CN1148881C - Realising method for parallel cascade convolution code hardware decoder - Google Patents

Realising method for parallel cascade convolution code hardware decoder Download PDF

Info

Publication number
CN1148881C
CN1148881C CNB021004293A CN02100429A CN1148881C CN 1148881 C CN1148881 C CN 1148881C CN B021004293 A CNB021004293 A CN B021004293A CN 02100429 A CN02100429 A CN 02100429A CN 1148881 C CN1148881 C CN 1148881C
Authority
CN
China
Prior art keywords
convolution code
hardware decoder
cascade convolution
unit
parallel cascade
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CNB021004293A
Other languages
Chinese (zh)
Other versions
CN1378345A (en
Inventor
国 卫
卫国
黄源良
赵春明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Science and Technology of China USTC
Research Institute of Telecommunications Transmission Ministry of Industry and Information Technology
Original Assignee
University of Science and Technology of China USTC
Research Institute of Telecommunications Transmission Ministry of Industry and Information Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Science and Technology of China USTC, Research Institute of Telecommunications Transmission Ministry of Industry and Information Technology filed Critical University of Science and Technology of China USTC
Priority to CNB021004293A priority Critical patent/CN1148881C/en
Publication of CN1378345A publication Critical patent/CN1378345A/en
Application granted granted Critical
Publication of CN1148881C publication Critical patent/CN1148881C/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Images

Landscapes

  • Error Detection And Correction (AREA)

Abstract

The present invention relates to an achieving method for a parallel cascade convolution code hardware decoder, and in order to meet the saving requirements of FSMC and RSMC for a storage, the parallel cascade convolution code hardware decoder decreases the loading and unloading capacity of the data of storages by increasing the use of logical resources when the use of a site programmable logical gate array (FPGA) is achieved. The parallel cascade convolution code hardware decoder has the particular method that two sets of RSMC are used, the used two sets of RSMC simultaneously work and start iteration from different initial moments, and a correct data output part is selected. The achieving method for a parallel cascade convolution code hardware decoder has the advantages that the loading and unloading capacity of the data of a storage unit in the achieving procedure of a decoded process only depends on a numerical value L instead of the number N of the data to be decoded, and the requirements can be met when the L is larger than 5 times of constraint length of a member cascade convolution code encoder. When the N is remotely larger than the L, the requirements can be met under general condition, and the using number of the storages can be greatly decreased. Because of synchronous calculation, the number of FSM needing to be stored is correspondingly decreased, and therefore, the loading and unloading capacity of the data of the storages can be decreased.

Description

A kind of implementation method of parallel cascade convolution code hardware decoder
(1) technical field:
The invention belongs to communication technical field, the simplification that relates in particular to a kind of error correcting code decoding device realizes.
(2) background technology:
Parallel cascade convolution code (Turbo code) is a kind of new error control coding that the people such as Berrou of France proposed in 1993, in additive white Gaussian noise channel, its error-correcting performance approaches shannon limit, reaches than the better error-correcting performance of original error control coding.The coding principle of Turbo code is with the parallel connection of two encoder for convolution codess, uses interleave unit to separate.From the overall effect, the structure of encoder has improved the range distribution of entire coded sequence, makes it correct the ability of error code, and the ability of particularly correcting the channel burst error code has obtained reinforcement.
Traditional decoding device adopts the multi-stage iteration decoding, (Berrou in existing hard-wired iterative decoding algorithm, C., Glavieux, A.And Thitimajshima, P.Near Shannon Limit Error-CorrectingCoding and Decoding:Turbo Codes, Proc.Of ICC ' 93,1064-1070), need memory to deposit the intermediateness value.When data volume is bigger; The usage quantity of memory will sharply increase, and read-write becomes very frequent.If adopt the chip external memory chip, can make that again the arithmetic speed of decoder is restricted.The time delay of decoding and data bulk to be decoded are proportional, can increase sharply along with increasing of data to be decoded.
(3) summary of the invention:
The object of the present invention is to provide a kind of device that can improve Turbo code hardware decoding efficiency.
Device of the present invention is the specific implementation that reduces the throughput of memory data in the Turbo code decoding algorithm, relates to the using method of hardware logic resource.
The employed decoding algorithm of device of the present invention is the maximum posteriori decoding criterion (LOG-MAP) on the log-domain, comprise following four elementary cells: forward state metric unit (Forward State Metric Calculator, be FSMC), reverse state metric unit (Reverse State Metric Calculator, be RSMC), path metric unit (Branch Metric Calculator, be BMC) and log-likelihood calculations unit (Log LikehoodRatio Calculator, i.e. LLRC).Employing can become signed magnitude arithmetic(al) with the multiplication and division computing in the calculating of log-domain, reduces computation complexity.
The intermediateness value that device of the present invention need be stored is forward state metric (FSM) and branch road state measurement (BM), can calculate reverse state metric (RSM) easily like this, and guarantees that the likelihood ratio that finally obtains is output synchronously.
Device of the present invention is in order to save FSMC and the RSMC demand for memory, when adopting field programmable gate array (FPGA) to realize, by increasing the minimizing that is used for exchanging for the memory data throughput that makes of logical resource.Concrete grammar is to use two cover RSMC, and two RSMC of use work simultaneously, and begin iteration from different initial times, choose correct data output unit, the selection of iteration initial time does not rely on the quantity N of data to be decoded, and chooses a numerical value L, satisfy L<<N.State measurement can begin recursion from any moment.If the initial recursion of selecting is not the moment that last data to be decoded enters decoding device constantly, then the metric that calculates in the some length of beginning is incorrect certainly, but after state transitions after a while (general required time is the constraint length of the several convolution codes of experience), metric will with the same from the situation of last current state recursion be correct, this has just guaranteed the correctness of decoding algorithm.
The present invention can require to adopt different L values according to specific performance when practical application.The value of L is big more, and then the usage quantity of memory is just many more, and throughput is also just big more.Otherwise throughput can reduce.But it should be noted that in application process of the present invention, the value of L can not obtain too small, otherwise the result that RSMC calculates can can't approach the performance of traditional algorithm because the number of times of reverse recursion is very few.Generally, can be integer, but generally be taken as 2 integral number power, constant 16,32 or 64 to N.
Beneficial effect of the present invention: superiority of the present invention be the to decode data throughout of memory cell in the implementation procedure does not rely on the quantity N of data to be decoded, and only depend on numerical value L, and L can meet the demands greater than 5 times of the constraint length of member's encoder for convolution codes.Generally speaking, N>>L promptly can meet the demands, and memory-aided quantity can be significantly reduced.Need the also corresponding minimizing that is directly proportional of FSM of storage because calculate reason synchronously, make the data throughout of memory reduce.The decode time of apparatus of the present invention postpones low, as long as finish the calculating and the storage of L forward state metric and reverse state metric, can carry out the calculating of log-likelihood ratio, has shortened the idle waiting time of LLRC, makes the moment of decoded data output shift to an earlier date.
(4) description of drawings:
Fig. 1 is the realization block diagram of the maximum posteriori decoding algorithm on traditional log-domain;
The realization block diagram that Fig. 2 uses for the present invention;
Fig. 3 is the decoding unit time chart of the maximum posteriori decoding algorithm on traditional log-domain;
Fig. 4 is a decoding unit time chart of the present invention;
Fig. 5 is a decoding unit performance of BER curve of the present invention;
Fig. 6 realizes block diagram for the decoding unit that increases by two reverse state metric unit;
Fig. 7 realizes block diagram for the decoding unit that uses non-log-domain MAP algorithm;
Fig. 8 is that the decoding unit of max-log-MAP decoding algorithm is realized block diagram;
Fig. 9 is for realizing a kind of hardware configuration of BMC;
Figure 10 is for realizing a kind of hardware configuration of FSMC.
(5) embodiment:
Below in conjunction with accompanying drawing the embodiment of the invention is further described.
The realization block diagram of the maximum posteriori decoding algorithm on accompanying drawing 1. traditional log-domains.Wherein 1 is path metric unit B MC, and 2 is forward state metric unit F SMC, and 3 is reverse state metric unit R SMC, and 4 is log-likelihood calculations unit LLRC, and 5 is memory cell, and the size of memory is identical with data length N to be decoded.
Make following agreement for convenience of description: original information bits part x in the data to be decoded k, verification information bit part y in the data to be decoded k, log-likelihood specific output L k
The realization block diagram that accompanying drawing 2. the present invention use.Wherein 3_I and 3_II are two reverse state metric unit R SMC, and 6 also is memory cell, but and accompanying drawing 1 in memory cell 5 be distinguishing, its memory capacity is identical with the L value.The capacity that is to say memory cell 6 is generally much smaller than the capacity of memory cell 5.
In the traditional algorithm, wait until the FSMC of data correspondence to be decoded of all N length and RSMC calculates and storage after, could start L kCalculating, the time-delay of decoding is big.Concrete timing diagram is referring to accompanying drawing 3.
On original architecture basics, increased by one group of reverse state metric unit 3, and worked simultaneously among the present invention.The sequential relationship of decoding is referring to shown in the accompanying drawing 4: x kAnd y kBe input to this unit, calculate path metric value and storage by path metric unit 1, postpone 2L constantly, start forward state metric unit 2 and calculate the forward state metric value from initial condition, its result of calculation is correct all the time; Existing two reverse state metric calculation unit 3, unit 3_I starts constantly at 2L, read BM (2L)---BM (0) value is calculated reverse state metric RSM_I, attention is from RSM_I (2L)--and-RSM_I (L) is insecure, dot in the drawings, unit 3_II starts constantly at 3L, reads BM (3L)--and BM (L) value is calculated reverse state metric RSM_II, and RSM_II (3L)---RSM_II (2L) is insecure, dots in the drawings.From 3L constantly, from RSM_I, RSM_II, select a reliable side, get the forward state metric FSM that calculated and stored before this moment, calculate output log-likelihood ratio Lk as reverse state metric RSM.The present invention brings the significantly minimizing that memory uses because increased a reverse state metric calculation unit.
Business with descending 384k among the WCDMA is an example below, and in a decoding unit, depositing BM needs 4L * 32=128L bit memory space, and depositing FSM needs 2L * 64=128L bit memory space, gets L=32, then needs 8192 bit memory spaces altogether.As can be seen, this shortcut calculation has following advantage: consider that input encoded data 6bit quantizes, simulation result is referring to accompanying drawing 5, can show that intermediate variable and common LOG-MAP algorithm that the present invention obtains are as good as, and do not influence decoding performance, but can save a large amount of memory spaces, therefore can in monolithic FPGA, realize, do not need to be visited the speed limit of external memory, can greatly utilize the hardware speed of FPGA, improve decode rate; Likelihood ratio Lk can export continuously, only postpones 3L and has saved the time on the other hand constantly, and speed has improved one times nearly.
Realize the Turbo decoding in the WCDMA system, the present invention adopts the LOG-MAP algorithm of above-mentioned correction when the deconvolution sign indicating number of decoding unit inside, reuse decoding unit on the time at a high speed and finish the level Four decoding.Only need obtain the indication of data block size N, the inner interleaving scheme of Turbo code that proposes according to agreement calculates the required interlacing pattern of generation decoding in real time.The APEX20K400 of a slice altera corp has been adopted in the realization of embodiment, and data are 6bit behind the coding of input, and BM, FSM, RSM and Lk value all are 8bit, shared 4795 logical blocks and 167680bit memory.Adopt the 30M clock in the sheet, 8 states for member's convolution code adopt the parallel processing mode, promptly clock is handled 8 state values of data, the recycling decoding unit is finished the level Four decoding on time, therefore the maximum data rate that can reach of decoding is 30M/ (4 * 2)=3.75M bit/s, has reached the requirement of WCDMA predetermined data speed 2M bit/s.
At 384k business in the WCDMA agreement, the data block size is 4224, at awgn channel, under the condition of each given signal to noise ratio hardware decoder is carried out the emulation of Turbo decoding performance, altogether emulation 4224 * 1000=4224000 data.Simulation result as shown in Figure 5.Come as can be seen, along with decoding progression increases, decoding performance is significantly improved.The level Four decoding that we realized, when signal to noise ratio 1.8dB, decoded bit error rate is 7.5758e-006.
Need to prove that the present invention can be applied to the maximum posteriori decoding algorithm (LOG-MAP) on the log-domain, also can be applied to the decoding algorithm that other similarly uses the memory stores intermediateness, decode when adopting other Turbo code decoding algorithm, as long as relate to equally in the algorithm when using memory to finish the storage of intermediateness, device of the present invention just can continue to use.For example widely used deformation algorithm max-log-MAP algorithm on the engineering is exactly an example.Because the MAP algorithm computation amount of log-domain is still bigger,, adopted mathematical one to be similar to so can do further simplification:
ln ( Σ i e x i ) ≈ max i ( x i )
Like this logarithm operation is converted into the maximizing computing, the complexity of computing has reduced, and that is to say that each state measurement all becomes the computing of maximizing.But, with the log-domain MAP class of algorithms seemingly, this algorithm still needs to store forward state metric (FSM) and the branch road state measurement (BM) that intermediate computations obtains.Accompanying drawing 6 has provided the realization block diagram of max-log-MAP algorithm.Wherein, path metric unit 7, forward state metric unit 8, reverse state metric calculation unit 9 and likelihood ratio computing unit 10 have all been used the max-log-MAP algorithm.Memory cell 6 is exactly an employed memory among the present invention, and overall structure does not have big variation.
Increase a reverse state metric unit in the decoding device that the present invention proposes, significantly reduced the throughput of memory.Can increase more reverse state metric unit, further reduce the efficient of memory cell.But the quantity that increases can not be too much, otherwise the complexity of hardware logic is strengthened, and it is loaded down with trivial details that sequencing control also becomes.In general, it has been satisfied using the result of two reverse metrics.Accompanying drawing 7 has provided the realization block diagram that has increased by two reverse state metric unit.The realization of concrete unit is consistent with appropriate section in the accompanying drawing 2 among the figure.
The present invention's proposition is based on log-domain, and purpose is to reduce computation complexity, is beneficial to Project Realization.But structure of the present invention is applicable to that fully the decoding on the non-log-domain realizes (this algorithm can be referred to as non-log-domain MAP algorithm), just the computation complexity height from the principle.Accompanying drawing 8 has provided the block diagram of finishing decoding in non-log-domain.
The present invention goes up the decoding process that reuses decoding unit at a high speed the time that goes for, and also goes for non-multiplexing decoding process.
Provide a kind of specific implementation that is applicable to each metric element of the present invention below, with as the reference of specifically finishing decoding unit.Accompanying drawing 9 provides a kind of implementation of path metric unit (BMC), wherein | and x k| and | y k| be respectively to input x kAnd y kFinish the computing that takes absolute value, BM k I, j(i, j ∈ 0,1}) path metric (BM) for calculating.Accompanying drawing 10 has provided a kind of implementation of forward state metric unit (FSMC), and wherein the min unit is the computing of minimizing, and the E unit is with after the BM of correspondence and the FSM addition, finishes the logarithm evaluation again by tabling look-up.From figure, can know the process of seeing the FSM iteration, promptly how try to achieve the value of k state from the FSM of k-1 state.The realization of reverse state metric unit (RSMC) and the realization of FSMC are very similar, be noted that should be from the RSM of k state the anti-value that obtains the k-1 state that pushes away.Also be to use E unit and min unit to adjudicate the final likelihood ratio of output in the middle of the log-likelihood calculations unit (LLRC), just no longer draw the realization block diagram of RSMC and LLRC at this.

Claims (7)

1. the implementation method of a parallel cascade convolution code hardware decoder, comprise and use the forward state metric unit, the path metric unit, log-likelihood calculations unit and at least two cover reverse state metric unit, it is characterized in that: forward state metric unit and described at least two cover reverse state metric unit are connected in parallel and work simultaneously in described at least two cover reverse state metric unit, and begin iteration from different initial times, choose correct data output unit, the selection of iteration initial time does not rely on the quantity N of data to be decoded, and choosing a numerical value L, L can be 1 to N integer
2. the implementation method of a kind of parallel cascade convolution code hardware decoder as claimed in claim 1 is characterized in that the value of described L can be taken as 2 integral number power.
3. the implementation method of a kind of parallel cascade convolution code hardware decoder as claimed in claim 2 is characterized in that described L value desirable 16,32 and 64.
4. the implementation method of a kind of parallel cascade convolution code hardware decoder as claimed in claim 1, it is characterized in that to be applied to the maximum posteriori decoding algorithm on the log-domain, also can be applied to the decoding algorithm that other uses the memory stores intermediateness.
5. the implementation method of a kind of parallel cascade convolution code hardware decoder as claimed in claim 1 is characterized in that increasing by two or three reverse state metric unit, to exchange the further minimizing that memory uses for.
6. the implementation method of a kind of parallel cascade convolution code hardware decoder as claimed in claim 1 is characterized in that not only going for the decoding algorithm of log-domain, also goes for the decoding algorithm of non-log-domain.
7. the implementation method of a kind of parallel cascade convolution code hardware decoder as claimed in claim 1 is characterized in that the last decoding process that reuses decoding unit at a high speed of the time of going for, and also goes for non-multiplexing decoding process.
CNB021004293A 2002-01-30 2002-01-30 Realising method for parallel cascade convolution code hardware decoder Expired - Fee Related CN1148881C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNB021004293A CN1148881C (en) 2002-01-30 2002-01-30 Realising method for parallel cascade convolution code hardware decoder

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNB021004293A CN1148881C (en) 2002-01-30 2002-01-30 Realising method for parallel cascade convolution code hardware decoder

Publications (2)

Publication Number Publication Date
CN1378345A CN1378345A (en) 2002-11-06
CN1148881C true CN1148881C (en) 2004-05-05

Family

ID=4739362

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB021004293A Expired - Fee Related CN1148881C (en) 2002-01-30 2002-01-30 Realising method for parallel cascade convolution code hardware decoder

Country Status (1)

Country Link
CN (1) CN1148881C (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102394663B (en) * 2011-10-11 2013-08-28 东南大学 Segment parallel coding method of feedforward convolutional code
CN108509382B (en) * 2018-03-27 2022-06-07 南开大学 Method for realizing quick convolution operation of super-long sequence based on FPGA

Also Published As

Publication number Publication date
CN1378345A (en) 2002-11-06

Similar Documents

Publication Publication Date Title
EP1030457B1 (en) Methods and system architectures for turbo decoding
US20040039769A1 (en) Method for decoding error correcting code, its program and its device
CN102412850B (en) Turbo code parallel interleaver and parallel interleaving method thereof
CN100546207C (en) A kind of dual-binary Turbo code encoding method based on the DVB-RCS standard
CN102340320B (en) Bidirectional and parallel decoding method of convolutional Turbo code
US7464316B2 (en) Modified branch metric calculator to reduce interleaver memory and improve performance in a fixed-point turbo decoder
CN101442321B (en) Parallel decoding of turbine code and data processing method and device
CN104092470A (en) Turbo code coding device and method
CN1254121C (en) Method for decoding Tebo code
CN1157883C (en) Maximal posterior probability algorithm of parallel slide windows and its high-speed decoder of Turbo code
CN1148881C (en) Realising method for parallel cascade convolution code hardware decoder
CN101938330A (en) Multi-code rate Turbo encoder and storage resource optimization method thereof
US20010054170A1 (en) Apparatus and method for performing parallel SISO decoding
CN101217336B (en) A TD-SCDMA/3G hard core turbo decoder
CN102594369B (en) Quasi-cyclic low-density parity check code decoder based on FPGA (field-programmable gate array) and decoding method
CN103595424A (en) Component decoding method, decoder, Turbo decoding method and Turbo decoding device
CN101882934A (en) Arithmetic circuit
CN1159933C (en) Universal convolution encoder and viterbi decoder
CN102594507B (en) High-speed parallel Turbo interpretation method in a kind of software radio system and system
EP1317071B1 (en) Normalisation in a turbo decoder using the two's complement format
CN102571107A (en) System and method for decoding high-speed parallel Turbo codes in LTE (Long Term Evolution) system
CN108449092B (en) Turbo code decoding method and device based on cyclic compression
CN1286533A (en) Decoding method and decoder for high-speed parallel cascade codes
CN2506034Y (en) Turbo decoder
CN103701475A (en) Decoding method for Turbo codes with word length of eight bits in mobile communication system

Legal Events

Date Code Title Description
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C06 Publication
PB01 Publication
C14 Grant of patent or utility model
GR01 Patent grant
C19 Lapse of patent right due to non-payment of the annual fee
CF01 Termination of patent right due to non-payment of annual fee