CN102118173B

CN102118173B - High-speed coding method of LDPC and shortened codes thereof

Info

Publication number: CN102118173B
Application number: CN 201110029300
Authority: CN
Inventors: 牛毅; 马忠松; 傅得立
Original assignee: Individual
Current assignee: Individual
Priority date: 2011-01-27
Filing date: 2011-01-27
Publication date: 2013-09-18
Anticipated expiration: 2031-01-27
Also published as: CN102118173A

Abstract

The invention discloses a high-speed coding method for LDPC (low-density parity-check) codes and shortened codes thereof. The method adopts the standard code types (8176 and 7154) established by CCSDS (consultative committee for space data system), realizes the coding of high-speed data stream on a hardware platform with lower cost, and enables low-density parity-check code with long length to becoded under the station of high code rate.

Description

The high speed decoding method of a kind of LDPC and shortening code thereof

Technical field

The invention belongs to the coding and decoding technical field, relate to the high-speed coding device of a kind of LDPC and shortening code thereof.

Background technology

The LDPC code proposes in by the thesis for the doctorate of Gallager at him as far back as the sixties in 20th century, but be limited to technical conditions at that time, lack feasible decoding algorithm, after this basically ignored by people between 35 years, the figure that has promoted LDPC (low density parity check code) code in 1981 by Tanner therebetween and provided the LDPC code represents, i.e. alleged Tanner figure afterwards.The people such as Berrou had found Turbo code in 1993, on this basis, the people such as MacKay and Neal have re-started research to the LDPC code before and after nineteen ninety-five, have proposed feasible decoding algorithm, thereby further found the superperformance that the LDPC code has, caused rapidly strong repercussion and greatly concern.The fields such as deep space communication, optical fiber communication, satellite digital video and audio broadcasting have been widely used at present.The LDPC code has become the 4th strong competitor of generation communication system (4G), and is adopted by satellite digital video broadcast standard DVB-S2 of future generation based on the encoding scheme of LDPC code.

LDPC channel coding technology first Application is the Chang'e-2 that launches on October 1st, 2010 in China's space industry.Its downstream data rate is 12M/S, and namely the LDPC coder all is operated under the frequency of 12MHz.Along with the development of Chinese Space technology, the data volume that various payload need to pass down in the space exploration can be increasing, and this LDPC code that just need to have good error correcting capability can effectively adapt to the requirement of high bit rate.

The hardware of current LDPC decoding realizes being based on the shorter pattern of code length (tens code lengths to the hundreds of bit) more, and arithmetic speed lower (tens000000 to tens), and because the restriction of hardware resource, the ldpc decoder that practical function is stronger needs the good chip of performance even need to be for the special algorithm specialized designs, the cost of hardware implementation platform is very high, is unfavorable for large-scale application.

Summary of the invention

Technology of the present invention is dealt with problems and is: overcome the deficiencies in the prior art, a kind of LDPC code translator and method are provided.The present invention to (8176,7154) pattern, has realized decoding to high-speed data-flow at the hardware platform of lower cost take CCSDS as standard, has solved the long decoding problem of low density parity check code under high code check of long code.

Technical solution of the present invention:

Realize by following steps:

(1) i Frame of decoding data judged:

If be input as the LDPC data, then by the address of employed output data register the true form of frame head data in the LDPC data is stored; And export the frame head data of Frame in the last execution cycle by the output data register;

If be input as the shortening code data of LDPC, then by the address of employed output data register the true form of frame head in the LDPC data and filler data is stored; And export frame head and the filler data of Frame in the last execution cycle by the output data register;

(2) store i the valid data in the Frame according to the address of employed intermediate data register and iteration result data register; And export decoding data in the last execution cycle by the output data register described in the step (1);

After (3) i Frame storages are complete, i Frame carried out the interative computation operation of check-node and variable node; Begin to utilize different output data registers, intermediate data register and iteration result data register to operate i+1 Frame of input from step (1) simultaneously;

(4) after the operation of i Frame is finished and exported, again adopt data register, intermediate data register and the newly arrived Frame of iteration result data register pair that i Frame operated to operate;

In above-mentioned steps, simultaneously all Frames of input is operated, to a Frame store the used time of rear operation equal to all the other each Frames of input after this individual data frame store the cumulative of used time and.

Check-node in the described step (3) and the operation of the interative computation of variable node are realized by check-node computing module and variable node module respectively, and can carry out multiplexing in the Frame that arrives continuously.

The Frame that operates simultaneously is 4; After the operation of finishing i Frame, i+4 data register, intermediate data register and the iteration result data register that input data frame adopts i Frame operated again.

The present invention compared with prior art has following advantage:

(1) adopts half parallel data processing mode of 2 check-node modules and 16 variable node modules, make the optimization that averages out of resource occupation and operation efficiency.With respect to the processing mode of complete serial, can so that in the situation that buffer memory 3 frame data interative computation number of times double; With respect to the processing mode of complete parallel, can save in a large number resources of chip and take, so that realizing that than the low side chip decoding of this scale becomes possibility.In addition, take this kind method as the basis, for adjacent two frame data distribute independently arithmetic element, in the situation that a small amount of hardware resource that increases is paid the operating frequency that can further improve system.According to the Practical Project demand, sequencing control is partly done suitable modification, can also be in the situation that the iterations that reduction operating frequency 50 percent doubles again provides more excellent decoding effect.

(2) adopt half parallel data processing method also can produce a large amount of intermediate data.If in the situation that the shorter LDPC code of code length is deciphered, can adopt the interior register resources of sheet of FPGA to store temporarily.But the LDPC code that (8176,7154) this code length is long is deciphered, and the register resources in the sheet will be not enough, and in the computing frequently the register to diverse location carry out the operating frequency that read and write access also can reduce system.Based on this, adopt the storage of BlockRam as intermediate data in the design, thereby effectively utilized the special circuit of FPGA inside to save in a large number logic and interconnection resource, solved above problem.In addition, because BlockRam addressing read-write has also increased the portability that designs.

(3) control mode of multilayer pipeline organization is actually a kind of optimization of the data processing method that half-and-half walks abreast.On data flow state, this parallel organization is dynamic, when improving the system works frequency, has solved the coherent problem of decoded data stream input, output.

Description of drawings

Fig. 1 is structural representation of the present invention;

Fig. 2 is Min-sum algorithm check matrix Tanner figure;

Fig. 3 is check-node computing schematic diagram;

Fig. 4 is workflow diagram of the present invention.

Embodiment

Below just by reference to the accompanying drawings the present invention is described further.

The decoding algorithm of LDPC code has:

1.SPA(Sum-Product?Algorithm)，Probability-domain；

2.SPA，Log-domain；

(3.Min-sum the shortcut calculation of Log-domain SPA), and various invulnerable release;

4.BF(Bit-Flipping)；

5.MLG (Majority-Logic), only can use cyclic LDPC code.

Above listed algorithm, complexity and error performance reduce successively.Wherein, Probability-domainSPA calculates too complex, and BF algorithm and MLG algorithm can not provide good error performance, all should not adopt in the reality.In operable several algorithm, the Min-sum algorithm error performance of correction can have again relatively low complexity near SPA, therefore, has adopted the Min-sum algorithm of revising through considering the design.

Min-sum algorithm decoding iteration execution mode:

H = [\begin{matrix} 1 & 1 & 1 & 1 & 0 & 0 & 0 & 1 & 0 & 0 \\ 1 & 0 & 0 & 0 & 1 & 1 & 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 & 1 & 0 & 0 & 1 & 1 & 0 \\ 0 & 0 & 1 & 0 & 0 & 1 & 0 & 1 & 0 & 1 \\ 0 & 0 & 0 & 1 & 0 & 0 & 1 & 0 & 1 & 1 \end{matrix}]

The Min-sum algorithm carries out iterative decoding based on the Tanner figure of LDPC code check matrix H.We illustrate its calculating process as an example of following check matrix example.The Tanner figure of H as shown in Figure 2.Imagination c _iAnd f _jBe respectively the arithmetic element of check-node and variable node, c _iAnd f _jBetween line be duplex channel, the note from c _iAnd f _jLine (be c _iOutput, f _jInput) be q _IjFrom f _jTo c _iLine (be f _jOutput, c _iInput) be r _Ji

Min-sum is initialization f at first _jInput q _Ijq _IjThrough f _jExport r after the computing _Ji, again through c _iQ is upgraded in output after the computing _Ij, finish iteration one time.Until solve correct numeral, perhaps reach predefined maximum iteration time, finish whole iterative process, decode complete.Concrete steps are as follows:

1. initialization q _Ij: the Noise that receives (Var=σ) ²Numeral γ=(γ ₀, γ ₁..., γ ₉), with q _i=2 γ _i/ σ ²As original input, and with all q _IjBe initialized as:

q _ij＝q _i＝2γ _i/σ ²

2. arithmetic element f _j: f ₀Be 5 inputs, 5 outputs, remaining f _jBe 4 inputs, 4 outputs.Output with the pass of input is:

r_{ji} = Π_{i^{'} &NotEqual; i} sign (q_{ij}) \cdot \min_{i^{'} &NotEqual; i} | q_{i^{'} j} |

Be that the symbol of each output is for removing corresponding input q _IjOutward, the product of the symbol of other inputs; The size of each output is:

r_{21} = Π (sign (q_{42}), sign (q_{72}), sign (q_{82})) . \min (| q_{42} |, | q_{72} |, | q_{82} |)

3. arithmetic element c _i: c ₇Be the arithmetic element of 3 inputs, 3 outputs, the c that all the other are all _iBe 2 inputs, 2 outputs.Physical relationship is said clearly,, input with the pass of output is:

Q_{i} = Σ_{j^{'}} r_{{j^{'}}_{i}} + q_{i}

q _ij＝Q _i-r _ij

Be that each is output as and removes outside the corresponding input, other inputs and, add this arithmetic element c _iCorresponding original input q _i=2 γ _i/ σ ²Intermediate object program Q _iJudging numeral for next step prepares.With c ₇Be example, see Fig. 3, output to f ₀Q ₇₀For:

Q ₇＝r ₀₇+r ₂₇+r ₃₇+2γ ₇/σ ²

q ₇₀＝Q ₇-r ₀₇＝r ₂₇+r ₃₇+2γ ₇/σ ²

4. calculate numeral and judge stop condition: step 2 and 3 has been finished iteration one time, uses Q this moment _iMake the judgement numeral:

c_{i} = \{\begin{matrix} 1, Q_{i} < 0 \\ 0, Q_{i} > 0 \end{matrix}

If c _iH ^T=0, namely successfully decoded, perhaps reached predefined maximum iteration time, then stop iteration, otherwise, continue step 2 and 3.

NASA has proposed a kind of (8176,7154) LDPC code by the finite geometry method construct in September, 2005, and in May, 2006 it is defined as replacing the normal channel coding standard that R-S adds convolution code.CCSDS (The Consultative Committee for Space Data Systems, CCSDS) standard code (reference: " CCSDS 131.1-0-1, Aug 2006 ") that the technological document of delivering in August, 2006 recommends this yard to use for near-earth.

This code has following characteristics:

1, finite geometry code can be at BER=10 ^-10The time error-floor do not appear;

2, finite geometry code has very fast iterative decoding convergence rate;

3, quasi-cyclic code, systematic code, the utilogic circuit is realized, is not needed real arithmetic;

4, code check 7/8, not obvious increase transmission bandwidth.

Based on above characteristics, the present invention has adopted (8176,7154) the LDPC code by the finite geometry method construct.

Hardware realizes adopting the programmable logic device of xilinx company, and the Ram resource that takes full advantage of wherein replaces look-up table.Adopt the control mode of streamline, the logic function part of key effectively improves operating frequency in the multiplexing algorithm, has reduced resources occupation rate.

Decoding adopts the Min-sum algorithm to carry out.Its core is check-node computing and variable node computing, and subject matter is when the LDPC code long to code length deciphered, and the operand that carry out is very large, and producing simultaneously a large amount of intermediate data needs storage.Take a frame length 8176bits as example, carry out an iteration and just need to carry out 1022 check-node computings and 8176 variable node computings, and the decoding of finishing frame data need to be carried out 10 iteration.

According to the characteristics of LDPC check matrix, adopted following methods to realize continuous decoding to the 100Mbps data flow:

2 groups of check-node computings are parallel, compare with serial structure, reduce half check-node execution cycle;

16 groups of variable node computings are parallel, compare with serial structure, reduce half variable node execution cycle;

The Pipeline control operation time sequence has improved the disposal ability of data, improves system works speed;

The multiplexing one group of check-node computing module of two frame data and variable node computing module have reduced resources occupation rate;

Utilize the inner Ram of FPGA to replace the register-stored intermediate data;

Divide according to function, code translator of the present invention comprises data transaction and control module, data outputting module, a n intermediate data register, a n iteration result register, a n output data register, a n/2 check-node computing module, a n/2 variable node computing module.Concrete structure below illustrates respectively the practical function of each several part as shown in Figure 1:

Data transaction and control module

The control core of LDPC decoding is finished the format conversion to original input data, input selection, address choice, the read-write operation of control intermediate data register, iteration result register and output data register; Distribute data flow away to; The sequential of control all functions unit.

Data memory module

Data memory module comprises intermediate data register, iteration result register, output data register three parts.For interative computation provides interim memory space, present frame is distributed to the next frame use after finishing computing, is the node of exchanges data.

Intermediate data register, iteration result register, output data register adopt blcok ram to realize.Blcok ram is the specific resources among the FPGA, reads when can realize twoport, and does not take other look-up table resources (look-up table) among the FPGA.

The check-node computing module

Finish the check-node computing.Under (8176,7154) LDPC code of CCSDS suggestion, its core is 32 7bits data inputs, calculate wherein minimum value and sub-minimum, sub-minimum is assigned to the data of original minimum value position, minimum value is assigned to remainder data, output or 32 7bits data.Can one group in two groups of 32 7bits data of input be calculated according to the data input select signal that data transaction and control module provide.

When specific implementation, the check-node computing module is by comprising 4 submodules, and its effect is minimum value and the sub-minimum output of selecting in 8 input data wherein, and the position of output minimum value data.

The variable node computing module

Finish the variable node computing.(8176,7154) LDPC code of same CCSDS suggestion, according to 5 7bits bit wide data calculating input, calculate the input data and and output; Difference and the output of calculating and value and each position data; Shorten the excessive data of numerical value.

The data output control module

MUX, the input data select signal that provides according to data transaction and control unit be data behind continuous wave output 4 frame codings successively.

As shown in Figure 4, be apparatus of the present invention workflow diagram.The synchronous data flow of the demodulator output of code translator front end can be single-bit (corresponding Hard decision decoding), also can be multi bit quantization (corresponding Soft decision decoding).After data entered code translator, time-sequence control module was changed the data form first, and the true form of initial data (coded data) is exported to intermediate data register and complement code is exported to the iteration result register.And according to the design feature of the check matrix of CCSDC suggestion, the data of a frame are stored in 32 intermediate data register and 16 the iteration result registers.When all data write complete after, set is carried out in address to middle data register and iteration result register, beginning reads to the check-node computing module with the data of 32 intermediate data register corresponding address simultaneously, after node computing to be verified is complete, upgrade the before data of input with this result.The mode that adopts two groups of check-node computing modules to work is simultaneously carried out the check-node computing to 1022 groups of data altogether, finishes a verification computing of frame data is upgraded.And then the address of 32 intermediate data register and 16 iteration result registers carried out set, data with 32 intermediate data register and 16 iteration result register corresponding address read to the variable node computing module simultaneously, after the variable node computing is complete, upgrade the before data of input with this result.Adopt 16 modes that the variable node computing module is worked simultaneously, altogether 8176 groups of data are carried out the variable node computing, finish a variable computing of frame data is upgraded.

More than be the process of frame data being carried out once complete interative computation, through 10 such iterative process, the decoding of frame data finishes.In order to reach higher speed, adopted simultaneously the method for parallel, buffer memory, pipeline processes to carry out data and processed.When the first frame initial data reads complete when beginning to process, the second frame initial data begins to read in, store in second group of 32 intermediate data register and 16 the iteration result registers, the read-in process of third and fourth frame initial data is identical, namely be that first to fourth frame enters pipeline organization successively in the input process of initial data, every frame data take one group of intermediate data register and iteration result register separately.When the first frame data were deciphered, from the operation that the first frame data carry out, other three frame data were equivalent to be buffered in intermediate data register and the iteration result register.In fact from the circulating treatment procedure of whole data flow, the decode procedure of any frame data is all corresponding with the buffer memory of other three frame data.Next, second and third, the four frame data pipelined process that enters successively iterative decoding processes.When data flow was constantly inputted in a steady stream, not only check-node computing and the variable node computing of every frame inside walked abreast, and data are processed and also walked abreast between frame and the frame.Can find out at any one time all to have frame data to be in the I/O state, other three frame data are in the iterative decoding state.From the angle of sequencing control, the processing procedure of every frame data is controlled by state machine, from the output that is input to decode results of initial data, is total to approximately 50 states; Do circulation decoding take four frame data as one-period, process continuous data stream and need to experience 200 kinds of combinations of states.

The unspecified part of the present invention belongs to general knowledge as well known to those skilled in the art.

Claims

1. a LDPC and shorten the high speed decoding method of code is characterized in that realizing by following steps:

(1) i Frame of decoding data judged:

2. a kind of LDPC according to claim 1 and shorten the high speed decoding method of code, it is characterized in that: the check-node in the described step (3) and the operation of the interative computation of variable node are realized by check-node computing module and variable node module respectively, and can carry out multiplexing in the Frame that arrives continuously.

3. a kind of LDPC according to claim 1 and shorten the high speed decoding method of code, it is characterized in that: the Frame that operates simultaneously is 4; After the operation of finishing i Frame, i+4 data register, intermediate data register and the iteration result data register that input data frame adopts i Frame operated again.