Background technology
In communication system, before information sends, need in data flow, add redundancy artificially according to certain rule, so that receiving terminal can carry out Error detection and correction, estimate the data of initial transmission.The product code of iterative decoding is the comparatively superior and a kind of method that easily realize of performance wherein.
Product code is by two short block codes---sign indicating number C1 and C2 constitute long block code.Wherein short block code is that the sequence of K information bit is passed through to add N-K check digit, weaves into the code character of N bit.The coding parameter of sign indicating number C1 is that (N1, K1), the coding parameter of sign indicating number C2 is that (N2, K2), wherein Ni, Ki represent code length and information bit number respectively.Encode as shown in Figure 1, step is as follows:
1. information bit a
0, a
1, a
2... a
K1, K2-1Line up the matrix of K1 * K2, the capable K2 row of K1;
2. K1 row encoded with C2 line by line, every row adds N2-K2 check digit, becomes the matrix of K1 * N2;
3. N2 row are encoded with C1 by row, every row add N1-K1 check digit, become the matrix of K1 * N2.
Behind the coding, each provisional capital of product code matrix constitutes the code word of sign indicating number C2, and each row all constitutes the code word of sign indicating number C1.Its decoding algorithm is that the soft input soft output decode algorithm with row sign indicating numbers C1 and row sign indicating number C2 is a core, to product code matrix repeated multiple times iterative decoding algorithm line by line.
Single iteration decoding be according to receiving sequence and last time result calculated recomputate the binary decision result's of each symbol likelihood ratio estimation (be 1 probability divided by the probability that is 0), obtain correction value, i.e. an external information of receiving sequence reliability thus.As shown in Figure 2, the iterative decoding of product code is exactly by column decoding decoding line by line again, translates by row more again and translates line by line, so repeated multiple times, be expert at every turn and column decoder between transmit results of intermediate calculations---external information makes final result more and more approaching with correct result.The symbol sebolic addressing that product code decoding needs storage also repeatedly to read reception also will repeatedly read and upgrade external information simultaneously.
Because iterative decoding makes decoding speed very limited, for example: adopt traditional serial interpretation method, the decoder of 4 iteration is translated 1 bit needs 8 clock cycle, and iteration once comprises once by column decoding and once decoding line by line here.If operating frequency 50MHz, then throughput is approximately 6Mbit/s.In order to improve decoding speed, parallel processing is the method for a key.Existing a kind of parallel decoding method as shown in Figure 3, be carrying out simultaneously line by line and by column decoding, but can only be 2, so the speed that can and can only double for this method degree of parallelism of the product code of two-dimensional matrix, and need two groups of results of intermediate calculations of deciphering of storage, increased memory space.
Summary of the invention
The object of the present invention is to provide and a kind ofly not only can improve the speed of decoder but also can make little being used to of trying one's best of shared hardware resource improve the method and the code translator of product code decoding speed.
The structure of product code is short code word of all relatively independent composition of each row and column, the line number and the columns of its matrix are bigger, be suitable for parallel decoding, promptly calculate the external information of likelihood ratio of the symbol of several rows or some row simultaneously, its fundamental block diagram and traditional serial decoding similar, different is to have replaced decoder line by line with P row or the capable parallel decoder of P, and the block diagram of single iteration as shown in Figure 4.
P row or the capable parallel decoder of P are made up of P row and P column decoder of concurrent working, taken more hardware resource like this, row and column adopts identical coding parameter, and then row decoding and column decoding can shared same decoders, can reduce the number of decoder.Parallel decoding will be divided into the read-write operation of several little memories to support to walk abreast to memory in addition.This method can make the quantity of required memory as far as possible little, and total memory space is constant, and the quantity that promptly adopts memory is original P times, and the memory space of each memory is original 1/P.
Method proposed by the invention is characterised in that: it is the interpretation method that the capable or P column decoder of the P that forms of P column decoder of a kind of P row decoder that utilizes concurrent working and concurrent working comes the capable or P row of the P of parallel successively processing product code.
Its described interpretation method contains following steps successively:
(a) the log-likelihood value of the signal that receives is stored in N * N the memory cell of receiving signal memory R1, draw their RAM sequence numbers under in memory R1 by following method: make M=N/P, obtain its residing row number and row number poor in matrix earlier, round to zero divided by M, rem divided by P again, be its RAM sequence number;
(b) at every turn of the P row addition of the P of R1 row with external information memory R2 relevant position, obtain P and vector, by the sub-decoder decoding of P soft inputting and soft output (SISO), be the output of each sub-decoder that each external information vector deposits in the corresponding row of R2 again, the initial value of external information is 0;
(c) repeat M step b, obtain in the product code external information of all;
(d) capable the P of the R1 again and capable addition of the P R2 relevant position obtains P and vector, sends into P sub-decoder respectively, and the P that the P that is asked an external information vector deposited in the R2 correspondence is capable again;
(e) repeat M step d, obtain in the product code external information of all;
(f) (b), (c), (d), (e) step repeats repeatedly, finally exports the court verdict of all information bits.
The code translator that the present invention proposes is characterised in that it contains: the RAM read-write controller of gathering received signal, successively by the vector element counter, the vector count device, iterations counter serial connection forms, and above-mentioned each counter is again respectively to the multilayer counter of RAM read-write controller dateout, P the sub-decoder that is controlled by the capable or P column information of RAM read-write controller and parallel processing P receives read-write control signal separately respectively and writes data and the receiving signal memory that respectively includes P RAM and the external information memory of output sense data under the control of RAM read-write controller.
Embodiment
The matrix of hypothetical products sign indicating number is the capable N row of N, and degree of parallelism is P, wherein P aliquot N.Decoder needs two groups of RAM, R1 and R2, and every group of RAM comprises N * N memory cell, corresponding to the matrix of product code.R1 is used for storing the log-likelihood value of received signal, and R2 is used for storing external information.For parallel processing, every group of RAM is made up of P little RAM, and each little RAM logically is divided into P part, and the degree of depth of each part is M * M, regards the capable M row of M as, wherein M=N/P.We will be with 64 * 64 product code, and the decoder of P=8 is that example illustrates relevant issues.
The sign indicating number matrix of this product code is 64 * 64, so every group of RAM comprises 64 * 64 memory cell.Because degree of parallelism is 8, so every group of RAM is divided into 8 little RAM.Each little RAM is a memory independently physically, and the degree of depth is 512 (=8 * 8 * 8), and the address is wide 9.Each little RAM logically regards 8 parts as, and every part comprises 8 * 8 memory cell.The concrete structure of RAM as shown in Figure 7, wherein left figure expression one group of RAM (R1 or R2), each lattice is represented a part, 8 parts (promptly number from 0 to 78 lattices) from first row, mark 8 diagonal to the lower right respectively, draw first row that delegation to the end then turns next column and continue to draw, to the last one classify as and end to the lower right, 8 parts on every diagonal belong to same little RAM like this, and on behalf of it, institute's target integer belong to which little RAM on the lattice.Have 8 so little RAM altogether, the numbering RAM_num of little RAM from 0 to 7.The arrangement of 8 parts among little RAM is according to their residing row order number from small to large, just the row at high 3 bit address of their correspondences and place are number identical, high 3 bit address of 8 parts of the 0th row all are that high 3 bit address of 8 parts of 0, the first row all are 1 among the left figure of Fig. 7.The corresponding address scope is shown in the right figure of Fig. 7, and label is that the address realm of 3.0 part is 0 to 63, and label is that 3.1 part address realm is 64 to 127.And each part correspondence one 8 * 8 square formation of product code matrix, the order of portion's address arrangement is to have arranged and arranged next column again by row row, row earlier within it, i.e. centre 3 bit representations of 9 bit address lines row number, capable number of low 3 bit representations, the home address that has provided 3.0 and 3.7 two parts among the figure is arranged.
8 decoders are arranged, and sequence number Dec_num from 0 to 7.Because degree of parallelism is 8, the code length of block code is 64, and 64 row, 64 row are arranged, and just can finish iteration one time so each decoder need be handled 8 vectors, and each vector contains 64 elements.Vector Vec_num (wide 3), element Ele_num (wide 6).Each sub-decoder all order from R1 and R2, read in a vector, and be stored in the buffer of its inside, after handling through the identical time, the external information of renewal again order write R2, therefore the element Ele_num of 8 elements that read while write is identical.
During by column decoding, the data of first sensor matrix the 0th row the 0th, 8,16,24,32,40,48,56 row, counter Vec_num is 0, Ele_num is 0; Read the 1st, 2 then successively ... the data of these 8 row in 63 row, Vec_num still is 0, it is 1,2 that Ele_num increases progressively ... 63.The data of the 0th row are handled by No. 0 decoder, and the data of the 8th row are handled by No. 1 decoder ... the data of the 56th row are handled by No. 7 decoders.Handle this 8 row, handle the data of the 1st, 9,17,25,33,41,49,57 row again, Vec_num is 1, and Ele_num is 0,1,2 ... 63.Same, the data of the 1st row are handled by No. 0 decoder, and the data of the 9th row are handled by No. 1 decoder ... the data of the 57th row are handled by No. 7 decoders, and Vec_num from 0 to 7.Therefrom the 0th decoder handled the 0th to the 7th row successively as can be seen, the 1st decoder handled the 8th to the 15th row successively, the 7th decoder handled the 56th to the 63rd row successively, its rule is each to handle the 8 columns certificates of the 8th * Dec_num+Vec_num (Dec_num=0,1,2 ..7) with Dec_num decoder, one be listed as in element sequence number Ele_num be followed successively by 0,1,2 ... 63.Vec_num is 0,1,2 ..7, then handle whole matrix by column decoding.
When deciphering line by line, the data of first sensor matrix the 0th, 8,16,24,32,40,48,56 row the 0th row, counter Vec_num is 0, Ele_num is 0; Read the 1st, 2 then successively ... the data of these 8 row in 63 row, Vec_num still is 0, it is 1,2 that Ele_num increases progressively ... 63.The data of the 0th row are handled by No. 0 decoder, and the data of eighth row are handled by No. 1 decoder ... the data of the 56th row are handled by No. 7 decoders.Handle this 8 row, handle the data of the 1st, 9,17,25,33,41,49,57 row again, Vec_num is 1, and Ele_num is 0,1,2 ... 63.Same, the data of the 1st row are handled by No. 0 decoder, and the data of the 9th row are handled by No. 1 decoder ... the data of the 57th row are handled by No. 7 decoders, and Vec_num from 0 to 7.Handle 8 line data of the 8th * Dec_num+Vec_num (Dec_num=0,1,2 ..7) with Dec_num decoder at every turn, in the delegation element sequence number Ele_num be followed successively by 0,1,2 ... 63.Vec_num is 0,1,2 ..7, then handles the decoding line by line of whole matrix.
Therefrom as can be seen, when the column decoding, transmit data if sequence number is the little RAM of RAM_num with the decoder that sequence number is Dec_num, then read/write address and sequence number satisfy following relation:
addr=(Ele_num(5?downto?3)-RAM_num)?&?Vec_num?&?Ele_num(2?downto?0);
RAM_num+Dec_num=Ele_num(5?downto?3);
(the plus and minus calculation here is the plus and minus calculation of mould 8, and " m downto n " represents binary m position to the n position, “ ﹠amp; " the expression binary number and meet computing, for example 1﹠amp; 2=" 01 ” ﹠amp; " 10 "=" 0110 "=6).
When deciphering line by line, if sequence number is the little RAM of RAM_num and decoder that sequence number is Dec_num transmission data, then read/write address and sequence number satisfy following relation:
addr=Ele_num?&?Vec_num;
Dec_num-RAM_num=Ele_num(5?downto?3);
The RAM read-write controller is according to the value of individual count device, according to above algorithm, produce address and the enable signal of read-write RAM, and 8 circuit-switched data of input are assigned to the FPDP of corresponding RAM and the FPDP that 8 circuit-switched data of reading is assigned to 8 sub-decoders from 8 RAM.The storage of control data like this can farthest reduce the quantity of RAM.
In sum, the decode procedure of receiving terminal is as follows:
A) the log-likelihood value of the signal that receives is stored in 64 * 64 memory cell among the R1.Obtain its residing row number and row number poor in matrix, round ([/ 8]) to zero, again divided by 8 rem (%8) divided by 8, RAM sequence number under drawing, the signal that for example receives are in the 37th row 55 row, then 37-55=-18, [18/8]=-2, (2) %8=6, thus be stored among No. 6 RAM, if at the 55th row 37 row, 55-37=18 then, [18/8]=2,2%8=2 is so be stored among No. 2 RAM.
B) at every turn of the 8 row additions of 8 row of R1, obtain 8 and vector with the R2 relevant position.These 8 vectors are exported the sub-decoder decoding of (SISO) by 8 soft inputting and softs.The output of i decoder, i external information vector just deposits in the part of i row of R2.The initial value of noting external information is 0, has only the row of 8 among the R1 to send into sub-decoder when deciphering for the first time.
C) repeat 8 step b, obtain in the product code external information of all.
D) of the 8 row additions of 8 row of R1, obtain 8 and vector, send into 8 sub-decoder for decoding respectively with the R2 relevant position.Again 8 external information vectors being asked are deposited in 8 row of R2 correspondence.
E) repeat 8 step d, obtain in the product code external information of all.
F) b, c, d, e are repeated 4 times, finally export the court verdict of all information bits.
Degree of parallelism is that 8 decoder speed is 8 times of serial decoding, needs to increase hardware resource and be used for realizing 8 sub-decoders, comprises the buffer of sub-decoder inside.And it is constant to be used for the memory space of the R1 of the received signal of the whole matrix of product stored sign indicating number and external information and R2, and just each all has been divided into 8 little RAM.Degree of parallelism is too high, and area can become very big in the time of hard-wired, and is extremely unfavorable; Degree of parallelism is too low, and processing speed is slow, relatively is fit to the decoder of low speed, so should consider to select degree of parallelism aspect speed and area two with compromising.In view of this consideration, in concrete enforcement, 8 degree of parallelisms have been adopted.
Below code translator is described:
Code translator is that the product code decoder comprises sub-decoder, receiving signal memory, external information memory, RAM read-write controller and the multilayer counter that parallel processing P is capable or P is listed as.
Sub-decoder partly is used for line by line or by the decoding that is listed as, calculates and upgrade the external information of each symbol according to the result of received signal and last iteration.P the capable or P row of sub-decoder parallel processing P, a parallel processing P element in each clock cycle.Next group row (or row) vector is read in and handled to decoder flowing water work when promptly exporting the current external information of obtaining,, and establishing the product code matrix is N * N, and each decoder is handled N/P row or column, then handles whole product code matrix.Iteration once comprises once pursues the decoding that is listed as line by line and once, considers to select iterations aspect decoding speed and error-correcting performance two with compromising.Each sub-decoder inside is useful on the buffer of the handled vector of storage, adopts dual-ported memory.
Receiving signal memory is used to store received signal, comprises P RAM.Signal of every reception, RAM is selected and deposited in to each simultaneously treated P data not in the principle of same block RAM during according to decoding.
The external information memory is used to store external information, comprises P RAM.For the first time external information is 0 (need not read external information) during iteration, and each iteration is read received signal and corresponding external information, handles the back external information of upgrading is write.Because the position of each external information of reading is identical with the position of received signal, so the organizational form of external information RAM and received signal RAM's is identical.
The multilayer counter comprises three kinds of counters, the vector element counter---be used for sub-decoder line by line or when the column decoding, N element of delegation or row carried out sequential counting; The vector count device---be used for iterative process one time, the handled N/P of each sub-decoder row vector (or column vector) carries out sequential counting; The iterations counter---be used for the counting of iterations, once decipher or once pursue column decoding line by line and calculate half iteration.
The RAM read-write controller produce the address of the RAM that will read and write according to the read-write mode of above-mentioned three counters and decoder and control P RAM and P sub-decoder between the exchange of data.
Fig. 5 is total figure of the capable P row of P parallel decoder.
Fig. 6 is the program flow diagram of the capable P row of product code P parallel decoder used interpretation method when P=8 of 64 * 64.
The tissue of RAM and the distribution method of address data memory when Fig. 7 is 64 * 64 product code degree of parallelism P=8 assign to realize with the data store among Fig. 5.
This method can be implemented in various programmable logic devices, and concrete enforcement is to make the product code decoder with the VirtexExcv600e hq240-6 chip of Xilinx company, and also the available dedicated integrated circuit is realized.