Summary of the invention
The present invention discloses BM (Berlekamp-Massey) the iterative decoding circuit of a kind of high speed, low delay, and purpose is to solve BM iterative decoding link in the traditional B CH decoder, increases under the situation in the error correction bit number, and speed is slack-off, the problem that time-delay increases.
In order to solve above-mentioned purpose, the present invention adopts syndrome calculation processing circuit and parallel iteration decoding circuit, has saved the amount of calculation of BCH decoding algorithm, has accelerated decoding speed; Reduced required clock cycle number and the circuit logic door number of traditional B M iterative decoding.For high speed Nand Flash memory device provides error correcting capability strong, it is little to delay time, the BM iterative decoding circuit that the error correction data throughput is big.
The present invention has realized a kind of high speed low delay BM iterative decoding circuit of the BCH of being used for decoder, comprises that odd number syndrome counting circuit (102), even number syndrome calculate one by one and syndrome ranking circuit (104) and parallel iteration decoding circuit (106);
Described odd number syndrome counting circuit (102) is used to receive the input data of Bose-Chaudhuri-Hocquenghem Code, and the odd number syndrome that calculates the input data of described Bose-Chaudhuri-Hocquenghem Code;
The described even number syndrome that is coupled to the output of odd number syndrome counting circuit (102) calculates and syndrome ranking circuit (104) one by one, be used to calculate the even number syndrome of the input data of described Bose-Chaudhuri-Hocquenghem Code, and the odd number syndrome and the even number syndrome that calculate exported to parallel iteration decoding circuit (106), when Bose-Chaudhuri-Hocquenghem Code input data can be corrected the t bit-errors, during described even number syndrome calculates one by one and syndrome ranking circuit (104) the j time in the 1st to the t-1 time circulation circulates, when j≤t/2, output syndrome S
2j+1, S
2j, S
2j-1... S
1, when j>t/2, export t+1 syndrome S
2j+1, S
2j, S
2j-1... S
2j-t+2, S
2j-t+1If syndrome S
2j+1, S
2j, S
2j-1... S
1Lazy weight t+1, remaining part is output as arbitrary value;
In t-1 circulation each, comprise k cycle, each cycle exports in regular turn and number to export p syndrome from big to small, and k is a positive integer, p*k=t+1;
Described parallel iteration decoding circuit (106) is based on the syndrome mistake in computation position multinomial coefficient that the BM algorithm utilizes described even number syndrome to calculate one by one and syndrome ranking circuit (104) is exported of no inverse operation;
Parallel iteration decoding circuit (106) comprises the first multiplier group (502), the second multiplier group (508), the 3rd multiplier group (509), many input summers (503), adder group (510), mistake multinomial register (511), mistake multinomial position buffer memory (512), Auxiliary polynomial buffer memory (513), non-zero differential register (507) and iteration difference register (504), and described multiplier and adder are arithmetic unit in the GF territory;
Calculate one by one and t-1 time of syndrome ranking circuit (104) circulation corresponding to described even number syndrome, described parallel iteration decoding circuit (106) carries out iterative computation t-1 time, and described parallel iteration decoding circuit (106) also carries out the t time iterative computation, in each cycle in k cycle of each iterative computation, the first multiplier group (502) is calculated described even number syndrome one by one and the value of the output of syndrome ranking circuit (104) and wrong multinomial register (511) storage multiplies each other, many input summers (503) calculate the long-pending last cycle with described many input summers (503) of p of the first multiplier group (502) output result of calculation and, the 3rd multiplier group (509) multiplies each other p value of error location polynomial buffer memory (512) respectively with the value of non-zero differential register (507), the second multiplier group (508) multiplies each other p value of Auxiliary polynomial buffer memory (513) respectively with the value of iteration difference register (504), adder group (510) is with second multiplier (508) output and the 3rd multiplier (509) output addition, obtain p and, and export to wrong multinomial register (511), judge whether iteration difference register (504) equals zero or the dimension of error location polynomial whether greater than iterations j:
If being not equal to the dimension of zero and error location polynomial, iteration difference register (504) is not more than iterations j, then the value with iteration difference register (504) stores in the non-zero differential register (507), the output of mistake multinomial register (511) is deposited with in the error location polynomial buffer memory (512), and the error location polynomial that error location polynomial buffer memory (512) calculates the last cycle is exported to Auxiliary polynomial buffer memory (513);
If iteration difference register (504) equals zero or the dimension of error location polynomial greater than iterations j, then do not upgrade the value in the non-zero differential register 507, do not upgrade Auxiliary polynomial buffer memory (513) yet;
And,, iteration difference register (504) is upgraded with the output valve of many input summers (503) in last cycle of each iterative computation.
The invention also discloses a kind of GF (2
13) square counting circuit in territory, comprise 13 signal input parts, 13 signal output parts:
The 1st signal input part and the 12nd signal input part are asked XOR, and the result is exported by the 1st signal output part;
The 8th signal input part and the 12nd signal input part and the 13rd signal input part are asked XOR, and the result is exported by the 2nd signal output part;
The 2nd signal input part and the 8th signal input part are asked XOR, and the result is exported by the 3rd signal output part;
The 9th signal input part and the 12nd signal input part and the 13rd signal input part are asked XOR, and the result is exported by the 4th signal output part;
The 3rd signal input part, the 8th signal input part, the 9th signal input part, the 12nd signal input part, the 13rd signal input part are asked XOR, and the result is exported by the 5th signal output part;
The 8th signal input part and the 10th signal input part are asked XOR, and the result is exported by the 6th signal output part;
The 4th signal input part, the 9th signal input part, the 10th signal input part and the 13rd signal input part are asked XOR, and the result is exported by the 7th signal output part;
The 9th signal input part and the 11st signal input part are asked XOR, and the result is exported by the 8th signal output part;
The 5th signal input part, the 10th signal input part and the 11st signal input part are asked XOR, and the result is exported by the 9th signal output part;
The 10th signal input part and the 12nd signal input part are asked XOR, and the result is exported by the 10th signal output part;
The 6th signal input part, the 11st signal input part and the 12nd signal input part are asked XOR, and the result is exported by the 11st signal output part;
The 11st signal input part and the 13rd signal input part are asked XOR, and the result is exported by the 12nd signal output part;
The 7th signal input part, the 12nd signal input part and the 13rd signal input part are asked XOR, and the result is exported by the 13rd signal output part.
The invention also discloses a kind of high speed BM iterative decoder of BCH decoder, comprise that odd number syndrome counting circuit (102), even number syndrome calculate one by one and syndrome ranking circuit (104) and parallel iteration decoding circuit (106);
Described odd number syndrome counting circuit (102) is used to receive the input data of Bose-Chaudhuri-Hocquenghem Code, and the odd number syndrome that calculates the input data of described Bose-Chaudhuri-Hocquenghem Code;
The described even number syndrome that is coupled to the output of odd number syndrome counting circuit (102) calculates and syndrome ranking circuit (104) one by one, be used for when the odd number syndrome of the input data of described Bose-Chaudhuri-Hocquenghem Code is not 0 entirely, calculate the even number syndrome of the input data of described Bose-Chaudhuri-Hocquenghem Code, and the odd number syndrome and the even number syndrome that calculate are exported to parallel iteration decoding circuit (106);
Described even number syndrome calculates one by one and syndrome ranking circuit (104) comprises 2t-2 the first ordering register cell RS
i(i ≠ p-1) and 1 second ordering register cell RS
P-1, p represents that described even number syndrome calculates and the degree of parallelism of syndrome ranking circuit (104), the error correcting capability of described BM iterative decoding circuit is t, and, described 2t-2 the first ordering register cell RS
i(each of i ≠ p-1) receives ordering register cell RS
I-pWith ordering register cell RS
I+t+3-pOutput, the described second ordering register cell RS
P-1Receive the first ordering register cell RS
2t-2With ordering register cell RS
t, RS
T-1, RS
T-2... RS
3Output, and from the odd number syndrome S of described odd number syndrome counting circuit (102)
1
Calculate one by one and each period 1 of t-1 time of syndrome ranking circuit (104) circulation at described even number syndrome, select ordering register cell RS
I+t+3-pOutput as the described first ordering register cell RS
i(output of i ≠ p-1) in other k-1 cycle of each circulation, is selected the register cell RS that sorts
I-pOutput as the described first ordering register cell RS
i(the output of i ≠ p-1);
In each period 1 of t circulation,, calculate syndrome S successively corresponding to cycle-index
1, ordering register cell RS
t, RS
T-1, RS
T-2... RS
4, RS
3Output square, as the second ordering register cell RS
P-1Output, in other k-1 cycle of each circulation, select the first register cell RS that sorts
2t-2Output as the second ordering register cell RS
P-1Output;
Ordering register cell RS
1, RS
2... RS
P-1, RS
pOutput calculate one by one as described even number syndrome and the output of syndrome ranking circuit (104), and k*p=t+1, k, p are positive integer;
Described parallel iteration decoding circuit (106) is based on the BM algorithm utilizes described even number syndrome to calculate one by one and syndrome ranking circuit (104) is exported the odd number syndrome and the even number syndrome mistake in computation position multinomial coefficient of no inverse operation;
Parallel iteration decoding circuit (106) comprises the first multiplier group (502), the second multiplier group (508), the 3rd multiplier group (509), many input summers (503), adder group (510), mistake multinomial register (511), mistake multinomial position buffer memory (512), Auxiliary polynomial buffer memory (513), non-zero differential register (507) and iteration difference register (504), and described multiplier and adder are arithmetic unit in the GF territory;
Calculate one by one and t-1 time of syndrome ranking circuit (104) circulation corresponding to described even number syndrome, described parallel iteration decoding circuit (106) carries out iterative computation t-1 time, and described parallel iteration decoding circuit (106) also carries out the t time iterative computation, in each cycle in k cycle of each iterative computation, the first multiplier group (502) is calculated described even number syndrome one by one and the value of the output of syndrome ranking circuit (104) and wrong multinomial register (511) storage multiplies each other, many input summers (503) calculate the long-pending last cycle with described many input summers (503) of p of the first multiplier group (502) output result of calculation and, the 3rd multiplier group (509) multiplies each other p value of error location polynomial buffer memory (512) respectively with the value of non-zero differential register (507), the second multiplier group (508) multiplies each other p value of Auxiliary polynomial buffer memory (513) respectively with the value of iteration difference register (504), adder group (510) is with second multiplier (508) output and the 3rd multiplier (509) output addition, obtain p and, and export to wrong multinomial register (511), judge whether iteration difference register (504) equals zero or the dimension of error location polynomial whether greater than iterations j:
If being not equal to the dimension of zero and error location polynomial, iteration difference register (504) is not more than iterations j, then the value with iteration difference register (504) stores in the non-zero differential register (507), the output of mistake multinomial register (511) is deposited with in the error location polynomial buffer memory (512), and the error location polynomial that error location polynomial buffer memory (512) calculates the last cycle is exported to Auxiliary polynomial buffer memory (513);
If iteration difference register (504) equals zero or the dimension of error location polynomial greater than iterations j, then do not upgrade the value in the non-zero differential register 507, do not upgrade Auxiliary polynomial buffer memory (513) yet;
And,, iteration difference register (504) is upgraded with the output valve of many input summers (503) in last cycle of each iterative computation.
The invention also discloses a kind of BM iterative decoding circuit of the BCH of being used for decoder, comprise that odd number syndrome counting circuit (102), even number syndrome calculate one by one and syndrome ranking circuit (104) and parallel iteration decoding circuit (106);
Described odd number syndrome counting circuit (102) is used to receive the input data of Bose-Chaudhuri-Hocquenghem Code, and the odd number syndrome that calculates the input data of described Bose-Chaudhuri-Hocquenghem Code;
Described even number syndrome calculates and syndrome ranking circuit (104) one by one, calculate the even number syndrome of the input data of described Bose-Chaudhuri-Hocquenghem Code, in one-period, p syndrome exported to parallel iteration decoding circuit (106) in order, wherein p is the degree of parallelism of described BCH decoder;
Described parallel iteration decoding circuit (106), be used to calculate the wrong equation coefficients of the input data of described Bose-Chaudhuri-Hocquenghem Code, it is characterized in that in a clock cycle, described parallel iteration decoding circuit (106) is converted into p wrong equation coefficient with p syndrome.
The minimizing of the very big degree of the present invention find the solution the required clock cycle of errors present equation, reduced Circuits System and separated time-delay, effectively improved the data throughout of BCH decoder.
Embodiment
Fig. 1 shows the basic structure schematic diagram of Berlekamp-Massey iterative decoding circuit of the present invention.The designed iterative circuit of the present invention mainly comprises three parts, and the counting circuit 102 of odd number syndrome, even number syndrome calculate one by one and syndrome ranking circuit 104, parallel iteration decoding circuit 106.
When the binary system Bose-Chaudhuri-Hocquenghem Code data 101 of carrying multidigit check digit information are input to the iterative decoding circuit, at first calculate the odd number syndrome 103 of these Bose-Chaudhuri-Hocquenghem Code data by odd number syndrome counting circuit 102, when not having error bit in the data, all odd number syndromes are 0, jump out iterative decoding.When error bit occurring in the data, the odd number syndrome is not 0 entirely, this moment the odd number syndrome is input to that the even number syndrome calculates one by one and syndrome ranking circuit 104, obtains the even number syndrome, and according to the parallel output of definite sequence syndrome 105 (comprising odd number syndrome and even number syndrome).The output of this syndrome calculates errors present equation 107 as the input of parallel iteration decoding circuit 106 by parallel iteration decoding circuit 106.
1. odd number syndrome counting circuit
First step of BCH decoding is for calculating syndrome.
The binary system that length is the n position sends code word C (x) and can be shown with polynomial table
C(x)=(c
n-1x
n-1+...+c
1x+c
0)
C wherein
N-1, c
N-2... c
0∈ (0,1) is the coding of binary system transmission code word.
Behind channel, the data that decoder receives are
R(x)=C(x)+E(x)=(r
n-1x
n-1+..+r
1x+r
0)
E (x)=(e wherein
N-1x
N-1+ ...+e
1X+e
0) be the error pattern that channel produces, so syndrome can be expressed as:
S
i=R(α
i)=C(α
i)+E(α
i)=E(α
i)
S in the formula
iBe syndrome, α
iRoot (C (the α of polynomial equation C (x)
i)=0), therefore for the BCH input data of no channel error (E (x)=0), syndrome is 0, i.e. S
i=0.
The computing formula of syndrome can convert to:
S
i=R(α
i)=r
n-1(α
i)
n-1+r
n-2(α
i)
n-2+r
1α
i+r
0
-------------(1)
=(...(r
n-1α
i+r
n-2)α
i+...+r
1)α
i+r
0
This shows and can try to achieve syndrome by the multiply-add loop computing function.Highest order r with the input data
N-1With α
iMultiply each other (GF territory multiplication) result of calculation and next input data bit r
N-2Addition; Result of calculation again with α
iMultiply each other ... by that analogy, to the last a r
0, calculating is finished.
Fig. 2 shows odd number syndrome counting circuit of the present invention, wherein imports parallel data 201 and is Bose-Chaudhuri-Hocquenghem Code byte (8bit), may comprise a plurality of error bits in these data.For the binary system Bose-Chaudhuri-Hocquenghem Code that can correct t error bit, need to calculate t odd number syndrome, S
1, S
3, S
5... S
2t-1When the Bose-Chaudhuri-Hocquenghem Code data parallel is input to odd number when following counting circuit, at first pass through RC (α
i) combinational logic circuit 202 calculates the result of 8 multiply-add cycle calculations:
RC(α
i)=(...(R
n-m(α
i)α
i+r
n-m-1)α
i+r
n-m-2)α
i+...+r
n-m-7)α
i+r
n-m-8
One of ordinary skill in the art will will be appreciated that easily, import the Bose-Chaudhuri-Hocquenghem Code data of a byte (8 bit) here at every turn, and to this 1 byte Bose-Chaudhuri-Hocquenghem Code input, by RC (α
i) number of times of the multiply-add cycle calculations carried out of combinational logic circuit 202 should be 8 (identical with the bit number of each input) mutually, only be as an example for ease of the needs of expressing, also can adopt the quantity of other bit numbers B, and correspondingly carry out the multiply-add cycle calculations B time as the input data.
In the computational process of the odd number syndrome counting circuit of Fig. 2, in one-period, RC (α
i) counting circuit 202 calculating (... (R
N-m(α
i) α
i+ r
N-m-1) α
i+ r
N-m-2) α
i+ ...+r
N-m-7) α
i+ r
N-m-8, r wherein
N-m-1, r
N-m-2... r
N-m-8Be 8 bits of a byte of input, R
N-m(α
i) be the preceding data that are stored in the register 203, the RC (α of once calculating
i) result of calculation be R
N-m-8(α
i) be stored in the register 203; At next cycle, when the data of another one byte are imported, advance this operation of cloth equally, circulation successively, byte data input is to the last finished, and the data in the register are the odd number syndrome 204 of these Bose-Chaudhuri-Hocquenghem Code data.According to the front formula, S when inerrancy occurs
i=0; Therefore when the odd number syndrome is 0 entirely, show that the data inerrancy needs to correct, jump out this iterative decoding circuit.When the odd number syndrome was not 0 entirely, these odd number syndromes will be sent to that the even number syndrome calculates one by one and syndrome ranking circuit 104 is handled.
2. the even number syndrome calculates and the syndrome ranking circuit one by one
The even number syndrome calculates one by one and the main feature of syndrome ranking circuit 104 is, syndrome only adopts a GF territory square computing unit in the parallel output according to a definite sequence, calculate the even number syndrome one by one.
Found the solution following GF territory computing formula for Bose-Chaudhuri-Hocquenghem Code even number syndrome,
According to this formula by S
1Can calculate
By S
2Can calculate S
4, by S
3Can calculate S
6...
Fig. 3 shows even number syndrome of the present invention and calculates one by one and syndrome ranking circuit 104, and this figure understands that for example an error correcting capability is 15bit, and the circuit structure of 4 syndromes of line output.
When initial, ordering register array 300 described in Fig. 3, the odd number syndrome is written into the ordering register cell, register array 300 afterwards sorts under clock drives, carry out 14 circulations, each circulation comprises 4 cycles, and each cycle is 4 syndromes of line output also, 16 syndromes of each circulation output.Register cell RS wherein sorts
4, RS
3, RS
2, RS
1, the output of ordering register array is provided, the mode of syndrome output is shown in the table 1.Wherein, with first clock cycle of the 1st circulation be example, in this clock cycle, at ordering register cell RS
4Output S
3, at RS
3Output S
2, at RS
2Output S
1, RS
1Output do not consider (being expressed as NA).Thereby, describe ordering register array 300 in the table 1 in detail in each cycle in 14 circulations, ordering register cell RS
4, RS
3, RS
2And RS
1The content of output.
Table 1 syndrome and line output order
In table 1, j is a cycle-index, S
2j(S
j 2) represent by S
jCalculate S by square operation unit, GF territory 324
2j
Introduce below in conjunction with Fig. 3 that the even number syndrome calculates one by one and the specific embodiment of syndrome ranking circuit 104.Reference numeral 310 indication ordering register cell RS among Fig. 3
i(i ≠ 3), mark 320 indication ordering register cell RS
3, have the function of calculating the even number syndrome.For RS
i(i ≠ 3) sequencing unit 310, it is input as R
I-4313 and R
I+14314, R
I-4313 and R
I+14314 is respectively RS
I-4And RS
I+14The output of ordering register cell, wherein i is to be 29 circulation sign in the cycle, and:
R
I-4=R
I-4+29(when i-4≤0), R
I+14=R
I+14-29(when i+14>29)
R
I-4313 and R
I+14314 are connected on the register 311 by MUX gate 312, ordering register cell RS
iThe output signal of (i ≠ 3) is R
i315.
RS
3The input of ordering register cell 320 is divided into two parts, and first is input as signal S
1, R
15, R
14... R
3S wherein
1Be syndrome S
1, and R
15, R
14... R
3Be respectively ordering register cell RS
15, RS
14... RS
3Output, they are input to GF territory square computing unit 324 by a MUX gate 323, calculate the even number syndrome; RS
3The second portion of ordering register cell 320 is input as R
28, be used for shift sort.Two parts input is connected on the register 326 RS by a MUX gate 325
3Ordering register cell 320 is output as R
3As previously mentioned, RS
1, RS
2, RS
3, RS
4Be connected to back level parallel iteration decoding circuit 106 as syndrome and line output 303.
Below once more referring to above-mentioned table 1, and in conjunction with Fig. 3, to illustrate that even number syndrome of the present invention calculates one by one and the course of work of syndrome ranking circuit 104.The even number syndrome calculates one by one and the syndrome ranking circuit can be divided into cyclic ordering displacement and two kinds of courses of work of cycle sequencer shift.
Carry out the cyclic ordering displacement in the 1st cycle of each circulation.MUX gate 312 is selected R
I+14314 as RS
iThe input signal of register 311 in (i ≠ 3); MUX gate 325 is according to cycle-index i, and corresponding to i=1,2...14 selects S successively
1, R
15, R
14... R
4, R
3As the input of GF territory square computing unit 324, calculate the even number syndrome by GF territory square computing unit 324; MUX gate 325 selects the output of GF territory square computing unit 324 as RS
3The input of middle register 326.
The 2nd, 3,4 cycles in each circulation are carried out the cycle sequencer shift.MUX gate 312 is selected R
I-4313 as RS
iThe input signal of register 311 in (i ≠ 3), MUX gate 325 is selected R
28As RS
3The input of middle register 326.Alternatively, can forbid MUX gate 323 and GF territory square computing unit 324 at this moment, to reduce the power consumption of decoder.
For further improving the speed of GF territory square calculating, the present invention has also designed a kind of brand-new GF territory square computing unit 324, is used to calculate the even number syndrome, for GF (2
13) territory square computing unit, can calculate GF (2 as follows
13) territory square.
If c={c[12], c[11], c[10] and, c[9], c[8] and, c[7], c[5] and, c[4], c[3] and, c[2], c[1] and, c[0];
a={a[12],a[11],a[10],a[9],a[8],a[7],a[5],a[4],a[3],a[2],a[1],a[0]};
c[0]=a[0]^a[11];
c[1]=a[7]^a[11]^a[12];
c[2]=a[1]^a[7];
c[3]=a[8]^a[11]^a[12];
c[4]=a[2]^a[7]^a[8]^a[11]^a[12];
c[5]=a[7]^a[9];
c[6]=a[3]^a[8]^a[9]^a[12];
c[7]=a[8]^a[10];
c[8]=a[4]^a[9]^a[10];
c[9]=a[9]^a[11];
c[10]=a[5]^a[10]^a[11];
c[11]=a[10]^a[12];
c[12]=a[6]^a[11]^a[12];
Wherein " c " is GF territory square result of calculation, and " a " is a square calculating input, and " ^ " is the step-by-step xor operation.Thereby as can be known, GF territory square computing unit circuit can be realized by a series of XOR gate.
Compared with prior art, described even number syndrome calculates and the advantage of syndrome ranking circuit is one by one:
1) adopts GF territory square computing unit, substitute GF territory multiplication computing unit, compare with GF territory mlultiplying circuit, the circuit logic door that GF territory squaring circuit takies takies 3% of logic gate number less than GF territory mlultiplying circuit, is used among the Chinese patent application 200910046088.X to calculate that the even number syndrome adopted is GF territory multiplier.
2) mode of employing ordering register, the realization syndrome is exported according to a definite sequence.Avoid adopting complicated state machine or processor instruction to realize syndrome output, simplified circuit structure, improved the speed of service, reduced the circuit area occupied.
3) square calculating is embedded in the register ordered steps, serial computing even number syndrome only needs 1 square of computing unit just can realize the calculating one by one of even number syndrome.Avoid adopting a plurality of GF territory multiplication unit to calculate the even number syndrome simultaneously, greatly reduce the area of hardware.
3. parallel iteration decoding circuit
Fig. 4 shows the iterative decoding flow chart that the present invention finds the solution the errors present equation, and Fig. 5 shows parallel iteration decoding circuit structure figure of the present invention.
The present invention has adopted the simplification BM algorithm of no GF territory inverse operation, solves the coefficient of errors present equation.The iterative decoding circuit receives the syndrome (according to putting in order shown in the table 1) of parallel input, by the multiplier group, and adder group, and the make mistake coefficient of position equation of ring-type cycling circuit parallel computation.For error correcting capability is 15 iterative decoding circuit, and whole process needs iteration 15 times, and each iteration comprises 4 cycles, the last time iterative computation 16 coefficients of position equation that make mistake.In the present embodiment, literal will be described in detail the composition and the operation principle of iterative decoding circuit in conjunction with Fig. 4 and Fig. 5 below.
If σ (x) is an error location polynomial, τ (x) is an Auxiliary polynomial, and Δ is the iteration difference, and δ is a non-zero iteration difference, and Deg is polynomial dimension, and j is an iterations, and t is an error correcting capability.
In the beginning of iterative computation, need carry out initial value to related register and set 401,
σ
(-1)(x)=τ
(-1)(x)=1,Δ
(0)=S
1,δ=1
Promptly set error location polynomial buffer memory 512 and Auxiliary polynomial buffer memory 513 is 1, setting wrong multinomial register 511 is 1, and setting non-zero differential δ 507 is 1, and set-up register 505 is 0, and setting iteration difference DELTA register 504 is S
1, S wherein
1Try to achieve by odd number syndrome counting circuit.
Referring to Fig. 4, whole iterative circuit relates generally to two kinds of calculating, finds the solution the calculation procedure 402 and the calculation procedure 403 of asking the error location polynomial factor sigma of iteration difference DELTA.These two calculation procedures can be carried out simultaneously by the ring-type cycling circuit in the iterative decoding circuit.Below, respectively it is introduced in detail.
A. find the solution the calculation procedure 402 of iteration difference DELTA
At first, as table 1 and illustrated in fig. 3, the even number syndrome calculates one by one and syndrome ranking circuit 104 can import 501 with 4 syndromes in each cycle, is input to " the multiplier group 1 " 502 of parallel iteration decoding circuit 106 in order.In first clock cycle of the j time iteration, " multiplier group 1 " 502 comprises 4 GF (2
13) the territory multiplier, can be with 4 multinomial coefficient σ in the wrong multinomial register 511
0 (j), σ
1 (j), σ
2 (j), σ
3 (j)4 syndrome S with input
2j+1, S
2j, S
2j-1, S
2j-2In the GF territory, multiply each other, obtain 4 multiplied result σ
0 (j)S
2j+1, σ
1 (j)S
2j, σ
2 (j)S
2j-1, σ
3 (j)S
2j-2(when calculating one by one from the even number syndrome and syndrome ranking circuit 104 when being output as NA, corresponding GF territory multiplication result counts 0), the result of calculation of the last clock cycle that 4 multiplied result and register 505 are deposited, by many input summers 503 with 5 input summations; Calculate σ second clock cycle
4 (j)S
2j-3, σ
5 (j)S
2j-4, σ
6 (j)S
2j-5, σ
7 (j)S
2j-6, by many input summers 503, the result of calculation of the last clock cycle that 4 multiplied result and register 505 are deposited is sued for peace.At 4 all after dates of the j time iteration of experience (circulation), just each iteration is last cycle (circulation), obtains the iteration difference
This iteration difference is deposited with in the iteration difference register 504, and in each iteration (circulation), the value of iteration difference DELTA register 504 remains unchanged.Fig. 3 even number syndrome calculate one by one and the syndrome ranking circuit in 4 cycles comprising of each circulation corresponding, in the table 1 the j time circulates corresponding to the j time iteration of the decode procedure of Fig. 4.When the 15th iteration, the parallel iteration decoding circuit only calculates σ
(j)(x), do not calculate the iteration difference DELTA, do not need the syndrome of importing; So the even number syndrome calculates one by one and the syndrome ranking circuit need experience 14 circulations, and the parallel iteration decoding circuit then needs to carry out iteration 15 times.
B. ask the error location polynomial factor sigma
(j)(x) calculation procedure 403
Ask the error location polynomial factor sigma
(j)(x) calculation procedure 403 is to carry out simultaneously with the calculation procedure 402 of finding the solution the iteration difference DELTA, introduces the detailed step of finding the solution the error location polynomial coefficient below.
Calculate one by one and each output cycle of syndrome ranking circuit 104 corresponding to the even number syndrome, judge whether the iteration difference equals zero or σ
(j)(x) whether polynomial dimension is greater than iterations j, promptly
Δ=0 or Deg σ
j(x)>j---(Boolean expression 1)
To the judged result of Boolean expression 1, if "No" records iteration difference DELTA register 504 in the non-zero differential register 507 by MUX506; In the j time iterative process, " multiplier group 2 " 508 be input as Auxiliary polynomial buffer memory 513 and iteration difference register 504, calculate Δ
(j)x
2τ
I-2 (j-1)(x), Δ
(j)x
2τ
I-1 (j-1)(x), Δ
(j)x
2τ
i (j-1)(x), Δ
(j)x
2τ
I+1 (j-1)(x); " multiplier group 3 " 509 be input as error location polynomial buffer memory 512 and non-zero differential register 507, calculate δ σ
i (j-1)(x), δ σ
I+1 (j-1)(x), δ σ
I+2 (j-1)(x), δ σ
I+3 (j-1)(x), at last by " adder group " 510, parallel computation σ
(j)(x)=δ σ
(j-1)(x)+Δ
(j)x
2τ
(j-1)(x), obtain σ
(j)(x) polynomial 4 factor sigma
i (j), σ
I+1 (j), σ
I+2 (j), σ
I+3 (j), being recorded in the wrong multinomial register 511, the output of mistake multinomial register 511 is connected to the input of " multiplier group 1 " 502, and the output of wrong multinomial register 511 simultaneously also is deposited with in the error location polynomial buffer memory 512.The σ that calculates for last iteration (the j-1 time iteration)
(i-1), error location polynomial buffer memory 512 outputs it to Auxiliary polynomial buffer memory 513, i.e. τ among Fig. 4
(j)(x)=σ
(j-1)405.τ
(j)(x) will be in next iteration (the j+1 time iteration) as the input of " multiplier group 2 " 508.
The judged result of Boolean expression 1 is if "Yes" is deposited the value in the non-zero differential register 507 again by MUX506, rather than is updated to the value of iteration difference DELTA register 504; By " multiplier group 2 " 508, " multiplier group 3 " 509 and " adder group " 510, parallel computation σ
(j)(x)=δ σ
(j-1)(x)+Δ
(j)x
2τ
(j-1)(x) 403, obtain σ
(j)(x) polynomial 4 factor sigma
i (j), σ
I+1 (j), σ
I+2 (j), σ
I+3 (j), be recorded in the wrong multinomial register 511, export to " multiplier group 1 " 502, and be deposited with in the error location polynomial buffer memory 512; Value in the Auxiliary polynomial buffer memory 513 does not need to upgrade, i.e. τ among Fig. 4
(j)(x)=x τ
(j-1)(x) 407.
Through 4 cycles of the j time iteration, (each computation of Period goes out σ to calculate wrong polynomial all 16 coefficients
(j)(x) polynomial 4 factor sigma
i (j), σ
I+1 (j), σ
I+2 (j), σ
I+3 (j)), promptly obtain error location polynomial
Step 408 in Fig. 4 adds " 1 " with iterations, in step 409, judges that whether iterations j is less than error correcting capability " t ".If "Yes" continues the next iteration operation, be 15 BM iterative decoding for an error correcting capability, need carry out iteration 15 times; If "No", show that iteration finishes, obtain error location polynomial σ (x), judge that in step 410 whether the dimension of σ (x) is greater than " t ", if "Yes" shows that this section code error position number is greater than maximum error correcting capability, if "No", then σ (x) is the error location polynomial of this Bose-Chaudhuri-Hocquenghem Code.
Compared with prior art, the advantage of described parallel iteration decoding circuit is:
1) at every turn can a plurality of syndromes of parallel processing, reduced the clock cycle of solving equation, reduce Circuits System and separated time-delay; For the BM decoding that realizes having 15 error correcting capabilities, adopt state machine or cpu instruction mode needing to realize hundreds of to arrive the thousands of clock cycle.And adopt the serial decoding mode to need 240 clock cycle at least, be that 4 decoded mode only needs 60 clock cycle just can obtain the errors present equation and adopt degree of parallelism.Compare with the performance described in the Chinese patent application 200910024526.2, under the 16bit error correcting capability, adopt the state machine mode to decipher, need 830 clock cycle just can finish the BM interative computation of error location polynomial.
2) adopt the ring-type cycling circuit, the calculating of wrong multinomial coefficient and the calculating of iteration difference closely are connected, can calculate iteration difference and error location polynomial coefficient simultaneously.3 GF territory multiplier groups are in running order always, do not have the clock cycle to be in wait state in the iterative process, reduced the time-delay of decoding circuit effectively.
3) realize that with adopting state machine or cpu instruction mode BM decoding compares, this parallel iteration circuit needs area seldom, and has higher decoding speed;
4) under the situation that realizes identical error correction data throughput, parallel iteration circuit structure of the present invention has remarkable advantages on chip occupying area.
4. error correcting capability and degree of parallelism
Above dual numbers syndrome calculates one by one and the description of syndrome ranking circuit and parallel iteration decoding circuit is 15bit at error correcting capability only, and degree of parallelism is that an example of 4 describes; By way of example the BM iterative circuit decode procedure that can correct 15 bit-errors has been carried out once complete description, purpose is an operation principle of being convenient to understand iterative circuit.On this basis, one of ordinary skill in the art will recognize easily, calculate one by one and syndrome ranking circuit 104 for even number syndrome with t position error correcting capability, can design ranking circuit according to actual needs with other degree of parallelisms and periodicity, and the parallel iteration decoding circuit that can obtain corresponding degree of parallelism.
Cycle-index J, degree of parallelism p and periodicity k and error correcting capability t satisfy following relation:
Cycle-index J=t-1, degree of parallelism p * periodicity k=t+1
J wherein, p, k, t is natural number, so t+1 is the multiple of p, so p can equal, and any one can divide exactly the natural number of t+1 in 1 to t+1.
For example, for the 15bit error correcting capability, can design cycle-index is 14, and each circulation comprises 8 cycles, the ranking circuit of each cycle and 2 syndromes of line output; Also can design cycle-index is 14, and each circulation comprises 1 cycle, the ranking circuit of each cycle and 16 syndromes of line output.For error correcting capability is 8bit, and can design cycle-index is 7, and each circulation comprises 3 cycles, the ranking circuit of each cycle and 3 syndromes of line output.
Fig. 6 shows that the even number syndrome calculates one by one and the syndrome ranking circuit, has general meaning.T is an error correcting capability among the figure, and p is a degree of parallelism, and m is the line number of ordering register array, and p register is divided into delegation continuously, and 2t-1 RS register cell is arranged in m full row altogether, satisfies following relation;
2t-1-p<m×p≤2t-1
In Fig. 6, RS represents register cell, RS
i(i ≠ p-1) register cell has the function of cycle sequencer shift and cyclic ordering displacement, RS
P-1Register cell except the function with the displacement of cycle sequencer shift and cyclic ordering, also has the function that the even number syndrome calculates, and table 2 has listed that to have error correcting capability be t, and degree of parallelism is the output order of the syndrome of p.The even number syndrome that the working method of circuit shown in Fig. 6 and front are narrated calculates one by one and the syndrome ranking circuit (t=15, working method unanimity p=4) do not remake at this and to be repeated in this description.
Table 2 error correcting capability is t, and degree of parallelism is the syndrome output order of p
For parallel iteration decoding circuit 106, can design iterative decoding circuit according to actual needs with other degree of parallelisms and periodicity with t position error correcting capability.Iterations J wherein, degree of parallelism p and periodicity k satisfy following relation:
Iterations J=t, degree of parallelism p * periodicity k=t+1
J wherein, p, k, t is natural number, so t+1 is the multiple of p, so p can equal, and any one can divide exactly the natural number of t+1 in 1 to t+1.
For example, for the 15bit error correcting capability, can design iterations is 15, and each iteration comprises 8 cycles, and degree of parallelism is 2 iterative decoding circuit; Also can design cycle-index is 15, and each circulation comprises 1 cycle, and degree of parallelism is 16 iterative decoding circuit.For error correcting capability is 8bit, and can design cycle-index is 8, and each circulation comprises 3 cycles, and degree of parallelism is 3 iterative decoding circuit.
Also can adopt to comprise the microprocessor circuit system that carries out computations and realize that the even number syndrome calculates one by one and the syndrome ranking circuit, and according to the syndrome output order that goes out as shown in table 2, with to parallel iteration decoding circuit output syndrome.Owing to can easily obtain the syndrome computational methods of the BCH code that microprocessor instruction realizes with reference to aforementioned formula (1), thereby for this specification is kept succinctly, and do not give unnecessary details again.
Fig. 7 shows parallel iteration decoding circuit structure figure of the present invention.In order to satisfy the calculating of degree of parallelism p, the multiplier group is made up of p GF territory multiplier, and the adder group is made up of p GF territory adder, and its working method and flow process and front are narrated, and parallel iteration decoding circuit that to have 15 error correcting capability degree of parallelisms be p is consistent.
For error correcting capability is t, and degree of parallelism is that the required clock cycle T of BM iterative decoding circuit of p is:
It is emphasized that the even number syndrome calculates one by one and syndrome ranking circuit 104 and parallel iteration decoding circuit 106, need have identical degree of parallelism p.
Under the constant situation of error correcting capability t, degree of parallelism p is big more, and it is few more to decipher the required clock cycle, and the time-delay of BM decoding circuit is more little; Simultaneously degree of parallelism p is big more, and the shared circuit area of decoding circuit is big more.Therefore, can choose degree of parallelism flexibly according to the demand of decoding system in actual applications to time-delay and circuit area occupied.
Utilize disclosed decoding circuit among the present invention, realized having the BCH decoding circuit of 15 and 8 error correcting capabilities.For having 15 error correcting capability BCH decoding circuits, its data throughout is greater than 2GB/s, and when being operated in the 250MHz clock frequency, average BM iterative decoding time-delay has only 250ns.For the BCH decoding circuit with 8 error correcting capabilities, its data throughout is greater than 4.5GB/s, and when being operated in the 250MHz clock frequency, average BM iterative decoding time-delay has only 100ns.This circuit has been used to realize the ECC Module Design of high speed solid storage device, has improved the bandwidth and the IOPS of storage system, has reduced the time-delay of whole storage system, reduces the IO operation awaits time of upper layer software (applications).
Represented the description of this invention, and be not intended to disclosed form limit or restriction the present invention in order to illustrate with purpose of description.To one of ordinary skill in the art, many adjustment and variation are conspicuous.