CN102142926A

CN102142926A - Processing method, processing unit and decoder for reducing resource consumption while ensuring throughput

Info

Publication number: CN102142926A
Application number: CN2010102546303A
Authority: CN
Inventors: 常德远; 肖治宇; 喻凡; 李扬
Original assignee: Huawei Technologies Co Ltd
Current assignee: Huawei Technologies Co Ltd
Priority date: 2010-08-13
Filing date: 2010-08-13
Publication date: 2011-08-03
Anticipated expiration: 2030-08-13
Also published as: CN102142926B

Abstract

The invention provides a processing method, processing unit and decoder for reducing resource consumption while ensuring throughput. The decoder comprises N first switch networks, Q processing units and N second switch networks, wherein each first switch network consists of Q second multiplexers; each second multiplexer is an m:1 multiplexer; the Q processing units are connected with the Q second multiplexers of each first switch network and used for processing the data output by the N multiplexers in parallel; each second switch network consists of Q second demultiplexers; each second demultiplexer is a 1:m demultiplexer; the Q second demultiplexers of each second switch network are connected with the Q processing units respectively; and each second switch network is used for demultiplexing the input Q data and outputting L data. According to the embodiment of the invention, the resource consumption is reduced while the throughput is ensured.

Description

Reduce processing method, processing unit and the decoder of resource consumption when guaranteeing throughput

Technical field

The present invention relates to data processing technique, relate in particular to a kind of processing method, processing unit and decoder that reduces resource consumption when guaranteeing throughput.

Background technology

Low-density parity effect sign indicating number (Low Density Parity Code, LDPC) be a kind of forward error correction (Forward Error Correction with the gain characteristic that can approach shannon limit, FEC) code word, but the logical resource that realization of decoding consumed of LDPC sign indicating number is very big.Quasi-cyclic LDPC (Quasi-Cyclic, QC-LDPC) be a kind of special LDPC sign indicating number, its check matrix is except having the total sparse property characteristic of LDPC sign indicating number, also having a characteristic feature is that check matrix is made of a series of circular matrixes, this circular matrix is called the submatrix of check matrix, for example, being spent by M * N of the capable N of M row is that 1 submatrix is formed, and each submatrix is the matrix of L * L.When having premium properties, the easier realization encoding and decoding of QC-LDPC sign indicating number.

In the existing ldpc decoder, check matrix can be divided into the M layer or be divided into the N layer by row by row, every layer has the capable or L row of L, in every layer by L serial process unit (SerialProcessing Unit, SPU) the capable or corresponding codeword information of L row of parallel processing L, the codeword information of each SPU serial process delegation correspondence could be exported result after promptly needing all to read in all data of a line correlation by the time simultaneously.And, be serial process between each layering, after promptly SPU handles the pairing information of one deck, handle one deck information down again.For the decoder of the big throughput of needs, so spent logical resource of serial decoding structure is huge even is difficult to practicability.

In realizing process of the present invention, the inventor finds that there are the following problems at least in the prior art: in order to realize big throughput, resource consumption is bigger in the prior art.

Summary of the invention

The embodiment of the invention provides a kind of processing method, processing unit and decoder that reduces resource consumption when guaranteeing throughput, in order to solve the bigger problem of resource consumption when the high-throughput that exists in the prior art.

On the one hand, the embodiment of the invention provides a kind of processing unit, comprising:

P first multiplexer is used for N data of input are carried out multiplexing process, exports P data, and wherein, each first multiplexer is the multiplexer of n:1, the optimum block count of n for obtaining in advance, and P is the number of data in each piecemeal, N=n * P;

P processing module is connected with described P first multiplexer respectively, is used for the data of described first multiplexer output of parallel processing;

P first demodulation multiplexer is connected with a described P processing module respectively, is used for that P data of described processing module output are carried out demultiplexing and handles, and exports N data, and wherein, each first demodulation multiplexer is respectively the demodulation multiplexer of 1:n.

On the other hand, the invention provides a kind of decoder, comprising:

N piece first switching network, every first switching network is made up of Q second multiplexer, each second multiplexer is the multiplexer of m:1, and every first switching network is used for L data of input are carried out multiplexing process, exports Q data, wherein, m is the layered optimization number of predetermined check matrix, and N is the columns of check matrix, and L is line number or the columns that every layer of check matrix comprises, Q is line number or the columns that each sublayer of check matrix comprises, L=m * Q;

Q processing unit is connected with Q second multiplexer of every first switching network respectively, is used for the data of parallel processing N piece first switching network output;

N piece second switching network, every second switching network is made up of Q second demodulation multiplexer, each second demodulation multiplexer is the demodulation multiplexer of 1:m, the Q of every second switching network second demodulation multiplexer is connected with Q processing unit respectively, every second switching network is used for that Q data of input are carried out demultiplexing to be handled, and exports L data.

On the one hand, the embodiment of the invention provides a kind of decoder, comprises above-mentioned processing unit, also comprises again:

N piece switching network, every block of switching network is connected with each processing unit respectively, and every block of switching network comprises at least two shift module, the step-length difference of each shift module, wherein, N is the columns of check matrix, and L is line number or the columns that every layer of check matrix comprises.

On the one hand, the embodiment of the invention provides a kind of processing method that reduces resource consumption when guaranteeing throughput, comprising:

Adopt P first multiplexer that N data of input are carried out multiplexing process, export P data, wherein, each first multiplexer is the multiplexer of n:1, the optimum block count of n for obtaining in advance, and P is the number of the data of each piecemeal, N=n * P;

Adopt P the described P of a processing module parallel processing data;

P data after adopting P first demodulation multiplexer to parallel processing are carried out the demultiplexing processing, export N data, and wherein, each first demodulation multiplexer is respectively the demodulation multiplexer of 1:n.

On the other hand, the embodiment of the invention provides a kind of processing method that reduces resource consumption when guaranteeing throughput, comprising:

Adopt every first switching network to receive L data of input, and Q data of output, wherein, be total to N piece first switching network, every first switching network is made up of Q second multiplexer, each second multiplexer is the multiplexer of m:1, m is the layered optimization number of predetermined check matrix, and N is the columns of check matrix, and L is line number or the columns that every layer of check matrix comprises, Q is line number or the columns that each sublayer of check matrix comprises, L=m * Q;

Adopt Q processing unit for parallel to handle the data of N piece first switching network output;

Adopt every second switching network that Q data of input are carried out demultiplexing and handle, export L data, wherein, be total to N piece second switching network, every second switching network is made up of Q second demodulation multiplexer, and each second demodulation multiplexer is the demodulation multiplexer of 1:m.

As shown from the above technical solution, the embodiment of the invention can reduce resource consumption so that the number of modules of follow-up parallel processing reduces by the data of input are carried out multiplexing process; Therefore, the embodiment of the invention can reduce resource consumption when guaranteeing high-throughput.

Description of drawings

In order to be illustrated more clearly in the embodiment of the invention or technical scheme of the prior art, to do one to the accompanying drawing of required use in embodiment or the description of the Prior Art below introduces simply, apparently, accompanying drawing in describing below is some embodiments of the present invention, for those of ordinary skills, under the prerequisite of not paying creative work, can also obtain other accompanying drawing according to these accompanying drawings.

Fig. 1 is the structural representation of ldpc decoder in the prior art;

Fig. 2 is the structural representation of the check matrix of QC-LDPC sign indicating number in the prior art;

Fig. 3 is the sequential schematic diagram of message handler of the prior art when handling according to layer;

Fig. 4 is message handler in the embodiment of the invention sequential schematic diagram during according to sub-layer processes;

Fig. 5 divides the structural representation of back decoder for carrying out the sublayer in the embodiment of the invention;

Fig. 6 is the decoder architecture schematic diagram of first embodiment of the invention;

Fig. 7 is the method flow schematic diagram of second embodiment of the invention;

The sequential schematic diagram that Fig. 8 carries out full parallel processing for processing unit of the prior art to the input data;

Fig. 9 carries out the sequential schematic diagram that full serial is handled for processing unit of the prior art to the input data;

Figure 10 executes the structural representation of the processing unit of example for the present invention the 3rd;

Figure 11 is the method flow schematic diagram of fourth embodiment of the invention;

Figure 12 is the structural representation of the switching network of fifth embodiment of the invention;

Figure 13 is the decoder architecture schematic diagram of sixth embodiment of the invention

Figure 14 is the method flow schematic diagram of seventh embodiment of the invention.

Embodiment

For the purpose, technical scheme and the advantage that make the embodiment of the invention clearer, below in conjunction with the accompanying drawing in the embodiment of the invention, technical scheme in the embodiment of the invention is clearly and completely described, obviously, described embodiment is the present invention's part embodiment, rather than whole embodiment.Based on the embodiment among the present invention, those of ordinary skills belong to the scope of protection of the invention not making the every other embodiment that is obtained under the creative work prerequisite.

Fig. 1 is the structural representation of ldpc decoder in the prior art, comprises memory 11, switching network 12, message handler 13, control sequence generator 14 and process controller 15.Wherein, memory 11 storage codeword information, for example posterior probability (A Posteriori Probability, APP) information; The exchange flow process of above-mentioned module is as follows: after the one piece of data in the codeword information in the memory 11 is read, send into switching network 12, by the order of switching network 12 acquisition expectations, send into message handler 13; 13 pairs of data of message handler are handled, and message handler 13 is made up of processing unit for parallel.Message handler 13 is delivered to switching network 12 again with the updating value that processing obtains, and 12 pairs of data of switching network exchange the data that obtain original order, stores the relevant position of memory 11 afterwards into, the data that read before upgrading.So far finish the once renewal of one piece of data.Continue to read down one piece of data, repeat said process,, at this moment just finished the iteration of a code word and upgraded up to running through a code word.Repeat said process, after the iteration of carrying out predefined number of times is altogether upgraded, finish decoding, final decode results is stored in the memory 11.Wherein, 12 pairs of data of switching network exchange the order of acquisition by 14 controls of control sequence generator, and control sequence generator 14 produces control signal, carries out the exchange of different order with the data of 12 pairs of inputs of control switching network.In addition, there are operations such as read-write in the data of each module, and for example, at first from memory 11 reading of data, afterwards again with in the writing data into memory 11, therefore, each module needs read-write sequence control, and process controller 15 is used to carry out controls such as read-write sequence.Concrete data flow and control flows trend can be referring to Fig. 1, and the solid line among Fig. 1 is represented data flow, and dotted line is represented control flows.

Fig. 2 is the structural representation of the check matrix of QC-LDPC sign indicating number in the prior art, referring to Fig. 2, check matrix H is made up of the submatrix of the capable N row of M, each submatrix is the matrix of the capable L row of L, each submatrix is to obtain after the unit matrix cyclic shift, unit matrix is that the element on the diagonal is 1, and all the other elements are 0 matrix.

According to the limit positions of submatrix, check matrix can be divided into the level course of M layer, every layer is that L is capable; Perhaps, be divided into the perpendicular layers of N layer, every layer is the L row.During decoding, in order to guarantee that decoding performance reduces the computing iterations simultaneously, can adopt belief propagation (the Belief Propagation of mixing, BP) algorithm, in this algorithm, information to the check matrix correspondence is carried out computing successively, and in a same iteration, this layer computing updated information will be used when descending one deck computing.

Fig. 3 is the sequential schematic diagram of message handler of the prior art when handling according to layer, referring to Fig. 3, suppose that every layer of line number L that comprises is 99, then the corresponding sequential operation of the 1st layer 1～99 row comprises and reads (RE), operation-1, operation-2, operation-3 and write (WR), wherein, operation-1, operation-2, operation-3 are the required logical operation of carrying out of message handler.Present embodiment is-3 to be example with operation-1, operation-2, operation, and still, the number of logical operation is not limited to 3.

Because between the layering is serial process, therefore, the 2nd layer the corresponding sequential operation of 1～99 row promptly, is participated in Fig. 3 after will waiting until that the 1st layer sequential operation is finished, and the 2nd layer logical operation is read (RE) and write (WR) afterwards the 1st layer logical operation.Owing to need between each layer to wait for, need to insert certain wait clock, cause the free time waste of logical resource.

Above-mentioned mode of carrying out parallel processing according to layer needs L processing unit, and needs serial process between each layer, needs bigger time-delay, therefore when reaching certain throughput demand, needs more resource consumption.

In order to reduce resource consumption, the embodiment of the invention can be optimized the number of the processing unit in the message handler.It is by every layer of check matrix being divided once more, be divided into a plurality of sublayers, the data of each sublayer correspondence being carried out parallel processing, rather than the data of each layer correspondence are handled that the number of processing unit is optimized.For example, L=99 is capable, and every layer is divided into m=3 sublayer, and it is capable that each sublayer comprises Q=33.If with the layer is that unit handles, then need 99 processing units, and be that unit handles with the sublayer, only need Q=33 processing unit.

Fig. 4 is message handler in the embodiment of the invention sequential schematic diagram during according to sub-layer processes, referring to Fig. 4, because the information between each sublayer uncorrelated (this is because the each row and column of the submatrix of H matrix have only a nonzero element decision), therefore, can continuous productive process between the data of each sublayer correspondence, promptly needn't be as shown in Figure 3, the data of the 2nd layer of correspondence are carried out the data that will wait until the 1st layer of correspondence and are carried out after the end, for example, referring to Fig. 4, the sequential RE of the 2nd sublayer correspondence is that the sequential operation WR that needn't wait for the 1st sublayer correspondence executes.In the mode shown in Figure 3, the delay inequality that the code word data of each layer correspondence are carried out is the operating time of each layer, and according to the pipeline mode behind the above-mentioned division sublayer, the delay inequality that the code word data of each layer correspondence are carried out is less than the operating time of each layer, therefore can reduce time delay, improve throughput.

Because present embodiment need be divided into the sublayer to each layer, m is counted in the sublayer that therefore needs to determine optimum.Wherein, can calculate throughput and resource consumption situation under the different m, will satisfy the most corresponding m of throughput demand and resource consumption as optimum sublayer number.Wherein, the Logic Structure Design under different m can be as follows.

Fig. 5 divides the structural representation of back decoder for carrying out the sublayer in the embodiment of the invention, referring to Fig. 5, comprises memory cell 51, first switching network 52, processing unit 53, second switching network 54.Wherein, memory cell 51 is divided into the N piece, and every comprises L APP information; Every first switching network 52 is made up of Q second multiplexer (MUX), and each second multiplexer is the multiplexer of m:1; Every second switching network 54 is made up of Q second demodulation multiplexer, and each second demodulation multiplexer is the demodulation multiplexer of 1:m.Wherein, L=m * Q.

After APP information is read, send into first switching network 52, order that obtains expecting and number from memory cell 51, through after the processing of processing unit 53, revert to original order and number through second switching network 54 again, be kept at last in the memory cell 51, in order to upgrade original value.

By structure shown in Figure 5, can calculate throughput and resource consumption situation when counting in different sublayers, for example, after m determines, then the situation of the memory cell of Cai Yonging, multiplexer, processing unit, demodulation multiplexer is all determined, therefore can count the resource consumption situation; After m determines simultaneously, can calculate throughput according to decoding delay, code word size.Therefore, throughput in the time of can calculating different m and resource consumption situation afterwards, can will reach the sublayer number of the most corresponding m of throughput demand and resource consumption as optimum.

By above-mentioned analysis, after obtaining optimum sublayer and counting m, then can be optimized existing message handler, it is the processing unit that existing message handler comprises L parallel processing, and can comprise multiplexer, processing unit and demodulation multiplexer in the inventive embodiments, wherein, the number of processing unit is Q.Though present embodiment need to increase multiplexer and demodulation multiplexer, because logical resource consumption is mainly in processing unit, and present embodiment can obviously reduce the number (reducing to original 1/m) of processing unit, therefore, can reduce the logical resource expense.Particularly, present embodiment provides decoder as follows.

Fig. 6 is the decoder architecture schematic diagram of first embodiment of the invention, reduces resource consumption when present embodiment can guarantee throughput.Referring to Fig. 6, present embodiment comprises Q * N second multiplexer 61, a Q processing unit 62 and Q * N second demodulation multiplexer 63.Wherein, Q second multiplexer 61 formed one first switching network, first switching network is the N piece altogether, every first switching network is made up of Q second multiplexer 61 (second multiplexer, 1～the second multiplexer Q), each second multiplexer 61 is the multiplexer of m:1, every first switching network is used for L data of input are carried out multiplexing process, export Q data, wherein, m is the layered optimization number of predetermined check matrix, and N is the columns of check matrix, and L is line number or the columns that every layer of check matrix comprises, Q is line number or the columns that each sublayer of check matrix comprises, L=m * Q; Q processing unit 62 (processing unit 1～processing unit Q) is connected with Q second multiplexer 61 of every first switching network respectively, is used for the data of parallel processing N piece first switching network output; N piece second switching network, every second switching network is made up of Q second demodulation multiplexer 63 (second demodulation multiplexer, 1～the second demodulation multiplexer Q), each second demodulation multiplexer 63 is the demodulation multiplexer of 1:m, every second switching network is connected with Q processing unit respectively, every second switching network is used for that Q data of input are carried out demultiplexing to be handled, and exports L data.

Present embodiment only needs Q processing unit, rather than as an available technology adopting L processing unit, can reduce resource consumption, and can stream treatment between the data of each sublayer correspondence, can improve throughput, therefore, present embodiment can reduce resource consumption when guaranteeing throughput.

Corresponding to this structure, the embodiment of the invention provides a kind of processing method that reduces resource consumption when guaranteeing throughput, and is specific as follows:

Fig. 7 is the method flow schematic diagram of second embodiment of the invention, and present embodiment can adopt the described message handler of Fig. 6 that data are handled, and referring to Fig. 7, comprising:

Step 71: adopt every first switching network to receive L data of input, and Q data of output, wherein, be total to N piece first switching network, every first switching network is made up of Q second multiplexer, each second multiplexer is the multiplexer of m:1, m is the layered optimization number of predetermined check matrix, and N is the columns of check matrix, and L is line number or the columns that every layer of check matrix comprises, Q is line number or the columns that each sublayer of check matrix comprises, L=m * Q;

Step 72: adopt Q processing unit for parallel to handle the data of N piece first switching network output;

Step 73: adopt every second switching network that Q data of input are carried out demultiplexing and handle, export L data, wherein, be total to N piece second switching network, every second switching network is made up of Q second demodulation multiplexer, and each second demodulation multiplexer is the demodulation multiplexer of 1:m.

Above-mentioned is the optimization that the number of processing unit is carried out, and can also be optimized processing unit itself, specific as follows:

The data of each processing unit single treatment delegation correspondence, for example, N is 10, then 10 data of each processing unit single treatment.Wherein, processing unit can carry out full parallel processing to these 10 data, also can carry out full serial and handle.

Fig. 8 referring to Fig. 8, is that 10 data are handled simultaneously for the sequential schematic diagram that processing unit of the prior art carries out full parallel processing to the input data, comprise RE, min, WR, wherein, min represents to ask minimum and operation, this operation is example just, also can carry out other operations.

Full parallel processing can be calculated simultaneously to 10 data, but min wherein need take very big resource, and needs a plurality of clock cycle just can finish.

Fig. 9 carries out the sequential schematic diagram that full serial is handled for processing unit of the prior art to the input data, referring to Fig. 9, be that each data is handled respectively, each data is read (RE) operation and min operation successively, afterwards, all finish after the min operation Deng 10 data, write (WR) operation more successively.

The required logical resource of full serial seldom, the min computing only needs a clock cycle to finish, still, decoding delay at this moment is maximum, has also just reduced the throughput of decoder.

In the present embodiment, consuming situation in order to satisfy throughput demand and to reduce logical resource as far as possible, can carry out piecemeal to the data of processing unit input, for example be divided into the n piece, is that n carries out parallel processing to the input data afterwards with the degree of parallelism.At this moment, need to determine optimum block count n.Be similar to the design of message handler, for processing unit, the throughput and the resource consumption situation of correspondence in the time of also can calculating different block count will reach the most corresponding block count of throughput demand and resource consumption and be defined as optimum block count.

Figure 10 comprises P first multiplexer 101 (first multiplexer, 1～the first multiplexer P), a P processing module 102 (processing module 1～processing module P) and P first demodulation multiplexer 103 (first demodulation multiplexer, 1～the first demodulation multiplexer P) for the structural representation that the present invention the 3rd executes the processing unit of example; P first multiplexer 101 is used for N data of input are carried out multiplexing process, exports P data, and wherein, each first multiplexer is the multiplexer of n:1, the optimum block count of n for obtaining in advance, and P is the number of data in each piecemeal, N=n * P; P processing module 102 is connected with described P first multiplexer 101 respectively, is used for the data of described first multiplexer of parallel processing 101 outputs; P first demodulation multiplexer 103 is connected with a described P processing module 102 respectively, is used for that P data of described processing module 102 outputs are carried out demultiplexing and handles, and exports N data, and wherein, each first demodulation multiplexer 103 is respectively the demodulation multiplexer of 1:n.

Be similar to the processing of layer being divided the sublayer, present embodiment also can calculate the throughput and the logical resource situation of correspondence when different piecemeals, determines to obtain optimum block count n.Afterwards, adopt the multiplexer of above-mentioned 1:n, demodulation multiplexer and the processing module of n:1 to handle.Wherein, the specific implementation of processing module 102 can adopt the module of carrying out parallel processing in the existing processing unit.

Present embodiment passes through to adopt multiplexer, can realize the piecemeal to the input data, and then reduces the number of processing module and improve the processing time-delay, is implemented in and reduces resource consumption when guaranteeing throughput.

Figure 11 is the method flow schematic diagram of fourth embodiment of the invention, and present embodiment can adopt the described message handler of Figure 10 that data are handled, and referring to Figure 11, comprising:

Step 111: adopt P first multiplexer that N data of input are carried out multiplexing process, export P data, wherein, each first multiplexer is the multiplexer of n:1, the optimum block count of n for obtaining in advance, and P is the number of the data of each piecemeal, N=n * P;

Step 112: adopt P the described P of a processing module parallel processing data;

Step 113: P data after adopting P first demodulation multiplexer to parallel processing are carried out the demultiplexing processing, export N data, and wherein, each first demodulation multiplexer is respectively the demodulation multiplexer of 1:n.

In order to guarantee certain throughput and to reduce resource consumption, first embodiment and the 3rd embodiment can implement respectively, also embodiment simultaneously.Promptly can the number of processing unit be optimized, also can be optimized, also can all be optimized the number of processing unit and the inside of processing unit to processing unit inside.

Above-mentioned is that to divide the sublayer with horizontal direction be example, is understandable that, still can consult and carry out when dividing the sublayer with vertical direction.

Above-mentioned is prioritization scheme to message handler, and the embodiment of the invention also provides the prioritization scheme to switching network.

Figure 12 is the structural representation of the switching network of fifth embodiment of the invention, comprises at least two shift module 121, the step-length difference of each shift module 121.

Referring to Figure 10, for example, three kinds of step-length: step1, step2, step3 are set, then be input to the transportable figure place of output at every turn and be: k=n1 * step1+n2 * step2+n3 * step3, wherein, n1, n2, n3 are respectively under the corresponding step-length step number that moves.

In the existing ldpc decoder, switching network is based on the exchange of cyclic shift, and for example, the input data are 1,2,3 ..., N, dateout is k, k+1, k+2 ..., N, 1,2,3 ..., k-1 supposes that step length of cyclic shift is 1, then obtaining above-mentioned exchange needs the mobile k step.When k is bigger, need very big time-delay, cause throughput lower.In order to improve throughput, switching network also can adopt the mode of circuit switched to realize, in this mode, a link is set between every kind of possible inputoutput data, and this mode can improve throughput, but when N was big, the resource of consumption was very big.

Realize that the figure place that moves is k, if employing existing fixed step-length is 1 displacement mode, need the mobile k step, and present embodiment only needs n1+n2+n3, when k was big, n1+n2+n3 was far smaller than k, and therefore, present embodiment can reduce time delay, the raising throughput; In addition, when the data of input and output are respectively N, if adopt the mode of existing circuit switched, need to dispose N circuit, and present embodiment only need be disposed the individual circuit of 3 (numbers of the step-length of setting), when N is big, N is far longer than 3, and therefore, present embodiment can reduce resource consumption.

Wherein, the multiple step-length that adopts in the present embodiment can be provided with according to actual needs, is not limited to 3, and concrete numerical value is also unrestricted.

Present embodiment adopts the mode of variable step, multistage displacement, the long problem of time delay that can avoid the single shift mode to cause, and the bigger problem of resource consumption that can avoid the multi-line exchange to cause realizes reducing time-delay and reduces resource consumption.

The variable step switching network of present embodiment is not limited in the ldpc decoder, and other N the data route switching application scenarioss that are input to N output are all applicable.

Based on processing unit shown in Figure 10 and switching network shown in Figure 12, the embodiment of the invention can provide a kind of decoder, referring to Figure 13, Figure 13 is the decoder architecture schematic diagram of sixth embodiment of the invention, comprise N piece switching network 131 and L processing unit 132, wherein, N is the columns of check matrix, L is line number or the columns that every layer of check matrix comprises, particularly, the structure of every block of switching network 131 can be referring to shown in Figure 10, and the structure of each processing unit 132 can be referring to shown in Figure 12.

Figure 14 is the method flow schematic diagram of seventh embodiment of the invention, referring to Figure 14, comprising:

Step 141: adopt N piece switching network that the data of input are carried out at least twice shifting function, wherein, the step-length difference of each displacement, and N data of output;

Step 142: adopt P first multiplexer that N data of input are carried out multiplexing process, export P data, wherein, each first multiplexer is the multiplexer of n:1, the optimum block count of n for obtaining in advance, and P is the number of the data of each piecemeal, N=n * P;

Step 143: adopt P the described P of a processing module parallel processing data;

Step 144: P data after adopting P first demodulation multiplexer to parallel processing are carried out the demultiplexing processing, export N data, and wherein, each first demodulation multiplexer is respectively the demodulation multiplexer of 1:n.

One of ordinary skill in the art will appreciate that: all or part of step that realizes said method embodiment can be finished by the relevant hardware of program command, aforesaid program can be stored in the computer read/write memory medium, this program is carried out the step that comprises said method embodiment when carrying out; And aforesaid storage medium comprises: various media that can be program code stored such as ROM, RAM, magnetic disc or CD.

It should be noted that at last: above embodiment only in order to technical scheme of the present invention to be described, is not intended to limit; Although with reference to previous embodiment the present invention is had been described in detail, those of ordinary skill in the art is to be understood that: it still can be made amendment to the technical scheme that aforementioned each embodiment put down in writing, and perhaps part technical characterictic wherein is equal to replacement; And these modifications or replacement do not make the essence of appropriate technical solution break away from the spirit and scope of various embodiments of the present invention technical scheme.

Claims

1. a processing unit is characterized in that, comprising:

2. a decoder is characterized in that, comprising:

3. decoder according to claim 2 is characterized in that, described processing unit is the described processing unit of claim 1.

4. a decoder is characterized in that, comprises L processing unit as claimed in claim 1, also comprises:

5. a processing method that reduces resource consumption when guaranteeing throughput is characterized in that, comprising:

Adopt P the described P of a processing module parallel processing data;

6. method according to claim 5 is characterized in that, also comprises:

Adopt N piece switching network that the data of input are carried out at least twice shifting function, wherein, the step-length difference of each displacement, and N data of output are given described P first multiplexer respectively.

7. according to claim 5 or 6 described methods, it is characterized in that, described optimum block count n determines in the following way: calculate resource consumption situation and throughput when different block counts, determining to reach throughput demand and resource consumption block count the most after a little while is optimum block count n.

8. a processing method that reduces resource consumption when guaranteeing throughput is characterized in that, comprising:

9. method according to claim 8 is characterized in that, each processing unit is handled the data of N piece first switching network output in the following way:

The described P of a parallel processing data;

10. according to Claim 8 or 9 described methods, it is characterized in that, described layered optimization is counted m and determined in the following way: calculate resource consumption situation and throughput when different hierarchy numbers, determining to reach throughput demand and resource consumption hierarchy number the most after a little while is that layered optimization is counted m.