CN102142926B - Processing method, processing unit and decoder for reducing resource consumption while ensuring throughput - Google Patents

Processing method, processing unit and decoder for reducing resource consumption while ensuring throughput Download PDF

Info

Publication number
CN102142926B
CN102142926B CN201010254630.3A CN201010254630A CN102142926B CN 102142926 B CN102142926 B CN 102142926B CN 201010254630 A CN201010254630 A CN 201010254630A CN 102142926 B CN102142926 B CN 102142926B
Authority
CN
China
Prior art keywords
data
multiplexer
block
switching network
resource consumption
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201010254630.3A
Other languages
Chinese (zh)
Other versions
CN102142926A (en
Inventor
常德远
肖治宇
喻凡
李扬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN201010254630.3A priority Critical patent/CN102142926B/en
Publication of CN102142926A publication Critical patent/CN102142926A/en
Application granted granted Critical
Publication of CN102142926B publication Critical patent/CN102142926B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Time-Division Multiplex Systems (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

The invention provides a processing method, processing unit and decoder for reducing resource consumption while ensuring throughput. The decoder comprises N first switch networks, Q processing units and N second switch networks, wherein each first switch network consists of Q second multiplexers; each second multiplexer is an m:1 multiplexer; the Q processing units are connected with the Q second multiplexers of each first switch network and used for processing the data output by the N multiplexers in parallel; each second switch network consists of Q second demultiplexers; each second demultiplexer is a 1:m demultiplexer; the Q second demultiplexers of each second switch network are connected with the Q processing units respectively; and each second switch network is used for demultiplexing the input Q data and outputting L data. According to the embodiment of the invention, the resource consumption is reduced while the throughput is ensured.

Description

The processing method of resource consumption, processing unit and decoder is reduced when ensureing throughput
Technical field
The present invention relates to data processing technique, particularly relate to a kind of processing method, processing unit and the decoder that reduce resource consumption when ensureing throughput.
Background technology
Low-density parity effect code (Low Density Parity Code, LDPC) be a kind of forward error correction (Forward Error Correction with the gain characteristic of programmable single-chip system shannon limit, FEC) code word, but the logical resource that the realization of decoding of LDPC code consumes is very large.Quasi-cyclic LDPC (Quasi-Cyclic, QC-LDPC) be a kind of special LDPC code, its check matrix is except having the total openness characteristic of LDPC code, a characteristic feature is also had to be that check matrix is made up of a series of circular matrix, this circular matrix is called the submatrix of check matrix, such as, the submatrix being 1 by M × N number of degree of M capable N row forms, and each submatrix is the matrix of L × L.While having premium properties, QC-LDPC code more easily realizes encoding and decoding.
In existing ldpc decoder, check matrix can be divided into M layer by row or be divided into N layer by row, every layer has L capable or L row, by L serial process unit (SerialProcessing Unit in every layer, SPU) codeword information that parallel processing L is capable or L row are corresponding, the codeword information that each SPU serial process a line is corresponding simultaneously, could output processing result after namely needing by the time all to read in all data of a line correlation.Further, be serial process between each layering, after namely SPU processes the information corresponding to one deck, one deck information under reprocessing.For needing the decoder of large throughput, such logical resource spent by serial decoding structure be huge be even difficult to practical.
Realizing in process of the present invention, inventor finds that in prior art, at least there are the following problems: in order to realize large throughput in prior art, resource consumption is larger.
Summary of the invention
The embodiment of the present invention is to provide a kind of processing method, processing unit and the decoder that reduce resource consumption when ensureing throughput, in order to solve the larger problem of the resource consumption when high-throughput that exists in prior art.
On the one hand, embodiments provide a kind of processing unit, comprising:
P the first multiplexer, for carrying out multiplexing process to N number of data of input, export P data, wherein, each first multiplexer is the multiplexer of n:1, and n is the optimum block count obtained in advance, and P is the number of data in each piecemeal, N=n × P;
P processing module, is connected with described P the first multiplexer, respectively for the data that the first multiplexer described in parallel processing exports;
P the first demodulation multiplexer, is connected with a described P processing module respectively, and carry out demultiplexing process for exporting P data to described processing module, export N number of data, wherein, each first demodulation multiplexer is respectively the demodulation multiplexer of 1:n.
On the other hand, the invention provides a kind of decoder, comprising:
N block first switching network, every block first switching network is made up of Q the second multiplexer, each second multiplexer is the multiplexer of m:1, and every block first switching network is used for carrying out multiplexing process to L data of input, exports Q data, wherein, m is the layered optimization number of predetermined check matrix, and N is the columns of check matrix, and L is the line number that comprises of check matrix every layer or columns, Q is the line number that comprises of each sublayer of check matrix or columns, L=m × Q;
Q processing unit, is connected, for the data that parallel processing N block first switching network exports with Q the second multiplexer of every block first switching network respectively;
N block second switching network, every block second switching network is made up of Q the second demodulation multiplexer, each second demodulation multiplexer is the demodulation multiplexer of 1:m, Q the second demodulation multiplexer of every block second switching network is connected with Q processing unit respectively, every block second switching network is used for carrying out demultiplexing process to Q data of input, exports L data.
Again on the one hand, embodiments provide a kind of decoder, comprise above-mentioned processing unit, also comprise:
N block switching network, every block switching network is connected with each processing unit respectively, and every block switching network comprises the shift module of at least two, the step-length of each shift module is different, wherein, N is the columns of check matrix, and L is the line number that comprises of check matrix every layer or columns.
On the one hand, embodiments provide a kind of processing method reducing resource consumption when ensureing throughput, comprising:
Adopt the N number of data of P the first multiplexer to input to carry out multiplexing process, export P data, wherein, each first multiplexer is the multiplexer of n:1, and n is the optimum block count obtained in advance, and P is the number of the data of each piecemeal, N=n × P;
Adopt P data described in P processing module parallel processing;
Adopt P the first demodulation multiplexer to carry out demultiplexing process to the data of the P after parallel processing, export N number of data, wherein, each first demodulation multiplexer is respectively the demodulation multiplexer of 1:n.
On the other hand, embodiments provide a kind of processing method reducing resource consumption when ensureing throughput, comprising:
Every block first switching network is adopted to receive L data of input, and export Q data, wherein, N block first switching network altogether, every block first switching network is made up of Q the second multiplexer, each second multiplexer is the multiplexer of m:1, m is the layered optimization number of predetermined check matrix, and N is the columns of check matrix, and L is the line number that comprises of check matrix every layer or columns, Q is the line number that comprises of each sublayer of check matrix or columns, L=m × Q;
Adopt the data that Q processing unit for parallel process N block first switching network exports;
Adopt Q the data of every block second switching network to input to carry out demultiplexing process, export L data, wherein, N block second switching network altogether, every block second switching network is made up of Q the second demodulation multiplexer, and each second demodulation multiplexer is the demodulation multiplexer of 1:m.
As shown from the above technical solution, the embodiment of the present invention, by carrying out multiplexing process to the data of input, can make the number of modules of follow-up parallel processing reduce, and reduces resource consumption; Therefore, the embodiment of the present invention when ensureing high-throughput, can reduce resource consumption.
Accompanying drawing explanation
In order to be illustrated more clearly in the embodiment of the present invention or technical scheme of the prior art, be briefly described to the accompanying drawing used required in embodiment or description of the prior art below, apparently, accompanying drawing in the following describes is some embodiments of the present invention, for those of ordinary skill in the art, under the prerequisite not paying creative work, other accompanying drawing can also be obtained according to these accompanying drawings.
Fig. 1 is the structural representation of ldpc decoder in prior art;
Fig. 2 is the structural representation of the check matrix of QC-LDPC code in prior art;
Fig. 3 is that message handler of the prior art is according to time diagram during layer process;
Fig. 4 is that message handler in the embodiment of the present invention is according to time diagram during sub-layer processes;
Fig. 5 carries out the structural representation that sublayer divides rear decoder in the embodiment of the present invention;
Fig. 6 is the decoder architecture schematic diagram of first embodiment of the invention;
Fig. 7 is the method flow schematic diagram of second embodiment of the invention;
Fig. 8 is that processing unit of the prior art carries out the time diagram of full parellel process to input data;
Fig. 9 is that processing unit of the prior art carries out the time diagram of full serial process to input data;
Figure 10 is the structural representation that the present invention the 3rd executes the processing unit of example;
Figure 11 is the method flow schematic diagram of fourth embodiment of the invention;
Figure 12 is the structural representation of the switching network of fifth embodiment of the invention;
Figure 13 is the decoder architecture schematic diagram of sixth embodiment of the invention
Figure 14 is the method flow schematic diagram of seventh embodiment of the invention.
Embodiment
For making the object of the embodiment of the present invention, technical scheme and advantage clearly, below in conjunction with the accompanying drawing in the embodiment of the present invention, technical scheme in the embodiment of the present invention is clearly and completely described, obviously, described embodiment is the present invention's part embodiment, instead of whole embodiments.Based on the embodiment in the present invention, those of ordinary skill in the art, not making the every other embodiment obtained under creative work prerequisite, belong to the scope of protection of the invention.
Fig. 1 is the structural representation of ldpc decoder in prior art, comprises memory 11, switching network 12, message handler 13, control sequence generator 14 and process controller 15.Wherein, memory 11 stores codeword information, such as posterior probability (A Posteriori Probability, APP) information; The exchange flow process of above-mentioned module is as follows: after the one piece of data in the codeword information in memory 11 is read, and sends into switching network 12, is obtained the order expected by switching network 12, sends into message handler 13; Message handler 13 pairs of data process, and message handler 13 is made up of processing unit for parallel.Message handler 13 delivers to switching network 12 again by processing the updated value obtained, and switching network 12 pairs of data carry out exchanging the data obtaining original order, are stored into the relevant position of memory 11 afterwards, the data read before renewal.So far the once renewal of one piece of data is completed.Continue to read lower one piece of data, repeat said process, until run through a code word, the iteration at this moment just completing a code word upgrades.Repeat said process, after carrying out the iteration renewal of the number of times preset altogether, complete decoding, final decode results is stored in memory 11.Wherein, switching network 12 pairs of data are carried out exchanging the order obtained and are controlled by control sequence generator 14, and control sequence generator 14 produces control signal, to control the exchange that the data of switching network 12 to input carry out different order.In addition, there is the operations such as read-write in the data of each module, such as, first read data from memory 11, afterwards again by writing data into memory 11, therefore, each module needs read-write sequence to control, and process controller 15 is for carrying out the controls such as read-write sequence.Concrete data flow and control flow check trend can represent data flow see the solid line in Fig. 1, Fig. 1, and dotted line represents control flow check.
Fig. 2 is the structural representation of the check matrix of QC-LDPC code in prior art, see Fig. 2, the submatrix that check matrix H is arranged by the capable N of M forms, each submatrix is the matrix of the capable L row of L, each submatrix obtains after unit matrix cyclic shift, unit matrix is the element on diagonal is 1, and all the other elements are the matrix of 0.
According to the limit positions of submatrix, check matrix can be divided into the level course of M layer, every layer is that L is capable; Or be divided into the perpendicular layers of N layer, every layer is L row.During decoding, in order to ensure that decoding performance reduces computing iterations simultaneously, belief propagation (the Belief Propagation of mixing can be adopted, BP) algorithm, in this algorithm, the information corresponding to check matrix carries out successively computing, and in a same iteration, the information of this layer of computing renewal will use when lower one deck computing.
Fig. 3 is that message handler of the prior art is according to time diagram during layer process, see Fig. 3, suppose that every layer of line number L comprised is 99, the sequential operation that then 1 ~ 99 row of the 1st layer is corresponding comprises reads (RE), operation-1, operation-2, operation-3 and writes (WR), wherein, operate-1, operate-2, operate-3 logical operations carried out needed for message handler.The present embodiment be operate-1, operation-2, operation-3 be example, but the number of logical operation is not limited to 3.
Owing to being serial process between layering, therefore, after the sequential operation that 1 ~ 99 row of the 2nd layer is corresponding will wait until that the sequential operation of the 1st layer completes, namely, participate in Fig. 3, the logical operation of the 2nd layer reads (RE) after (WR) is write in the logical operation of the 1st layer.Owing to needing to wait between each layer, need to insert certain wait clock, cause the free time of logical resource to waste.
Above-mentioned mode of carrying out parallel processing according to layer, needs L processing unit, and needs serial process between each layer, needs larger time delay, therefore when reaching certain throughput demand, needs more resource consumption.
In order to reduce resource consumption, the embodiment of the present invention can be optimized the number of the processing unit in message handler.Being optimized the number of processing unit is that be divided into multiple sublayer, the data corresponding to each sublayer carry out parallel processing by again dividing every layer of check matrix, instead of processes each layer of corresponding data.Such as, L=99 is capable, and every layer is divided into m=3 sublayer, and it is capable that each sublayer comprises Q=33.If processed in units of layer, then need 99 processing units, and process in units of sublayer, only need Q=33 processing unit.
Fig. 4 is that message handler in the embodiment of the present invention is according to time diagram during sub-layer processes, see Fig. 4, due to the information between each sublayer uncorrelated (this is because each row and column of the submatrix of H matrix only have a nonzero element to determine), therefore, can continuous productive process between the data that each sublayer is corresponding, namely need not as shown in Figure 3, after 2nd layer of corresponding data perform and will wait until that the 1st layer of corresponding data execution terminates, such as, see Fig. 4, sequential RE corresponding to the 2nd sublayer need not wait for that sequential operation WR corresponding to the 1st sublayer executes.In mode shown in Fig. 3, the delay inequality that the codeword data of each layer correspondence performs is the operating time of each layer, and according to the pipeline mode behind above-mentioned division sublayer, the delay inequality that the codeword data of each layer correspondence performs is the operating time being less than each layer, therefore can reduce time delay, improve throughput.
Because the present embodiment needs to be divided into sublayer to each layer, therefore need to determine optimum sub-number of stories m.Wherein, the throughput under different m and resource consumption situation can be calculated, will throughput demand be met and resource consumption is minimum time corresponding m as optimum sublayer number.Wherein, the Logic Structure Design under different m can be as follows.
Fig. 5 carries out the structural representation that sublayer divides rear decoder in the embodiment of the present invention, see Fig. 5, comprise memory cell 51, first switching network 52, processing unit 53, second switching network 54.Wherein, memory cell 51 is divided into N block, and every block comprises L APP information; Every block first switching network 52 is made up of Q the second multiplexer (MUX), and each second multiplexer is the multiplexer of m:1; Every block second switching network 54 is made up of Q the second demodulation multiplexer, and each second demodulation multiplexer is the demodulation multiplexer of 1:m.Wherein, L=m × Q.
After APP information reads from memory cell 51, send into the first switching network 52, obtain order and the number of expectation, after the process of processing unit 53, revert to original order and number through the second switching network 54 again, be finally kept in memory cell 51, in order to upgrade original value.
By the structure shown in Fig. 5, the throughput when different sublayer numbers and resource consumption situation can be calculated, such as, after m determines, the situation of the memory cell then adopted, multiplexer, processing unit, demodulation multiplexer is all determined, therefore can count resource consumption situation; After m determines simultaneously, throughput can be calculated according to decoding delay, code word size.Therefore, throughput when can calculate different m and resource consumption situation, afterwards, can will reach throughput demand and resource consumption is minimum time corresponding m as the sublayer number of optimum.
By above-mentioned analysis, after obtaining optimum sub-number of stories m, then can be optimized existing message handler, namely existing message handler comprises the processing unit of L parallel processing, and multiplexer, processing unit and demodulation multiplexer in inventive embodiments, can be comprised, wherein, the number of processing unit is Q.Although the present embodiment needs to increase multiplexer and demodulation multiplexer, but because logical resource consumption is mainly in processing unit, and the present embodiment obviously can reduce the number (reducing to original 1/m) of processing unit, therefore, logical resource expense can be reduced.Particularly, the present embodiment provides decoder as follows.
Fig. 6 is the decoder architecture schematic diagram of first embodiment of the invention, reduces resource consumption when the present embodiment can ensure throughput.See Fig. 6, the present embodiment comprises Q × N number of second multiplexer 61, a Q processing unit 62 and Q × N second demodulation multiplexer 63.Wherein, Q the second multiplexer 61 forms first switching network, first switching network is N block altogether, every block first switching network is made up of Q the second multiplexer 61 (the second multiplexer 1 ~ the second multiplexer Q), each second multiplexer 61 is the multiplexer of m:1, every block first switching network is used for carrying out multiplexing process to L data of input, export Q data, wherein, m is the layered optimization number of predetermined check matrix, N is the columns of check matrix, L is the line number that comprises of check matrix every layer or columns, Q is the line number that comprises of each sublayer of check matrix or columns, L=m × Q, Q processing unit 62 (processing unit 1 ~ processing unit Q) is connected, for the data that parallel processing N block first switching network exports with Q the second multiplexer 61 of every block first switching network respectively, N block second switching network, every block second switching network is made up of Q the second demodulation multiplexer 63 (the second demodulation multiplexer 1 ~ the second demodulation multiplexer Q), each second demodulation multiplexer 63 is the demodulation multiplexer of 1:m, every block second switching network is connected with Q processing unit respectively, every block second switching network is used for carrying out demultiplexing process to Q data of input, exports L data.
The present embodiment only needs Q processing unit, instead of as adopted L processing unit in prior art, can reduce resource consumption, and can stream treatment between the data that each sublayer is corresponding, can improve throughput, therefore, the present embodiment can reduce resource consumption when ensureing throughput.
Corresponding to this structure, embodiments provide a kind of processing method reducing resource consumption when ensureing throughput, specific as follows:
Fig. 7 is the method flow schematic diagram of second embodiment of the invention, and the present embodiment can adopt the message handler described in Fig. 6 to process data, see Fig. 7, comprising:
Step 71: adopt every block first switching network to receive L data of input, and export Q data, wherein, N block first switching network altogether, every block first switching network is made up of Q the second multiplexer, each second multiplexer is the multiplexer of m:1, m is the layered optimization number of predetermined check matrix, and N is the columns of check matrix, and L is the line number that comprises of check matrix every layer or columns, Q is the line number that comprises of each sublayer of check matrix or columns, L=m × Q;
Step 72: adopt the data that Q processing unit for parallel process N block first switching network exports;
Step 73: adopt Q the data of every block second switching network to input to carry out demultiplexing process, export L data, wherein, N block second switching network altogether, every block second switching network is made up of Q the second demodulation multiplexer, and each second demodulation multiplexer is the demodulation multiplexer of 1:m.
The present embodiment only needs Q processing unit, instead of as adopted L processing unit in prior art, can reduce resource consumption, and can stream treatment between the data that each sublayer is corresponding, can improve throughput, therefore, the present embodiment can reduce resource consumption when ensureing throughput.
Above-mentioned is the optimization carried out the number of processing unit, can also be optimized processing unit itself, specific as follows:
The data that each processing unit single treatment a line is corresponding, such as, N is 10, then each processing unit single treatment 10 data.Wherein, processing unit can carry out full parellel process to these 10 data, also can carry out full serial process.
Fig. 8 is that processing unit of the prior art carries out the time diagram of full parellel process to input data, see Fig. 8, is process 10 data simultaneously, comprise RE, min, WR, wherein, min represent ask minimum and operation, this operation is example, also can carry out other operations.
Full parellel process can calculate 10 data simultaneously, but min wherein needs to take very large resource, and needs multiple clock cycle just can complete.
Fig. 9 is that processing unit of the prior art carries out the time diagram of full serial process to input data, see Fig. 9, that each data are processed respectively, each data carry out reading (RE) operation and min operation successively, afterwards, after 10 data have all carried out min operation, then carry out successively writing (WR) operation.
Logical resource needed for full serial is little, and min computing only needs a clock cycle to complete, but decoding delay is at this moment maximum, also just reduces the throughput of decoder.
In the present embodiment, in order to meet throughput demand and reduce logical resource Expenditure Levels as far as possible, can carry out piecemeal, such as, be divided into n block to the data of processing unit input, be that n carries out parallel processing to input data afterwards with degree of parallelism.Now, need to determine optimum block count n.Be similar to the design of message handler, for processing unit, corresponding throughput and resource consumption situation when also can calculate different block count, will throughput demand be reached and resource consumption is minimum time corresponding block count be defined as optimum block count.
Figure 10 is the structural representation that the present invention the 3rd executes the processing unit of example, comprises P the first multiplexer 101 (the first multiplexer 1 ~ the first multiplexer P), a P processing module 102 (processing module 1 ~ processing module P) and P the first demodulation multiplexer 103 (the first demodulation multiplexer 1 ~ the first demodulation multiplexer P); P the first multiplexer 101 is for carrying out multiplexing process to N number of data of input, and export P data, wherein, each first multiplexer is the multiplexer of n:1, and n is the optimum block count obtained in advance, and P is the number of data in each piecemeal, N=n × P; P processing module 102 is connected with described P the first multiplexer 101, respectively for the data that the first multiplexer 101 described in parallel processing exports; P the first demodulation multiplexer 103 is connected with a described P processing module 102 respectively, and carrying out demultiplexing process for exporting P data to described processing module 102, exporting N number of data, wherein, each first demodulation multiplexer 103 is respectively the demodulation multiplexer of 1:n.
Be similar to process layer being divided to sublayer, the present embodiment also can calculate corresponding throughput and logical resource situation when different piecemeals, determines to obtain optimum block count n.Afterwards, the multiplexer of above-mentioned 1:n, the demodulation multiplexer of n:1 and processing module is adopted to process.Wherein, the specific implementation of processing module 102 can adopt the module of carrying out parallel processing in existing processing unit.
The present embodiment, by adopting multiplexer, can realize the piecemeal to input data, and then reduces the number of processing module and improve process time delay, realizes reducing resource consumption when ensureing throughput.
Corresponding to this structure, embodiments provide a kind of processing method reducing resource consumption when ensureing throughput, specific as follows:
Figure 11 is the method flow schematic diagram of fourth embodiment of the invention, and the present embodiment can adopt the message handler described in Figure 10 to process data, see Figure 11, comprising:
Step 111: adopt the N number of data of P the first multiplexer to input to carry out multiplexing process, export P data, wherein, each first multiplexer is the multiplexer of n:1, and n is the optimum block count obtained in advance, and P is the number of the data of each piecemeal, N=n × P;
Step 112: adopt P data described in P processing module parallel processing;
Step 113: adopt P the first demodulation multiplexer to carry out demultiplexing process to the data of the P after parallel processing, export N number of data, wherein, each first demodulation multiplexer is respectively the demodulation multiplexer of 1:n.
The present embodiment, by adopting multiplexer, can realize the piecemeal to input data, and then reduces the number of processing module and improve process time delay, realizes reducing resource consumption when ensureing throughput.
In order to ensure certain throughput and reduce resource consumption, the first embodiment and the 3rd embodiment can be implemented respectively, also can simultaneously embodiment.Namely can be optimized the number of processing unit, also can be optimized processing unit inside, also all can be optimized the inside of the number of processing unit and processing unit.
Above-mentioned is divide sublayer for horizontal direction, is understandable that, still can implements when dividing sublayer with vertical direction.
Above-mentioned is prioritization scheme to message handler, and the embodiment of the present invention additionally provides the prioritization scheme to switching network.
Figure 12 is the structural representation of the switching network of fifth embodiment of the invention, comprises the shift module 121 of at least two, and the step-length of each shift module 121 is different.
See Figure 10, such as, three kinds of step-lengths are set: step1, step2, step3, are then input at every turn export transportable figure place and be: k=n1 × step1+n2 × step2+n3 × step3, wherein, n1, n2, n3 are respectively the step number of movement under corresponding step-length.
In existing ldpc decoder, switching network is the exchange based on cyclic shift, and such as, input data are 1,2,3 ..., N, exporting data is k, k+1, k+2 ..., N, 1,2,3 ..., k-1, supposes that step length of cyclic shift is 1, then obtaining above-mentioned exchange needs mobile k to walk.When k is larger, needs very large time delay, cause throughput lower.In order to improve throughput, switching network also can adopt the mode of circuit switched to realize, and in which, arrange a link between often kind of possible inputoutput data, which can improve throughput, but when n is large, the resource of consumption is very large.
The figure place realizing movement is k, if adopt existing fixed step size be 1 displacement mode, need mobile k to walk, and the present embodiment only needs n1+n2+n3, when k is larger, n1+n2+n3 is far smaller than k, and therefore, the present embodiment can reduce time delay, improve throughput; In addition, when the data of input and output are respectively N number of, if adopt the mode of existing circuit switched, need to dispose N number of circuit, and the present embodiment only need dispose 3 (number of the step-length of setting) individual circuit, when n is large, N is far longer than 3, therefore, the present embodiment can reduce resource consumption.
Wherein, the multiple step-length adopted in the present embodiment can be arranged according to actual needs, is not limited to 3, and concrete numerical value is also unrestricted.
The present embodiment adopts the mode of variable step, stages shift, the problem that the time delay that single shift mode can be avoided to cause is longer, the problem that resource consumption multi-line can being avoided to exchange cause is larger, realizes reducing time delay and reduction resource consumption.
The variable step switching network of the present embodiment is not limited in ldpc decoder, and other are N number of, and to be input to the data route switching application scenarios of N number of output all applicable.
Based on the switching network shown in the processing unit shown in Figure 10 and Figure 12, the embodiment of the present invention can provide a kind of decoder, see Figure 13, Figure 13 is the decoder architecture schematic diagram of sixth embodiment of the invention, comprise N block switching network 131 and L processing unit 132, wherein, N is the columns of check matrix, L is the line number that comprises of check matrix every layer or columns, particularly, the structure of every block switching network 131 can be shown in Figure 10, and the structure of each processing unit 132 can be shown in Figure 12.
Corresponding to this structure, embodiments provide a kind of processing method reducing resource consumption when ensureing throughput, specific as follows:
Figure 14 is the method flow schematic diagram of seventh embodiment of the invention, see Figure 14, comprising:
Step 141: adopt the data of N block switching network to input to carry out the shifting function of at least twice, wherein, the step-length of each displacement is different, and exports N number of data;
Step 142: adopt the N number of data of P the first multiplexer to input to carry out multiplexing process, export P data, wherein, each first multiplexer is the multiplexer of n:1, and n is the optimum block count obtained in advance, and P is the number of the data of each piecemeal, N=n × P;
Step 143: adopt P data described in P processing module parallel processing;
Step 144: adopt P the first demodulation multiplexer to carry out demultiplexing process to the data of the P after parallel processing, export N number of data, wherein, each first demodulation multiplexer is respectively the demodulation multiplexer of 1:n.
The present embodiment adopts the mode of variable step, stages shift, the problem that the time delay that single shift mode can be avoided to cause is longer, the problem that resource consumption multi-line can being avoided to exchange cause is larger, realizes reducing time delay and reduction resource consumption.
One of ordinary skill in the art will appreciate that: all or part of step realizing said method embodiment can have been come by the hardware that program command is relevant, aforesaid program can be stored in a computer read/write memory medium, this program, when performing, performs the step comprising said method embodiment; And aforesaid storage medium comprises: ROM, RAM, magnetic disc or CD etc. various can be program code stored medium.
Last it is noted that above embodiment is only in order to illustrate technical scheme of the present invention, be not intended to limit; Although with reference to previous embodiment to invention has been detailed description, those of ordinary skill in the art is to be understood that: it still can be modified to the technical scheme described in foregoing embodiments, or carries out equivalent replacement to wherein portion of techniques feature; And these amendments or replacement, do not make the essence of appropriate technical solution depart from the spirit and scope of various embodiments of the present invention technical scheme.

Claims (4)

1. a decoder, is characterized in that, comprising:
N block first switching network, every block first switching network is made up of Q the second multiplexer, each second multiplexer is the multiplexer of m:1, every block first switching network is used for carrying out multiplexing process to L data of input, export Q data, wherein, m is the layered optimization number of predetermined check matrix, N is the columns of check matrix, L is the line number that comprises of check matrix every layer or columns, Q is the line number that comprises of each sublayer of check matrix or columns, L=m × Q, wherein, described layered optimization number m determines in the following way: calculate the resource consumption situation when different hierarchy numbers and throughput, determine to reach throughput demand and resource consumption is minimum time hierarchy number be layered optimization number m,
Q processing unit, is connected, for the data that parallel processing N block first switching network exports with Q the second multiplexer of every block first switching network respectively;
N block second switching network, every block second switching network is made up of Q the second demodulation multiplexer, each second demodulation multiplexer is the demodulation multiplexer of 1:m, Q the second demodulation multiplexer of every block second switching network is connected with Q processing unit respectively, every block second switching network is used for carrying out demultiplexing process to Q data of input, exports L data;
Wherein, described processing unit, comprising:
P the first multiplexer, for carrying out multiplexing process to N number of data of input, export P data, wherein, each first multiplexer is the multiplexer of n:1, n is the optimum block count obtained in advance, P is the number of data in each piecemeal, N=n × P, wherein, described optimum block count n determines in the following way: calculate the resource consumption situation when different block counts and throughput, determine to reach throughput demand and resource consumption is minimum time block count be optimum block count n;
P processing module, is connected with described P the first multiplexer, respectively for the data that the first multiplexer described in parallel processing exports;
P the first demodulation multiplexer, is connected with a described P processing module respectively, and carry out demultiplexing process for exporting P data to described processing module, export N number of data, wherein, each first demodulation multiplexer is respectively the demodulation multiplexer of 1:n.
2. a decoder, is characterized in that, comprises L processing unit, also comprises:
N block switching network, every block switching network is connected with each processing unit respectively, and every block switching network comprises the shift module of at least two, the step-length of each shift module is different, wherein, N is the columns of check matrix, and L is the line number that comprises of check matrix every layer or columns;
Wherein, described processing unit, comprising:
P the first multiplexer, for carrying out multiplexing process to N number of data of input, export P data, wherein, each first multiplexer is the multiplexer of n:1, n is the optimum block count obtained in advance, P is the number of data in each piecemeal, N=n × P, wherein, described optimum block count n determines in the following way: calculate the resource consumption situation when different block counts and throughput, determine to reach throughput demand and resource consumption is minimum time block count be optimum block count n;
P processing module, is connected with described P the first multiplexer, respectively for the data that the first multiplexer described in parallel processing exports;
P the first demodulation multiplexer, is connected with a described P processing module respectively, and carry out demultiplexing process for exporting P data to described processing module, export N number of data, wherein, each first demodulation multiplexer is respectively the demodulation multiplexer of 1:n.
3. reduce a processing method for resource consumption when ensureing throughput, it is characterized in that, comprising:
The N number of data of P the first multiplexer to input are adopted to carry out multiplexing process, export P data, wherein, each first multiplexer is the multiplexer of n:1, and n is the optimum block count obtained in advance, and P is the number of the data of each piecemeal, N=n × P, wherein, described optimum block count n determines in the following way: calculate the resource consumption situation when different block counts and throughput, determine to reach throughput demand and resource consumption is minimum time block count be optimum block count n;
Adopt P data described in P processing module parallel processing;
Adopt P the first demodulation multiplexer to carry out demultiplexing process to the data of the P after parallel processing, export N number of data, wherein, each first demodulation multiplexer is respectively the demodulation multiplexer of 1:n;
Adopt the data of N block switching network to input to carry out the shifting function of at least twice, wherein, the step-length of each displacement is different, and exports N number of data respectively to described P the first multiplexer.
4. reduce a processing method for resource consumption when ensureing throughput, it is characterized in that, comprising:
Every block first switching network is adopted to receive L data of input, and export Q data, wherein, N block first switching network altogether, every block first switching network is made up of Q the second multiplexer, each second multiplexer is the multiplexer of m:1, m is the layered optimization number of predetermined check matrix, N is the columns of check matrix, L is the line number that comprises of check matrix every layer or columns, Q is the line number that comprises of each sublayer of check matrix or columns, L=m × Q, wherein, described layered optimization number m determines in the following way: calculate the resource consumption situation when different hierarchy numbers and throughput, determine to reach throughput demand and resource consumption is minimum time hierarchy number be layered optimization number m,
Adopt the data that Q processing unit for parallel process N block first switching network exports;
Adopt Q the data of every block second switching network to input to carry out demultiplexing process, export L data, wherein, N block second switching network altogether, every block second switching network is made up of Q the second demodulation multiplexer, and each second demodulation multiplexer is the demodulation multiplexer of 1:m;
Wherein, each described processing unit processes the data that described N block first switching network exports in the following way:
The N number of data of P the first multiplexer to input are adopted to carry out multiplexing process, export P data, wherein, each first multiplexer is the multiplexer of n:1, and n is the optimum block count obtained in advance, and P is the number of the data of each piecemeal, N=n × P, wherein, described optimum block count n determines in the following way: calculate the resource consumption situation when different block counts and throughput, determine to reach throughput demand and resource consumption is minimum time block count be optimum block count n;
The data of P described in parallel processing;
Adopt P the first demodulation multiplexer to carry out demultiplexing process to the data of the P after parallel processing, export N number of data, wherein, each first demodulation multiplexer is respectively the demodulation multiplexer of 1:n.
CN201010254630.3A 2010-08-13 2010-08-13 Processing method, processing unit and decoder for reducing resource consumption while ensuring throughput Active CN102142926B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201010254630.3A CN102142926B (en) 2010-08-13 2010-08-13 Processing method, processing unit and decoder for reducing resource consumption while ensuring throughput

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201010254630.3A CN102142926B (en) 2010-08-13 2010-08-13 Processing method, processing unit and decoder for reducing resource consumption while ensuring throughput

Publications (2)

Publication Number Publication Date
CN102142926A CN102142926A (en) 2011-08-03
CN102142926B true CN102142926B (en) 2014-12-31

Family

ID=44410181

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201010254630.3A Active CN102142926B (en) 2010-08-13 2010-08-13 Processing method, processing unit and decoder for reducing resource consumption while ensuring throughput

Country Status (1)

Country Link
CN (1) CN102142926B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9536495B2 (en) * 2014-01-31 2017-01-03 Samsung Display Co., Ltd. System for relayed data transmission in a high-speed serial link

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101106381A (en) * 2007-08-09 2008-01-16 上海交通大学 Hierarchical low density check code decoder and decoding processing method
CN101212277A (en) * 2006-12-29 2008-07-02 中兴通讯股份有限公司 Multi-protocol supporting LDPC decoder

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101212277A (en) * 2006-12-29 2008-07-02 中兴通讯股份有限公司 Multi-protocol supporting LDPC decoder
CN101106381A (en) * 2007-08-09 2008-01-16 上海交通大学 Hierarchical low density check code decoder and decoding processing method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
苏凌杰 等.基于DTMB标准的新型LDPC译码器实现.《福州大学学报(自然科学版)》.2010,第38卷(第2期), *

Also Published As

Publication number Publication date
CN102142926A (en) 2011-08-03

Similar Documents

Publication Publication Date Title
US20140223255A1 (en) Decoder having early decoding termination detection
CN101106381B (en) Hierarchical low density check code decoder and decoding processing method
US8966339B1 (en) Decoder supporting multiple code rates and code lengths for data storage systems
US8996972B1 (en) Low-density parity-check decoder
CN101800559B (en) High-speed configurable QC-LDPC code decoder based on TDMP
CN101771421A (en) Ultrahigh-speed and low-power-consumption QC-LDPC code decoder based on TDMP
US9195536B2 (en) Error correction decoder and error correction decoding method
CN101188426B (en) Decoder for parallel processing of LDPC code of aligning cycle structure and its method
US20150227419A1 (en) Error correction decoder based on log-likelihood ratio data
CN101777921B (en) Structured LDPC code decoding method and device for system on explicit memory chip
KR102543059B1 (en) Method of decoding low density parity check (LDPC) code, decoder and system performing the same
US8225174B2 (en) Decoding device and decoding method
CN103188035A (en) Iterative demapping and decoding method and iterative demapping and decoding system
CN102291153B (en) Decoding method of LDPC (Low Density parity check) code in CMMB (China Mobile multimedia broadcasting) and partial parallel decoder
WO2012094179A1 (en) Low latency simd architecture for multiple iterative decoders arranged parallel
EP2992429B1 (en) Decoder having early decoding termination detection
CN101692611A (en) Multi-standard LDPC encoder circuit base on SIMD architecture
CN102142926B (en) Processing method, processing unit and decoder for reducing resource consumption while ensuring throughput
CN104158549A (en) Efficient decoding method and decoding device for polar code
CN104052500B (en) Ldpc code decoder and implementation method
CN101958718B (en) Improved semi-parallel decoder for low density parity check (LDPC) code and decoding method
CN102611462B (en) LDPC-CC (Low-Density Parity-Check Convolution Codes) decoding algorithm and decoder
CN115694513A (en) Ultra-high throughput rate LDPC decoder based on shift-type base graph
CN102299719B (en) Non-iterative type LDPC code decoder
CN101777920B (en) Coding method and coding and decoding device of low-density parity check code

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant