Summary of the invention
Technical problem to be solved by this invention provides a kind of under the LTE technical conditions, carries out method and the Turbo decoder of Turbo decoding, to satisfy the transmission rate of the required descending 100Mbit/s of LTE.
In order to address the above problem, the invention discloses the method for carrying out Turbo decoding among a kind of LTE, comprising: receive data to decode; Data volume based on described data to decode, according to the first presetting rule, described data to decode is decomposed into the N piece, and distribute corresponding Turbo decoder processing unit for each piece, respectively N data to decode piece carried out parallel decoding by corresponding Turbo decoder processing unit and process; Wherein, N is the integer more than or equal to 1; When a Turbo decoder processing unit is processed a data to decode piece that distributes, data volume based on this data to decode piece, according to the second presetting rule, it is decomposed into the M section, successively M data to decode section carried out serial decoding by this Turbo decoder processing unit and process, to finish the decoding to whole data to decode piece; Wherein, M is the integer more than or equal to 1; Export the decode results that each Turbo decoder processing unit obtains; Wherein, in described M the data to decode section, there be the overlapping of certain Bit data in the afterbody of the last data to decode section of two data to decode sections that front and back are adjacent and the head of a rear data to decode section.
Preferably, a described Turbo decoder processing unit specifically comprises the decoding treatment process of a data to decode section in M the data to decode section:
The first posteriori decoding step is used for finishing one time interative computation; It is input as data to decode Sym and the redundant information P1 of current data to decode section, and perhaps it is input as deinterleaving result and the redundant information P1 of deinterleaving step;
The step that interweaves is used for the iteration result of the first posteriori decoding step is interweaved, with upset information;
The second posteriori decoding step is used for finishing one time interative computation; It is input as interweave result and the redundant information P2 of the step that interweaves;
The deinterleaving step is used for the iteration result of the second posteriori decoding step is carried out deinterleaving, obtains the deinterleaving result.
Preferably, after described the first posteriori decoding step, the second posteriori decoding step are finished 8 iteration altogether, export corresponding deinterleaving result as the decode results of current Turbo decoder processing unit to current data to decode section; Perhaps, when meeting prerequisite between the deinterleaving result of deinterleaving step and the crc checking data, export corresponding deinterleaving result as the decode results of current Turbo decoder processing unit to current data to decode section.
Preferably, export in the following manner the decode results that each Turbo decoder processing unit obtains: a temporary Turbo decoder processing unit is to the decode results of a data to decode section, when this Turbo decoder processing unit finish the decoding of M data to decode section processed after, gather and obtain this Turbo decoder processing unit to the decode results of corresponding data to decode piece; Gather each Turbo decoder processing unit to the decode results of corresponding data to decode piece, obtain total decode results and export.
Preferably, also can export in the following manner the decode results that each Turbo decoder processing unit obtains: when a Turbo decoder processing unit finish the decoding of a data to decode section processed after, the result directly exports as Partial Decode.
According to another embodiment of the present invention, a kind of Turbo decoder is also disclosed, comprising:
Receiving element is used for receiving data to decode;
The first controller, be used for the data volume based on described data to decode, according to the first presetting rule, described data to decode is decomposed into the N piece, and distribute corresponding Turbo decoder processing unit for each piece, respectively N data to decode piece carried out parallel decoding by corresponding Turbo decoder processing unit and process; Wherein, N is the integer more than or equal to 1;
At least one Turbo decoder processing unit is used for carrying out decode procedure, obtains decode results;
Second controller, be used for when a Turbo decoder processing unit is processed a data to decode piece that distributes, data volume based on this data to decode piece, according to the second presetting rule, it is decomposed into the M section, successively M data to decode section carried out serial decoding by this Turbo decoder processing unit and process, to finish the decoding to whole data to decode piece; Wherein, M is the integer more than or equal to 1;
Output unit is used for the decode results that each Turbo decoder processing unit of output obtains;
Wherein, in described M the data to decode section, there be the overlapping of certain Bit data in the afterbody of the last data to decode section of two data to decode sections that front and back are adjacent and the head of a rear data to decode section.
Preferably, described output unit comprises:
Processing unit output buffer memory, respectively corresponding each Turbo decoder processing unit is used for a temporary Turbo decoder processing unit to the decode results of a data to decode section; When this Turbo decoder processing unit finish the decoding of M data to decode section processed after, gather and obtain this Turbo decoder processing unit to the decode results of corresponding data to decode piece;
The decoder output module is used for gathering each Turbo decoder processing unit to the decode results of corresponding data to decode piece, obtains total decode results and exports.
Preferably, described Turbo decoder processing unit specifically comprises:
The first posteriori decoding device is used for finishing one time interative computation; It is input as data to decode Sym and the redundant information P1 of current data to decode section, and perhaps it is input as deinterleaving result and the redundant information P1 of deinterleaver;
Interleaver is used for the iteration result of the first posteriori decoding device is interweaved, with upset information;
The second posteriori decoding device is used for finishing one time interative computation; It is input as interweave result and the redundant information P2 of interleaver;
Deinterleaver is used for the iteration result of the second posteriori decoding device is carried out deinterleaving, obtains the deinterleaving result.
Preferably, described Turbo decoder can also comprise judging unit, be used for after described the first posteriori decoding device, the second posteriori decoding device are finished 8 iteration altogether, exporting corresponding deinterleaving result as the decode results of current Turbo decoder processing unit to current data to decode section;
Perhaps, when meeting prerequisite between the deinterleaving result of deinterleaving step and the crc checking data, export corresponding deinterleaving result as the decode results of current Turbo decoder processing unit to current data to decode section.
Compared with prior art, the present invention has the following advantages:
The present invention adopts parallel turbo decoding processor, and data to decode is divided into multistage, mode with simple windowing is deciphered processing, can improve system transmissions speed and reach the descending 100Mbit/s that LTE requires, and can reduce data space and reduce requirement for the chip handling property, satisfy the industrialization demand; Further, in the turbo coding and decoding, the method that adopts multinomial to interweave can be avoided the read/write conflict of step to memory that interweave in decoding is processed.
Embodiment
For above-mentioned purpose of the present invention, feature and advantage can be become apparent more, the present invention is further detailed explanation below in conjunction with the drawings and specific embodiments.
With reference to Fig. 1, show the steps flow chart of the embodiment of the method for carrying out Turbo decoding among a kind of LTE of the present invention, it specifically can comprise:
Step 101, reception need the data to decode of decoding;
Because the performance index of 3GPP LTE require: under the 20MHz spectral bandwidth, can provide the peak rate of descending 100Mbps, up 50Mbps.And existing Turbo code decode procedure can't meet the demands, and need to greatly improve the efficient of whole decode procedure.Thinking of the present invention is exactly to shorten decoding time in parallel mode, and the mode with windowing reduces data space and reduces performance of devices requirements such as decoder, interleaver, deinterleavers again.The performance index requirement of satisfying 3GPPLTE with the most direct thinking, lower cost.
Step 102, based on the data volume of described data to decode, according to the first presetting rule, described data to decode is decomposed into the N piece, and distribute corresponding Turbo decoder processing unit for each piece, respectively N data to decode piece carried out parallel decoding by corresponding Turbo decoder processing unit and process; Wherein, N is the integer more than or equal to 1;
Concrete, when the data volume of described data to decode is larger, it can be divided into polylith, by a plurality of Turbo decoder processing units are parallel respectively each piece is deciphered, to improve decoding efficiency.Specifically data to decode should be decomposed into a plurality of, namely the value of N then can be determined according to reality test or research by those skilled in the art, and the present invention does not need this is limited.Certainly, need to consider data volume to be decoded concrete when determining, also need to consider the quantity of the Turbo decoder processing unit of current free time.
Common, the first presetting rule can for: according to the size of data capacity, what data blocks it is divided into.
Perhaps, the first presetting rule also can be the data structure of foundation data to decode, and it is divided into what data blocks (for example, the data to decode frame is comprised of several data blocks, then namely can be divided into the data block of several parallel decodings).
Again or, the first presetting rule also can be the quantity according to available Turbo decoder processing unit, determines the quantity of the data block of dividing.
Certainly, the first top presetting rule only is used for for example, and those skilled in the art can set up on their own.Also can directly adopt one or more the combination in the above-mentioned presetting rule, the present invention need not this to be limited.
Step 103, when a Turbo decoder processing unit is processed a data to decode piece that distributes, data volume based on this data to decode piece, according to the second presetting rule, it is decomposed into the M section, successively M data to decode section carried out serial decoding by this Turbo decoder processing unit and process, to finish the decoding to whole data to decode piece; Wherein, M is the integer more than or equal to 1;
Adopt the paralleling tactic of step 102, can greatly shorten decoding time, but the processing unit number of Turbo decoder is limited usually, even data to decode is decomposed into a plurality of, processing unit is processed still have difficulties (data volume of each piece is still larger) to the decoding of each piece, interative computation performance requirement to processing unit is still higher, and processing unit needs to take larger data space in decode procedure, can increase chip area and cost like this.Therefore, the present invention is on the basis of piecemeal, further piece is carried out segmentation, so that Turbo decoder processing unit is only deciphered processing to one piece of data at every turn, thereby can greatly reduce the requirement to the data space of decode procedure, and to the interative computation performance requirement of processing unit.Wherein, be used for the data to decode piece is decomposed into the second presetting rule of M data to decode section, can be set according to practical application by those skilled in the art, the factor of usually considering can comprise performance of data volume and processing unit etc.
Step 104, export the decode results that each Turbo decoder processing unit obtains.
In actual applications, the way of output can be according to concrete application, and by those skilled in the art's choice for use, the present invention need not restriction to this.For example, the below provides feasible dual mode.
Mode 1
Turbo decoder processing unit carries out serial decoding to M data to decode section to be processed, and after the decoding of finishing first data to decode section, will be cached for the decode results of this data to decode section.After the decoding of second data to decode section is finished, will also be cached for the decode results of this second data to decode section.When this Turbo decoder processing unit finish the decoding of M data to decode section processed after, from buffer memory, gather and obtain this Turbo decoder processing unit to the decode results of corresponding data to decode piece;
Further, gather each Turbo decoder processing unit to the decode results of corresponding data to decode piece, obtain total decode results and export.
The buffer memory here can be positioned at outside the decoding iterative process, does not need frequently to call, and as just the interim storage of decode results, can't affect the performance of processing unit.
Mode 2
Mode 1 is with all N piece of whole data to decode (M of each piece section), all decipher finish after, the ability Output rusults.In actual applications, may exist and finish Partial Decode, situation about just can export, therefore, mode 2 has provided a kind of Partial Decode, the example of part output.
Concrete, namely when a Turbo decoder processing unit finish the decoding of a data to decode section processed after, the result directly exports as Partial Decode, for the use of subsequent module.
Suppose, the data to be tested that receive are 1024bit, and the present invention can be decomposed into it two pieces, and each piece 512bit carries out parallel decoding by processing unit 1 and processing unit 2 to it respectively.For the piece of the 512bit of processing unit 1, the present invention further is decomposed into it two sections, and each section 256bit is deciphered the 256bit of front first by processing unit 1, the 256bit of this piece back is deciphered again.When processing unit 1 and processing unit 2 are finished decoding to the piece of 512bit respectively, then just finished the decoding of the data to be tested of 1024bit.
Wherein, preferably, when segmentation was deciphered, in described M data to decode section, there be the overlapping of certain Bit data in the afterbody of the last data to decode section of two data to decode sections that front and back are adjacent and the head of a rear data to decode section.For example, for the piece of aforesaid 512bit, can be decomposed into two sections,
First section is: from the initial 256bit+ of counting of piece 20bit backward
Second section is: the 256bit+ 20bit forward of counting from block end
Be easy to find out, the 20bit in these two sections overlaps.Adopt the benefit of this preferred version to be, second section can adopt the part of the last period when decoding, because this partial data may also reflect some information, thereby can improve the decoding levels of precision of second section.
Further extension is, also can there be partly overlapping scheme in the present invention when piecemeal, and for example, there be the overlapping of certain Bit data in the head of the afterbody of last data to decode piece and a rear data to decode piece.
How the below specifically finishes the decoding treatment process of a data to decode section in M the data to decode section to Turbo decoder processing unit, be elaborated.
With reference to Fig. 2, whole process roughly can be divided into following 4 steps:
Step 201, the first posteriori decoding step are used for finishing one time interative computation; It is input as data to decode Sym (being the effective information of Turbo code) and the redundant information P1 of current data to decode section, and perhaps it is input as deinterleaving result and the redundant information P1 of deinterleaving step;
Step 202, the step that interweaves are used for the iteration result of the first posteriori decoding step is interweaved, with upset information;
Step 203, the second posteriori decoding step are used for finishing one time interative computation; It is input as interweave result and the redundant information P2 of the step that interweaves;
Wherein, redundant information P1, P2 are the redundant information of Turbo code, have comprised redundant information P1, P2 in the Turbo code that receives, and more stable to guarantee transmission, error correcting capability is stronger.
Step 204, deinterleaving step are used for the iteration result of the second posteriori decoding step is carried out deinterleaving, obtain the deinterleaving result.
Above 4 steps usually need repeatedly circulation, realizing repeatedly iterative process, thereby finish decode procedure, namely when posterior probability acquires a certain degree, export.The below provides the example of two kinds of concrete finishing iteration (or circulation).
Example 1
After described the first posteriori decoding step, the second posteriori decoding step are finished 8 iteration altogether, export corresponding deinterleaving result as the decode results of current Turbo decoder processing unit to current data to decode section.
Because based on those skilled in the art's test and study, for existing situation about using, 8 iteration just can satisfy the requirement of decoding accuracy usually.During the benefit of this mode, do not need to introduce checking data, reduce and judge the operand that finishes.Certainly, in actual applications, iterations can be tested by those skilled in the art rear definite, is not limited to 8 times.This example just is used for showing, can adopt " fixing iterations " as the judgment condition of output.
Example 2
When meeting prerequisite between the deinterleaving result of deinterleaving step and the crc checking data, export corresponding deinterleaving result as the decode results of current Turbo decoder processing unit to current data to decode section.This mode can preferably control decoding accuracy, still owing to needing to introduce the crc checking data, increase certain computational complexity.
For example, in Turbo code, can be with the multinomial of CRC check, if the corresponding multinomial of deinterleaving result of deinterleaving step and Turbo code with the multinomial of CRC check conform to, then can adjudicate output.
The first posteriori decoding step among top Fig. 2 can adopt identical posterior probability algorithm with the second posteriori decoding step, and the present invention preferably adopts the max-log-map algorithm, and this algorithm is relatively simplified, and can further improve decoding efficiency of the present invention.
The below simply introduces the max-log-map algorithm.The max-log-map algorithm is in the algorithm of log-domain, logarithm component in the likelihood value addition expression is ignored, the likelihood addition is become the maximizing computing fully, like this except saving most add operation, maximum benefit is the estimation of having saved signal to noise ratio, so that algorithm is more sane.Concrete, simply be expressed as follows with formula:
1. initialization
α
0(s=0)=0
α
0(s≠0)=-∞
β
0(s=0)=0
β
0(s≠0)=-∞
2. preceding paragraph iteration:
Wherein
3. carry out backward iteration after the currentitem iteration is finished
Calculate simultaneously LLR:
Wherein
It is exactly the Output rusults of posteriori decoding step.For the first posteriori decoding step,
Export to the step that interweaves; For the second posteriori decoding step, then can export to deinterleaver, as with
Expression.
With reference to Fig. 3, show the decoding iteration situation for two processing units.
Wherein, abscissa is the grid time, and ordinate is the processing time.L is iteration cycle, and wherein the iteration of posteriori decoding has been carried out 5 times, namely α, β iteration 5 times, wherein, before the β iteration once and after once overlap (dotted line and solid line on the abscissa direction overlap).
In the drawings, β
1, β
2Represent two interative computation devices, namely adopted two arithmetic units that the β iteration is carried out computing, can improve speed.For the present invention, we have adopted the design of parallel and windowing, so adopt the arithmetic unit also can the basic guarantee demand.
The below simply introduces the step that interweaves among Fig. 2.
The effect of interleaver is for the correlation that reduces between the check bit in the Turbo code system, and then reduces bit error rate in iterative decoding process.Design performance is characteristics and the basic principle of interleaver preferably: by increasing the length of interleaver, decoding performance is improved, the free distance that good interleaver can make total code word increases with the increase of interleaver sizes, and certain interleaver distance namely is provided.Interleaver should make as much as possible randomization of list entries, thus the information sequence that generates low repeated code word of avoiding encoding after interweaving, encode still to generate and hangs down repeated code word, cause the free distance of Turbo code to reduce.In a word, interweave and reset in the position of the element in the data sequence in fact exactly, thereby obtain the process of interleaved sequence; Its inverse process is exactly that the sequential element after interweaving is reverted to original order, is also referred to as deinterleaving.
The present invention preferably adopts the multinomial algorithm that interweaves, and adopting the interweave benefit of algorithm of multinomial mainly is the conflict that prevents when parallel turbo decoding.
With reference to Fig. 4 a, show the structural representation of a kind of Turbo decoder implementation of the present invention example, comprising:
Receiving element 401 is used for receiving data to decode;
The first controller 402, be used for the data volume based on described data to decode, according to the first presetting rule, described data to decode is decomposed into the N piece, and distribute corresponding Turbo decoder processing unit for each piece, respectively N data to decode piece carried out parallel decoding by corresponding Turbo decoder processing unit and process; Wherein, N is the integer more than or equal to 1;
At least one Turbo decoder processing unit 403 is used for carrying out decode procedure, obtains decode results;
Second controller 404, be used for when a Turbo decoder processing unit is processed a data to decode piece that distributes, data volume based on this data to decode piece, according to the second presetting rule, it is decomposed into the M section, successively M data to decode section carried out serial decoding by this Turbo decoder processing unit and process, to finish the decoding to whole data to decode piece; Wherein, M is the integer more than or equal to 1; Preferably, in described M the data to decode section, there be the overlapping of certain Bit data in the afterbody of the last data to decode section of two data to decode sections that front and back are adjacent and the head of a rear data to decode section.
Output unit 405 is used for the decode results that each Turbo decoder processing unit of output obtains.
In actual applications, also can adopt the example of Fig. 4 b, namely second controller 404 is arranged in Turbo decoder processing unit 403, and each processing unit has the second controller 404 of oneself.
Concrete, corresponding to the way of output 1 that embodiment of the method provides, described output unit 405 can comprise:
Processing unit output buffer memory, respectively corresponding each Turbo decoder processing unit is used for a temporary Turbo decoder processing unit to the decode results of a data to decode section; When this Turbo decoder processing unit finish the decoding of M data to decode section processed after, gather and obtain this Turbo decoder processing unit to the decode results of corresponding data to decode piece;
The decoder output module is used for gathering each Turbo decoder processing unit to the decode results of corresponding data to decode piece, obtains total decode results and exports.
The way of output 2 that provides corresponding to embodiment of the method, then when a Turbo decoder processing unit finish the decoding of a data to decode section processed after, output unit 405 just can directly be exported it as the Partial Decode result, for the use of subsequent module.
The below simply introduces Turbo decoder processing unit, and with reference to Fig. 5, it specifically can comprise:
The first posteriori decoding device (APP1) 501 is used for finishing one time interative computation; It is input as data to decode Sym and the redundant information P1 of current data to decode section, and perhaps it is input as deinterleaving result and the redundant information P1 of deinterleaver;
Interleaver 502 is used for the iteration result of the first posteriori decoding device 501 (is adopted at Fig. 5
Expression) interweaves, with upset information;
The second posteriori decoding device (APP2) 503 is used for finishing one time interative computation; Its result that interweaves who is input as interleaver 502 (adopts L (u in Fig. 5
n) expression) and redundant information P2;
Deinterleaver 504 is used for the iteration result of the second posteriori decoding device 503 (is adopted at Fig. 5
Expression) carries out deinterleaving, obtain the deinterleaving result and (in Fig. 5, adopt L (u
k) expression).
Further, also include judging unit 505 among Fig. 5, its deinterleaving result who is input as the deinterleaving step (adopts L (u in Fig. 5
k) expression) and crc checking data, when meeting prerequisite between the deinterleaving result of deinterleaving step and the crc checking data, export corresponding deinterleaving result as the decode results of current Turbo decoder processing unit to current data to decode section.
In another embodiment of the present invention, also can not introduce the crc checking data, and directly identified by 505 pairs of iterationses of judging unit, after described the first posteriori decoding step, the second posteriori decoding step are finished the iteration that presets number of times altogether (such as 8 times), export corresponding deinterleaving result as the decode results of current Turbo decoder processing unit to current data to decode section.
Can find out from top description, above-mentioned four parts circulation is carried out, can output needle to the decode results of current data to decode section, namely current Turbo decoder processing unit has been finished the work decoding of a data segment in M the data to decode section.
In processing unit structure shown in Figure 5, the internal structure of the first posteriori decoding device 501 and the second posteriori decoding device 503 is basic identical, and the below roughly introduces its design frame chart, with reference to Fig. 6.
APP (posterior probability) decoder 501 or 503 can comprise with lower module: Input Data Buffer 601, Beta computing unit 602, Beta memory cell 603, LLR computing unit 604;
Interleaver 502 can comprise: interleaver memory cell 605, interleaver scalar/vector 606.Wherein,
Input Data Buffer 601 is used for buffer memory input data;
Beta computing unit 602 is used for backward iteration;
Beta memory cell 603 is used for storage Beta iterative data;
LLR computing unit 604 is used for log-likelihood ratio function LLR and calculates; (comprising the computing of α iteration);
Interleaver memory cell 605 is used for interweaving result store;
Interleaver scalar/vector 606 is used for interweaving, because interleaver is realized by generating interleaving address.
Each embodiment in this specification all adopts the mode of going forward one by one to describe, and what each embodiment stressed is and the difference of other embodiment that identical similar part is mutually referring to getting final product between each embodiment.For device embodiment because itself and embodiment of the method basic simlarity, so describe fairly simple, relevant part gets final product referring to the part explanation of embodiment of the method.
Above to carrying out method and a kind of Turbo decoder of Turbo decoding among a kind of LTE provided by the present invention, be described in detail, used specific case herein principle of the present invention and execution mode are set forth, the explanation of above embodiment just is used for helping to understand method of the present invention and core concept thereof; Simultaneously, for one of ordinary skill in the art, according to thought of the present invention, all will change in specific embodiments and applications, in sum, this description should not be construed as limitation of the present invention.