CN112751572B - Four-path parallel LTE-based 4Turbo interleaving address generation method - Google Patents
Four-path parallel LTE-based 4Turbo interleaving address generation method Download PDFInfo
- Publication number
- CN112751572B CN112751572B CN202110019218.1A CN202110019218A CN112751572B CN 112751572 B CN112751572 B CN 112751572B CN 202110019218 A CN202110019218 A CN 202110019218A CN 112751572 B CN112751572 B CN 112751572B
- Authority
- CN
- China
- Prior art keywords
- iteration
- path
- iterative
- iterator
- address
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M13/00—Coding, decoding or code conversion, for error detection or error correction; Coding theory basic assumptions; Coding bounds; Error probability evaluation methods; Channel models; Simulation or testing of codes
- H03M13/27—Coding, decoding or code conversion, for error detection or error correction; Coding theory basic assumptions; Coding bounds; Error probability evaluation methods; Channel models; Simulation or testing of codes using interleaving techniques
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M13/00—Coding, decoding or code conversion, for error detection or error correction; Coding theory basic assumptions; Coding bounds; Error probability evaluation methods; Channel models; Simulation or testing of codes
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M13/00—Coding, decoding or code conversion, for error detection or error correction; Coding theory basic assumptions; Coding bounds; Error probability evaluation methods; Channel models; Simulation or testing of codes
- H03M13/29—Coding, decoding or code conversion, for error detection or error correction; Coding theory basic assumptions; Coding bounds; Error probability evaluation methods; Channel models; Simulation or testing of codes combining two or more codes or code structures, e.g. product codes, generalised product codes, concatenated codes, inner and outer codes
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D30/00—Reducing energy consumption in communication networks
- Y02D30/70—Reducing energy consumption in communication networks in wireless communication networks
Abstract
The invention discloses a four-way parallel LTE-based 4Turbo interleaving address generation method, which realizes a four-way eight-group parallel interleaving address generator with low complexity, occupies less resources, has low power consumption and can output 4 odd-even interleaving addresses in parallel. Compared with the traditional FPGA implementation interleaved address calculation scheme, the hardware circuit design scheme based on the special calculation function of the ASIC has higher performance; compared with a special iteration initial value calculation scheme, the design area can be effectively reduced by adopting an iterator multiplexing-initial value calculation scheme; and by adopting a gated clock technology, the design power consumption can be effectively reduced. The clock gating rate of the design is 98.94%, and the area after the design is integrated in DC under the 500MHz clock condition is about 9536.9 mu m 2 . The method is suitable for the actual requirements of most radar communication integrated ASIC fields.
Description
Technical Field
The invention belongs to the technical field of radar communication integration, and particularly relates to a four-path parallel LTE-based 4Turbo interleaving address generation method.
Background
With the application of radar target detection to communication systems, namely radar communication integrated systems. It is desirable to use modulated pulses such as BPSK, QPSK, etc. to achieve radar target detection while performing communication processing. The FPGA or ASIC design implementation of turbo interleaving address calculation in the 3GPP36.212 protocol can effectively meet the interleaving address calculation requirements of a radar communication integrated system with miniaturization, low power consumption and high performance.
The original calculation formula of the interleaved address is pi (i) = (f) 1 i+f 2 i 2 )modK,(i=0,1,2,...,K-11024≤K≤6144, K mod32= 0), where pi (i) is the output address of the interleaver, i is the sequential address before interleaving, K is the code block length, f 1 、f 2 Is a preset constant and is uniquely determined by K. It can be seen that the formula contains multiplication, division and addition operations. If the direct expansion calculation introduces a multiplier and a divider on the hardware implementation, a large amount of logic resources are consumed. And the dividend is larger, and the complexity of the realized hardware is higher.
If the interleaving address solution is carried out by iteration, initial values for iteration must be calculated in advance in the process, and the solution of the iteration initial values needs to design an independent initial value solving circuit, but an additional initial value calculating circuit is introduced. Although a part of logic can be multiplexed in a certain way, since the part of the circuit is only gated in the circuit initialization process and is in an idle state when the interleaved address is generated iteratively, this will certainly result in a waste of area.
Disclosure of Invention
Aiming at the problems in the prior art, the invention aims to provide a four-path parallel LTE-based 4Turbo interleaving address generation method, which realizes a four-path eight-group parallel interleaving address generator with low complexity, has less resource occupation and low power consumption, completes the Turbo interleaving address calculation function in a 3GPP36.212 protocol and can output 4 odd-even interleaving addresses in parallel.
In order to achieve the purpose, the invention is realized by adopting the following technical scheme.
The four-path parallel LTE-based 4Turbo interleaving address generation method comprises the following steps:
Further, the 8 ways of pi value iteration paths and the two ways of delta paths are as follows: pi (2 n) iteration path, pi (2n + 1) iteration path,An iterative path,An iterative path,An iterative path,An iterative path,An iteration path, a delta (2 n) iteration path, and a delta (2n + 1) iteration path.
Further, after the iterative process begins, the detection is continuedThe iteration output of the iteration channel represents that the iteration process is terminated when the channel outputs 0.
Furthermore, the value range of K is 1024-6144, and the minimum value interval between two K is 32; f. of 1 And f 2 Respectively odd and even.
Further, when the natural address is an odd number, the interleaving address is an odd number; when the natural address is an even number, the interleaved address is an even number.
Compared with the prior art, the invention has the beneficial effects that:
(1) The method realizes a low-complexity four-way eight-group parallel interleaving address generator, has less resource occupation and low power consumption, completes the turbo interleaving address calculation function in a 3GPP36.212 protocol, and can output 4-way odd-even interleaving addresses in parallel. Compared with the traditional FPGA implementation interleaved address calculation scheme, the hardware circuit design scheme based on the special calculation function of the ASIC has higher performance; compared with a special iteration initial value calculation scheme, the iterator provided by the invention realizes multiplexing in an initial value calculation stage, and can effectively reduce the design area.
(2) The invention adopts the gated clock technology, and can effectively reduce the design power consumption. The clock gating rate of the invention is 98.94%, the clock gating rate can be up to 1.1GHz under the TSMC90nm process library, the performance is high, and the area after the clock gating rate is integrated in DC under the 500MHz clock condition is about 9536.9 mu m 2 The method is suitable for the actual requirements of most of the fields of radar communication integrated ASICs. In addition, an automatic simulation script is provided, and the design can be quickly verified by supporting one-key simulation in a Modelsim10.6 environment.
Drawings
The invention is described in further detail below with reference to the figures and specific embodiments.
FIG. 1 is a circuit diagram of a prior art sum and remainder iterator;
FIG. 2 is a diagram of a circuit for solving an initial value of an interleaving address in the prior art;
FIG. 3 is a diagram of a remainder circuit configuration according to an embodiment of the invention;
FIG. 4 is a delta for the case where the input data range is greater than 2K according to an embodiment of the present invention init And solving the circuit structure diagram.
FIG. 5 is a schematic diagram of a pi (2 n) iterative path circuit according to an embodiment of the present invention;
FIG. 6 is a diagram of a π (2n + 1) iterative path circuit according to an embodiment of the present invention;
FIG. 10 is a diagram of a delta (2 n) iterative path circuit according to an embodiment of the present invention;
FIG. 14 is a diagram of an iterative path circuit for δ (2n + 1) according to an embodiment of the present invention;
FIG. 15 is a timing diagram illustrating an iterative detection of interleaved addresses in accordance with an embodiment of the present invention;
FIG. 16 is a timing diagram illustrating exemplary operation of an interleaved address generation module according to an embodiment of the present invention;
FIG. 17 is a modelsim platform output diagram in accordance with an embodiment of the present invention;
FIG. 18 is a graph of the integrated clock gating rate for DC according to an embodiment of the present invention;
FIG. 19 is a DC integrated top level module of an embodiment of the present invention;
FIG. 20 is a graph of the result of the integrated area calculation according to the embodiment of the present invention.
Detailed Description
Embodiments of the present invention will be described in detail below with reference to examples, but it will be understood by those skilled in the art that the following examples are only illustrative of the present invention and should not be construed as limiting the scope of the present invention.
The four-path parallel LTE-based 4Turbo interleaving address generation method provided by the embodiment of the invention comprises the following steps of:
firstly, eliminating multiplication operation and performing the following steps:
the iterative formula of the original interleaved address is:
π(i+k)=(f 1 (i+k)+f 2 (i+k) 2 )mod K
=(if 1 +i 2 f 2 +kf 1 +2ikf 2 +k 2 f 2 )mod K
let δ (i) = (kf) 1 +2ikf 2 +k 2 f 2 ) mod K, then:
π(i+k)=(π(i)+δ(i))mod K
the iterative calculation of δ (i) is then:
δ(i+k)=(kf 1 +2(i+k)kf 2 +k 2 f 2 )mod K
=(kf 1 +2ikf 2 +2k 2 f 2 +k 2 f 2 )mod K
=(δ(i)+b)mod K
wherein pi (i) is an original interleaving address, k is an iteration step, and b =2k 2 f 2 K is the code block length;
then, the remainder operation is eliminated: converting the remainder operation in the iterative formula into judgment whether the remainder number is greater than K and less than 2K, and if so, performing K subtraction operation on the remainder number once; and further obtaining a simplified iterative formula:
wherein, 0 is more than or equal to delta (i) < K,0 is more than or equal to pi (i) < K, and 0 is more than or equal to pi (i) + delta (i) <2K;
the simplified form of the iterative formula can be obtained by finding out the rule of the value range of 0 to pi (i) + delta (i) <2K, thereby realizing real simplification by mapping to a hardware circuit and avoiding division and remainder operation. Instead of simple comparison and addition operations, this can be realized using simple arithmetic logic circuit units such as comparators, multiplexers, adders.
Consider further the calculation of the initial values of 8 sets of parallel interleaved addresses, since the iterative method needs to know the initial values to be able to proceed. The required 8 groups of interleaved addresses have the general formula pi (2 n), pi (2n + 1), Wherein
When n =0, the corresponding four-way parallel 8 groups of interleaving address initial values are pi (0), pi (1) and,
Deriving four-way parallel interleaving address initial value pi init Let P =0,1,2,3 denote four parallel row indexes respectively, P =4 denotes the parallel row number, and the length of each address sequence output after interleaving is:
it is easy to know that i = p · L and K = p.l, and to bring them into the original interleaved address calculation:
π init (i)=[(f 1 +if 2 )×i]mod K
π init (i)=[(f 1 +pLf 2 )×pL]mod(P·L)
and (3) putting out L in the formula to obtain:
π init (i)={[(f 1 +pLf 2 )×p]mod P}×L
and P =4 is a parallel number, so that the remainder of the code block length K is converted into the remainder operation of a fixed value P, and the complexity of hardware implementation is greatly reduced.
In order to meet the requirement of the problem, the iteration step length k =2 is taken, and the initial value of each path of interleaving address is simplified in the following cases:
when i =0, pi (i) =0;
when i =1, pi (i) = (f) 1 +f 2 ) mod K, due to f in the list 1 +f 2 E (0, K), then pi (i) = f 1 +f 2 ;
π init (i)={[(f 1 +1·Lf 2 )×1]mod P}×L
whereinIt is easy to calculate from a theoretical point of view, according to the known condition in the 3gpp36.212 protocol table K mod32=0,
The formula is further simplified as follows:
can see f 1 mod4 is a remainder to an integer power of 2, then f 1 mod4 is equal to take f 1 The lower 2 bits of the binary representation. And because f 1 Is an odd number, the lowest order bit must be 1, only f need be 1 Second lowest order f 1 [1]Can be judged to obtain pi init (i):
π init (i)={[(f 1 +2·Lf 2 )×2]mod P}×L
known according to the title: k mod32=0 available (Kf) 2 ) mod4=0 holds, and f 1 After multiplying odd numbers by 2, the remainder of 4 is obtained, and a constant value of 2 is always obtained. The above formula can be further simplified as follows:
π init (i)={[(f 1 +3·Lf 2 )×3]modP}×L
similarly, it can be deduced that:
this exactly satisfies the relation:
when the temperature is higher than the set temperatureWhen the temperature of the water is higher than the set temperature,
further deriving an initial value of δ (i) in iteration, and according to δ (i) iterative calculation formula in the case of iteration step k =2:
δ(i)=(2f 1 +4if 2 +4f 2 )mod K
δ (i + 2) = (δ (i) + b) mod K (where b =8 f) 2 )
When i =0, δ (0) = (2 f) 1 +4f 2 )mod K;
When i =1, δ (1) = (2 f) 1 +8f 2 )mod K;
observe the above three formulas, the second term is Kf 2 、2Kf 2 、3Kf 2 Since the remainder is zero for K, the following relationship holds, that is, the initial value of the even channel of the sequential address of δ (i) is:
similarly, a similar relationship holds for the odd channel of the sequential address, and the initial value is:
to this end, it can be obtained that all δ (i) sequences obtained by iterative computations with an iteration step k =2 satisfy:
whereinTherefore, the iteration circuit of the delta (i) only needs to multiplex the iterators according to the parity of the sequential address i, and 2-path parallel delta (i) iterators can be obtained, so that the area consumption is reduced. Even when the design of the structural module is optimized, 8 parallel iterators for solving pi (i) can be reused to further optimize the area.
Both iterative simplification and initial value calculation can be achieved by the simplest logic, which provides a simplified model for the hardware mapping scheme.
The basic iterative operation units of the existing iterators are consistent in structure and are in a (a + b) mod K ' form, and the basic iterative operation units have the functions of ' summing and remainder ' and pi + delta epsilon [0, 2K) or delta + b epsilon [0, 2K).
The interleaving address solution is carried out by an iterative formula, initial values for iteration must be calculated in advance, and pi (0), pi (1) and pi (0) are independently calculated by designing an independent initial value solution module in the prior art, δ init (even)、δ init (odd), but using this scheme introduces additional initial value calculation circuitry. Although a part of logic can be multiplexed in a certain way, since the part of the circuit is only gated in the circuit initialization process and is in an idle state when the interleaved address is generated iteratively, this will certainly result in a waste of area.
Fig. 2 is a schematic diagram of a conventional circuit structure for solving the initial value of the interleaved address. The circuit can calculate pi (0), pi (1) and pi (1) in the first clock cycle of user input,Can be obtained in the second clock cycle
As shown in FIG. 3, for a mod module single remainder module, the data range of the input end of the module must satisfy din ∈ [0, K ], otherwise, the remainder result cannot be normally calculated. The circuit structure is shown in fig. 3. Because the remainder structure of the comparator-subtracter can only work correctly when the input data value range is between 0 and 2K, the input operand must be ensured not to be larger than 2K. However, when K =1504, f1=49, and f2=846. Since the initial value of δ (i) is calculated as:
δ init (even)=2f 1 +4f 2
δ init (odd)=2f 1 +8f 2
simple analysis shows that: whether 4f 2 Or 8f 2 All are larger than 2K, therefore, when solving the initial value of δ (i), the initial value calculation circuit is designed reasonably to handle the condition that the input data range is larger than 2K. In view of the above requirements, the designed circuit is shown in fig. 4, and the working flow of the circuit is as follows:
(1) Firstly f is firstly 2 Shifting left by 1 bit, taking the remainder of K, and calculating to obtain 2f 2 %K。
(2) Continuously moving the result in the step (1) to the left by 1, then taking the remainder of K, and calculating to obtain 4f 2 % K, when the results are compared with 2f 1 Summing to give δ init (even)。
(3) 4f obtained by calculation in (2) 2 The left shift of the% K is 1 bit, and the K is taken out for residue to obtain 8f 2 % K, when the results are compared with 2f 1 Summing to give δ init (odd)。
The existing remainder is carried out on the premise of adopting an independent initial value calculation circuit, but the part of the initial value calculation circuit is independent of an iterative circuit and works only at the beginning of the operation process, and the part of the circuit is in an idle state when an interleaving address is generated. This clearly creates a waste of 'computing power'.
The iterative operation unit for interleaved address generation can calculate (a + b)% K operations, and all intermediate variables and results in the initial value calculation circuit are ∈ [0, K ], which means that all operations in the initial value operation can be calculated by the iterative operation unit.
The invention designs a data processing flow, and the iterative operation unit is multiplexed to calculate the iterative initial value, so that an initial value calculation module in the original structure is omitted, and the design area can be greatly reduced.
In order to reduce the complexity of data transfer in a multiplexing structure, the basic principle of the initial value calculation function is 'in-situ operation', and the basic concept of the in-situ operation is all iterationsThe initial values are all produced in the corresponding iteration cell (e.g., pi (1) is calculated in an iterator that generates pi (1, 3,5, 7.). F due to partial K value 2 The value is large, the result cannot be directly obtained through single complementation operation, and the result needs to be generated through three times of cyclic operation; considering that pi (0) =0 does not require further operation, 8f is calculated by using the iterator corresponding to pi (0) 2 %K。
The work flow of the interleaving address generator designed by the invention is as follows:
In the invention, the 8 paths of pi value iteration paths and the two paths of delta paths are as follows: pi (2 n) iteration path, pi (2n + 1) iteration pathA road,An iterative path,An iterative path,An iterative path,An iterative path,An iteration path, a delta (2 n) iteration path, and a delta (2n + 1) iteration path.
Further, pi (2 n) iteration path, pi (2n + 1) iteration path,An iterative path,An iterative path,An iteration path, a delta (2 n) iteration path,The iteration path comprises a corresponding iterator and two alternative data selectors respectively; as shown in particular in fig. 5-11. When the input init _ done signal is at low level, the 0 ends of the two alternative data selectors are valid, and data input by the valid ends are output to the iterator to carry out initial value calculation of the iterator; after the initial value calculation is completed, if the init _ done signal is at high level, the 1 ends of the two alternative data selectors are valid, the input data of the 1 ends are output to the iterator, and the iterator performs circular iteration according to the initial value and the input data to sequentially generate corresponding interleaved address sequences.
Go to oneStep (a) toThe iteration path comprisesAn iterator and three alternative data selectors; as shown in fig. 12.
When the init done signal is low, one of the two-out-of-one data selectors is at f 1 [1]If =0, select 0 to outputIterator, at f 1 [1]If =1, selecting K > 1 to outputAn iterator; the 0 terminals of the other two alternative data selectors are effective terminals, and the data K > 2 input by the effective terminals is output toAn iterator is used for carrying out the iterative process,the iterator calculates an initial value from two input data
The init _ done signal is high when the initial value calculation is completed, andthe 1 ends of two alternative data selectors directly connected by the iterator are effective ends, and input data delta of the effective ends even Is output toThe iterative device is used for carrying out the iterative operation,the iterator is based on the input initial valueAnd delta even Performing loop iteration to generateInterleaving the sequence of addresses.
Further, theThe iteration path comprisesAn iterator and three alternative data selectors as shown in fig. 13.
When the init done signal is low, one of the two-out-of-one data selectors is at f 1 [1]If =0, selecting K > 1 to outputIterator at f 1 [1]If =1, select 0 to outputAn iterator; the 0 terminals of the other two alternative data selectors are effective terminals, and the data K > 2 input by the effective terminals is output toThe iterative device is used for carrying out the iterative operation,the iterator calculates an initial value from two input data
The init _ done signal is high when the initial value calculation is completed, andthe 1 ends of two alternative data selectors directly connected by the iterator are effective ends, and input data delta of the effective ends even Is output toThe iterative device is used for carrying out the iterative operation,the iterator is based on the input initial valueAnd delta even Performing loop iteration to generateInterleaving the sequence of addresses.
Further, the δ (2n + 1) iteration path comprises a δ (2n + 1) iterator and an alternative data selector; as shown in fig. 14.
When the init _ done signal is at low level, the 0 terminal of the alternative data selector is the active terminal, and the input f of the active terminal is 1 1 to delta (2n + 1) iterator; f. of 2 3, directly inputting delta (2n + 1) iterator, and calculating an initial value delta (1) by the delta (2n + 1) iterator according to the two input data; when the initial value calculation is completed, the init _ done signal is at high level, the 1 end of the alternative data selector is an effective end, and the input data of the delta (2n + 1) iterator are delta (1) and f 2 Is less than 3; and continuously performing loop iteration according to the sequence to sequentially generate a pi (2n + 1) interleaved address sequence until the generation is finished.
Since the four-way parallel LTE-based 4turbo interleaver requires parallel output of 8 interleaving addresses, a total of K/8 interleaving addresses are output for a fixed K value. To detect the stop of the iteration, the simplest way is to store the K/8 value in a register each time an interleaving address is generated, and construct a counter in the hardware circuit, increment the counter by 1 each time a valid output is generated, stop working until the count value reaches K/8, and enter the idle state.
However, this structure will additionally introduce a counter for flow control, and we derive from theory that in performing the iteration with step k =2, there are the following situations:
that is, when the number of iterations reaches K/8, the next interleaved address will be repeatedly output from the head (e.g., the address of the next interleaved address will be output from the head) It can be found that pi (0) does not change regardless of the change in the value of K, andthe iteration channel becomes pi (K) = pi (0) =0 after the iteration number reaches K/8.
According to the principle, the detection is continuously carried out only after the iteration is startedThe iteration output of the channel represents that the iteration process is terminated when the channel outputs 0. One counter can be optimized in this way. The interleave address iteration completion detection timing of the present invention is shown in fig. 15.
There are some specific values and bits in the design. The value of each bit is bound to a specific level and is not changed, and the resource consumption can be reduced by performing directional optimization on the corresponding bits in the program, wherein the specific values are as follows:
(a) The range of K values is 1024-6144 and the minimum number interval between two K values is 32, which also means that the low 5 bits corresponding to the K value are always low.
(b) According to protocol definition, f 1 And f 2 Odd and even, respectively, representing input f 1 The lowest position is defined as 1,f 2 The minimum is necessarily 0.
(c) When solving delta, the formula is delta new =δ+(8f 2 )mod K,b=(8f 2 ) mod K has a third bit of unity at 0,can solve for delta new The third bit is optimized.
(d) According to the interleaving address definition formula, when the natural address is an odd number, the interleaving address is set as an odd number; when the natural address is an even number, the interleaving address positions the even number. This represents the pi of the output of all odd-way pi iterators new The lowest position is 1; and pi of even-path pi value iterator output new The lowest bit is always 0.
The port description of the interleaved address generator designed by the present invention is shown in table 1 below:
table 1 port description
The typical work flow of the module is shown in fig. 16, and the reset signal should be at a high level and the enable signal en should be at a high level during normal operation. User input signals k, f1 and f2 are stored by the module when vld _ in is high, and the output of the parallel interleaving address from the valid vld _ in signal to the first group has a time lag of 5 clock cycles. The interleaving address is continuously output, the vld _ out signal is continuously in a high level in the process of outputting the interleaving address, and the effective interleaving address of K/8 clock cycles is output.
Table 2 below illustrates the calculation process of each path of initial values in the parameter calculation stage, and since some initial values have interdependencies and cannot be calculated in one clock, there is a certain clock delay.
TABLE 2 parameter calculation clock
Table 2 shows that the pi (0) iterator is multiplexed to calculate delta (0) =2f in the first three clock cycles 1 +4f 2 And δ (1) =2f 1 +8f 2 4f in (1) 2 And 8f 2 . Since there is f corresponding to K in the protocol 2 Larger values, e.g. f at K =1504 1 =49,f 2 =846, for which case 4f exists 2 > 2K, cannot be complemented by a remainder taker in the design, first calculates 2f2% by a pi (0) iterator and multiplies it at the second clock and calculates 4f 2 %K=[(2f 2 %K)<<1]% K, recalculated at the third clock 8f 2 %K=[(4f 2 %K)<<1]% K. When 4f is calculated 2 % K and 8f 2 % K can calculate delta in the next clock init (odd) and δ init (even)。
An iteration stage: the total 8 pi value iteration paths and the two delta paths operate together, a new iteration value is generated in each clock period, the vld _ out signal is always in a high level in the iteration process, and the output interfaces pi _ 0-pi _7 output effective pi values.
Simulation experiment
The correctness and effectiveness of the invention are further illustrated by the simulation data processing result.
1) The simulation platform work flow:
f1 and f2 values corresponding to each K value test all cases of 1024 < = K < 6144 defined in a protocol, after the system is initialized, a verification process of a single K value is started by an auto _ test system task, then the verification process enters a result _ check task, a natural address is generated in the result _ check task, and then an interleaving address is calculated by an interleaving address calculation function directly by using a definition formula (defined as long int to prevent numerical overflow). And comparing with the module output, and defining three conditions in the comparison process, namely abnormal termination, output error and pass verification.
And (4) abnormal termination: means that when the module inputs a value of K, parallel interleaved addresses should be output continuously within K/8 clock cycles under normal conditions, but due to some error reasons, the vld _ out signal of the module fails in advance within K/8 clock cycles, and the simulation process will terminate in advance.
And (4) outputting an error: the output calculated by the interleaving address module is different from the result calculated by the function, and the simulation platform can output wrong information such as natural addresses, interleaving addresses and the like.
And (4) passing the verification: the module outputs all correct interleaving addresses in K/8 clock cycles of effective output, and the module works normally under the K value and passes the test.
After the simulation is run, when the script window outputs the information shown in FIG. 17 to represent that the corresponding test case passed, all cases should be shown here because the design has been carefully verified before submission.
2) Analysis of results
(1) Low power consumption analysis
In most digital systems, the clock tree generates more power consumption, and controlling the clock tree power consumption has a high significance for realizing low power consumption of the system. The adoption of the gated clock technology can effectively reduce power consumption, and is one of important methods for realizing low-power-consumption design. When the gated clock is used, the inversion of the input clock can be stopped when the register data is not changed, so that the dynamic power consumption is reduced to a certain extent.
The design has high gating rate and an intelligent clock control system, and can automatically close the module clock when the module does not generate effective output.
For a generic module, the internal clock can be gated through an enable interface en at the top of the module. However, due to the "coarse-grained" nature of the control, for a complex control system (e.g., "encoding system"), the gating enable of the entire module is valid when the entire encoding system is working, and all sub-module clocks in the module are in a working state. In practice, however, not all sub-modules are "active" during the operation phase of the coding system, resulting in a waste of part of the power consumption.
In order to solve the problem, an intelligent clock control technology can be adopted, the gated clock works only when the interleaving address generation module works, and the gated clock stops immediately after the interleaving address generation is finished.
A clock control register power _ en exists in the interleaving address generation module, the interleaving address generation module is set when the input vld _ in of the module is valid, and the interleaving address generation module returns to zero after all interleaving addresses are output by the module. The clock gating inside the module is controlled by the AND logic of the clock control register power _ en and the module top clock enable signal en. Fig. 19 presents a DC integrated gated clock rate diagram.
The control logic enables the clock in the module to automatically run only in the process of iterative initial value calculation and iterative calculation of the interleaved address, and automatically stop after the output is finished, thereby finishing the fine-grained control on the clock tree in the module and effectively reducing the power consumption of the subsystem.
(2) Area analysis
Fig. 19 shows a DC integrated top module of the design, the design uses a TSMC90nm process library for compilation, a fast condition is adopted, under the constraint of a 2ns clock cycle (500 MHz), a map _ effort and an area _ effort _ high gate _ clock option are adopted for integration, an integrated area report is shown in fig. 20, fig. 20 shows an area _ report of detailed resource consumption of each module, and it can be seen from the figure that the total area consumption of the design is 9536.8896 μm 2 The method is suitable for the actual requirements of most of the fields of radar communication integrated ASICs.
(3) Frequency analysis
In order to better verify the working effect of the interleaving address generator, the clock constraint frequency is modified, and a clock frequency-area table is obtained. The results are given in table 3 below:
TABLE 3 clock frequency-area results
The optimization points of the invention are mainly concentrated on the area, and the design scheme can realize smaller area under the clock requirement of 500 MHz. The design main area is consumed by 8-way pi iterators and 2-way delta iterators.
Although the present invention has been described in detail in this specification with reference to specific embodiments and illustrative embodiments, it will be apparent to those skilled in the art that modifications and improvements can be made thereto based on the present invention. Accordingly, such modifications and improvements are intended to be within the scope of the invention as claimed.
Claims (8)
1. The four-path parallel LTE-based 4Turbo interleaving address generation method is characterized by comprising the following steps of:
step 1, iterative path optimization: optimizing an iterative formula of an original interleaving address to eliminate multiplication and remainder operation in the original iterative formula to obtain an optimized iterative formula; further obtaining four-path parallel 4Turbo interleaving address iterative formulas;
the optimization of the iterative formula of the original interleaving address specifically comprises the following steps:
firstly, eliminating multiplication operation is carried out according to the following steps:
the iterative formula for the original interleaved address is:
π(i+k)=(f 1 (i+k)+f 2 (i+k) 2 )mod K
=(if 1 +i 2 f 2 +kf 1 +2ikf 2 +k 2 f 2 )mod K
let δ (i) = (kf) 1 +2ikf 2 +k 2 f 2 ) mod K, then:
π(i+k)=(π(i)+δ(i))mod K
the iterative calculation of δ (i) is then:
δ(i+k)=(kf 1 +2(i+k)kf 2 +k 2 f 2 )mod K
=(kf 1 +2ikf 2 +2k 2 f 2 +k 2 f 2 )mod K
=(δ(i)+b)mod K
where pi (i) is the original interleave address, k is the iteration step, b =2k 2 f 2 K is the code block length;
then, the remainder operation is eliminated: converting the remainder operation in the iterative formula into judgment of whether the remainder number is larger than K and smaller than 2K, and if so, performing primary K subtraction operation on the remainder number; and further obtaining a simplified iterative formula:
wherein 0 is not less than δ (i) < K,0 is not less than π (i) + δ (i) <2K;
the four-path parallel 4Turbo interleaving address iterative formula is as follows:
setting four-way parallel interweaving address initial value pi init P =0,1,2,3 respectively represents four parallel row indexes, P =4 represents the parallel row, and the length of each address sequence output after interleaving is:
it is easy to know that i = P · L and K = P · L, which are substituted into the original interleaved address formula:
π init (i)=[(f 1 +if 2 )×i]mod K
=[(f 1 +pLf 2 )×pL]mod(P·L)
={[(f 1 +pLf 2 )×p]mod P}×L
and if the iteration step length k =2 is taken, simplifying each path of interleaving address initial value as follows:
when i =0, pi (i) =0;
when i =1, pi (i) = (f) 1 +f 2 ) mod K, due to f 1 +f 2 E (0, K), then pi (i) = f 1 +f 2 ;
according to the known condition of Kmod32=0 in the protocol table of 3GPP36.212, K mod4=0 is obtained
The formula is further simplified as follows:
the above formula is equivalent to:
similarly, deducing:
when the temperature is higher than the set temperatureWhen the temperature of the water is higher than the set temperature,
when the temperature is higher than the set temperatureWhen the temperature of the water is higher than the set temperature,
δ (i) is an initial value in iteration, and δ (i) iteration formula in the case of an iteration step k =2 is:
δ (i + 2) = (δ (i) + b) mod K; wherein b =8f 2
When i =0, δ (0) = (2 f) 1 +4f 2 )mod K;
When i =1, δ (1) = (2 f) 1 +8f 2 )mod K;
the sequential address even channel initial value of δ (i) is:
similarly, the initial value of the sequential address odd channel of δ (i) is:
to this end, all δ (i) sequences are obtained with an iteration step k =2 satisfying:
wherein the content of the first and second substances,therefore, the iteration path of delta (i) just needs to multiplex the iterators according to the parity of the sequential address i;
step 2, parameter loading: storing user input K, f in the first clock cycle that vld _ in is valid 1 And f 2 (ii) a The four-path parallel LTE-based 4Turbo interleaved address output format is that each path of internal address is divided into odd and even sequential address index output, and 8 groups of interleaved addresses are output in total; the general formula of 8 groups of interleaving addresses is pi (2 n), pi (2n + 1),WhereinK is the code block length, and n is the iteration times; f. of 1 、f 2 Is a preset constant;
step 3, iterative initial value calculation: designing corresponding 8 paths of pi value iteration paths and two paths of delta paths according to a four-path parallel 4Turbo interleaving address iteration formula; in 5 clock cycles after the vld _ in, adopting 8 paths of pi value iteration paths to calculate iteration initial values of corresponding parallel paths, and outputting all the iteration initial values to the corresponding iteration paths in the 5 th clock cycle;
step 4, an iteration stage: an 8-path pi value iteration path and two paths delta paths are adopted to jointly operate to calculate an interleaving address, a new iteration value is generated in each clock period, and in the iteration process, 8 output interfaces output corresponding pi values, namely the interleaving address.
2. The four-way parallel LTE-based 4Turbo interleaving address generation method according to claim 1, wherein the 8-way pi value iteration path and the two-way delta path are: pi (2 n) iteration path, pi (2n + 1) iteration path,An iterative path,An iterative path,An iterative path,An iterative path,An iteration path, a delta (2 n) iteration path, and a delta (2n + 1) iteration path.
3. The four-path parallel LTE-based 4Turbo interleaving address generation method according to claim 2, wherein the pi (2 n) iteration path, the pi (2n + 1) iteration path, the,An iterative path,An iterative path,An iteration path, a delta (2 n) iteration path,The iteration path comprises a corresponding iterator and two alternative data selectors respectively;
when the input init _ done signal is at low level, the 0 ends of the two alternative data selectors are valid, and data input by the valid ends are output to the iterator to carry out initial value calculation of the iterator; after the initial value calculation is completed, if the init _ done signal is at high level, the 1 ends of the two alternative data selectors are valid, the input data of the 1 ends are output to corresponding iterators, and the iterators perform loop iteration according to the initial value and the input data to sequentially generate corresponding interleaved address sequences.
4. The method of claim 2, wherein the four-way parallel LTE-based 4Turbo interleaving address generation method is characterized in thatThe iteration path comprisesAn iterator and three alternative data selectors;
when the init done signal is low, one of the two-out data selectors is at f 1 [1]If =0, select 0 to outputIterator, at f 1 [1]K is selected when =1>>1 is output toAn iterator; the 0 terminals of the other two alternative data selectors are effective terminals, and the data K input by the effective terminals>>2 to output toThe iterative device is used for carrying out the iterative operation,the iterator calculates an initial value from two input data
The init _ done signal is high when the initial value calculation is completed, andthe 1 ends of two alternative data selectors directly connected by the iterator are effective ends, and input data delta of the effective ends even Is output toThe iterative device is used for carrying out the iterative operation,the iterator is based on the input initial valueAnd delta even Performing loop iteration to generateInterleaving the sequence of addresses.
5. The four-way parallel LTE-based 4Turbo interleaving address generation method according to claim 2, wherein said method comprisesThe iteration path comprisesAn iterator and three alternative data selectors;
when the init done signal is low, one of the two-out-of-one data selectors is at f 1 [1]K is selected when =0>>1 is output toIterator, at f 1 [1]If =1, select 0 to outputAn iterator; the 0 terminals of the other two alternative data selectors are effective terminals, and the data K input by the effective terminals>>2 to output toThe iterative device is used for carrying out the iterative operation,the iterator calculates an initial value from two input data
The init _ done signal is high when the initial value calculation is completed, anditerator straightThe 1 ends of the two alternative data selectors are connected as effective ends, and the input data delta of the effective ends is input even Is output toAn iterator is used for carrying out the iterative process,the iterator is based on the input initial valueAnd delta even Performing loop iteration to generateInterleaving the sequence of addresses.
6. The four-way parallel LTE-based 4Turbo interleaving address generation method according to claim 2, wherein said δ (2n + 1) iteration path comprises a δ (2n + 1) iterator and an alternative data selector;
when the init _ done signal is at low level, the 0 terminal of the alternative data selector is the active terminal, and the input f of the active terminal is 1 <<1 is output to a delta (2n + 1) iterator; f. of 2 <<3, directly inputting delta (2n + 1) iterator, and calculating an initial value delta (1) by the delta (2n + 1) iterator according to the two input data; when the initial value calculation is completed, the init _ done signal is at high level, the 1 end of the alternative data selector is an effective end, and the input data of the delta (2n + 1) iterator are delta (1) and f 2 <<3; and repeating cycle iteration continuously to generate pi (2n + 1) interleaved address sequences in turn until the generation is finished.
8. The method for generating the four-path parallel LTE-based 4Turbo interleaving address according to claim 1, wherein the value range of K is 1024-6144, and the minimum value interval between two K is 32; f. of 1 And f 2 Respectively odd and even.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110019218.1A CN112751572B (en) | 2021-01-07 | 2021-01-07 | Four-path parallel LTE-based 4Turbo interleaving address generation method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110019218.1A CN112751572B (en) | 2021-01-07 | 2021-01-07 | Four-path parallel LTE-based 4Turbo interleaving address generation method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112751572A CN112751572A (en) | 2021-05-04 |
CN112751572B true CN112751572B (en) | 2023-03-14 |
Family
ID=75650232
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110019218.1A Active CN112751572B (en) | 2021-01-07 | 2021-01-07 | Four-path parallel LTE-based 4Turbo interleaving address generation method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112751572B (en) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101087181A (en) * | 2007-07-11 | 2007-12-12 | 中兴通讯股份有限公司 | A method for removing interweaving and speed match |
CN102739358A (en) * | 2012-06-01 | 2012-10-17 | 武汉邮电科学研究院 | Method for realizing parallel Turbo code interweaver and used in LTE (Long Term Evolution) |
CN103986557A (en) * | 2014-05-23 | 2014-08-13 | 西安电子科技大学 | LTE Turbo code parallel block decoding method with low path delay |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
FR2880483A1 (en) * | 2004-12-31 | 2006-07-07 | France Telecom | INTERLACING METHOD AND DEVICE |
US8140932B2 (en) * | 2007-11-26 | 2012-03-20 | Motorola Mobility, Inc. | Data interleaving circuit and method for vectorized turbo decoder |
US20110087949A1 (en) * | 2008-06-09 | 2011-04-14 | Nxp B.V. | Reconfigurable turbo interleavers for multiple standards |
TW201209711A (en) * | 2010-08-19 | 2012-03-01 | Ind Tech Res Inst | Address generation apparatus and method for quadratic permutation polynomial interleaver |
CN102412850B (en) * | 2010-09-25 | 2014-02-05 | 中兴通讯股份有限公司 | Turbo code parallel interleaver and parallel interleaving method thereof |
US8762808B2 (en) * | 2012-02-22 | 2014-06-24 | Lsi Corporation | Multi-processing architecture for an LTE turbo decoder (TD) |
-
2021
- 2021-01-07 CN CN202110019218.1A patent/CN112751572B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101087181A (en) * | 2007-07-11 | 2007-12-12 | 中兴通讯股份有限公司 | A method for removing interweaving and speed match |
CN102739358A (en) * | 2012-06-01 | 2012-10-17 | 武汉邮电科学研究院 | Method for realizing parallel Turbo code interweaver and used in LTE (Long Term Evolution) |
CN103986557A (en) * | 2014-05-23 | 2014-08-13 | 西安电子科技大学 | LTE Turbo code parallel block decoding method with low path delay |
Non-Patent Citations (2)
Title |
---|
LTE中灵活并行无冲突Turbo码交织器的实现;黄跃斌等;《复旦学报(自然科学版)》;20130615(第03期);全文 * |
硬件友好的3GPP-LTE Turbo交织器设计;姚彦斌等;《高技术通讯》;20170115(第01期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN112751572A (en) | 2021-05-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN100583769C (en) | Time point system for ellipse curve password system | |
CN104375802A (en) | Multiplication and division device and operational method | |
CN101043284B (en) | Interleaver of TURBO coder in WCDMA system | |
CN104092470A (en) | Turbo code coding device and method | |
CN104038770A (en) | Discrete cosine transform (DCT) implementation method and system based on randomized computation | |
CN113783702A (en) | Hardware implementation method and system for elliptic curve digital signature and signature verification | |
CN113794572A (en) | Hardware implementation system and method for high-performance elliptic curve digital signature and signature verification | |
CN101295237B (en) | High-speed divider for quotient and balance | |
CN112307421A (en) | Base 4 frequency extraction fast Fourier transform processor | |
CN101196964B (en) | Anti-bypass attack algorithm chip | |
CN107092462B (en) | 64-bit asynchronous multiplier based on FPGA | |
CN112751572B (en) | Four-path parallel LTE-based 4Turbo interleaving address generation method | |
CN102004627A (en) | Multiplication rounding implementation method and device | |
CN109388373A (en) | Multiplier-divider for low-power consumption kernel | |
CN109687877A (en) | A kind of method and device reducing multiple stage circulation shift network cascade series | |
CN111313910B (en) | Low density parity check code encoder device for space communication application | |
CN103078729A (en) | Dual-precision chaotic signal generator based on FPGA (field programmable gate array) | |
CN109284085B (en) | High-speed modular multiplication and modular exponentiation operation method and device based on FPGA | |
CN113342310B (en) | Serial parameter matched quick number theory conversion hardware accelerator for grid cipher | |
CN112819168B (en) | Ring polynomial multiplier circuit in encryption and decryption of lattice cipher | |
CN115202616A (en) | Modular multiplier, security chip, electronic device and encryption method | |
CN111506293B (en) | High-radix divider circuit based on SRT algorithm | |
CN204143432U (en) | A kind of multiplier-divider | |
CN100477538C (en) | Turbo code interleaver | |
CN103176768A (en) | Modular multiplication method used for calculating classic modular multiplication and extensible modular multiplier |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |