CN112751572B

CN112751572B - Four-path parallel LTE-based 4Turbo interleaving address generation method

Info

Publication number: CN112751572B
Application number: CN202110019218.1A
Authority: CN
Inventors: 曹运合; 郭超; 孙正源; 李�城; 牛艺锋
Original assignee: Xidian University
Current assignee: Xidian University
Priority date: 2021-01-07
Filing date: 2021-01-07
Publication date: 2023-03-14
Anticipated expiration: 2041-01-07
Also published as: CN112751572A

Abstract

The invention discloses a four-way parallel LTE-based 4Turbo interleaving address generation method, which realizes a four-way eight-group parallel interleaving address generator with low complexity, occupies less resources, has low power consumption and can output 4 odd-even interleaving addresses in parallel. Compared with the traditional FPGA implementation interleaved address calculation scheme, the hardware circuit design scheme based on the special calculation function of the ASIC has higher performance; compared with a special iteration initial value calculation scheme, the design area can be effectively reduced by adopting an iterator multiplexing-initial value calculation scheme; and by adopting a gated clock technology, the design power consumption can be effectively reduced. The clock gating rate of the design is 98.94%, and the area after the design is integrated in DC under the 500MHz clock condition is about 9536.9 mu m ² . The method is suitable for the actual requirements of most radar communication integrated ASIC fields.

Description

Four-way parallel LTE-based 4Turbo interleaving address generation method

Technical Field

The invention belongs to the technical field of radar communication integration, and particularly relates to a four-path parallel LTE-based 4Turbo interleaving address generation method.

Background

With the application of radar target detection to communication systems, namely radar communication integrated systems. It is desirable to use modulated pulses such as BPSK, QPSK, etc. to achieve radar target detection while performing communication processing. The FPGA or ASIC design implementation of turbo interleaving address calculation in the 3GPP36.212 protocol can effectively meet the interleaving address calculation requirements of a radar communication integrated system with miniaturization, low power consumption and high performance.

The original calculation formula of the interleaved address is pi (i) = (f) ₁ i+f ₂ i ² )modK，(i＝0，1，2，...，K-11024≤K≤6144, K mod32= 0), where pi (i) is the output address of the interleaver, i is the sequential address before interleaving, K is the code block length, f ₁ 、f ₂ Is a preset constant and is uniquely determined by K. It can be seen that the formula contains multiplication, division and addition operations. If the direct expansion calculation introduces a multiplier and a divider on the hardware implementation, a large amount of logic resources are consumed. And the dividend is larger, and the complexity of the realized hardware is higher.

If the interleaving address solution is carried out by iteration, initial values for iteration must be calculated in advance in the process, and the solution of the iteration initial values needs to design an independent initial value solving circuit, but an additional initial value calculating circuit is introduced. Although a part of logic can be multiplexed in a certain way, since the part of the circuit is only gated in the circuit initialization process and is in an idle state when the interleaved address is generated iteratively, this will certainly result in a waste of area.

Disclosure of Invention

Aiming at the problems in the prior art, the invention aims to provide a four-path parallel LTE-based 4Turbo interleaving address generation method, which realizes a four-path eight-group parallel interleaving address generator with low complexity, has less resource occupation and low power consumption, completes the Turbo interleaving address calculation function in a 3GPP36.212 protocol and can output 4 odd-even interleaving addresses in parallel.

In order to achieve the purpose, the invention is realized by adopting the following technical scheme.

The four-path parallel LTE-based 4Turbo interleaving address generation method comprises the following steps:

step 1, iterative path optimization: optimizing an iterative formula of an original interleaving address to eliminate multiplication and remainder operation in the original iterative formula and obtain an optimized iterative formula; further obtaining a four-path parallel 4Turbo interleaving address iterative formula;

step 2, parameter loading: storing the K, f input by the user in the first clock cycle when the vld _ in is effective ₁ And f ₂ (ii) a The four-path parallel LTE-based 4Turbo interleaved address output format is that each path of internal address is divided into odd and even sequential address index output, and 8 groups of address index output are outputInterleaving the address; the general formula of 8 groups of interleaving addresses is pi (2 n), pi (2n + 1),

Wherein

K is the code block length, and n is the iteration number; f. of ₁ 、f ₂ Is a preset constant;

step 3, iterative initial value calculation: designing corresponding 8 paths of pi value iteration paths and two paths of delta paths according to a four-path parallel 4Turbo interleaving address iteration formula; in 5 clock cycles after the vld _ in, adopting 8 paths of pi value iteration paths to calculate iteration initial values of corresponding parallel paths, and outputting all the iteration initial values to the corresponding iteration paths in the 5 th clock cycle;

step 4, an iteration stage: an 8-path pi value iteration path and two paths delta paths are adopted to jointly operate to calculate an interleaving address, a new iteration value is generated in each clock period, and in the iteration process, 8 output interfaces output corresponding pi values, namely the interleaving address.

Further, the 8 ways of pi value iteration paths and the two ways of delta paths are as follows: pi (2 n) iteration path, pi (2n + 1) iteration path,

An iterative path,

An iterative path,

An iterative path,

An iterative path,

An iteration path, a delta (2 n) iteration path, and a delta (2n + 1) iteration path.

Further, after the iterative process begins, the detection is continued

The iteration output of the iteration channel represents that the iteration process is terminated when the channel outputs 0.

Furthermore, the value range of K is 1024-6144, and the minimum value interval between two K is 32; f. of ₁ And f ₂ Respectively odd and even.

Further, when the natural address is an odd number, the interleaving address is an odd number; when the natural address is an even number, the interleaved address is an even number.

Compared with the prior art, the invention has the beneficial effects that:

(1) The method realizes a low-complexity four-way eight-group parallel interleaving address generator, has less resource occupation and low power consumption, completes the turbo interleaving address calculation function in a 3GPP36.212 protocol, and can output 4-way odd-even interleaving addresses in parallel. Compared with the traditional FPGA implementation interleaved address calculation scheme, the hardware circuit design scheme based on the special calculation function of the ASIC has higher performance; compared with a special iteration initial value calculation scheme, the iterator provided by the invention realizes multiplexing in an initial value calculation stage, and can effectively reduce the design area.

(2) The invention adopts the gated clock technology, and can effectively reduce the design power consumption. The clock gating rate of the invention is 98.94%, the clock gating rate can be up to 1.1GHz under the TSMC90nm process library, the performance is high, and the area after the clock gating rate is integrated in DC under the 500MHz clock condition is about 9536.9 mu m ² The method is suitable for the actual requirements of most of the fields of radar communication integrated ASICs. In addition, an automatic simulation script is provided, and the design can be quickly verified by supporting one-key simulation in a Modelsim10.6 environment.

Drawings

The invention is described in further detail below with reference to the figures and specific embodiments.

FIG. 1 is a circuit diagram of a prior art sum and remainder iterator;

FIG. 2 is a diagram of a circuit for solving an initial value of an interleaving address in the prior art;

FIG. 3 is a diagram of a remainder circuit configuration according to an embodiment of the invention;

FIG. 4 is a delta for the case where the input data range is greater than 2K according to an embodiment of the present invention _init And solving the circuit structure diagram.

FIG. 5 is a schematic diagram of a pi (2 n) iterative path circuit according to an embodiment of the present invention;

FIG. 6 is a diagram of a π (2n + 1) iterative path circuit according to an embodiment of the present invention;

FIG. 7 shows an embodiment of the present invention

An iteration path circuit structure diagram;

FIG. 8 shows an embodiment of the present invention

An iteration path circuit structure diagram;

FIG. 9 shows an embodiment of the present invention

An iteration path circuit structure diagram;

FIG. 10 is a diagram of a delta (2 n) iterative path circuit according to an embodiment of the present invention;

FIG. 11 shows an embodiment of the present invention

An iteration path circuit structure diagram;

FIG. 12 shows an embodiment of the present invention

An iteration path circuit structure diagram;

FIG. 13 shows an embodiment of the present invention

An iteration path circuit structure diagram;

FIG. 14 is a diagram of an iterative path circuit for δ (2n + 1) according to an embodiment of the present invention;

FIG. 15 is a timing diagram illustrating an iterative detection of interleaved addresses in accordance with an embodiment of the present invention;

FIG. 16 is a timing diagram illustrating exemplary operation of an interleaved address generation module according to an embodiment of the present invention;

FIG. 17 is a modelsim platform output diagram in accordance with an embodiment of the present invention;

FIG. 18 is a graph of the integrated clock gating rate for DC according to an embodiment of the present invention;

FIG. 19 is a DC integrated top level module of an embodiment of the present invention;

FIG. 20 is a graph of the result of the integrated area calculation according to the embodiment of the present invention.

Detailed Description

Embodiments of the present invention will be described in detail below with reference to examples, but it will be understood by those skilled in the art that the following examples are only illustrative of the present invention and should not be construed as limiting the scope of the present invention.

The four-path parallel LTE-based 4Turbo interleaving address generation method provided by the embodiment of the invention comprises the following steps of:

step 1, iterative formula optimization: optimizing an iterative formula of an original interleaving address to eliminate multiplication and remainder operation in the original iterative formula and obtain an optimized iterative formula; further obtaining four-path parallel 4Turbo interleaving address iterative formulas;

firstly, eliminating multiplication operation and performing the following steps:

the iterative formula of the original interleaved address is:

π(i+k)＝(f ₁ (i+k)+f ₂ (i+k) ² )mod K

＝(if ₁ +i ² f ₂ +kf ₁ +2ikf ₂ +k ² f ₂ )mod K

let δ (i) = (kf) ₁ +2ikf ₂ +k ² f ₂ ) mod K, then:

π(i+k)＝(π(i)+δ(i))mod K

the iterative calculation of δ (i) is then:

δ(i+k)＝(kf ₁ +2(i+k)kf ₂ +k ² f ₂ )mod K

＝(kf ₁ +2ikf ₂ +2k ² f ₂ +k ² f ₂ )mod K

＝(δ(i)+b)mod K

wherein pi (i) is an original interleaving address, k is an iteration step, and b =2k ² f ₂ K is the code block length;

then, the remainder operation is eliminated: converting the remainder operation in the iterative formula into judgment whether the remainder number is greater than K and less than 2K, and if so, performing K subtraction operation on the remainder number once; and further obtaining a simplified iterative formula:

wherein, 0 is more than or equal to delta (i) < K,0 is more than or equal to pi (i) < K, and 0 is more than or equal to pi (i) + delta (i) <2K;

the simplified form of the iterative formula can be obtained by finding out the rule of the value range of 0 to pi (i) + delta (i) <2K, thereby realizing real simplification by mapping to a hardware circuit and avoiding division and remainder operation. Instead of simple comparison and addition operations, this can be realized using simple arithmetic logic circuit units such as comparators, multiplexers, adders.

Consider further the calculation of the initial values of 8 sets of parallel interleaved addresses, since the iterative method needs to know the initial values to be able to proceed. The required 8 groups of interleaved addresses have the general formula pi (2 n), pi (2n + 1),

Wherein

When n =0, the corresponding four-way parallel 8 groups of interleaving address initial values are pi (0), pi (1) and,

Deriving four-way parallel interleaving address initial value pi _init Let P =0,1,2,3 denote four parallel row indexes respectively, P =4 denotes the parallel row number, and the length of each address sequence output after interleaving is:

it is easy to know that i = p · L and K = p.l, and to bring them into the original interleaved address calculation:

π _init (i)＝[(f ₁ +if ₂ )×i]mod K

π _init (i)＝[(f ₁ +pLf ₂ )×pL]mod(P·L)

and (3) putting out L in the formula to obtain:

π _init (i)＝{[(f ₁ +pLf ₂ )×p]mod P}×L

and P =4 is a parallel number, so that the remainder of the code block length K is converted into the remainder operation of a fixed value P, and the complexity of hardware implementation is greatly reduced.

In order to meet the requirement of the problem, the iteration step length k =2 is taken, and the initial value of each path of interleaving address is simplified in the following cases:

when i =0, pi (i) =0;

when i =1, pi (i) = (f) ₁ +f ₂ ) mod K, due to f in the list ₁ +f ₂ E (0, K), then pi (i) = f ₁ +f ₂ ；

When in use

Time, i.e. parallel channel index p =1:

π _init (i)＝{[(f ₁ +1·Lf ₂ )×1]mod P}×L

wherein

It is easy to calculate from a theoretical point of view, according to the known condition in the 3gpp36.212 protocol table K mod32=0,

then there is K mod4=0, resulting in

This is true.

The formula is further simplified as follows:

can see f ₁ mod4 is a remainder to an integer power of 2, then f ₁ mod4 is equal to take f ₁ The lower 2 bits of the binary representation. And because f ₁ Is an odd number, the lowest order bit must be 1, only f need be ₁ Second lowest order f ₁ [1]Can be judged to obtain pi _init (i)：

When in use

Time, i.e. parallel channel index p =2:

π _init (i)＝{[(f ₁ +2·Lf ₂ )×2]mod P}×L

known according to the title: k mod32=0 available (Kf) ₂ ) mod4=0 holds, and f ₁ After multiplying odd numbers by 2, the remainder of 4 is obtained, and a constant value of 2 is always obtained. The above formula can be further simplified as follows:

when the temperature is higher than the set temperature

Time, i.e. parallel channel index p =3:

π _init (i)＝{[(f ₁ +3·Lf ₂ )×3]modP}×L

similarly, it can be deduced that:

this exactly satisfies the relation:

when in use

When the temperature of the water is higher than the set temperature,

when in use

When the utility model is used, the water is discharged,

when the temperature is higher than the set temperature

When the temperature of the water is higher than the set temperature,

further deriving an initial value of δ (i) in iteration, and according to δ (i) iterative calculation formula in the case of iteration step k =2:

δ(i)＝(2f ₁ +4if ₂ +4f ₂ )mod K

δ (i + 2) = (δ (i) + b) mod K (where b =8 f) ₂ )

When i =0, δ (0) = (2 f) ₁ +4f ₂ )mod K；

When i =1, δ (1) = (2 f) ₁ +8f ₂ )mod K；

When in use

When the temperature of the water is higher than the set temperature,

observe the above three formulas, the second term is Kf ₂ 、2Kf ₂ 、3Kf ₂ Since the remainder is zero for K, the following relationship holds, that is, the initial value of the even channel of the sequential address of δ (i) is:

similarly, a similar relationship holds for the odd channel of the sequential address, and the initial value is:

to this end, it can be obtained that all δ (i) sequences obtained by iterative computations with an iteration step k =2 satisfy:

wherein

Therefore, the iteration circuit of the delta (i) only needs to multiplex the iterators according to the parity of the sequential address i, and 2-path parallel delta (i) iterators can be obtained, so that the area consumption is reduced. Even when the design of the structural module is optimized, 8 parallel iterators for solving pi (i) can be reused to further optimize the area.

Both iterative simplification and initial value calculation can be achieved by the simplest logic, which provides a simplified model for the hardware mapping scheme.

The basic iterative operation units of the existing iterators are consistent in structure and are in a (a + b) mod K ' form, and the basic iterative operation units have the functions of ' summing and remainder ' and pi + delta epsilon [0, 2K) or delta + b epsilon [0, 2K).

The interleaving address solution is carried out by an iterative formula, initial values for iteration must be calculated in advance, and pi (0), pi (1) and pi (0) are independently calculated by designing an independent initial value solution module in the prior art,

δ _init (even)、δ _init (odd), but using this scheme introduces additional initial value calculation circuitry. Although a part of logic can be multiplexed in a certain way, since the part of the circuit is only gated in the circuit initialization process and is in an idle state when the interleaved address is generated iteratively, this will certainly result in a waste of area.

Fig. 2 is a schematic diagram of a conventional circuit structure for solving the initial value of the interleaved address. The circuit can calculate pi (0), pi (1) and pi (1) in the first clock cycle of user input,

Can be obtained in the second clock cycle

As shown in FIG. 3, for a mod module single remainder module, the data range of the input end of the module must satisfy din ∈ [0, K ], otherwise, the remainder result cannot be normally calculated. The circuit structure is shown in fig. 3. Because the remainder structure of the comparator-subtracter can only work correctly when the input data value range is between 0 and 2K, the input operand must be ensured not to be larger than 2K. However, when K =1504, f1=49, and f2=846. Since the initial value of δ (i) is calculated as:

δ _init (even)＝2f ₁ +4f ₂

δ _init (odd)＝2f ₁ +8f ₂

simple analysis shows that: whether 4f ₂ Or 8f ₂ All are larger than 2K, therefore, when solving the initial value of δ (i), the initial value calculation circuit is designed reasonably to handle the condition that the input data range is larger than 2K. In view of the above requirements, the designed circuit is shown in fig. 4, and the working flow of the circuit is as follows:

(1) Firstly f is firstly ₂ Shifting left by 1 bit, taking the remainder of K, and calculating to obtain 2f ₂ ％K。

(2) Continuously moving the result in the step (1) to the left by 1, then taking the remainder of K, and calculating to obtain 4f ₂ % K, when the results are compared with 2f ₁ Summing to give δ _init (even)。

(3) 4f obtained by calculation in (2) ₂ The left shift of the% K is 1 bit, and the K is taken out for residue to obtain 8f ₂ % K, when the results are compared with 2f ₁ Summing to give δ _init (odd)。

The existing remainder is carried out on the premise of adopting an independent initial value calculation circuit, but the part of the initial value calculation circuit is independent of an iterative circuit and works only at the beginning of the operation process, and the part of the circuit is in an idle state when an interleaving address is generated. This clearly creates a waste of 'computing power'.

The iterative operation unit for interleaved address generation can calculate (a + b)% K operations, and all intermediate variables and results in the initial value calculation circuit are ∈ [0, K ], which means that all operations in the initial value operation can be calculated by the iterative operation unit.

The invention designs a data processing flow, and the iterative operation unit is multiplexed to calculate the iterative initial value, so that an initial value calculation module in the original structure is omitted, and the design area can be greatly reduced.

In order to reduce the complexity of data transfer in a multiplexing structure, the basic principle of the initial value calculation function is 'in-situ operation', and the basic concept of the in-situ operation is all iterationsThe initial values are all produced in the corresponding iteration cell (e.g., pi (1) is calculated in an iterator that generates pi (1, 3,5, 7.). F due to partial K value ₂ The value is large, the result cannot be directly obtained through single complementation operation, and the result needs to be generated through three times of cyclic operation; considering that pi (0) =0 does not require further operation, 8f is calculated by using the iterator corresponding to pi (0) ₂ ％K。

The work flow of the interleaving address generator designed by the invention is as follows:

step 2, parameter loading: storing user input K, f in the first clock cycle that vld _ in is valid ₁ And f ₂ (ii) a The four-path parallel LTE-based 4Turbo interleaved address output format is that each path of internal address is divided into odd and even sequential address index output, and 8 groups of interleaved addresses are output in total; the general formula of 8 groups of interleaving addresses is pi (2 n), pi (2n + 1),

Wherein

step 3, iterative initial value calculation: designing corresponding 8-path pi value iteration paths and two-path delta paths according to a four-path parallel 4Turbo interleaving address iteration formula; in 5 clock cycles after the vld _ in, adopting 8 paths of pi value iterative paths to calculate iterative initial values of corresponding parallel paths, and outputting all the iterative initial values to the corresponding iterative paths in the 5 th clock cycle;

In the invention, the 8 paths of pi value iteration paths and the two paths of delta paths are as follows: pi (2 n) iteration path, pi (2n + 1) iteration pathA road,

An iterative path,

An iterative path,

An iterative path,

An iterative path,

Further, pi (2 n) iteration path, pi (2n + 1) iteration path,

An iterative path,

An iterative path,

An iteration path, a delta (2 n) iteration path,

The iteration path comprises a corresponding iterator and two alternative data selectors respectively; as shown in particular in fig. 5-11. When the input init _ done signal is at low level, the 0 ends of the two alternative data selectors are valid, and data input by the valid ends are output to the iterator to carry out initial value calculation of the iterator; after the initial value calculation is completed, if the init _ done signal is at high level, the 1 ends of the two alternative data selectors are valid, the input data of the 1 ends are output to the iterator, and the iterator performs circular iteration according to the initial value and the input data to sequentially generate corresponding interleaved address sequences.

Go to oneStep (a) to

The iteration path comprises

An iterator and three alternative data selectors; as shown in fig. 12.

When the init done signal is low, one of the two-out-of-one data selectors is at f ₁ [1]If =0, select 0 to output

Iterator, at f ₁ [1]If =1, selecting K > 1 to output

An iterator; the 0 terminals of the other two alternative data selectors are effective terminals, and the data K > 2 input by the effective terminals is output to

An iterator is used for carrying out the iterative process,

the iterator calculates an initial value from two input data

The init _ done signal is high when the initial value calculation is completed, and

the 1 ends of two alternative data selectors directly connected by the iterator are effective ends, and input data delta of the effective ends _even Is output to

The iterative device is used for carrying out the iterative operation,

the iterator is based on the input initial value

And delta _even Performing loop iteration to generate

Interleaving the sequence of addresses.

Further, the

The iteration path comprises

An iterator and three alternative data selectors as shown in fig. 13.

When the init done signal is low, one of the two-out-of-one data selectors is at f ₁ [1]If =0, selecting K > 1 to output

Iterator at f ₁ [1]If =1, select 0 to output

The iterative device is used for carrying out the iterative operation,

the iterator calculates an initial value from two input data

The iterative device is used for carrying out the iterative operation,

the iterator is based on the input initial value

And delta _even Performing loop iteration to generate

Interleaving the sequence of addresses.

Further, the δ (2n + 1) iteration path comprises a δ (2n + 1) iterator and an alternative data selector; as shown in fig. 14.

When the init _ done signal is at low level, the 0 terminal of the alternative data selector is the active terminal, and the input f of the active terminal is ₁ 1 to delta (2n + 1) iterator; f. of ₂ 3, directly inputting delta (2n + 1) iterator, and calculating an initial value delta (1) by the delta (2n + 1) iterator according to the two input data; when the initial value calculation is completed, the init _ done signal is at high level, the 1 end of the alternative data selector is an effective end, and the input data of the delta (2n + 1) iterator are delta (1) and f ₂ Is less than 3; and continuously performing loop iteration according to the sequence to sequentially generate a pi (2n + 1) interleaved address sequence until the generation is finished.

Since the four-way parallel LTE-based 4turbo interleaver requires parallel output of 8 interleaving addresses, a total of K/8 interleaving addresses are output for a fixed K value. To detect the stop of the iteration, the simplest way is to store the K/8 value in a register each time an interleaving address is generated, and construct a counter in the hardware circuit, increment the counter by 1 each time a valid output is generated, stop working until the count value reaches K/8, and enter the idle state.

However, this structure will additionally introduce a counter for flow control, and we derive from theory that in performing the iteration with step k =2, there are the following situations:

that is, when the number of iterations reaches K/8, the next interleaved address will be repeatedly output from the head (e.g., the address of the next interleaved address will be output from the head

) It can be found that pi (0) does not change regardless of the change in the value of K, and

the iteration channel becomes pi (K) = pi (0) =0 after the iteration number reaches K/8.

According to the principle, the detection is continuously carried out only after the iteration is started

The iteration output of the channel represents that the iteration process is terminated when the channel outputs 0. One counter can be optimized in this way. The interleave address iteration completion detection timing of the present invention is shown in fig. 15.

There are some specific values and bits in the design. The value of each bit is bound to a specific level and is not changed, and the resource consumption can be reduced by performing directional optimization on the corresponding bits in the program, wherein the specific values are as follows:

(a) The range of K values is 1024-6144 and the minimum number interval between two K values is 32, which also means that the low 5 bits corresponding to the K value are always low.

(b) According to protocol definition, f ₁ And f ₂ Odd and even, respectively, representing input f ₁ The lowest position is defined as 1,f ₂ The minimum is necessarily 0.

(c) When solving delta, the formula is delta _new ＝δ+(8f ₂ )mod K，b＝(8f ₂ ) mod K has a third bit of unity at 0,can solve for delta _new The third bit is optimized.

(d) According to the interleaving address definition formula, when the natural address is an odd number, the interleaving address is set as an odd number; when the natural address is an even number, the interleaving address positions the even number. This represents the pi of the output of all odd-way pi iterators _new The lowest position is 1; and pi of even-path pi value iterator output _new The lowest bit is always 0.

The port description of the interleaved address generator designed by the present invention is shown in table 1 below:

table 1 port description

The typical work flow of the module is shown in fig. 16, and the reset signal should be at a high level and the enable signal en should be at a high level during normal operation. User input signals k, f1 and f2 are stored by the module when vld _ in is high, and the output of the parallel interleaving address from the valid vld _ in signal to the first group has a time lag of 5 clock cycles. The interleaving address is continuously output, the vld _ out signal is continuously in a high level in the process of outputting the interleaving address, and the effective interleaving address of K/8 clock cycles is output.

Table 2 below illustrates the calculation process of each path of initial values in the parameter calculation stage, and since some initial values have interdependencies and cannot be calculated in one clock, there is a certain clock delay.

TABLE 2 parameter calculation clock

Table 2 shows that the pi (0) iterator is multiplexed to calculate delta (0) =2f in the first three clock cycles ₁ +4f ₂ And δ (1) =2f ₁ +8f ₂ 4f in (1) ₂ And 8f ₂ . Since there is f corresponding to K in the protocol ₂ Larger values, e.g. f at K =1504 ₁ ＝49，f ₂ =846, for which case 4f exists ₂ > 2K, cannot be complemented by a remainder taker in the design, first calculates 2f2% by a pi (0) iterator and multiplies it at the second clock and calculates 4f ₂ ％K＝[(2f ₂ ％K)＜＜1]% K, recalculated at the third clock 8f ₂ ％K＝[(4f ₂ ％K)＜＜1]% K. When 4f is calculated ₂ % K and 8f ₂ % K can calculate delta in the next clock _init (odd) and δ _init (even)。

An iteration stage: the total 8 pi value iteration paths and the two delta paths operate together, a new iteration value is generated in each clock period, the vld _ out signal is always in a high level in the iteration process, and the output interfaces pi _ 0-pi _7 output effective pi values.

Simulation experiment

The correctness and effectiveness of the invention are further illustrated by the simulation data processing result.

1) The simulation platform work flow:

f1 and f2 values corresponding to each K value test all cases of 1024 < = K < 6144 defined in a protocol, after the system is initialized, a verification process of a single K value is started by an auto _ test system task, then the verification process enters a result _ check task, a natural address is generated in the result _ check task, and then an interleaving address is calculated by an interleaving address calculation function directly by using a definition formula (defined as long int to prevent numerical overflow). And comparing with the module output, and defining three conditions in the comparison process, namely abnormal termination, output error and pass verification.

And (4) abnormal termination: means that when the module inputs a value of K, parallel interleaved addresses should be output continuously within K/8 clock cycles under normal conditions, but due to some error reasons, the vld _ out signal of the module fails in advance within K/8 clock cycles, and the simulation process will terminate in advance.

And (4) outputting an error: the output calculated by the interleaving address module is different from the result calculated by the function, and the simulation platform can output wrong information such as natural addresses, interleaving addresses and the like.

And (4) passing the verification: the module outputs all correct interleaving addresses in K/8 clock cycles of effective output, and the module works normally under the K value and passes the test.

After the simulation is run, when the script window outputs the information shown in FIG. 17 to represent that the corresponding test case passed, all cases should be shown here because the design has been carefully verified before submission.

2) Analysis of results

(1) Low power consumption analysis

In most digital systems, the clock tree generates more power consumption, and controlling the clock tree power consumption has a high significance for realizing low power consumption of the system. The adoption of the gated clock technology can effectively reduce power consumption, and is one of important methods for realizing low-power-consumption design. When the gated clock is used, the inversion of the input clock can be stopped when the register data is not changed, so that the dynamic power consumption is reduced to a certain extent.

The design has high gating rate and an intelligent clock control system, and can automatically close the module clock when the module does not generate effective output.

For a generic module, the internal clock can be gated through an enable interface en at the top of the module. However, due to the "coarse-grained" nature of the control, for a complex control system (e.g., "encoding system"), the gating enable of the entire module is valid when the entire encoding system is working, and all sub-module clocks in the module are in a working state. In practice, however, not all sub-modules are "active" during the operation phase of the coding system, resulting in a waste of part of the power consumption.

In order to solve the problem, an intelligent clock control technology can be adopted, the gated clock works only when the interleaving address generation module works, and the gated clock stops immediately after the interleaving address generation is finished.

A clock control register power _ en exists in the interleaving address generation module, the interleaving address generation module is set when the input vld _ in of the module is valid, and the interleaving address generation module returns to zero after all interleaving addresses are output by the module. The clock gating inside the module is controlled by the AND logic of the clock control register power _ en and the module top clock enable signal en. Fig. 19 presents a DC integrated gated clock rate diagram.

The control logic enables the clock in the module to automatically run only in the process of iterative initial value calculation and iterative calculation of the interleaved address, and automatically stop after the output is finished, thereby finishing the fine-grained control on the clock tree in the module and effectively reducing the power consumption of the subsystem.

(2) Area analysis

Fig. 19 shows a DC integrated top module of the design, the design uses a TSMC90nm process library for compilation, a fast condition is adopted, under the constraint of a 2ns clock cycle (500 MHz), a map _ effort and an area _ effort _ high gate _ clock option are adopted for integration, an integrated area report is shown in fig. 20, fig. 20 shows an area _ report of detailed resource consumption of each module, and it can be seen from the figure that the total area consumption of the design is 9536.8896 μm ² The method is suitable for the actual requirements of most of the fields of radar communication integrated ASICs.

(3) Frequency analysis

In order to better verify the working effect of the interleaving address generator, the clock constraint frequency is modified, and a clock frequency-area table is obtained. The results are given in table 3 below:

TABLE 3 clock frequency-area results

The optimization points of the invention are mainly concentrated on the area, and the design scheme can realize smaller area under the clock requirement of 500 MHz. The design main area is consumed by 8-way pi iterators and 2-way delta iterators.

Although the present invention has been described in detail in this specification with reference to specific embodiments and illustrative embodiments, it will be apparent to those skilled in the art that modifications and improvements can be made thereto based on the present invention. Accordingly, such modifications and improvements are intended to be within the scope of the invention as claimed.

Claims

1. The four-path parallel LTE-based 4Turbo interleaving address generation method is characterized by comprising the following steps of:

step 1, iterative path optimization: optimizing an iterative formula of an original interleaving address to eliminate multiplication and remainder operation in the original iterative formula to obtain an optimized iterative formula; further obtaining four-path parallel 4Turbo interleaving address iterative formulas;

the optimization of the iterative formula of the original interleaving address specifically comprises the following steps:

firstly, eliminating multiplication operation is carried out according to the following steps:

the iterative formula for the original interleaved address is:

π(i+k)＝(f ₁ (i+k)+f ₂ (i+k) ² )mod K

＝(if ₁ +i ² f ₂ +kf ₁ +2ikf ₂ +k ² f ₂ )mod K

let δ (i) = (kf) ₁ +2ikf ₂ +k ² f ₂ ) mod K, then:

π(i+k)＝(π(i)+δ(i))mod K

the iterative calculation of δ (i) is then:

δ(i+k)＝(kf ₁ +2(i+k)kf ₂ +k ² f ₂ )mod K

＝(kf ₁ +2ikf ₂ +2k ² f ₂ +k ² f ₂ )mod K

＝(δ(i)+b)mod K

where pi (i) is the original interleave address, k is the iteration step, b =2k ² f ₂ K is the code block length;

then, the remainder operation is eliminated: converting the remainder operation in the iterative formula into judgment of whether the remainder number is larger than K and smaller than 2K, and if so, performing primary K subtraction operation on the remainder number; and further obtaining a simplified iterative formula:

wherein 0 is not less than δ (i) < K,0 is not less than π (i) + δ (i) <2K;

the four-path parallel 4Turbo interleaving address iterative formula is as follows:

setting four-way parallel interweaving address initial value pi _init P =0,1,2,3 respectively represents four parallel row indexes, P =4 represents the parallel row, and the length of each address sequence output after interleaving is:

it is easy to know that i = P · L and K = P · L, which are substituted into the original interleaved address formula:

π _init (i)＝[(f ₁ +if ₂ )×i]mod K

＝[(f ₁ +pLf ₂ )×pL]mod(P·L)

＝{[(f ₁ +pLf ₂ )×p]mod P}×L

and if the iteration step length k =2 is taken, simplifying each path of interleaving address initial value as follows:

when i =0, pi (i) =0;

when i =1, pi (i) = (f) ₁ +f ₂ ) mod K, due to f ₁ +f ₂ E (0, K), then pi (i) = f ₁ +f ₂ ；

When the temperature is higher than the set temperature

Time, i.e. parallel channel index p =1:

according to the known condition of Kmod32=0 in the protocol table of 3GPP36.212, K mod4=0 is obtained

The formula is further simplified as follows:

the above formula is equivalent to:

when in use

Time, i.e. parallel channel index p =2:

when in use

Time, i.e. parallel channel index p =3:

similarly, deducing:

when in use

When the temperature of the water is higher than the set temperature,

when the temperature is higher than the set temperature

When the temperature of the water is higher than the set temperature,

when the temperature is higher than the set temperature

When the temperature of the water is higher than the set temperature,

δ (i) is an initial value in iteration, and δ (i) iteration formula in the case of an iteration step k =2 is:

δ (i + 2) = (δ (i) + b) mod K; wherein b =8f ₂

When i =0, δ (0) = (2 f) ₁ +4f ₂ )mod K；

When i =1, δ (1) = (2 f) ₁ +8f ₂ )mod K；

When in use

When the temperature of the water is higher than the set temperature,

the sequential address even channel initial value of δ (i) is:

similarly, the initial value of the sequential address odd channel of δ (i) is:

to this end, all δ (i) sequences are obtained with an iteration step k =2 satisfying:

wherein the content of the first and second substances,

therefore, the iteration path of delta (i) just needs to multiplex the iterators according to the parity of the sequential address i;

Wherein

K is the code block length, and n is the iteration times; f. of ₁ 、f ₂ Is a preset constant;

2. The four-way parallel LTE-based 4Turbo interleaving address generation method according to claim 1, wherein the 8-way pi value iteration path and the two-way delta path are: pi (2 n) iteration path, pi (2n + 1) iteration path,

An iterative path,

An iterative path,

An iterative path,

An iterative path,

3. The four-path parallel LTE-based 4Turbo interleaving address generation method according to claim 2, wherein the pi (2 n) iteration path, the pi (2n + 1) iteration path, the,

An iterative path,

An iterative path,

An iteration path, a delta (2 n) iteration path,

The iteration path comprises a corresponding iterator and two alternative data selectors respectively;

when the input init _ done signal is at low level, the 0 ends of the two alternative data selectors are valid, and data input by the valid ends are output to the iterator to carry out initial value calculation of the iterator; after the initial value calculation is completed, if the init _ done signal is at high level, the 1 ends of the two alternative data selectors are valid, the input data of the 1 ends are output to corresponding iterators, and the iterators perform loop iteration according to the initial value and the input data to sequentially generate corresponding interleaved address sequences.

4. The method of claim 2, wherein the four-way parallel LTE-based 4Turbo interleaving address generation method is characterized in that

The iteration path comprises

An iterator and three alternative data selectors;

when the init done signal is low, one of the two-out data selectors is at f ₁ [1]If =0, select 0 to output

Iterator, at f ₁ [1]K is selected when =1>>1 is output to

An iterator; the 0 terminals of the other two alternative data selectors are effective terminals, and the data K input by the effective terminals>>2 to output to

The iterative device is used for carrying out the iterative operation,

the iterator calculates an initial value from two input data

The iterative device is used for carrying out the iterative operation,

the iterator is based on the input initial value

And delta _even Performing loop iteration to generate

Interleaving the sequence of addresses.

5. The four-way parallel LTE-based 4Turbo interleaving address generation method according to claim 2, wherein said method comprises

The iteration path comprises

An iterator and three alternative data selectors;

when the init done signal is low, one of the two-out-of-one data selectors is at f ₁ [1]K is selected when =0>>1 is output to

Iterator, at f ₁ [1]If =1, select 0 to output

The iterative device is used for carrying out the iterative operation,

the iterator calculates an initial value from two input data

iterator straightThe 1 ends of the two alternative data selectors are connected as effective ends, and the input data delta of the effective ends is input _even Is output to

An iterator is used for carrying out the iterative process,

the iterator is based on the input initial value

And delta _even Performing loop iteration to generate

Interleaving the sequence of addresses.

6. The four-way parallel LTE-based 4Turbo interleaving address generation method according to claim 2, wherein said δ (2n + 1) iteration path comprises a δ (2n + 1) iterator and an alternative data selector;

when the init _ done signal is at low level, the 0 terminal of the alternative data selector is the active terminal, and the input f of the active terminal is ₁ <<1 is output to a delta (2n + 1) iterator; f. of ₂ <<3, directly inputting delta (2n + 1) iterator, and calculating an initial value delta (1) by the delta (2n + 1) iterator according to the two input data; when the initial value calculation is completed, the init _ done signal is at high level, the 1 end of the alternative data selector is an effective end, and the input data of the delta (2n + 1) iterator are delta (1) and f ₂ <<3; and repeating cycle iteration continuously to generate pi (2n + 1) interleaved address sequences in turn until the generation is finished.

7. The four-way parallel LTE-based 4Turbo interleaving address generation method according to claim 1, wherein detection is continuously performed after the iterative process starts

Iterative pathWhen the channel outputs 0, it represents that the iterative process is terminated.

8. The method for generating the four-path parallel LTE-based 4Turbo interleaving address according to claim 1, wherein the value range of K is 1024-6144, and the minimum value interval between two K is 32; f. of ₁ And f ₂ Respectively odd and even.