CN115001911B

CN115001911B - FPGA-based high-speed FTN signal iterative equalization method and system

Info

Publication number: CN115001911B
Application number: CN202210385857.4A
Authority: CN
Inventors: 武楠; 王皓铮; 李彬; 张婷婷; 戚远靖; 秦臻
Original assignee: Beijing Institute of Technology BIT
Current assignee: Beijing Institute of Technology BIT
Priority date: 2022-04-13
Filing date: 2022-04-13
Publication date: 2024-02-20
Anticipated expiration: 2042-04-13
Also published as: CN115001911A

Abstract

The invention belongs to the technical field of wireless communication, and relates to a high-speed FTN signal iterative equalization system and method based on an FPGA. The partitioned data structure and the simplified iterative equalization algorithm are adopted, and the complexity is obviously reduced while the system performance is ensured through the separated feedforward and feedback structures; the processing speed and throughput are improved through a parallel structure and a pipeline processing mode; and integrating all computing demands in the system, realizing each computing module by using a uniform interface and architecture, and saving a large amount of hardware resources by multiplexing the universal computing modules.

Description

FPGA-based high-speed FTN signal iterative equalization method and system

Technical Field

The invention belongs to the technical field of wireless communication, and relates to a high-speed FTN signal iterative equalization system and method based on an FPGA.

Background

In recent years, the rapid development of 5G technology makes mobile access technology, which is dedicated to everything interconnection, one of important internet access forms, and more communication systems are deployed in miniaturized terminal devices with limited power. Therefore, embedded devices with high performance and low power consumption are increasingly becoming a research hotspot in engineering applications. Among them, the signal processing system based on FPGA design is the most widespread solution. Meanwhile, the problem of increasingly tense frequency spectrum resources is brought about by the interconnection of everything, and the transmission speed and the communication quality of signals are seriously influenced. In such a background, the super Nyquist transmission technique (FTN) with higher spectral efficiency and channel capacity has attracted a great deal of attention in the field of wireless communication. In 1975, mazo proposed the theory of nyquist and demonstrated that when signals are transmitted at a rate exceeding 25% of the nyquist rate in an additive white gaussian noise channel, the minimum euclidean distance between the signals remains unchanged and the error performance is not affected. This conclusion breaks the constraint of orthogonal transmission and demonstrates the feasibility of non-orthogonal transmission, thereby yielding FTN transmission techniques. FTN transmission breaks through the limitation of nyquist rate by reducing symbol transmission interval, and has higher spectral efficiency and channel capacity compared with the traditional orthogonal system, but also introduces ISI, so that FTN signal detection is more difficult, and the complexity of the receiver is greatly improved. Meanwhile, logic and storage resources in the FPGA chip are limited, so that a certain technical difficulty still exists in realizing the super Nyquist transmission on the FPGA.

Since the FTN theory proposes, new equalization algorithms suitable for the FTN system are proposed, such as a truncated viterbi algorithm, an M-BCJR algorithm based on an Ungerboeck observation model, a continuous interference cancellation algorithm, a Turbo equalization algorithm based on a weighting factor graph, and the like, which reduce complexity to a certain extent or improve bit error rate performance, but are still not easy to be implemented by hardware. In order to solve the problem of large equalization calculation amount, japanese scholars s.sugiura et al propose a scheme of applying a frequency domain equalization technique to an FTN transmission system, truncating ISI introduced by the FTN by adding a cyclic prefix, and calculating a frequency domain equalization coefficient by using an MMSE criterion, so that the calculation amount can be effectively reduced, and the implementation of hardware is very easy. However, when the equalization coefficient is calculated, only ISI introduced by the FTN is considered, but the influence of an actual channel is not considered, so that the error performance is poor, and an additional channel coding is needed to obtain a better error rate.

In the same year, ganhao et al introduced an Iterative Block Decision Feedback Equalizer (IBDFE) into the FTN transmission system, which demaps the symbols during each iteration, thereby calculating and updating the filter coefficients and the feedback compensation amounts simultaneously, acting on the next iteration. This approach significantly improves the bit error rate performance of the receiver, but also increases the complexity of the system. After that, xu Yang et al propose a simplified IBDFE algorithm on the basis of this, and by taking into account the error relation between the symbol after decision and the transmitted symbol to introduce the variance of the decision error symbol, the error rate and the optimal signal-to-noise ratio are artificially set, so that the calculation of the filter coefficient is simplified to a certain extent, but the distance is still a certain difference from easy hardware implementation.

Disclosure of Invention

The technical solution of the invention is as follows: the method can effectively reduce complexity while ensuring performance, is easy to be deployed in miniaturized equipment based on the FPGA, improves throughput of data processing through parallel processing technology, and meets the requirement of high speed.

The technical scheme of the invention is as follows:

an iteration equalization method of a high-speed FTN signal based on an FPGA, comprising the following steps:

step 1, grouping an original symbol sequence to be transmitted, and transmitting the grouped symbol sequence, wherein each group of data comprises a data block with the length of N and a cyclic prefix with the length of 2 v;

the data block is obtained by dividing an original symbol sequence to be transmitted according to every N symbols;

the cyclic prefix refers to adding the first 2v symbols of the data block to the end of the data block, so as to form a group of data with the length of n+2v;

step 2, preprocessing the received symbol sequence, removing the cyclic prefix of each v symbols before and after each group of data, performing serial-parallel conversion to obtain a data block r to be processed, and finally initializing the iteration times l=0 of the data block;

the data block r to be processed is parallel N paths of data, and the data block r to be processed is processed by taking the parallel N paths of data blocks as a unit in the subsequent step;

the iteration times l refer to the times of iterative processing in the subsequent steps, and the value of the iteration times l is larger than 0;

step 3, performing N-point Fast Fourier Transform (FFT) on the data block R to obtain R;

step 4, judging the iteration times of the current processing process, and executing step 4.1 if the iteration times l=0, namely the first iteration; otherwise, the iteration number l is greater than 0, and the step 4.2 is executed;

step 4.1, frequency domain equalization, wherein frequency domain equalization is carried out on the data block R output in the step 3 according to the formula (1) to obtain Y:

Y＝R·W (1)

wherein W represents an equalization coefficient, and the element W (i) thereof is calculated by the formulas (2) - (5):

λ＝F(g ₁ ) (3)

wherein N is ₀ Represents noise power, P represents signal power, F (·) represents Fourier transform, h _s (t) and h _r (t) represents FTN signal shaping filter coefficients and matched filter coefficients, respectively;

step 4.2, calculating a feedback compensation coefficient, and compensating and correcting the data Y obtained in the step 4.1 after the frequency equalization in the first iteration according to the formula (6) to obtain U ^(l) ：

U ^(l) ＝Y+Z ^(l) (6)

Wherein U is ^(l) Representing the feedback compensated data block in the first iteration,representing the frequency domain form of the remapped symbol sequence in the first iteration,/i>Representing the first-1 foldThe compensation coefficient given by the feedback compensation coefficient calculation module in the generation, its element +.>Can be calculated from formulas (6) - (8):

in the method, in the process of the invention,correlation coefficients representing the decided symbol sequence and the original symbol sequence,/->Is the energy of the symbol sequence after the decision. In practice, the decision operation has greatly reduced +.>Error from the original symbol sequence, thus taking into accountIn the actual sense of->Far greater than +.>Therefore, the complex division is simplified according to the following formula, and the resource consumption of the complex division is reduced:

step 5, for U ^(l) Performing fast inverse Fourier transform (IFFT) to obtain u ^(l) . It should be noted that, in the 0 th iteration process, U ⁽⁰⁾ Obtained directly from the frequency domain equalization of step 4.1, i.e. U ⁽⁰⁾ ＝Y。

Step 6, judging whether the current process completes iteration, namely whether the iteration number reaches a preset value, if so, outputting u ^(l) The final iterative equalization result is obtained; otherwise, adding 1 to the current iteration number l and remapping to obtain

The remapping refers to that a bit sequence obtained after hard decision is carried out on a symbol sequence is mapped into the symbol sequence again according to a modulation mode of the bit sequence;

remappedFFT transform is performed to obtain->And re-performs step 4.

The system comprises a preprocessing module, a parallel FFT (fast Fourier transform) calculation unit, a frequency domain equalization module, a parallel complex multiplication calculation unit, a data caching module, a parallel IFFT calculation module, a feedback compensation parameter calculation module, a remapping module and an output module;

the preprocessing module is used for preprocessing an input symbol sequence of the high-speed FTN signal iterative equalization system, carrying out serial-parallel conversion on data with the length of N+2v in each group, and simultaneously removing cyclic prefixes of v symbols before and after each group of data to obtain parallel N paths of data blocks r to be processed. In addition, in order to meet the processing requirements in the subsequent modules, a tag needs to be built for the data block r. The binary format of the tag is [ b ] ₇ b ₆ b ₅ b ₄ b ₃ b ₂ b ₁ b ₀ ]Wherein [ b ] ₂ b ₁ b ₀ ]Representing the current number of iterations of the data block, the value is set to [000 ] in the preprocessing module]Representing the 0 th iteration. [ b ] ₅ b ₄ b ₃ ]The labels representing the data blocks are used for distinguishing different data blocks, and the data blocks are cached as addresses in the following modules; the identification is updated in the preprocessing module by means of a counter, i.e. for each data block to be processed [ b ] ₅ b ₄ b ₃ ]The value of (2) is increased by 1.[ b ] ₇ b ₆ ]For marking the data flow in the system, the value is set to [00 ] in the preprocessing module]. The preprocessing module passes the data block r and its tag as output to the parallel FFT computation element. Finally, in order to meet the demands of the pipeline processing of the system, the preprocessing module also needs to provide a uniform driving enable for the rest of the system so as to ensure the coordination stepping among all modules of the system.

The parallel FFT calculation unit is used for meeting the requirement of the system for executing parallel FFT calculation in a multiplexing mode, and consists of an FFT multiplexer, an N-path parallel FFT calculator and an FFT result distributor. The FFT multiplexer monitors FFT calculation calling requests from the preprocessing module, the parallel IFFT calculation module and the remapping module and transmits the FFT calculation calling requests to the FFT calculator through the data bus; the FFT calculator adopts a radix-2 algorithm to finish specific FFT calculation, and transmits a calculation result to an FFT result distributor; FFT result allocator pass check [ b ₇ b ₆ ]To determine the source and flow direction of the data, and to output the FFT calculation result to a parallel IFFT calculation module, a parallel complex multiplication unit or a feedback compensation parameter calculation module.

The frequency domain equalization module receives an input data block R from the parallel FFT calculation unit and invokes the parallel complex multiplication unit to perform the calculation of the formula (1). The module stores the frequency domain equalization coefficient W calculated in advance by the formulas (2) - (5), and outputs the frequency domain equalization coefficient W to the parallel complex multiplication calculation unit together with the frequency domain equalization coefficient W after receiving the data block R.

Said parallel complex multiplicationThe method calculation unit is used for meeting the requirement of the system for executing parallel complex multiplication calculation in a multiplexing mode, and is similar to the parallel FFT calculation unit, and the unit is composed of a complex multiplication multiplexer, an N-path parallel complex multiplication calculator and a multiplication result distributor. The complex multiplication multiplexer monitors complex multiplication calculation calling requests from the frequency domain equalization module and the feedback compensation parameter calculation module and transmits the complex multiplication calculation calling requests to the N paths of parallel complex multipliers through the data bus; n-path parallel complex multiplier completes specific complex multiplication calculation; multiplication result allocator passes check b ₇ b ₆ ]To determine the data source and flow direction, to output the complex multiplication result to the data buffer module or the feedback compensation parameter calculation module.

The data buffer module is used for buffering the data needed in the iterative process by encapsulating the RAM, and specifically, the module judges the iterative times [ b ] of the input data block label ₂ b ₁ b ₀ ]If it is 0, i.e. the data block is the result Y of the frequency domain equalization obtained in iteration 0, the data block is marked with its label [ b ] ₅ b ₄ b ₃ ]For address buffering into RAM, otherwise with input data block index b ₅ b ₄ b ₃ ]The corresponding data is fetched from RAM for address and added to the input data, resulting in U ^(l) And outputting to a parallel IFFT calculation module.

The parallel IFFT calculating module is used for performing parallel IFFT calculation on the input from the data buffer module, and the parallel IFFT calculation is realized by calling the parallel FFT calculating unit because the IFFT and the FFT have the same calculation structure. Specifically, the parallel IFFT module pairs the input data blocks U ^(l) After conjugation, the data block is output to a parallel FFT computing unit, and the label [ b ] of the data block is modified ₇ b ₆ ]To indicate its data source and flow direction. Finally, after receiving the calculation result returned by the parallel FFT calculation unit, the conjugate is taken again and the label is modified to be used as output u ^(l) To the remapping module and the output module.

The feedback compensation setting calculation module is used for receiving the outputs from the parallel FFT calculation unit and the parallel complex multiplication unitAnd executing equations (6) - (8) to calculate the compensation parameters during each iteration, the module comprising a parallel real multiplier, a parallel real divider, a FIFO, a RAM, a parallel complex divider, and a multiplexer. Wherein, for the input from the parallel FFT calculation unit, the label [ b ] is judged first ₂ b ₁ b ₀ ]If 0, the input data block is the FFT result R of the system input R, which is sent to a parallel complex divider to calculate R (n)/lambda (n) and the result is according to its label [ b ] ₅ b ₄ b ₃ ]Stored in RAM; otherwise, the input data block isThe following three operations will be performed: the R (n)/lambda (n) calculation result fetched from RAM is passed to a multiplexer to request the calling of parallel complex multiplication unit calculation +.>Together with its own conjugate to a multiplexer to request the call of the parallel complex multiplication unit to calculate +.>According to its label [ b ] ₅ b ₄ b ₃ ]Stored in FIFO for later use. The inputs from the parallel complex multiplication units are +.>And->After averaging, the +.f. is calculated by parallel real number multiplier and parallel real number divider>Simultaneously the +.>And sent togetherAnd a parameter-in calculation multiplexer. Finally, the parameter computation multiplexer processes the request and outputs the data to the parallel complex multiplication unit via the bus.

The remapping module is used for detecting the input data block u from the parallel IFFT calculation module ^(l) If not, adding 1 to the iteration number, hard-judging to obtain bit sequence according to its modulation mode, and remapping to symbol sequenceOtherwise, no operation is performed on the data block.

The output module is used for detecting the input data block u from the parallel IFFT calculation module ^(l) Determining whether iteration is completed or not, if so, removing the label of the signal, and simultaneously performing parallel-serial conversion to output a final high-speed FTN signal iteration equalization result; otherwise, the module does nothing.

Advantageous effects

1. The partitioned data structure and the simplified iterative equalization algorithm are adopted, and the complexity is obviously reduced while the system performance is ensured through the separated feedforward and feedback structures;

2. the processing speed and throughput are improved through a parallel structure and a pipeline processing mode;

3. and integrating all computing demands in the system, realizing each computing module by using a uniform interface and architecture, and saving a large amount of hardware resources by multiplexing the universal computing modules.

4. The structure of the existing iterative equilibrium algorithm is changed, the original iterative updating process is separated into two parts of independent feedforward and feedback loops, and the iterative updating structure and complexity are simplified; reasonable approximation is carried out on the formulas of the existing iterative equilibrium algorithm, and the calculated amount is obviously reduced while the system performance is ensured;

5. the processing speed and throughput are improved through a parallel structure and a pipeline processing mode;

6. and integrating all calculation requirements in the device, realizing each calculation module by using a uniform interface and architecture, and sharing the same calculation module for the same calculation process in all steps in a time-sharing multiplexing mode, thereby saving a great amount of hardware logic resources.

Drawings

FIG. 1 is a flowchart of a method for implementing iterative equalization FPGA of a high-speed FTN signal according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of an algorithm structure according to an embodiment of the present invention;

fig. 3 is a schematic structural diagram of an FPGA-based high-speed FTN signal iterative equalization system according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of an input preprocessing module according to an embodiment of the present invention;

fig. 5 is a schematic structural diagram of a feedback compensation parameter calculation module according to an embodiment of the invention.

Detailed Description

In order to better understand the technical solutions of the present application, the following will make a clear and complete description of the technical solutions of the embodiments of the present application with reference to the drawings in the embodiments of the present application.

The invention provides a high-speed FTN signal iterative equalization method and a realization device based on an FPGA, as shown in figure 1, the structure comprises the following parts:

and a preprocessing module. The function of the module is to preprocess the input symbol sequence, including blocking, removing the cyclic prefix, adding the address label, to meet the processing requirements of the following modules and steps.

And a parallel FFT calculation unit. The unit consists of a multiplexer, a parallel FFT calculator and a distributor, and is used for carrying out parallel FFT conversion on input data in a mode of multiplexing hardware logic resources.

And the frequency domain equalization module. And realizing the frequency domain equalization of the data block.

And a parallel complex multiplication unit. Similar to the FFT computation unit, parallel N-way complex multiplication computation is implemented.

And a feedback compensation parameter calculation module. The function of the module is to calculate feedback compensation coefficients for iterative equalization, wherein some complex multiplication calculations require complex multiplication units.

And a data caching module. The function of the module is to buffer the data needed in the subsequent iteration process.

And remapping the module. The function of the module is to hard judge the data of incomplete iteration according to the modulation mode to obtain a bit sequence, then remap the bit sequence into a symbol sequence, and send the symbol sequence into the next iteration process.

And an output module. The function of the module is to detect whether the iteration times are reached and restore the format output of the original data sequence.

The realization device of the invention can be used for realizing the iterative equalization of the high-speed FTN signals, and the modules can be realized based on the FPGA and are connected through a unified data interface. In order to better explain the working principle, specific implementation parameters need to be determined first. Comprehensively considering the error rate performance of the system and hardware conditions such as clock frequency, logic resource and the like of the current mainstream FPGA, in the embodiment, the implementation structure of the data block length N of 128, the cyclic prefix length v of 8, the input parallel path number p of 8 and the iteration number l of 3 is used for describing the implementation method of the FPGA for realizing the iteration balance of the high-speed super Nyquist system in detail. It should be noted that the structural parameters in the embodiment are only preferred implementation schemes, and the corresponding parameters can be adjusted according to specific application scenarios to achieve the expected effects and purposes.

The flow chart of the FPGA implementation method for iterative equalization of the high-speed FTN signal is shown in fig. 2, the algorithm structure is shown in fig. 3, and the method comprises the following steps:

step one, preprocessing an input symbol sequence. The received parallel 8-way symbol sequence is shown in fig. 4, where each set of data includes valid data symbols of length 128 and a cyclic prefix of length 16. It should be noted that the subsequent process is performed with a data block of length 128, so that the input symbol sequence needs to be subjected to serial-parallel conversion and 8 symbols before and after each are removed. Specifically, the preprocessing module and the subsequent modules use the same clock signal, serial-parallel conversion adopts a FIFO structure, and 8 parallel input data are buffered into 128 parallel data blocks every clock cycle in the case that data enable is valid, as shown in fig. 4. The assignment of registers requires a delay of one clock cycle. Meanwhile, a counter is used as a control, the value of which is incremented by 1 every clock cycle and reset every 18 clock cycles. When the counter accumulates to the 17 th clock period, the effective data to be processed is in the 128 parallel data blocks, at the moment, the output data enabling is set to be effective, and the output data enabling is invalid at the rest time.

Because the hardware implementation method provided by the invention adopts a pipeline processing mode, a plurality of unprocessed data blocks exist in the system at each moment. Meanwhile, since iterative processing is required, there are many multiplexing logic resources and feedback loops in the implementation structure. For the above two reasons, a corresponding tag must be added to each data block as a distinction and the number of iterations can be marked. Specifically, the format of the data block tag is [ b ] ₇ b ₆ b ₅ b ₄ b ₃ b ₂ b ₁ b ₀ ]Wherein:

[b ₂ b ₁ b ₀ ]representing the current number of iterations of the data block, the value is set to [000 ] in the preprocessing module]Representing the 0 th iteration. [ b ] ₅ b ₄ b ₃ ]The identification representing the data block is also used as an address to cache the data block in the subsequent module; the identification is updated in the preprocessing module with a counter, i.e. pair b, every time a valid data block is prepared ₅ b ₄ b ₃ ]The value of (2) is increased by 1.[ b ] ₇ b ₆ ]For marking the data flow in the multiplexing module, the value being set to [00 ] in the preprocessing module]. It should be noted that, the data block label is modified and set according to the requirement along with the subsequent data processing; in addition, as can be seen from the subsequent steps and the introduction analysis of the modules, the tag format provided in the embodiment of the present invention can unambiguously and uniquely identify the data blocks at each stage in the system.

Finally, the preprocessing module needs to delay the input data enabling by one clock cycle as the driving enabling of the subsequent module. Since it must be ensured that the multiplexing module does not collide with multiple valid data blocks at any time as the input symbol sequence is continuously fed into the pipeline, the output delay of each module needs to be strictly controlled and coordinated stepping between each module is ensured. Under control of unified drive enable, the delay between modules remains fixed, and valid data is marked by data enable.

In summary, the ports of all the subsequent modules in the embodiment of the present invention are composed of a clock signal, a driving enable and a plurality of input/output interfaces, where each input/output interface includes data enable, a tag and 128 parallel paths of data.

And step two, frequency domain equalization. Before frequency domain equalization, the data is first FFT transformed. As shown in fig. 1, the output of the preprocessing module first goes to the FFT multiplexer, which receives the inputs of the different modules and selects a set of valid inputs as outputs to be passed to the bus based on the data enable of the input interface, which introduces a processing delay of 1 clock cycle. The data to be calculated will then enter the 128-point FFT calculation module via the bus. The FFT module adopts a radix 2-butterfly operation structure, is realized in a recursive manner, and has a processing delay of 12 clock cycles. The FFT-completed data will enter a distributor, which will be based on tag b ₇ b ₆ ]Bits send data on the bus to different destination modules for subsequent processing, specifically, [00]Output from port 1, [01 ]]Output from port 2. It should be noted that the allocator is implemented using combinational logic, with no processing delay. Multiplexing of the FFT computation block is achieved by introducing a multiplexer, a distributor and using bus connections.

In the feedforward frequency domain equalization stage, the tag of the data block [ b ] ₇ b ₆ ]Is [00 ]]The following calculation will be performed by the port 1 entry multiplication module of the distributor:

Y＝R·W (1)

wherein, the element W (i) of the equalization coefficient W can be calculated by the formulas (2) - (5):

λ＝F(g ₁ ) (3)

wherein N is ₀ Represents noise power, P represents signal power, F (·) represents Fourier transform, h _s (t) and h _r (t) represents FTN shaping filter coefficients and matched filter coefficients, respectively. Since the feedforward frequency domain equalization coefficient is not updated during processing and can be pre-calculated, this coefficient can be stored directly in the allocator and output from port 1 along with the data block R. Obviously, the input interface of the complex multiplication unit needs to be widened by 128 data to meet the parallel entry of two multipliers.

Meanwhile, the complex multiplication unit has a similar structure to the FFT calculation unit to realize a multiplexing function. Wherein the parallel multiplier verifies the parallel computation by invoking 128 complex multiplication IP blocks with a processing delay of 1 clock cycle. After the frequency domain equalization is completed, tag [ b ] of data block Y ₇ b ₆ ]Still being [00 ]]The allocator of the complex multiplication unit sends its output from port 2 to the data buffering module.

And thirdly, calculating a feedback compensation coefficient, and compensating and correcting the received data. First, the data buffer module inputs the data block Z ^(l) The number of iterations is judged by the label:

if the iteration is 0, i.e. l= [ b ₂ b ₁ b ₀ ]＝[000]Then the input representing the data buffer module is the result of the frequency domain equalization (i.e. Z ^(l) Y), at which time Y is passed to the module output port and the number of iterations is increased by 1, i.e. U ^(l+1) =y; at the same time, Y is according to its label [ b ] ₅ b ₄ b ₃ ]Stored as addresses in RAM for use in subsequent iterative processes. And then skip the subsequent content of this step.

If not iteration 0, the data cache module will go through tag [ b ] ₅ b ₄ b ₃ ]The data block Y after frequency domain equalization (i.e. 0 th iteration) is taken out of RAM and is then processed according to U ^(l+1) ＝Y+Z ^(l) And compensating and outputting. Wherein,representing the symbol sequence after the first iteration decision, < >>Representing the compensation coefficient given by the feedback compensation coefficient calculation module in the first iteration, the element +.>Can be calculated from formulas (6) - (8):

in the method, in the process of the invention,correlation coefficients representing the decided symbol sequence and the original symbol sequence,/->Is the energy of the symbol sequence after the decision.

In the course of the specific implementation process, the method comprises,will be performed by a feedback compensation parameter calculation moduleObtained. As shown in fig. 1, two groups of input ports of the feedback compensation parameter calculation module are respectively connected with an output port 1 of the FFT calculation unit distributor and an output port 2 of the complex multiplication unit distributor, and one group of output ports is connected with an input of the complex multiplication unit multiplexer.

Specifically, as shown in fig. 5, the feedback compensation parameter calculation module first determines the number of iterations for the data block from the FFT allocator port 1 by the tag to distinguish R from R

In the frequency domain equalization stage, the original data block R after FFT conversion is sent into an 8-path parallel complex divider execution type (7)Is calculated by the computer. It should be noted that, the complex divider consumes a large amount of hardware resources, and cannot implement calculation with high parallelism. But the result of the calculation is only used in the next iteration and is calculated once per data block R, which allows a relatively sufficient processing delay for the calculation process, which can be completed over a period of time by a small number of parallel dividers. In summary, in the embodiment of the invention, the 128 paths of data blocks are sequentially calculated by using the 8 paths of parallel complex dividers, and the processing process occupies 16 clock cycles of the dividers and is not greater than the effective data output cycle of the preprocessing module, so that no conflict is caused. The parallel complex divider implementation is shown in fig. 5, with the addition of the input-output assignment process, with a total processing delay of 18 clock cycles. Data for completing division calculation->Will be stored as an address in RAM as a tag for use in a subsequent iteration.

In the iteration stage, the last iteration resultThe following three operations will be performed:

(1) And taken out of RAMOutputting together;

(2) After delaying for 2 clock cycles, outputting the result together with the result of conjugation;

(3) Stored in FIFO for later use.

The output of the above operation will modify the data block tag b ₇ b ₆ ]Is [01 ]]Then, the complex multiplication unit is fed to perform the calculations of the formulas (7) and (8). The allocator of the complex multiplication unit distinguishes tags [ b ] ₇ b ₆ ]And sending the calculation result back to the feedback compensation parameter calculation module through the port 2. It should be noted that, by delaying the operation (2) by two clock cycles, the calculation result output by the port 1 takes up 2 consecutive clock cycles, the former one isThe latter is +.>This simplifies the control logic. The feedback compensation parameter calculation module receives +.>Thereafter, 128-way parallel multiplication computation in equation (9) is performed, since the set of multipliers of this computation is fixed and the processing delay is 1 clock cycle, the result is exactly equal to +.>Division calculation in the execution type (9) of the parallel real divider with 32 paths is carried out simultaneously to obtain compensation coefficient +.>The 32-way parallel real divider has the same structure as the 8-way parallel complex divider, and the processing delay is 6 clock cycles. In addition, it outputs the read control signal in good time, and takes out +.>And->Outputs to complex multiplication unit to calculate compensation Z ^(l) And is fed into the data caching module through the distributor port 1.

Step four, the compensated data block U ^(l+1) And sending the mixture to an IFFT module for performing inverse fast Fourier transform. According to the characteristic of the fast fourier transform, the FFT has the same computational structure as the IFFT, i.e., X (N) =conj (FFT (conj (X (k)))/N. Thus, in a specific implementation, the IFFT module may convert the received data U ^(l+1) Conjugation is taken and tag b is modified ₇ b ₆ ]Is [01 ]]And then sent to an FFT computing unit for computation. The result of the calculation is sent back to the IFFT module again through port 2 of the FFT computation element distributor, and then the conjugate is taken to obtain the inverse transformed data u ^(l+1) And output to decision module and output module, while recovering tag [ b ] ₇ b ₆ ]Is [00 ]]. It should be noted that the IFFT module is implemented using combinational logic, and has no processing delay. In this way, the FFT and the IFFT share the same computing module, thereby saving a great deal of hardware resources.

Step five, the output module and the judging module can judge the current data block u ^(l+1) Whether the number of iterations is greater than the expected number of iterations:

if the data block is larger than the detection symbol sequence, the output module executes the opposite operation of the input preprocessing module, namely label removal and parallel-serial conversion, and finally the data block exits iteration and outputs the detection symbol sequence; the decision module does not take any action.

If not, the output module does not take any action; the decision module decides the data block u ^(l+1) Hard decision making, remapping into data block s ^(l+1) And send it into FFT module for calculationAnd continuing to execute the step three, and carrying out the next iteration.

It should be noted that, in order to ensure that the input bus of the multiplexing module does not collide with data (i.e. multiple data blocks occupy one bus at the same time), taking one data block as an example, taking all the time that each bus is occupied in the complete iteration process to modulo the output period (18 in this embodiment) of the preprocessing module must be ensured not to be repeated. For example, assume that a block of data is output from the preprocessing module at time 0, it will occupy the bus of the FFT computation element at times 1,18,32,65,79,112, so that during an iteration cycle the bus will be occupied at (1,18,32,65,79,112) mod 18= (1,0,14,11,7,4) at these times, and no collision is visible. Other buses are similar. If a conflict occurs, the processing delays of the respective modules need to be appropriately adjusted to meet the above conditions, and specifically, the adjustment may be performed by adding a register buffer or changing the processing delay of the ip core.

From this, the FPGA hardware realization of the iterative equalization of the high-speed FTN signal is completed, and the invention has the advantages that: by adopting a partitioned data structure and an iterative equalization algorithm and separating a feedforward structure from a feedback structure, the complexity is obviously reduced while the system performance is ensured; the processing speed and throughput are improved through a parallel structure and a pipeline processing mode; all the calculation demands in the system are integrated, each calculation module is realized by a unified interface and architecture, and a great amount of hardware resources are saved in a multiplexing mode.

In summary, although embodiments of the present invention have been described with reference to specific structural parameters and drawings, the present invention is not limited to the specific structural parameters. It will be apparent to those skilled in the art that several modifications can be made to the structural parameters in conjunction with the specific application scenario without departing from the principles and concepts of the present invention, which are also to be considered as falling within the scope of the present invention.

Claims

1. The method for iteratively equalizing the high-speed FTN signal based on the FPGA is characterized by comprising the following steps of:

step 3, performing N-point fast Fourier transform on the data block R to obtain R;

Y＝R·W (1)

λ＝F(g ₁ ) (3)

U ^(l) ＝Y+Z ^(l) (6)

Wherein U is ^(l) Representing the feedback compensated data block in the first iteration, representing the frequency domain form of the remapped symbol sequence in the first iteration,/i>Representing the compensation coefficient given by the feedback compensation coefficient calculation module in the first-1 iteration, the element +.>Can be calculated from formulas (7) - (9):

in the method, in the process of the invention,correlation coefficients representing the decided symbol sequence and the original symbol sequence,/->Is the energy of the symbol sequence after the decision;

step 5, for U ^(l) Fast-runningThe fast Fourier transform is used for obtaining u ^(l) ；

2. The FPGA-based high-speed FTN signal iterative equalization method of claim 1, wherein:

in the step 1, the data block is obtained by dividing the original symbol sequence to be transmitted according to every N symbols; the cyclic prefix refers to adding the first 2v symbols of the data block to the end of the data block, thereby constituting a set of data of length n+2v.

3. The method for iterative equalization of FPGA-based high-speed FTN signals of claim 2, wherein:

in the step 2, the data block r to be processed is parallel N paths of data.

4. The method for iterative equalization of FPGA-based high-speed FTN signals of claim 3, further comprising:

in the step 5, in the 0 th iteration process, U ⁽⁰⁾ Obtained directly from the frequency domain equalization of step 4.1, i.e. U ⁽⁰⁾ ＝Y。

5. The method for iterative equalization of FPGA-based high-speed FTN signals of claim 4, wherein:

in the step 6, remapping refers to remapping the bit sequence obtained after hard decision is performed on the symbol sequence into the symbol sequence according to the modulation mode.

6. The method for iterative equalization of FPGA-based high-speed FTN signals according to any one of claims 1-5, wherein:

remappedFFT transform is performed to obtain->And re-performs step 4.

7. An FPGA-based high-speed FTN signal iterative equalization system based on the method of claim 1, characterized in that: the system comprises a preprocessing module, a parallel FFT computing unit, a frequency domain equalizing module, a parallel complex multiplication computing unit, a data buffer module, a parallel IFFT computing module, a feedback compensation parameter computing module, a remapping module and an output module;

the preprocessing module is used for preprocessing an input symbol sequence of the high-speed FTN signal iterative equalization system, carrying out serial-parallel conversion on data with the length of N+2v in each group, and simultaneously removing cyclic prefixes of v symbols before and after each group of data to obtain parallel N paths of data blocks r to be processed;

the parallel FFT computing unit comprises an FFT multiplexer, an N-path parallel FFT calculator and an FFT result distributor, wherein the FFT multiplexer monitors FFT computing calling requests from the preprocessing module, the parallel IFFT computing module and the remapping module and transmits the FFT computing calling requests to the FFT calculator through a data bus; the FFT calculator adopts a radix-2 algorithm to finish specific FFT calculation, and transmits a calculation result to an FFT result distributor; FFT result allocator pass check [ b ₇ b ₆ ]Determining the source and the flow direction of data, so as to output FFT calculation results to a parallel IFFT calculation module, a parallel complex multiplication unit or a feedback compensation parameter calculation module;

the frequency domain equalization module receives an input data block R from the parallel FFT calculation unit and calls the parallel complex multiplication unit to execute the calculation of the formula (1), the frequency domain equalization coefficient W which is obtained by the pre-calculation of the formulas (2) - (5) is stored in the module, and the frequency domain equalization coefficient W are output to the parallel complex multiplication calculation unit together after the data block R is received;

the parallel complex multiplication calculation unit comprises a complex multiplication multiplexer, an N-path parallel complex multiplication calculator and a multiplication result distributor, wherein the complex multiplication multiplexer monitors complex multiplication calculation calling requests from the frequency domain equalization module and the feedback compensation parameter calculation module and transmits the complex multiplication calculation calling requests to the N-path parallel complex multiplier through a data bus; n-path parallel complex multiplier completes specific complex multiplication calculation; multiplication result allocator passes check b ₇ b ₆ ]Determining the source and the flow direction of data so as to output complex multiplication calculation results to a data caching module or a feedback compensation parameter calculation module;

the data caching module is used for caching data needed in the iterative process by packaging the RAM;

the parallel IFFT calculation module is used for performing parallel IFFT calculation on the input from the data caching module;

the feedback compensation parameter calculation module is used for receiving the outputs from the parallel FFT calculation unit and the parallel complex multiplication unit and executing formulas (6) - (8) to calculate the compensation parameters in each iteration process, and comprises a parallel real number multiplier, a parallel real number divider, a FIFO, a RAM, a parallel complex number divider and a parameter calculation multiplexer, wherein for the input from the parallel FFT calculation unit, the label [ b ] is firstly judged ₂ b ₁ b ₀ ]If 0, the input data block is the FFT result R of the system input R, which is sent to a parallel complex divider to calculate R (n)/lambda (n) and the result is according to its label [ b ] ₅ b ₄ b ₃ ]Stored in RAM; otherwise, the input data block isThe following three operations will be performed: the R (n)/lambda (n) calculation result fetched from RAM is passed to a multiplexer to request the calling of parallel complex multiplication unit calculation +.>Together with its own conjugate to a multiplexer to request the call of the parallel complex multiplication unit to calculate +.>According to its label [ b ] ₅ b ₄ b ₃ ]Is stored in FIFO for input from parallel complex multiplication units>And->After averaging, the +.f. is calculated by parallel real number multiplier and parallel real number divider>Simultaneously the +.>Sending the processing request to a parameter calculation multiplexer, and finally, outputting the data to a parallel complex multiplication unit through a bus by the parameter calculation multiplexer;

the remapping module is used for detecting the input data block u from the parallel IFFT calculation module ^(l) If not, adding 1 to the iteration number, hard-judging to obtain bit sequence according to its modulation mode, and remapping to symbol sequenceOtherwise, do no operation to the data block;

the output module is used for detecting the input data block u from the parallel IFFT calculation module ^(l) Determining whether iteration is completed or not, if so, removing the label of the signal, and simultaneously performing parallel-serial conversion to output a final high-speed FTN signal iteration equalization result; otherwiseThe module does not do anything;

the preprocessing module also constructs a label for the data block r, and the binary format of the label is [ b ] ₇ b ₆ b ₅ b ₄ b ₃ b ₂ b ₁ b ₀ ]Wherein [ b ] ₂ b ₁ b ₀ ]Representing the current number of iterations of the data block, the value is set to [000 ] in the preprocessing module]Represents iteration 0, [ b ] ₅ b ₄ b ₃ ]The index representing the data block is used to distinguish between different data blocks, which index is updated in the preprocessing module by means of a counter, i.e. pair b, each time a data block to be processed is prepared ₅ b ₄ b ₃ ]The value of [ b ] is increased by 1 ₇ b ₆ ]For marking the data flow in the system, the value is set to [00 ] in the preprocessing module]The preprocessing module transmits the data block r and the tag thereof as output to the parallel FFT calculation unit, and finally, the preprocessing module also needs to provide a unified driving enable for the rest of the system so as to ensure coordination stepping among all modules of the system.

8. The FPGA-based high-speed FTN signal iterative equalization system of claim 7, wherein:

the data buffer module judges the iteration number [ b ] of the input data block label through the encapsulation RAM ₂ b ₁ b ₀ ]If it is 0, i.e. the data block is the result Y of the frequency domain equalization obtained in iteration 0, the data block is marked with its label [ b ] ₅ b ₄ b ₃ ]For address buffering into RAM, otherwise with input data block index b ₅ b ₄ b ₃ ]The corresponding data is fetched from RAM for address and added to the input data, resulting in U ^(l) And outputting to a parallel IFFT calculation module.

9. The FPGA-based high-speed FTN signal iterative equalization system of claim 7, wherein:

the parallel IFFT calculation module is used for inputting the data block U ^(l) Output to parallel FFT calculation sheet after conjugationMeta-and tag b of the data block to be modified at the same time ₇ b ₆ ]To indicate the data source and flow direction, finally, when receiving the calculation result returned by the parallel FFT calculation unit, the method takes conjugate again and modifies the label as output u ^(l) To the remapping module and the output module.