CN113918875B

CN113918875B - Fast processing method of two-dimensional FFT

Info

Publication number: CN113918875B
Application number: CN202111114501.9A
Authority: CN
Inventors: 林炳章; 苏亮; 吴江淼; 叶炳
Original assignee: Tung Thih Electron Xiamen Co Ltd
Current assignee: Tung Thih Electron Xiamen Co Ltd
Priority date: 2021-09-23
Filing date: 2021-09-23
Publication date: 2024-05-03
Anticipated expiration: 2041-09-23
Also published as: CN113918875A

Abstract

The invention relates to a quick processing method of a two-dimensional FFT, which comprises the steps of reading original data in a RAM unit through a DMA unit, sequentially reading the data, storing the data into a cache unit A1, a cache unit A2, a cache unit A3 and a cache unit A4, sequentially calculating the data through an FFT calculation unit F1, an FFT calculation unit F2, an FFT calculation unit F3 and an FFT calculation unit F4 to obtain a one-dimensional FFT calculation result, storing the result into a cache unit B1, a cache unit B2, a cache unit B3 and a cache unit B4, and writing the result into the RAM unit through the DMA unit to finish the first-dimensional FFT calculation. Similar to the first dimension FFT computation, a second dimension FFT computation is performed. The invention uses the DMA unit, the hardware accelerator FFTA and the CPU to execute in sections, so that the read-write operation of the RAM can be executed in parallel with FFT calculation, the operation efficiency is improved, and the system real-time performance is good.

Description

Fast processing method of two-dimensional FFT

Technical Field

The invention relates to the field of data processing, in particular to a quick processing method of a two-dimensional FFT.

Background

The existing radar data or other data is processed by two-dimensional FFT: firstly, sequentially performing DMA read operation, FFT calculation and DMA write operation on AD original data in a RAM, and then storing a one-dimensional calculation result into the RAM; and then sequentially performing DMA read operation, FFT calculation and DMA write operation on the one-dimensional FFT calculation result in the RAM, and storing the two-dimensional calculation result in the RAM. The treatment method has the following problems:

1. The DMA read-write operation and the FFT processing are executed in series, so that the operation efficiency is low, and the real-time requirement cannot be met;

2. Although DMA can support dual channels, if read-write operation is performed simultaneously, random read-write problem of RAM is brought, especially DRAM problem is outstanding, resulting in low code execution efficiency and failure to meet real-time requirement;

3. in the FFT processing process, the CPU is occupied by the thread for a long time, so that other threads cannot be normally scheduled;

4. Without fully utilizing hardware computing resources, the CPU may require a higher dominant frequency to meet application requirements, resulting in difficulty in cost reduction.

In view of the above problems, the present inventors have proposed a fast FFT processing method.

Disclosure of Invention

The invention aims to provide a quick processing method of a two-dimensional FFT (fast Fourier transform) so as to improve the operation efficiency of the FFT.

In order to achieve the above purpose, the technical scheme adopted by the invention is as follows:

The fast processing method of the two-dimensional FFT is realized based on a processing system, wherein the processing system comprises an FFT calculation unit, a cache unit, a RAM unit and a DMA unit;

The FFT calculation unit performs FFT calculation through a CPU and a hardware accelerator FFTA, and comprises an FFT calculation unit F1, an FFT calculation unit F2, an FFT calculation unit F3 and an FFT calculation unit F4;

the cache units comprise a cache unit A1, a cache unit A2, a cache unit A3, a cache unit A4, a cache unit B1, a cache unit B2, a cache unit B3 and a cache unit B4;

The RAM unit is used for storing data to be subjected to FFT calculation and data after the FFT calculation is completed; the DMA unit is used for reading and storing data in the RAM into the cache unit A1, the cache unit A2, the cache unit A3 and the cache unit A4, or writing the data in the cache unit B1, the cache unit B2, the cache unit B3 and the cache unit B4 into the RAM unit;

the processing method comprises the steps of initial reading and calculating, cyclic reading and calculating, and ending calculating and writing;

The initial reading and calculation is as follows:

(1) Starting a DMA unit, reading data to be operated in the RAM unit, and putting the data into a cache unit A1 and a cache unit A2;

(2) The FFT calculation unit F1 performs FFT calculation on the data in the buffer unit A1 by utilizing a CPU and a hardware accelerator FFTA, and the calculation result is stored in the buffer unit B1;

the cyclic reading and writing and calculation are as follows:

(1) The FFT calculation unit F2 performs FFT calculation on the data in the buffer unit A2 by utilizing the CPU and the hardware accelerator FFTA, and the calculation result is stored in the buffer unit B2; meanwhile, starting a DMA unit, reading data to be operated in the RAM unit, and putting the data into a cache unit A3 and a cache unit A4;

(2) The FFT calculation unit F3 performs FFT calculation on the data in the buffer unit A3 by utilizing the CPU and the hardware accelerator FFTA, and the calculation result is stored in the buffer unit B3; meanwhile, starting a DMA unit, and writing the data in the cache unit B1 and the cache unit B2 into the RAM;

(3) The FFT calculation unit F4 performs FFT calculation on the data in the buffer unit A4 by utilizing the CPU and the hardware accelerator FFTA, and the calculation result is stored in the buffer unit B4; meanwhile, starting a DMA unit, reading data to be operated in the RAM unit, and putting the data into a cache unit A1 and a cache unit A2;

(4) The FFT calculation unit F1 performs FFT calculation on the data in the buffer unit A1 by utilizing a CPU and a hardware accelerator FFTA, and the calculation result is stored in the buffer unit B1; meanwhile, starting a DMA unit, and writing the data in the cache unit B3 and the cache unit B4 into the RAM;

(5) The above processes are circulated until the data needing FFT calculation of the RAM are read;

the ending calculation and writing are as follows:

(1) The FFT calculation unit F4 performs FFT calculation on the data in the buffer unit A4 by utilizing the CPU and the hardware accelerator FFTA, and the calculation result is stored in the buffer unit B4;

(2) And starting the DMA unit, and writing the data in the buffer unit B3 and the buffer unit B4 into the RAM.

The CPU is provided with a BUSY mark, and when the FFT calculation unit calculates, the BUSY mark is 1.

After the scheme is adopted, the invention has the following beneficial effects:

1. The invention uses the DMA unit, the hardware accelerator FFTA and the CPU to execute in sections, so that the read-write operation of the RAM can be executed in parallel with FFT calculation, the operation efficiency is improved, and the system real-time performance is good.

2. The invention separates the read-write operation of the DMA unit to the RAM by adopting the pipeline and the ping-pong operation, and avoids the problem of low efficiency caused by random read-write of the RAM unit, in particular to the DRAM. The invention can exit and release CPU resources in time in the data processing process, and improves the application flexibility of the CPU.

Drawings

FIG. 1 is a schematic block diagram of the present invention;

Fig. 2 is a flow chart of the method of the present invention.

Detailed Description

As shown in fig. 1, the present invention discloses a fast processing system for a two-dimensional FFT, which includes an FFT calculation unit, a buffer unit, a RAM unit, and a DMA unit (not shown in the figure).

The FFT calculation unit performs FFT calculation through the CPU and the hardware accelerator FFTA, and comprises an FFT calculation unit F1, an FFT calculation unit F2, an FFT calculation unit F3 and an FFT calculation unit F4. The cache units comprise a cache unit A1, a cache unit A2, a cache unit A3, a cache unit A4, a cache unit B1, a cache unit B2, a cache unit B3 and a cache unit B4. The RAM unit is used to store data to be subjected to FFT computation and data for which FFT computation is completed. The DMA unit is configured to read data in the RAM into the cache unit A1, the cache unit A2, the cache unit A3, and the cache unit A4, or write data in the cache unit B1, the cache unit B2, the cache unit B3, and the cache unit B4 into the RAM unit.

With continued reference to fig. 1, the present invention also discloses a fast processing method of the two-dimensional FFT based on the fast processing system of the two-dimensional FFT, which reads the original data in the RAM unit through the DMA unit, sequentially reads the data, stores the data in the buffer unit A1, the buffer unit A2, the buffer unit A3, and the buffer unit A4, sequentially calculates the data through the FFT calculation unit F1, the FFT calculation unit F2, the FFT calculation unit F3, and the FFT calculation unit F4 to obtain a one-dimensional FFT calculation result, and stores the one-dimensional FFT calculation result in the buffer unit B1, the buffer unit B2, the buffer unit B3, and the buffer unit B4, and then writes the data into the RAM unit through the DMA unit, thereby completing the first-dimensional FFT calculation. Similarly to the first dimension FFT calculation, the DMA unit reads the first dimension FFT calculation result data in the RAM unit, sequentially reads the data, stores the data in the buffer unit A1, the buffer unit A2, the buffer unit A3, and the buffer unit A4, sequentially calculates the data by the FFT calculation unit F1, the FFT calculation unit F2, the FFT calculation unit F3, and the FFT calculation unit F4 to obtain two dimension FFT calculation result data, and stores the two dimension FFT calculation result data in the FFT calculation unit B1, the FFT calculation unit B2, the FFT calculation unit B3, and the FFT calculation unit B4, and then writes the two dimension FFT calculation result data into the RAM unit by the DMA unit.

Specifically, the fast processing method of the present invention includes three blocks, namely, initial reading and calculating, cyclic reading and writing and calculating, and ending calculating and writing, as shown in fig. 2.

1. Initial reading and calculation:

2. Cyclic reading and writing and calculating:

3. Ending the calculation and writing as follows:

The key of the invention is that firstly, the invention uses the DMA unit, the hardware accelerator FFTA and the CPU to execute in sections, so that the read-write operation of the RAM can be executed in parallel with FFT calculation, the operation efficiency is improved, and the real-time performance of the system is good. Secondly, the invention separates the read-write operation of the DMA unit to the RAM by adopting the pipeline and the ping-pong operation, thereby avoiding the problem of low efficiency caused by random read-write of the RAM unit, in particular to the DRAM. Third, the data processing flow of the present invention occupies a thread that can be temporarily exited during the loop in response to task requests of other threads.

On the basis of the above, the CPU is provided with a BUSY flag, and when the FFT calculation unit calculates, the BUSY flag is 1. Different threads can simultaneously request CPU resources, the system flexibility is good, the maximum utilization of hardware resources is realized, the cost can be further reduced, the performance is improved, and the product competitiveness is improved.

The foregoing embodiments of the present invention are not intended to limit the technical scope of the present invention, and therefore, any minor modifications, equivalent variations and modifications made to the above embodiments according to the technical principles of the present invention still fall within the scope of the technical proposal of the present invention.

Claims

1. A fast processing method of a two-dimensional FFT is characterized in that: the processing method is realized based on a processing system, and the processing system comprises an FFT calculation unit, a cache unit, a RAM unit and a DMA unit;

The initial reading and calculation is as follows:

starting a DMA unit, reading data to be operated in the RAM unit, and putting the data into a cache unit A1 and a cache unit A2;

The FFT calculation unit F1 performs FFT calculation on the data in the buffer unit A1 by utilizing a CPU and a hardware accelerator FFTA, and the calculation result is stored in the buffer unit B1;

the cyclic reading and writing and calculation are as follows:

The FFT calculation unit F2 performs FFT calculation on the data in the buffer unit A2 by utilizing the CPU and the hardware accelerator FFTA, and the calculation result is stored in the buffer unit B2; meanwhile, starting a DMA unit, reading data to be operated in the RAM unit, and putting the data into a cache unit A3 and a cache unit A4;

The FFT calculation unit F3 performs FFT calculation on the data in the buffer unit A3 by utilizing the CPU and the hardware accelerator FFTA, and the calculation result is stored in the buffer unit B3; meanwhile, starting a DMA unit, and writing the data in the cache unit B1 and the cache unit B2 into the RAM;

The FFT calculation unit F4 performs FFT calculation on the data in the buffer unit A4 by utilizing the CPU and the hardware accelerator FFTA, and the calculation result is stored in the buffer unit B4; meanwhile, starting a DMA unit, reading data to be operated in the RAM unit, and putting the data into a cache unit A1 and a cache unit A2;

the FFT calculation unit F1 performs FFT calculation on the data in the buffer unit A1 by utilizing a CPU and a hardware accelerator FFTA, and the calculation result is stored in the buffer unit B1; meanwhile, starting a DMA unit, and writing the data in the cache unit B3 and the cache unit B4 into the RAM;

the above processes are circulated until the data which is needed to be subjected to FFT calculation by the RAM are read;

the ending calculation and writing are as follows:

the FFT calculation unit F4 performs FFT calculation on the data in the buffer unit A4 by utilizing the CPU and the hardware accelerator FFTA, and the calculation result is stored in the buffer unit B4;

and starting the DMA unit, and writing the data in the buffer unit B3 and the buffer unit B4 into the RAM.

2. The fast processing method of a two-dimensional FFT of claim 1, characterized by: the CPU is provided with a BUSY mark, and when the FFT calculation unit calculates, the BUSY mark is 1.