CN112966209B - FFT processor and method for processing data thereof - Google Patents

FFT processor and method for processing data thereof Download PDF

Info

Publication number
CN112966209B
CN112966209B CN202110267342.XA CN202110267342A CN112966209B CN 112966209 B CN112966209 B CN 112966209B CN 202110267342 A CN202110267342 A CN 202110267342A CN 112966209 B CN112966209 B CN 112966209B
Authority
CN
China
Prior art keywords
domain data
data
module
frequency domain
time domain
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110267342.XA
Other languages
Chinese (zh)
Other versions
CN112966209A (en
Inventor
李涌睿
谢宜壮
陈禾
陈亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Institute of Technology BIT
Original Assignee
Beijing Institute of Technology BIT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Institute of Technology BIT filed Critical Beijing Institute of Technology BIT
Priority to CN202110267342.XA priority Critical patent/CN112966209B/en
Publication of CN112966209A publication Critical patent/CN112966209A/en
Application granted granted Critical
Publication of CN112966209B publication Critical patent/CN112966209B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/14Fourier, Walsh or analogous domain transformations, e.g. Laplace, Hilbert, Karhunen-Loeve, transforms
    • G06F17/141Discrete Fourier transforms
    • G06F17/142Fast Fourier transforms, e.g. using a Cooley-Tukey type algorithm
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/52Multiplying; Dividing
    • G06F7/523Multiplying only
    • G06F7/527Multiplying only in serial-parallel fashion, i.e. one operand being entered serially and the other in parallel
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3867Concurrent instruction execution, e.g. pipeline or look ahead using instruction pipelines
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Computing Systems (AREA)
  • Discrete Mathematics (AREA)
  • Algebra (AREA)
  • Databases & Information Systems (AREA)
  • Complex Calculations (AREA)

Abstract

The invention relates to an FFT processor and a method for processing data thereofA method of manufacturing the same. The FFT processor includes: the serial-parallel conversion module converts the received first time domain data from serial to 2 n Second time domain data parallelism, 2 n Not greater than the length of the first time domain data. 2 n Any one of the processing channels comprises m operation level modules and m-1 multiplier modules which are sequentially arranged, second time domain data of the current processing channel is input, and first frequency domain data is obtained after operation is carried out through the m operation level modules and the m-1 multiplier modules and is output to the twiddle factor multiplier module. The twiddle factor multiplier module will come from 2 n The first frequency domain data of each processing channel are multiplied by the twiddle factors respectively to obtain 2 n And outputting the second frequency domain data after being assembled. Base-2 n Butterfly unit pair 2 n Group second frequency domain data basis-2 n Calculating and outputting the third frequency domain data to a parallel-serial module, wherein the parallel-serial module outputs 2 n The third set of frequency domain data is combined into serial fourth frequency domain data.

Description

FFT processor and method for processing data thereof
Technical Field
The present application relates to the field of signal processing, and more particularly, to an FFT processor and a method for processing data thereof.
Background
Synthetic aperture radar (Synthetic Aperture Radar, SAR) is a typical application scenario for FFT algorithms. The two-dimensional imaging radar with all-day and all-weather working capacity is widely applied to the important fields of numerous national defense and national economy such as battlefield early warning reconnaissance, weapon accurate guidance, various target identification, mineral resource exploration, major disaster monitoring and the like. In radar imaging, good resolution and high signal-to-noise ratio are required. Common SAR imaging algorithms employ a key pulse compression technique to simultaneously obtain higher resolution and signal-to-noise ratio, in which the FFT is used multiple times, which is an important component in the algorithm flow, such as signal spectrum acquisition, matched filter use, signal restoration, and the like. The performance of the FFT processor is related to the real-time performance and imaging quality of the SAR system, and therefore, an efficient and stable FFT processor is an important component of the SAR system.
With the continuous improvement of the SAR image resolution requirement, the number of points required to be processed by the FFT processor once is increased, and the maximum of the FFT operation IP core provided by XILINX company can be configured to be 64k points, so that the requirement of a high-resolution SAR system on the FFT number is difficult to meet.
Disclosure of Invention
In order to solve the above problems, the present application proposes an FFT processor and a method of processing data thereof. The FFT processor provided by the application adopts a pipeline structure. The FFT processor with the pipeline structure can simultaneously realize continuous input and output of data, high data throughput capacity and high real-time performance, and is most widely used. And the circuit structure with the least occupied resources and the best performance in the pipeline structure is a Single-path delay feedback Structure (SDF).
In a first aspect, the present invention provides an FFT processor. The FFT processor includes:
a serial-to-parallel module for receiving first time domain data and converting the received first time domain data from serial to 2 n Parallel the second time domain data, and combining the two paths 2 n Input 2 of one-to-one correspondence of second time domain data n A plurality of processing channels; wherein 2 is n Not greater than the length of the first time domain data, n being an integer greater than 0;
2 n any one current processing channel in the processing channels comprises m operation level modules which are sequentially arranged and m-1 multiplier modules which are sequentially arranged, wherein the ith operation level module is connected with the (i+1) th operation level module through the ith multiplier module, m is an integer greater than 0, and i is an integer greater than 0 and not greater than m; the second time domain data of the current processing channel is input, a group of first frequency domain data is obtained after the operation is carried out on m operation level modules and m-1 multiplier modules, and the first frequency domain data is output to a twiddle factor multiplier module;
a twiddle factor multiplier module for multiplying the output signal from 2 n The first frequency domain data of each processing channel are multiplied by a twiddle factor respectively to obtain 2 n The second frequency domain data are output after being assembled;
base-2 n Butterfly unit for pair 2 n Group second frequency domain data basis-2 n Calculating to obtain third frequency domain data and outputting the third frequency domain data to a parallel-serial module;
parallel-serial module for converting 2 n The third set of frequency domain data is combined into serial fourth frequency domain data.
Optionally, the FFT processor further comprises m-1 twiddle factor generators arranged in sequence, the ith twiddle factor generator being configured to provide twiddle factors for the ith multiplier module in each of said processing channels.
Optionally, the operation level module includes a first processing unit, a second processing unit, and a third processing unit; wherein, the liquid crystal display device comprises a liquid crystal display device,
the first processing unit comprises a first SDF structure and a first complex conversion unit; the first SDF structure is used for performing base-2 butterfly operation on the current data received by the first SDF structure; the first complex conversion unit is used for exchanging the real part and the imaginary part of the complex obtained by the first SDF structure operation to obtain and output a first conversion result; when i is 1, the current data is second time domain data; when i is greater than 1 and not greater than m, the current data is the output of the ith-1 operation level module;
the second processing unit comprises a second SDF structure, a second complex conversion unit and a constant factor conversion unit; the second SDF structure is used for performing a radix-2 butterfly operation on the received first conversion result; the second complex conversion unit is used for exchanging the real part and the imaginary part of the complex obtained by the second SDF structure operation to obtain and output a second conversion result; the constant factor conversion unit is used for carrying out specific rotation on the second conversion result and then outputting the second conversion result; the third processing unit includes a third SDF structure; the third SDF structure is used for performing base-2 butterfly operation on the current data received by the third SDF structure; the current data is the output of the second processing unit.
Optionally, the data type of the time domain data is a fixed point number.
Optionally, the ith operation level module in the m operation level modules processes the current data received by the ith operation level module to obtain and output target data, wherein the difference between the bit width of the target data and the bit width of the current data is 4 bits; when i is 1, the received current data is second time domain data; when i is greater than 1 and not greater than m, the received current data is the output of the ith-1 operation level module.
In a second aspect, the present invention provides a method for processing data by an FFT processor, comprising:
receiving first time domain data through a serial-parallel module and converting the received first time domain data from serial to 2 n Parallel the second time domain data, and combining the two paths 2 n Input 2 of one-to-one correspondence of second time domain data n A plurality of processing channels; wherein 2 is n Not greater than the length of the first time domain data, n being an integer greater than 0; 2 n Any one current processing channel in the processing channels comprises m operation level modules which are sequentially arranged and m-1 multiplier modules which are sequentially arranged, wherein the ith operation level module is connected with the (i+1) th operation level module through the ith multiplier module, m is an integer greater than 0, and i is an integer greater than 0 and not greater than m; for 2 n Each current processing channel in the processing channels is operated by m operation level modules and m-1 multiplier modules included in the current processing channel to obtain a group of first frequency domain data, and the group of first frequency domain data is output to a twiddle factor multiplier module;
will come from 2 by twiddle factor multiplier module n The first frequency domain data of each processing channel are multiplied by a twiddle factor respectively to obtain 2 n The second frequency domain data are output after being assembled;
through group-2 n Butterfly unit pair 2 n Group second frequency domain data basis-2 n Calculating to obtain third frequency domain data and outputting the third frequency domain data to a parallel-serial module;
pass parallel-serial module to convert 2 n The third set of frequency domain data is combined into serial fourth frequency domain data.
Optionally, the method of processing data further comprises: providing twiddle factors for the ith multiplier module in each of the processing channels by the ith twiddle factor generator in the m-1 twiddle factor generators arranged in sequence.
Optionally, in the method for processing data, the data type of the time domain data is a fixed point number.
Through the method and the device provided by the embodiment of the specification, the large-point FFT application scene similar to an SAR imaging system can be met, the occupation of logic resources is reduced, and the utilization rate of hardware resources can be effectively improved.
Book attached description
In order to more clearly illustrate the technical solutions of the embodiments of the present description, the drawings that are needed in the description of the embodiments will be briefly introduced below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is an application scenario diagram of a technical solution provided in an embodiment of the present application;
fig. 2 is a schematic diagram of the overall structure of an FFT processor provided in an embodiment of the present application;
fig. 3 is a schematic structural diagram of an operation stage module in the FFT processor provided in the embodiment of the present application;
fig. 4 is a schematic structural diagram of an SDF in an operation level module provided in an embodiment of the present application;
FIG. 5 is a schematic diagram of a data bit width scheme provided in an embodiment of the present application;
fig. 6 is a process schematic diagram of an FFT processor processing method provided in an embodiment of the present application;
FIG. 7 is a schematic diagram of a structural implementation of three (-1) terms in the algorithm provided in an embodiment of the present application;
FIG. 8 is an algorithm implementation structure provided in an embodiment of the present application;
fig. 9 is another algorithm implementation structure provided in an embodiment of the present application.
Detailed Description
The technical scheme provided by the invention is further described in detail below with reference to the accompanying drawings and the embodiments.
Fig. 1 is an application scenario diagram of a technical solution provided in an embodiment of the present application. As shown in fig. 1, data input is received, time domain data is input to an FFT processor, frequency domain data is obtained after processing by the FFT processor, and the obtained frequency domain data is output.
Fig. 2 is a schematic diagram of the overall structure of the FFT processor provided in the embodiment of the present application.
As shown in fig. 2, the FFT processor provided in the present application includes a serial-parallel module, 2 n Processing channels, twiddle factor multiplier modules, base-2 n Butterfly unit and parallel-serial module.
The serial-parallel module is used for receiving the first time domain data and converting the received first time domain data from serial to 2 n Path second time domain data parallelism, will 2 n Input 2 of one-to-one correspondence of second time domain data n A plurality of processing channels. Wherein 2 is n And the length of the first time domain data is not larger than that of the first time domain data, and n is an integer larger than 0.
For example, the number of the first time domain data points received by the serial-parallel module is 128k, the value of n is 2, and the serial 128k first time domain data is converted into 4 paths of 32k time domain data by the serial-parallel module, i.e. the serial first time domain data is converted into 4 paths of parallel second time domain data.
For each processing channel, i.e. 2 n Any one current processing channel in the processing channels comprises m operation level modules which are sequentially arranged and m-1 multiplier modules which are sequentially arranged, wherein the ith operation level module is connected with the (i+1) th operation level module through the ith multiplier module, m is an integer greater than 0, and i is an integer greater than 0 and not greater than m. The second time domain data of the current processing channel is input, and is calculated by m operation level modules and m-1 multiplier modules to obtain a group of first frequency domain data, and the first frequency domain data is output to the twiddle factor multiplier module.
Specifically, referring to fig. 2, taking an m value of 5 as an example, a single processing channel includes 5 operation Stage modules Stage1, stage2, stage3, stage4, stage5, and 4 multiplier modules arranged in sequence. The 32k data (second time domain data) are sequentially subjected to operation of 5 operation level modules and 4 multiplier modules to obtain a group of first frequency domain data, and the last operation level module (Stage 5) outputs the obtained group of first frequency domain data to the twiddle factor multiplier module.
A twiddle factor multiplier module for multiplying the output signal from 2 n The first frequency domain data of each processing channel are multiplied by a twiddle factor respectively to obtain 2 n And outputting the second frequency domain data after being assembled.
Base-2 n Butterfly unit for pair 2 n Group second frequency domain data basis-2 n And calculating to obtain third frequency domain data and outputting the third frequency domain data to the parallel-to-serial module.
Parallel-serial module for converting 2 n The third set of frequency domain data is combined into serial fourth frequency domain data.
Specifically, the first frequency domain data is completed and the twiddle factor is achieved through a twiddle factor multiplier module
Figure BDA0002972693580000061
To obtain second frequency domain data. And the second frequency domain data is subjected to a radix-4 butterfly unit operation to obtain third frequency domain data. And the third frequency domain data enters a parallel-serial module, and the parallel-serial module completes the combination of four paths of data to finally obtain fourth frequency domain data. I.e. the FFT processor is finished processing one time domain data.
In some possible examples, the FFT processor further comprises m-1 twiddle factor generators arranged in sequence, the ith twiddle factor generator being configured to provide twiddle factors for the ith multiplier module in each of said processing channels.
For example, the twiddle factors in the multiplier modules after each operation level module are not set in advance, but are provided after being generated in real time by twiddle factor generator tw_gen. And to save resources, 2 n The ith multiplier module of each of the processing channels shares a twiddle factorGenerators, i.e. 2 n The ith multiplier module of each of the processing channels shares the same twiddle factor. In other words, 2 n The function or operation of the ith multiplier module is the same for each of the processing channels.
In some possible examples, the operation level module may include 3 processing units, a first processing unit, a second processing unit, and a third processing unit. The processing unit is used for rotating the data in a specific direction.
Fig. 3 is a schematic structural diagram of an operation stage module in the FFT processor provided in the embodiment of the present application. Fig. 4 is a schematic structural diagram of an SDF in an operation level module provided in an embodiment of the present application.
In some more specific examples, please refer to fig. 3 and fig. 4, the first processing unit includes a first SDF structure and a first complex conversion unit;
the first processing unit comprises a first SDF structure and a first complex conversion unit; the first SDF structure is used for performing base-2 butterfly operation on the current data received by the first SDF structure; the first complex conversion unit is used for exchanging the real part and the imaginary part of the complex obtained by the first SDF structure operation to obtain and output a first conversion result; when i is 1, the current data is second time domain data; when i is greater than 1 and not greater than m, the current data is the output of the ith-1 operation level module;
the second processing unit comprises a second SDF structure, a second complex conversion unit and a constant factor conversion unit; the second SDF structure is used for performing a radix-2 butterfly operation on the received first conversion result; the second complex conversion unit is used for exchanging the real part and the imaginary part of the complex obtained by the second SDF structure operation to obtain and output a second conversion result; the constant factor conversion unit is used for carrying out specific rotation on the second conversion result and then outputting the second conversion result;
the third processing unit includes a third SDF structure; the third SDF structure is used for performing base-2 butterfly operation on the current data received by the third SDF structure; the current data is the output of the second processing unit.
Specifically, the current data input into a certain operation level module is processed by the first processing unit, then enters the second processing unit, is processed by the second processing unit, then enters the third processing unit, is processed by the third processing unit, and is output to the next operation level module or the twiddle factor multiplier module. When i is 1, the current data is the second time domain data, and when i is greater than 1 and not greater than m, the current data is the output of the ith-1 operation level module.
As described above, the SDF structure performs a radix-2 butterfly operation on current data received by the SDF structure, where the current data is an output of a previous processing unit of the processing unit to which the SDF structure belongs or an output of a previous operation level module of the operation level module to which the SDF structure belongs. As shown in fig. 4, an SDF structure includes a memory unit and a radix-2 butterfly unit. The memory unit is used for caching the result generated by the lower path of the base-2 butterfly unit.
The complex conversion unit is used for realizing the exchange of the real part and the imaginary part of the complex. The constant factor conversion unit is used for realizing specific rotation of the data processed by the second complex conversion unit, wherein the specific rotation is to rotate the complex data 45 degrees clockwise on the complex plane.
For example, each channel contains 5 operation level modules, the operation level modules are connected through a multiplier module, and the twiddle factor generator provides twiddle factors for the multiplier module. Each operation level module comprises 3 processing units (a first processing unit, a second processing unit and a third processing unit), wherein the first processing unit comprises an SDF structure and a complex conversion unit-j, and the second processing unit comprises an SDF structure and a complex conversion unit-j and a constant factor conversion unit
Figure BDA0002972693580000081
The third processing unit comprises an SDF structure, and the SDF structure comprises a memory unit RAM and a base-2 butterfly unit.
Specifically, when the 32k point time domain data flows into the first processing unit of the first operation level module (Stage 1) of the current channel, the front 16k point data is cached in the memory unit of the first level, and each time one bit is received in the process of flowing the rear 16k point data, butterfly operation is directly performed on the data corresponding to the data in the cache area, the result generated by the upper channel is sent to the next level, and the result generated by the lower channel is fed back to the cache area to cover the used data. When the data of the next 16k points are all received, the operation results of the first-stage butterfly lower path are all in the first-stage buffer area, at this time, the SDF structure needs to artificially generate a 0 sequence as input to drive the butterfly unit, and the data of the buffer area are not changed and are sent to the next stage one by one because the input of the lower path is 0, so that the function of a displacer is replaced. In the subsequent stages, operating in the same way, only 0 needs to be input as drive.
In some possible examples, the data type of the time domain data is fixed point number.
Specifically, the decimal point position of the fixed point number is fixed, the data precision is higher than that of the floating point number, and the circuit is simple to realize. The fixed point number is adopted, the representation range is-1 to 0.9999999995, and the representation precision is 2 (-31).
In some possible examples, the ith operation level module in the m operation level modules processes the current data received by the ith operation level module to obtain and output target data, wherein the difference between the bit width of the target data and the bit width of the current data is 4 bits; when i is 1, the received current data is second time domain data; when i is greater than 1 and not greater than m, the received current data is the output of the ith-1 operation level module.
FIG. 5 is a schematic diagram of a data bit width planning scheme provided in an embodiment of the present application. For example, referring to FIG. 5, the bit width of the data before flowing into the butterfly unit is the initial bit width, which represents the bit widths of both the real and imaginary parts of the data. When data passes through the first-stage base-2 butterfly unit, the operation is one addition or one subtraction, and the data bit width needs to be expanded by 1 bit so as to ensure that all results are not overflowed.
When the data is multiplied by the factor-j, it is the real and imaginary parts of the data that are actually swapped positions, and there is no need to spread the data bit width.
When the data passes through the second-stage base-2 butterfly unit, the data bit width is expanded by 1 bit again.
Before the data enter the 3 rd-level base-2 butterfly unit, the data can be decomposed into the data which is multiplied by-j, the operation does not change the bit width, and then the data is multiplied by-j
Figure BDA0002972693580000091
Multiplication, i.e. rotating the data 45 deg. clockwise on the complex plane, taking into account the extreme case of data overflow, i.e. the original data is on the angular bisector of each quadrant, the rotated data falls on the coordinate axis, in which case the ratio of the change of the real part and the imaginary part before and after the data is maximum, is ++>
Figure BDA0002972693580000092
Multiple times. Therefore, the data bit width only needs to be expanded by 1 bit, so that the calculation is ensured not to overflow.
When the data passes through the 3 rd-level base-2 butterfly unit, the data bit width is expanded by 1 bit.
To sum up, the data bit width is extended by 4 bits through one operation level module.
Specifically, each independent multiplier module expands the data bit width by 1 bit. According to the fixed-point FFT overflow planning scheme, the expansion bit width of each operation level module is 4 bits, and the expansion bit width of the multiplier module outside each operation level module is 1 bit, so that the expansion bit width of 5 operation level modules and 4 multiplier modules is 24 bits.
By adopting the FFT processor provided by the embodiment of the application, the multiplier module resource can be saved more, and the effect of the large-point FFT processor is realized.
Fig. 6 is a processing method of an FFT processor provided in an embodiment of the present application. As shown in fig. 6, the processing method includes:
s601: receiving first time domain data through a serial-parallel module and converting the received first time domain data from serial to 2 n Parallel the second time domain data, and combining the two paths 2 n Input 2 of one-to-one correspondence of second time domain data n A plurality of processing channels; wherein 2 is n And the length of the first time domain data is not larger than that of the first time domain data, and n is an integer larger than 0. 2 n Any one of the processing channels currently processesThe channel comprises m operation level modules which are sequentially arranged and m-1 multiplier modules which are sequentially arranged, wherein the ith operation level module is connected with the (i+1) th operation level module through the ith multiplier module, m is an integer larger than 0, and i is an integer larger than 0 and not larger than m.
In some possible examples, the time domain data type in the processing method is a fixed point number.
S602: for 2 n And each current processing channel in the processing channels is operated by m operation level modules and m-1 multiplier modules included in the current processing channel to obtain a group of first frequency domain data, and the group of first frequency domain data is output to a twiddle factor multiplier module.
The algorithm adopted in each operation level module is realized according to the following principle:
the N-point DFT formula is:
Figure BDA0002972693580000101
when N is an integer multiple of 8, N and k can be decomposed according to the following manner:
Figure BDA0002972693580000102
substituting this decomposition into the DFT formula yields:
Figure BDA0002972693580000103
DFT can be regarded as a superposition operation, each value corresponding to k requires all x (n) to participate in superposition, but each x (n) has an unused weight, namely a twiddle factor, and the splitting purpose is to convert the twiddle factor into a multi-stage implementation. FIG. 7 is a schematic diagram of a structural implementation of three (-1) terms in the algorithm provided in the embodiments of the present application. Referring to FIG. 7, in combination with the concept of x (n) superposition, three (-1) terms can be usedTo be realized by the following structure of fig. 7. N in FIG. 7 4 The scope of the item omits unwritten. Each box in the figure represents a data set, n in each data set 4 The range of (2) is:
Figure BDA0002972693580000104
/>
based on the basic framework of fig. 7, the (-j) term can be,
Figure BDA0002972693580000105
the term and the last twiddle factor are added to the block diagram. During the addition, the special +.>
Figure BDA0002972693580000106
Term, where the effective condition is k 1 And n 3 At the same time, 1, as can be seen from FIG. 7, there is k in both columns 2 and 3 1 And n 3 Thus->
Figure BDA0002972693580000111
There are two methods of adding items, as shown in fig. 8 and 9, respectively. Fig. 8 and 9 are two different implementation structures provided in the embodiments of the present application. A second implementation structure, shown in fig. 9, is employed in the present invention. Wherein-j can be achieved by exchanging the positions of the real and imaginary parts of the complex numbers and adding a negative sign, without using a multiplier module; wherein->
Figure BDA0002972693580000112
For a constant factor, multiplication with the constant factor can be achieved by a specific set of shifters and adders, without the use of multiplier modules. The saving of multiplier modules in the implementation process of the two factors is a key reason for saving multiplier modules by the algorithm in the application.
In some possible examples, the operation level module in the processing method includes 3 processing units, which are a first processing unit, a second processing unit, and a third processing unit. Wherein, the liquid crystal display device comprises a liquid crystal display device,
the first processing unit comprises a first SDF structure and a first complex conversion unit; performing a radix-2 butterfly operation on the current data received by the first SDF structure; the complex obtained by the first SDF structure through the first complex conversion unit realizes the interchange of a real part and an imaginary part and then outputs the complex;
the second processing unit comprises a second SDF structure, a second complex conversion unit and a constant factor conversion unit; performing a radix-2 butterfly operation on the current data received by the second SDF structure; the complex obtained by the second SDF structure is output after the real part and the imaginary part of the complex are exchanged through the second complex conversion unit; the constant factor conversion unit is used for carrying out specific rotation on the data processed by the second complex conversion unit and outputting the data; the current data is output by the first processing unit;
the third processing unit includes a third SDF structure; performing a radix-2 butterfly operation on the current data received by the third SDF structure; the current data is the output of the second processing unit.
In some possible examples, the processing method further comprises providing a twiddle factor for an ith one of said multiplier modules in each of said processing channels by an ith twiddle factor generator of m-1 twiddle factor generators arranged in sequence.
S603: will come from 2 by twiddle factor multiplier module n The first frequency domain data of each processing channel are multiplied by a twiddle factor respectively to obtain 2 n And outputting the second frequency domain data after being assembled.
S604: through group-2 n Butterfly unit pair 2 n Group second frequency domain data basis-2 n And calculating and outputting the third frequency domain data to the parallel-to-serial module.
S605: pass parallel-serial module to convert 2 n The third set of frequency domain data is combined into serial fourth frequency domain data.
TABLE 1
Figure BDA0002972693580000121
Table 1 shows the comparison of the occupation of resources by the base-2 algorithm, the base-4 algorithm and the method. As can be seen from table 1, with this method, a large amount of multiplier module resources can be saved.
In the description provided herein, numerous specific details are set forth. However, it is understood that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.
It will be apparent to those skilled in the art that various modifications and variations can be made to the present invention without departing from the spirit or scope of the invention. Thus, it is intended that the present invention also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.

Claims (5)

1. An FFT processor, comprising:
a serial-to-parallel module for receiving first time domain data and converting the received first time domain data from serial to 2 n Parallel the second time domain data, and combining the two paths 2 n Input 2 of one-to-one correspondence of second time domain data n A plurality of processing channels; wherein 2 is n Not greater than the length of the first time domain data, n being an integer greater than 0;
2 n any one current processing channel in the processing channels comprises m operation level modules which are sequentially arranged and m-1 multiplier modules which are sequentially arranged, wherein the ith operation level module is connected with the (i+1) th operation level module through the ith multiplier module, m is an integer greater than 0, and i is an integer greater than 0 and not greater than m; the second time domain data of the current processing channel is input, a group of first frequency domain data is obtained after the operation is carried out on m operation level modules and m-1 multiplier modules, and the first frequency domain data is output to a twiddle factor multiplier module; the operation level module comprises a first processing unit, a second processing unit and a third processing unit; wherein the first processing unit packageIncludes a first SDF structure and a first complex conversion unit; the first SDF structure is used for performing base-2 butterfly operation on the current data received by the first SDF structure; the first complex conversion unit is used for exchanging the real part and the imaginary part of the complex obtained by the first SDF structure operation to obtain and output a first conversion result; when i is 1, the current data is second time domain data; when i is greater than 1 and not greater than m, the current data is the output of the ith-1 operation level module; the second processing unit comprises a second SDF structure, a second complex conversion unit and a constant factor conversion unit; the second SDF structure is used for performing a radix-2 butterfly operation on the received first conversion result; the second complex conversion unit is used for exchanging the real part and the imaginary part of the complex obtained by the second SDF structure operation to obtain and output a second conversion result; the constant factor conversion unit is used for outputting a second conversion result after rotating; the third processing unit includes a third SDF structure; the third SDF structure is used for performing base-2 butterfly operation on the current data received by the third SDF structure; the current data is output by the second processing unit;
m-1 twiddle factor generators arranged in sequence, the ith twiddle factor generator being used for providing twiddle factors for the ith multiplier module in each processing channel;
a twiddle factor multiplier module for multiplying the output signal from 2 n The first frequency domain data of each processing channel are multiplied by a twiddle factor respectively to obtain 2 n The second frequency domain data are output after being assembled;
base-2 n Butterfly unit for pair 2 n Group second frequency domain data basis-2 n Calculating to obtain third frequency domain data and outputting the third frequency domain data to a parallel-serial module;
parallel-serial module for converting 2 n The third set of frequency domain data is combined into serial fourth frequency domain data.
2. The processor of claim 1, wherein the data type of the first time domain data and the second time domain data is a fixed point number.
3. The processor of claim 2, wherein an i-th operation level module of the m operation level modules processes current data received by the i-th operation level module to obtain and output target data, and a difference between a bit width of the target data and a bit width of the current data is 4 bits; when i is 1, the received current data is second time domain data; when i is greater than 1 and not greater than m, the received current data is the output of the ith-1 operation level module.
4. A FFT processor processing method, comprising:
receiving first time domain data through a serial-parallel module and converting the received first time domain data from serial to 2 n Parallel the second time domain data, and combining the two paths 2 n Input 2 of one-to-one correspondence of second time domain data n A plurality of processing channels; wherein 2 is n Not greater than the length of the first time domain data, n being an integer greater than 0; 2 n Any one current processing channel in the processing channels comprises m operation level modules which are sequentially arranged and m-1 multiplier modules which are sequentially arranged, wherein the ith operation level module is connected with the (i+1) th operation level module through the ith multiplier module, m is an integer greater than 0, and i is an integer greater than 0 and not greater than m; the operation level module comprises 3 processing units, namely a first processing unit, a second processing unit and a third processing unit; wherein the first processing unit comprises a first SDF structure and a first complex conversion unit; performing a radix-2 butterfly operation on the current data received by the first SDF structure; the complex obtained by the first SDF structure through the first complex conversion unit realizes the interchange of a real part and an imaginary part and then outputs the complex; the second processing unit comprises a second SDF structure, a second complex conversion unit and a constant factor conversion unit; performing a radix-2 butterfly operation on the current data received by the second SDF structure; the complex obtained by the second SDF structure is output after the real part and the imaginary part of the complex are exchanged through the second complex conversion unit; passing through the constant factor conversion unit pair through the second complex conversion unitThe processed data are output after being rotated; the current data is output by the first processing unit; the third processing unit includes a third SDF structure; performing a radix-2 butterfly operation on the current data received by the third SDF structure; the current data is the output of the second processing unit
Providing a twiddle factor for an ith one of the multiplier modules in each of the processing channels by an ith twiddle factor generator of m-1 twiddle factor generators arranged in sequence;
for 2 n Each current processing channel in the processing channels is operated by m operation level modules and m-1 multiplier modules included in the current processing channel to obtain a group of first frequency domain data, and the group of first frequency domain data is output to a twiddle factor multiplier module;
will come from 2 by twiddle factor multiplier module n The first frequency domain data of each processing channel are multiplied by a twiddle factor respectively to obtain 2 n The second frequency domain data are output after being assembled;
through group-2 n Butterfly unit pair 2 n Group second frequency domain data basis-2 n Calculating to obtain third frequency domain data and outputting the third frequency domain data to a parallel-serial module;
pass parallel-serial module to convert 2 n The third set of frequency domain data is combined into serial fourth frequency domain data.
5. The method of claim 4, wherein the data type of the first time domain data and the second time domain data is a fixed point number.
CN202110267342.XA 2021-03-11 2021-03-11 FFT processor and method for processing data thereof Active CN112966209B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110267342.XA CN112966209B (en) 2021-03-11 2021-03-11 FFT processor and method for processing data thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110267342.XA CN112966209B (en) 2021-03-11 2021-03-11 FFT processor and method for processing data thereof

Publications (2)

Publication Number Publication Date
CN112966209A CN112966209A (en) 2021-06-15
CN112966209B true CN112966209B (en) 2023-05-05

Family

ID=76277290

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110267342.XA Active CN112966209B (en) 2021-03-11 2021-03-11 FFT processor and method for processing data thereof

Country Status (1)

Country Link
CN (1) CN112966209B (en)

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050015420A1 (en) * 2003-07-18 2005-01-20 Gibb Sean G. Recoded radix-2 pipeline FFT processor
CN107391439B (en) * 2017-07-11 2020-08-14 创耀(苏州)通信科技股份有限公司 Processing method capable of configuring fast Fourier transform
CN109558638B (en) * 2018-10-23 2023-03-24 清华大学 FFT processor
CN109522674A (en) * 2018-12-05 2019-03-26 天津大学 The Fast Fourier Transform (FFT) hardware design methods of base 2-2 algorithm
CN110765709B (en) * 2019-10-15 2023-08-29 天津大学 FPGA-based base 2-2 fast Fourier transform hardware design method
CN112100568B (en) * 2020-06-29 2022-11-25 北京理工大学 Fixed-point Fourier transform FFT processor and processing method

Also Published As

Publication number Publication date
CN112966209A (en) 2021-06-15

Similar Documents

Publication Publication Date Title
CN110765709B (en) FPGA-based base 2-2 fast Fourier transform hardware design method
US10706353B2 (en) Integrated circuit
CN102680945A (en) Doppler modulation frequency estimation method based on field programmable gate array (FPGA)
CN110361691A (en) Coherent DOA based on nonuniform noise estimates FPGA implementation method
CN101847137B (en) FFT processor for realizing 2FFT-based calculation
Wang et al. K-means clustering for multispectral images using floating-point divide
CN112966209B (en) FFT processor and method for processing data thereof
Xiao et al. FPGA-based scalable and highly concurrent convolutional neural network acceleration
CN102364456A (en) 64-point fast Fourier transform (FFT) calculator
CN113203997B (en) FPGA-based radar super-resolution direction finding method, system and application
Hou et al. An FPGA-based multi-core system for synthetic aperture radar data processing
De Souza et al. A radio astronomy correlator optimized for the Xilinx Virtex-4 SX FPGA
Yang et al. A efficient design of a real-time FFT architecture based on FPGA
CN113778378A (en) Device and method for solving complex number N-degree square root
CN113778379A (en) CORDIC-based low-complexity hardware system and application method
KR20230053405A (en) Range-doppler algorithm based sar imaging apparatus and method thereof
CN107193784B (en) High-precision low-hardware-complexity sinc interpolation implementation method and system
CN110532510B (en) Generator for generating twiddle factor and correction factor
CN112835073A (en) FFT (fast Fourier transform) processor for satellite signal acquisition
CN113378109A (en) Mixed base fast Fourier transform calculation circuit based on memory calculation
CN113094970A (en) Function calculation accelerating device based on Taylor expansion
Lin et al. A Real-Time High-Resolution Variable-Size Imaging Processor for Spaceborne Synthetic Aperture Radar
Shashikala et al. Efficient Implementation of Radix-2 FFT Architecture using CORDIC for Signal Processing Applications
Ghosh et al. Design of an application specific VLSI chip for image rotation
Shrestha et al. Multi-level domain-decomposition strategy for solving the eikonal equation with the fast-sweeping method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant