CN205486097U - FFT device based on FPGA - Google Patents
FFT device based on FPGA Download PDFInfo
- Publication number
- CN205486097U CN205486097U CN201620035015.6U CN201620035015U CN205486097U CN 205486097 U CN205486097 U CN 205486097U CN 201620035015 U CN201620035015 U CN 201620035015U CN 205486097 U CN205486097 U CN 205486097U
- Authority
- CN
- China
- Prior art keywords
- data
- butterfly computation
- fft
- butterfly
- base
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Landscapes
- Complex Calculations (AREA)
Abstract
The utility model relates to a FFT device based on FPGA, the device includes: - 4 butterfly computation wares in cache module, control module and the base, control module links to each other with cache module and base - 4 butterfly computation ware respectively for control data's input, output are arranged in control data with mode buffer memory to the cache module of table tennis buffer memory, are arranged in control data to accomplish the FFT operation with the mode of cyclic addressing at - 4 butterfly computation wares in the base, cache module is used for the computation result of the data of initial input preceding 34, output back 34 to the mediant certificate is used for preserving, - 4 butterfly computation wares in the base are used for the data of initial input back 14, export preceding 14 computation result. The utility model provides high operational speed uses the total degree of depth that has reduced storage data's RAM under the circumstances that the FFT device is suitable of quantity and induced draft fan - 2 butterfly computation unit at the DSP multiplier.
Description
Technical field
This utility model relates to field of digital signals, particularly relates to a kind of FFT based on FPGA
Device.
Background technology
In a wireless communication system, commonly used fast Fourier transform FFT is to input time domain letter
Number carry out transform analysis, observe frequency-domain waveform, to obtain the frequency domain character of signal.OFDM profit
Multi-carrier modulation is replaced with inverse discrete fourier transform and discrete Fourier transform (IDFT/DFT)
With the realization of demodulation, i.e. at transmitting terminal, data to be modulated are carried out IFFT computing and realize modulation,
The receiving terminal data to receiving carry out FFT computing and realize demodulation, thus greatly reduce system
The complexity realized.
FPGA can solve concurrency and speed issue well, and has flexible configuration, easily
In features such as upgradings, it it is the conventional method realizing fast Fourier transform FFT.Such as,
The Virtex6 family chip of Xilinx, inside FPGA, provide not only multiple referred to as DSP
The computing unit of Slices, additionally provides read-write LUT unit, two-port RAM unit.
The soft core of fft algorithm IP within Virtex6 family chip of Xilinx is divided into four kinds at present
Pattern, is respectively as follows: the data stream I/O (Pipelined, Streaming I/O) of flowing water, base-4 is dashed forward
Send out I/O (Radix-4, Burst I/O), base-2 burst I/O (Radix-2, Burst I/O), base-2Lite
Burst I/O (Radix-2Lite, Burst I/O).Pipelined and Burst can be divided into by structure
Two kinds, following is a brief introduction of the implementation method of two kinds of structures, as follows:
(1) the data stream I/O of flowing water.
The flowing water that the data stream I/O structure of flowing water processes engine by one group of base-2 butterfly unit comes
Realize continuous data to process.Each process engine have memory block store input data and in
Between data.
(2) base-4 burst I/O.
For base-4 burst I/O structure, FFT IP kernel processes with base-4 butterfly unit and draws
Hold up realization.
For the data stream I/O structure of flowing water, IP kernel is processing current frame data transformation calculations
Meanwhile, next frame input data can be loaded and export the transformation results data of former frame, permissible
Input data acquisition continuous print result of calculation output after certain computation delay continuously.Input
Data are orders, and output data can be inverted order or order.Below as a example by 8 o'clock
The FFT device of base-2 butterfly pipeline system is described.
Base-2DIF carried out butterfly computation in units of 2 o'clock, advanced row data before entering computing
Caching, makes the top half of input data combine with the latter half.Basic structure is as follows:
If clock cycle one data of caching, i.e. first clock buffer 0, when second
Clock caching 1... input data buffer storage ram space is 4, i.e. arrives when the 5th data " 4 "
Time, the data " 0 " of caching and " 4 " directly carry out butterfly computation, and need not store data.
The frequency domain data of final output follows inverted order arrangement, the FFT of base-2 butterfly pipeline system of 8
Input/output list as shown in table 1:
The input/output list of the FFT of table 1 base-2 butterfly pipeline system
Input (positive sequence) | Decimal scale | Output (inverted order) | Decimal scale |
000 | 0 | 000 | 0 |
001 | 1 | 100 | 4 |
010 | 2 | 010 | 2 |
011 | 3 | 110 | 6 |
100 | 4 | 001 | 1 |
101 | 5 | 101 | 5 |
110 | 6 | 011 | 3 |
111 | 7 | 111 | 7 |
According to butterfly diagram, 8 base-2FFT are divided into 3 grades, data cached before computing need space
Being 4, the space that mediant needs is respectively 4,2, needs advanced person during last Sequential output data
Row caching a little, then address output, spatial cache is 8.Use ram space altogether
Being 18, every grade uses a butterfly computation, amounts to and uses 3 butterfly processing elements, it is assumed that 1
One butterfly processing element uses 3 DSP multipliers, then amount to and use 9 DSP multiplication
Device.
The data stream I/O of base-2 flowing water utilizes every grade to place butterfly unit and storage intermediate data,
Allow data can be carried out continuously fixing point FFT, along with counting of FFT computing is increased, take
Resource also with growth, and owing to every grade of computing only uses base-2 butterfly unit, calculate
The priority of number is fixing, so when afterbody requires Sequential output, needing extra increasing
RAM, table 2 has been added up the FFT device of application base-2 butterfly processing element and has been used scale scaling
When pattern processes, store the total depth of the RAM that data take and carry out what computing took
The quantity of DSP multiplier.
The stock number that the data stream I/O structure of table 2 base-2 butterfly unit flowing water takies
Utility model content
Technical problem to be solved in the utility model is: existing FFT device is suitable in data
The problem that sequence output is low to RAM utilization rate, need more FPGA resource.
For solving above-mentioned technical problem, the utility model proposes a kind of flowing water based on FPGA
Wire type FFT device.Should include by pipeline system FFT device based on FPGA:
Cache module, control module and base-4 butterfly computation device;
Described control module is connected with cache module and base-4 butterfly computation device respectively, is used for controlling
The input of data, output, cache to cache module in the way of ping-pong buffer for controlling data
In, in the way of cyclic addressing, in base-4 butterfly computation device, complete FFT fortune for controlling data
Calculate;
Described cache module is the data of 3/4 before initial input, the operation result of 3/4 after output,
And be used for preserving intermediate data;
Described base-4 butterfly computation device is the data of 1/4 after initial input, export the fortune of front 1/4
Calculate result.
Alternatively, described cache module is multiple dual port RAM or multiple single port RAM.
Alternatively, the number of described dual port RAM is 7 or 8, by counting certainly of FFT computing
Fixed.
Alternatively, the twice that the total depth of the plurality of RAM is counted less than or equal to FFT computing.
Alternatively, the number of described base-4 butterfly computation device is 1 or 2, by FFT computing
Count decision.
The FFT device based on FPGA that the utility model proposes, uses radix-4 butterfly computing
Device, improves arithmetic speed, uses the mode of cyclic addressing to eliminate intermediate data storage,
Extra RAM is need not, in DSP multiplier usage quantity and application during data Sequential output
The RAM of storage data is decreased in the case of the FFT device of base-2 butterfly processing element is suitable
Total depth, improve the utilization rate to RAM, save the resource of FPGA.
Accompanying drawing explanation
By feature and advantage of the present utility model, accompanying drawing can be more clearly understood from reference to accompanying drawing
It is schematic and should not be construed as this utility model is carried out any restriction, in the accompanying drawings:
Fig. 1 is the structural representation of the FFT device of application base-2 butterfly processing element;
Fig. 2 is that the structure of the FFT device based on FPGA of one embodiment of this utility model is shown
It is intended to;
Fig. 3 is the schematic diagram of the FFT device based on FPGA of one embodiment of this utility model;
Fig. 4 is the schematic diagram of the FFT method based on FPGA of one embodiment of this utility model.
Detailed description of the invention
Below in conjunction with accompanying drawing, embodiment of the present utility model is described in detail.
Fig. 2 shows the knot of the FFT device based on FPGA of one embodiment of this utility model
Structure schematic diagram.
As in figure 2 it is shown, the FFT device based on FPGA of the present embodiment includes:
Cache module 1, control module 2 and base-4 butterfly computation device 3;
Control module 2 is connected with cache module 1 and base-4 butterfly computation device 3 respectively, is used for controlling
The input of data, output, cache to cache module in the way of ping-pong buffer for controlling data
In 1, in the way of cyclic addressing, in base-4 butterfly computation device 3, complete FFT for controlling data
Computing;
Cache module 1 is the data of 3/4 before initial input, the operation result of 3/4 after output, and
For preserving intermediate object program;
Base-4 butterfly computation device 2 is the data of 1/4 after initial input, export the computing knot of front 1/4
Really.
The FFT device based on FPGA of the present embodiment, uses radix-4 butterfly arithmetical unit, carries
High arithmetic speed, need not extra during data between using the mode of cyclic addressing in storage
RAM, need not extra RAM when data Sequential output, uses number at DSP multiplier
Storage number is decreased in the case of amount is suitable with the FFT device of application base-2 butterfly processing element
According to the total depth of RAM, improve the utilization rate to RAM, save the money of FPGA
Source.
In the optional embodiment of one, described cache module is multiple dual port RAM or many
Individual single port RAM.In FFT device based on FPGA, cache module is that dual port RAM is permissible
Reach to use the less effect of number of RAM.
The number of described dual port RAM is 7 or 8, by the decision of counting of FFT computing.
The twice that the total depth of the plurality of RAM is counted less than or equal to FFT computing.
The number of described base-4 butterfly computation device is 1 or 2, by the decision of counting of FFT computing.
Fig. 3 is the schematic diagram of the FFT device based on FPGA of one embodiment of this utility model.
As it is shown on figure 3, this FFT device includes some dual port RAMs and butterfly computation device and selector,
Wherein count 2 times of the total RAM degree of depth up to FFT, width is data width.Butterfly computation
2 base-4 butterfly computation devices are at most set, every 8 block RAMs can within a cycle parallel output 8
Data, can make full use of two radix-4 butterflyunits, improve arithmetic speed.
Fig. 4 is the method schematic diagram of the FFT based on FPGA of one embodiment of utility model.As
Shown in Fig. 4, use the FFT method of FFT device based on FPGA as above, including:
S41: sequentially input the first frame data, after completing 1 grade of butterfly computation of the first frame data,
Use ping-pong buffer to sequentially input the second frame data, and complete the M level butterfly fortune of the first frame data
Calculate;
S42: complete the Sequential output of the butterfly computation result of the first frame data, carries out simultaneously
The caching of two frame data and butterfly computation;
S43: complete the M level butterfly computation of the second frame data, uses ping-pong buffer to carry out simultaneously
The caching of the 3rd frame data, and proceed by 1 grade of butterfly computation of the 3rd frame data;
S44: constantly repeat the caching of data, butterfly computation and result output procedure, complete many
The butterfly computation of frame data;
Wherein, M is the progression of butterfly computation, and N is counting of FFT computing, N=4M;Data
Read and storage uses cyclic addressing mode.
Further, described in sequentially input the first frame data, complete 1 grade of butterfly of the first frame data
After shape computing, use ping-pong buffer to sequentially input the second frame data, and complete the first frame data
M level butterfly computation;Complete the Sequential output of the butterfly computation result of the first frame data, enter simultaneously
Caching and the butterfly computation of row the second frame data include:
Sequentially input first frame data of front 3/4 to the Part I of cache module, when after 1/4
When first frame data arrive base-4 butterfly computation device, the direct and cache module according to butterfly computation figure
In data carry out butterfly computation, and the result of 1 grade of butterfly computation is preserved to cache module
Part I;
Complete the M level butterfly computation of the first frame data, base-4 butterfly computation device Sequential output first
Front the 1/4 of the butterfly computation result of frame data, the operation result of rear 3/4 preserves to cache module
Part I;Ping-pong buffer is used to sequentially input second frame data of front 3/4 to cache module
Part II, when second frame data of rear 1/4 arrive base-4 butterfly computation device, transports according to butterfly
Nomogram data directly and in cache module carry out butterfly computation;
After the butterfly computation result of Part I Sequential output first frame data of cache module
3/4;
Correspondingly, the digital independent of described cache module and storage use cyclic addressing mode.
Illustrate that the table tennis in this FFT method based on FPGA delays with a specific example below
Deposit process.
If it is 4096 points that a frame serial data carries out counting of FFT computing, use base-4DIF
Computing, the RAM of use is that RAM1-14 in Fig. 3 is (it should be noted that in Fig. 3
RAM1-14 be single port RAM, the process of following ping-pong buffer is also with single port RAM
As a example by illustrate;Counting for FFT is the computing of 4096, it is possible to use 7
Individual dual port RAM, its process and operation principle are similar with use single port RAM), its process
As follows:
(1) caching input serial data frame 0, spatial cache is set to computing and counts
3/4, i.e. 4096*0.75=3072, be i.e. cached to RAM6.
(2) when the 3073rd data arrive, according to butterfly computation figure, directly with before
In caching RAM, the 1st, 1025,2049 data carry out base-4 butterfly computation.And will meter
Calculation result is stored in and caches to RAM1~RAM8.
(3) when the 3074th data arrive, according to butterfly computation figure, directly with caching
In the 2nd, 1026,2050 data carry out base-4 butterfly computation, and result of calculation is stored in
Cache to RAM1~RAM8.
(4) when the 3075th data arrive, according to butterfly computation figure, directly with caching
In the 3rd, 1027,2051 data carry out base-4 butterfly computation, and result of calculation is stored in
Cache to RAM1~RAM8.
When the 3076th data arrive ....
When the 4096th data arrive, according to butterfly computation figure, directly with caching in the
1024,2048,3072 data carry out base-4 butterfly computation, and result of calculation are stored in slow
Deposit in RAM.Now complete all butterfly computations of the 1st grade.Complete 1 grade of computing
The caching RAM that data are stored in is RAM1~RAM8.
(5) caching next frame input data frame 1, spatial cache is opened from RAM9
Begin, in these 1024 clock cycle, it is possible to use 2 radix-4 butterflyunits are to 1~6
Data in caching RAM proceed to process, and now 1 clock cycle reads buffer
Interior 8 point data carry out butterfly computation, complete 1024*8=8192 point within 1024 cycles altogether,
I.e. 8192/4096=2 level butterfly computation.Now data complete for 3 grades of computings are still stored back to
RAM1~RAM8, it is achieved former address computing.
(6) continuing to cache frame 1 data, spatial cache is RAM11 and RAM12,
In these 1024 clock cycle, it is possible to use 2 radix-4 butterflyunits are to 1~6 cachings
Data in RAM proceed to process, and now 1 clock cycle reads 8 points in buffer
Data carry out butterfly computation, complete 1024*8=8192 point altogether, i.e. within 1024 cycles
8192/4096=2 level butterfly computation.Now data complete for 5 grades of computings are still stored back to
RAM1~RAM8, it is achieved former address computing.
(7) continuing to cache frame 1 data, spatial cache is RAM13 and RAM14,
In these 1024 clock cycle, it is possible to use 1 radix-4 butterflyunit delays 7~14
Depositing the data in RAM to proceed to process, now 1 clock cycle reads in buffer 4
Point data carries out butterfly computation, completes 1024*4=4096 point altogether, i.e. within 1024 cycles
4096/4096=1 level butterfly computation.Now data complete for 6 grades of computings are still stored back to
RAM1~RAM8, it is achieved former address computing.Owing to having been completed the computing of afterbody,
During calculating, the result after 6 grades of computings directly can be exported, when all calculating are complete
Cheng Shi, result output 1/4.
(8) frame 1 is carried out 1 grade of computing, operation result is stored in RAM1~2,
RAM9~14, the operation result of the previous frame frame 0 of RAM3 output simultaneously.
(9) to RAM3,4 cache, data cached for next frame frame 2 data, with
Time RAM5 output previous frame operation result, these two butterfly computations of time frame 1 data separate
Device completes 3 grades of butterfly computations.
(10) to RAM5,6 cache, simultaneously the output of RAM7 start frame 0 data,
Frame 1 data complete 5 grades of computings.
(11) to RAM7,8 caching, frame 1 data complete 6 grades of computings and export.
Further, described cyclic addressing mode includes:
Carry out 1 grade of butterfly computation, 1 grade of butterfly computation result is preserved according to the mode of cyclic addressing
In cache module;
Carry out the butterfly computation of intergrade, read the number in cache module according to cyclic addressing mode
According to, intergrade butterfly computation result is saved in cache module according to the mode of cyclic addressing;
Carry out the butterfly computation of afterbody, according to cyclic addressing mode, butterfly computation result is protected
Deposit to cache module, the data being successively read in cache module Sequential output butterfly computation knot
Really.
Specifically, described in carry out 1 grade of butterfly computation, 1 grade of butterfly computation result is sought according to circulation
The mode of location is saved in cache module and includes:
Carry out 1 grade of butterfly computation, 1 grade of butterfly computation result is divided into 16 groups, by described 16 groups of butterflies
The 0-3 group data of shape operation result are sequentially stored into a RAM, the 2nd RAM, the 3rd RAM
With the 4th RAM;The 4-7 group data of described 16 groups of butterfly computation results are sequentially stored into second
RAM, the 3rd RAM, a 4th RAM and RAM;By described 16 groups of butterfly computation results
8-11 group data be sequentially stored into the 3rd RAM, the 4th RAM, a RAM and second
RAM;The 12-15 group data of described 16 groups of butterfly computation results are sequentially stored into the 4th
RAM, a RAM, the 2nd RAM and the 3rd RAM.
Specifically, described in carry out the butterfly computation of intergrade, read slow according to cyclic addressing mode
Data in storing module, are saved in intergrade butterfly computation result according to the mode of cyclic addressing
Cache module includes:
Carry out the butterfly computation of intergrade, read the number in cache module according to cyclic addressing mode
According to being input to the first port of base-4 butterfly computation device, the second port, the 3rd port and the 4th end
Mouthful;The data read in cache module according to cyclic addressing mode are input to base-4 butterfly computation device
The second port, the 3rd port, the 4th port and the first port;Read according to cyclic addressing mode
Take the data in cache module be input to the 3rd port of base-4 butterfly computation device, the 4th port,
First port and the second port;The data reading cache module according to cyclic addressing mode are input to
4th port, the first port, the second port and the 3rd port of base-4 butterfly computation device;
Wherein, a length of the 1/4 of each conversion inputs mouthM×N;
The butterfly computation result of each intergrade is divided into 16 groups, protects according to the mode of cyclic addressing
Exist in cache module.
Specifically, the butterfly computation of afterbody is carried out, according to cyclic addressing mode by butterfly described in
Shape operation result preserves to cache module, and the data being successively read in cache module order are defeated
Go out butterfly computation result to include:
Carry out afterbody butterfly computation, the data of the first port in base-4 butterfly computation device are protected
Deposit to a RAM, the data of the second port in base-4 butterfly computation device are preserved to the 3rd
The data of the 3rd port in base-4 butterfly computation device are preserved to the 2nd RAM, by base by RAM
In-4 butterfly computation devices, the data of the 4th port preserve to the 4th RAM;
Wherein, described cache module has carried out multi-stage data division, until often organizing the number of data
It is 1;
The data being successively read in cache module Sequential output butterfly computation result.
With a specific example, the cyclic addressing in FFT method based on FPGA is described below
Process.(this time introduce is the method using a butterfly computation device, with two butterfly fortune
Calculate device method consistent)
(1) carry out being sequentially input in RAM by N point data, until 3/4 data are input to
After RAM, start 1 grade of addressing and calculate.
(2) 1 grades of addressing: RAM1~3 sequential reads out data also according to address 0~(1/4*N-1)
As front 3 inputs of butterfly computation device, the 4th input of butterfly computation is for directly to come
Data.By the 0th~(1/16*N-1) of butterfly computation device output port 1~4 after calculating
Individual data are sequentially stored into RAM1,2,3,4, sequence number (1/16*N)~(1/8*N-1)
Being sequentially stored into RAM2,3,4,1, sequence number (1/8*N)~(3/16*N-1) are sequentially stored into
RAM3,4,1,2, sequence number (3/16*N)~(1/4*N-1) be sequentially stored into RAM4,1,
2、3。
(3) 2 grades of addressing: RAM1 read addresses 0~(1/16*N-1), (1/16*N)~(1/8*N-1),
(1/8*N)~(3/16*N-1), (3/16*N)~(1/4*N-1) data respectively as butterfly
The data of shape carrier input port 1,2,3,4.RAM2 reads address simultaneously
(1/16*N)~(1/8*N-1), (1/8*N)~(3/16*N-1), (3/16*N)~(1/4*N-1),
0~the data of (1/16*N-1) the number as butterfly carrier input port 2,3,4,1
According to.RAM3 read address (1/8*N)~(3/16*N-1), (3/16*N)~(1/4*N-1),
0~(1/16*N-1), (1/16*N)~(1/8*N-1) data and as butterfly computation device input
The data of port 3,4,1,2.RAM4 reading address (3/16*N)~(1/4*N-1),
0~(1/16*N-1), (1/16*N)~(1/8*N-1), (1/8*N)~(3/16*N-1) number
According to and as the data of butterfly carrier input port 4,1,2,3.By butterfly after calculating
The 0th~(1/64*N-1) individual data of shape output port arithmetical unit 1~4 be sequentially stored into RAM1,
2,3,4, sequence number (1/64*N)~(1/32*N-1) are sequentially stored into RAM2,3,4,1,
Sequence number (1/32*N)~(3/64*N-1) are sequentially stored into RAM3,4,1,2, sequence number (3/64*N)
~(1/16*N-1) is sequentially stored into RAM4,1,2,3.Equally remaining number is done identical
Operation, i.e. sequence number (1/16*N)~(5/64*N-1) data be sequentially stored into RAM1,2,3,
4, sequence number (5/64*N)~(6/64*N-1) are sequentially stored into RAM2,3,4,1, sequence number
(6/64*N)~(7/64*N-1) is sequentially stored into RAM3,4,1,2, sequence number
(7/64*N)~(8/64*N-1) be sequentially stored into RAM4,1,2,3....
(4) 3 grades of addressing: RAM1 read addresses 0~(1/64*N-1), (1/64*N)~(2/64*N-1),
(2/64*N)~(3/64*N-1), (3/64*N)~(4/64*N-1) data and respectively as
The data of butterfly carrier input port 1,2,3,4.RAM2 reads address (4/64*N) simultaneously
~(5/64*N-1), (5/64*N)~(6/64*N-1), (6/64*N)~(7/64*N-1),
(7/64*N)~(8/64*N-1) data and respectively as butterfly carrier input port 2,3,
4, the data of 1.RAM3 read address (8/64*N)~(9/64*N-1), (9/64*N)~
(10/64*N-1), (10/64*N)~(11/64*N-1), (11/64*N)~(12/64*N-1)
Data the data respectively as butterfly carrier input port 3,4,1,2.RAM4 reads
Address (12/64*N)~(13/64*N-1), (13/64*N)~(14/64*N-1), (14/64*N)
~(15/64*N-1), (15/64*N)~(16/64*N-1) data and respectively as butterfly transport
The data of defeated device input port 3,4,1,2.Same do remaining address date is grasped equally
Make.By individual to the 0th~(1/256*N-1) of butterfly computation device output port 1~4 after calculating
Data are sequentially stored into RAM1,2,3,4, sequence number (1/256*N)~(2/256*N-1)
Being sequentially stored into RAM2,3,4,1, sequence number (2/256*N)~(3/256*N-1) are successively
Being stored in RAM3,4,1,2, sequence number (3/256*N)~(4/256*N-1) are sequentially stored into
RAM4、1、2、3.Equally remaining number is done same operation, i.e. sequence number (4/256*N)
~(5/256*N-1) data are sequentially stored into RAM1,2,3,4, sequence number
(5/256*N)~(6/256*N-1) is sequentially stored into RAM2,3,4,1, sequence number
(6/256*N)~(7/256*N-1) is sequentially stored into RAM3,4,1,2, sequence number
(7/256*N)~(8/256*N-1) be sequentially stored into RAM4,1,2,3....
(5) 4,5,6,7 grades of addressing ....
(6) afterbody addressing: first RAM1 is successively read address 0,2/16*N,
3/16*N, 1/16*N, the data of output input, simultaneously as the port 1 of butterfly computation device
RAM2 is successively read address (a+a1..), (2/16*N+a+a1..), (3/16*N+a+a1..),
(1/16*N+a+a1..), the data of output input as the port 2 of butterfly computation device, right
RAM3 be successively read address 2* (a+a1...), [2/16*N+2* (a+a1...)],
[3/16*N+2* (a+a1...)], [1/16*N+2* (a+a1...)], the data of output are as butterfly computation
The port 3 of device inputs, RAM4 is successively read address 3* (a+a1...),
[2/16*N+3* (a+a1...)], [3/16*N+3* (a+a1...)], [1/16*N+3* (a+a1...)], defeated
The data gone out input as the port 4 of butterfly computation device.By direct for the result of calculation of port 1
As final output data output, the data former address of port 2 is stored in RAM3, by port
The data former address of 3 is stored in RAM2, the data former address of port 4 is stored in RAM4, connects down
Continue RAM1 is carried out read operation, reading address is 2/64*N, (2/16*N+2/64*N),
(3/16*N+2/64*N), (1/16*N+2/64*N), to RAM2 read address be
(2/64*N+a+a1...)、[(2/16*N+2/64*N)+a+a1...]、
[(3/16*N+2/64*N)+a+a1...], [(1/16*N+2/64*N)+a+a1...], read RAM3
Take address for [2/64*N+2* (a+a1...)], [(2/16*N+2/64*N)+2* (a+a1...)],
[(3/16*N+2/64*N)+2* (a+a1...)], [(1/16*N+2/64*N)+2* (a+a1...)] are right
RAM4 read address be [2/64*N+3* (a+a1...)],
[(2/16*N+2/64*N)+3*(a+a1...)]、[(3/16*N+2/64*N)+3*(a+a1...)]、
[(1/16*N+2/64*N)+3* (a+a1...)], the data of same output are as the end of butterfly computation device
Mouth 4 input.By the result of calculation of port 1 directly as final output data output, by end
The data former address of mouth 2 is stored in RAM3, and the data former address of port 3 is stored in RAM2, will
The data former address of port 4 is stored in RAM4... wherein a, and a1... represents level, if afterbody is 3
Level computing, i.e. 64 points, then a=4, a1=1.If afterbody is 4 grades of computings, i.e.
256 points, then a=16, a1=4, a2=1.If afterbody is M level computing, i.e. 4^M
Point, then a=4^M/16, a1=4^M/64...aM-1=1.
(7) after completing afterbody computing, the data of 1/4*N Sequential output is complete,
Next the data of RAM2~4 it are sequentially output.
Sum up addressing it is seen that:
The data of the first order can be sequential read out from each RAM and are sequentially input to butterfly
4 ports of computing carry out computing.The output data of computing are divided into 16 groups, and (each butterfly is transported
Calculating device and export 4 groups of data simultaneously, each butterfly computation device output port produces 4 groups of data),
It is sequentially stored into RAM1,2,3,4, RAM2,3,4,1, RAM3,4,1,2, RAM4,
1, in 2,3.
During to the addressing data of intergrade, RAM1 starts reading out from address 0 all the time, and
The data of reading are separately input to the input port 1,2,3,4 of butterfly computation device, successively
Circulation.Every time the data length of conversion inputs mouth is followed successively by 1 grade of 1/4*N, 2 grades
1/16*N..M level 1/4^M*N, wherein N=4^M.The initial address of reading of RAM2 is a1+a2...,
If 1 grade of calculating, a1=1/4*N, a2, a3...=0, if 2 grades of computings, a1=1
/ 4*N, a2=1/16*N, a3...=0, if M level calculates,
A1=1/4*N, a2=1/16*N...aM=1/4^M*N, sequentially read address afterwards, and address is being read
Return to address 0 during maximum and continue addressing it is known that complete the one of whole address spatial depth
Secondary circulation.RAM3,4 initial addresses of reading are respectively 2 (a1+a2...) and 3 (a1+a2...), its
It operates with RAM2.When butterfly computation is complete start to write time, with read address location consistent,
Realize former address storage, it should be noted that every 4 groups of each butterfly computation device output port
Data need to be placed in different RAM, and the data length often organized defines according to progression, 1 grade
Computing is output as 1/16*N, and 2 grades of computings are output as the data storage position of 1/64*N.... port 1
Putting and be followed successively by RAM1,2,3,4 circulation, port 2 is RAM2,3,4,1 circulation,
Port 3 is RAM3,4,1,2 circulation, and port 4 is RAM4,1,2,3 circulation.
During to the addressing data of last 1 grade, according to radix-4 butterfly arithmograph finally output order
Feature, addressing rule as follows: the first step, the number in RAM is divided into 4 according to address
Group, referred to as 1 grade group.The degree of depth often organized is 1/16*N, numbered group 1~4, then RAM1
According to 0,2/16*N, 3/16*N, 1/16*N be addressed, i.e. order be group 1, organize 3,
Group 2, the first address of group 4, plus a1+a2... on the basis of other RAM at this point location.
Second step, divides 4 groups again to each 1 grade of group, referred to as 2 grades groups, and the degree of depth often organized is 1/64*N,
Numbered group 1~4, then RAM1 is according to group 1, group 3, group 2, the addressing of address of group 4,
Other RAM2 is on this basis plus a1+a2....(to secondary groups in the first step
1 addressing) the 3rd step, divide 4 groups again to each 2 grades of groups .... until the number often organizing data is 1
Stop packet.After butterfly computation, the data of output port 1 are fed directly to total module
Output port, the data of port 2 are stored in RAM3, and the data of port 3 are stored in RAM2
In, the data of port 4 are stored in RAM4.After butterfly computation, the number of 1/4*N
Complete according to output, it is successively read the data output of RAM2~4 the most in order.
According to above procedure, the FFT method based on FPGA of the present embodiment uses radix-4 butterfly
Arithmetical unit, improve arithmetic speed, use ping-pong buffer, the mode of cyclic addressing to achieve
The in-place computation of data, in storage between data time need not extra RAM, suitable in data
Extra RAM is need not, (contrast table 2 and table 3 understand) base-2 butterfly fortune during sequence output
Calculate the FFT device of unit quite in the case of decrease the total depth of RAM of storage data,
Improve the utilization rate to RAM, save the resource of FPGA.Storage added up by table 3
The RAM total depth that data take and the quantity carrying out the DSP multiplier that computing takies, its
The bit wide of middle storage RAM is data bit width, as follows:
The stock number that the FFT device of base-4 butterfly computation device takies applied by table 3
Those skilled in the art it should be appreciated that embodiment of the present utility model can be system,
Or computer program.Therefore, device of the present utility model can use complete hardware embodiment
Form.This utility model is the stream with reference to the equipment (system) according to this utility model embodiment
Journey figure and/or block diagram describe.Although having been described for preferred embodiment of the present utility model,
But those skilled in the art once know basic creative concept, then can implement these
Example makes other change and amendment.So, claims are intended to be construed to include preferably
Embodiment and fall into all changes and the amendment of this utility model scope.
The FFT device based on FPGA that the utility model proposes, uses radix-4 butterfly computing
Device, improves arithmetic speed, need not between using the mode of cyclic addressing in storage during data
Extra RAM, need not extra RAM when data Sequential output, at DSP multiplication
Reduce in the case of device usage quantity is suitable with the FFT device of application base-2 butterfly processing element
The total depth of the RAM of storage data, improves the utilization rate to RAM, saves
The resource of FPGA.
Although be described in conjunction with the accompanying embodiment of the present utility model, but people in the art
Member can make in the case of without departing from spirit and scope of the present utility model various amendment and
Modification, within the scope of such amendment and modification each fall within and are defined by the appended claims.
Claims (5)
1. a FFT device based on FPGA, it is characterised in that including:
Cache module (1), control module (2) and base-4 butterfly computation device (3);
Described control module (2) respectively with cache module (1) and base-4 butterfly computation device (3)
It is connected, for controlling the input of data, output, for controlling data in the way of ping-pong buffer
Cache to cache module (1), for controlling data in the way of cyclic addressing base-4 butterfly
Shape arithmetical unit (3) completes FFT computing;
Described cache module (1) is the data of 3/4, the computing of 3/4 after output before initial input
As a result, and be used for preserving intermediate data;
Described base-4 butterfly computation device (3) is the data of 1/4 after initial input, export front 1/4
Operation result.
FFT device based on FPGA the most according to claim 1, it is characterised in that
Described cache module (1) is multiple dual port RAM or multiple single port RAM.
FFT device based on FPGA the most according to claim 2, it is characterised in that
The number of described dual port RAM is 7 or 8, by the decision of counting of FFT computing.
FFT device based on FPGA the most according to claim 2, it is characterised in that
The twice that the total depth of the plurality of RAM is counted less than or equal to FFT computing.
FFT device based on FPGA the most according to claim 1, it is characterised in that
The number of described base-4 butterfly computation device (3) is 1 or 2, by the decision of counting of FFT computing.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201620035015.6U CN205486097U (en) | 2016-01-14 | 2016-01-14 | FFT device based on FPGA |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201620035015.6U CN205486097U (en) | 2016-01-14 | 2016-01-14 | FFT device based on FPGA |
Publications (1)
Publication Number | Publication Date |
---|---|
CN205486097U true CN205486097U (en) | 2016-08-17 |
Family
ID=56667937
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201620035015.6U Expired - Fee Related CN205486097U (en) | 2016-01-14 | 2016-01-14 | FFT device based on FPGA |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN205486097U (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106970895A (en) * | 2016-01-14 | 2017-07-21 | 普天信息技术有限公司 | FFT device and methods based on FPGA |
CN107167713A (en) * | 2017-06-01 | 2017-09-15 | 贵州电网有限责任公司 | A kind of cable local discharge pulse signal time frequency analysis system and method based on FPGA |
-
2016
- 2016-01-14 CN CN201620035015.6U patent/CN205486097U/en not_active Expired - Fee Related
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106970895A (en) * | 2016-01-14 | 2017-07-21 | 普天信息技术有限公司 | FFT device and methods based on FPGA |
CN106970895B (en) * | 2016-01-14 | 2023-10-03 | 普天信息技术有限公司 | FFT device and method based on FPGA |
CN107167713A (en) * | 2017-06-01 | 2017-09-15 | 贵州电网有限责任公司 | A kind of cable local discharge pulse signal time frequency analysis system and method based on FPGA |
CN107167713B (en) * | 2017-06-01 | 2020-02-18 | 贵州电网有限责任公司 | Cable partial discharge pulse signal time-frequency analysis system and method based on FPGA |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Jo et al. | New continuous-flow mixed-radix (CFMR) FFT processor using novel in-place strategy | |
CN103699515B (en) | FFT (fast Fourier transform) parallel processing device and FFT parallel processing method | |
CN107392309A (en) | A kind of general fixed-point number neutral net convolution accelerator hardware structure based on FPGA | |
US20100017452A1 (en) | Memory-based fft/ifft processor and design method for general sized memory-based fft processor | |
CN101847137B (en) | FFT processor for realizing 2FFT-based calculation | |
CN102063411A (en) | FFT/IFFT processor based on 802.11n | |
CN1655143A (en) | Fast Fourier transform processor and method using half-sized memory | |
US20060253514A1 (en) | Memory-based Fast Fourier Transform device | |
CN205486097U (en) | FFT device based on FPGA | |
CN101937423B (en) | Streamline FFT/IFFT processing system | |
US9767074B2 (en) | Method and device for fast fourier transform | |
Lin et al. | A low-power 64-point FFT/IFFT design for IEEE 802.11 a WLAN application | |
CN100594490C (en) | Memory control method and calculating device therefor | |
CN107133194A (en) | Configurable FFT/IFFT coprocessors based on hybrid radix | |
CN104268124B (en) | A kind of FFT realizes apparatus and method | |
CN112446330A (en) | Solar radio frequency spectrum analysis method and system based on multichannel FFT algorithm | |
CN103034621B (en) | The address mapping method of base 2 × K parallel FFT framework and system | |
CN107391439A (en) | A kind of processing method of configurable Fast Fourier Transform (FFT) | |
EP3340066A1 (en) | Fft accelerator | |
CN103176949B (en) | Realize circuit and the method for FFT/IFFT conversion | |
JP2005196787A (en) | Fast fourier transform device improved in processing speed and its processing method | |
CN106970895A (en) | FFT device and methods based on FPGA | |
CN101937332A (en) | Multiplier multiplexing method in base 2<4> algorithm-based multi-path FFT processor | |
CN103493039B (en) | Data processing method, data processing equipment, access device and subscriber equipment | |
CN114422315B (en) | Ultra-high throughput IFFT/FFT modulation and demodulation method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20160817 Termination date: 20220114 |