CN105718424A - Parallel and rapid Fourier conversion processing method - Google Patents

Parallel and rapid Fourier conversion processing method Download PDF

Info

Publication number
CN105718424A
CN105718424A CN201610052233.5A CN201610052233A CN105718424A CN 105718424 A CN105718424 A CN 105718424A CN 201610052233 A CN201610052233 A CN 201610052233A CN 105718424 A CN105718424 A CN 105718424A
Authority
CN
China
Prior art keywords
data
data block
level
fft
parallel
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201610052233.5A
Other languages
Chinese (zh)
Other versions
CN105718424B (en
Inventor
禹霁阳
汪路元
李欣
徐轲
郭丽明
冯国平
徐勇
李珂
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Institute of Spacecraft System Engineering
Original Assignee
Beijing Institute of Spacecraft System Engineering
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Institute of Spacecraft System Engineering filed Critical Beijing Institute of Spacecraft System Engineering
Priority to CN201610052233.5A priority Critical patent/CN105718424B/en
Publication of CN105718424A publication Critical patent/CN105718424A/en
Application granted granted Critical
Publication of CN105718424B publication Critical patent/CN105718424B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/14Fourier, Walsh or analogous domain transformations, e.g. Laplace, Hilbert, Karhunen-Loeve, transforms

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • General Physics & Mathematics (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • Mathematical Analysis (AREA)
  • Computational Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Algebra (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Complex Calculations (AREA)

Abstract

The invention provides a parallel and rapid Fourier conversion processing method.The method includes the steps of dividing a data sequence X(n) with the point number of N=rS into vr secondary data blocks, and obtaining the FFT result of the N/vr datum in each secondary data block through the base r FFT calculation, wherein v is equal to rZ, r and S are any integers, and Z is equal to 0,1,...or S-2.Therefore, the data point number N has more values, the scheme with the least zero padding can be selected from the values, and therefore occupation of storage space and calculation time is reduced; multi-butterfly-shape parallel calculation is adopted, the number v of parallel butterfly-shaped calculation units is equal to rZ, and therefore the parallel degree can be selected according to configuration of hardware resources, and higher flexibility is achieved.

Description

A kind of parallel fast fourier transform processing method
Technical field
The present invention relates to fft processor design field, particularly to one parallel fast fourier transform process side Method.
Background technology
Fast fourier transform is the crucial composition of digital signal processing in communication, radar system, is used for realizing number The conversion between time domain and frequency domain of the word signal.It is capable of such as spectrum analysis, digital filtering in actual applications, is correlated with The functions such as process.
Currently, many butterflies parallel computation fast fourier transform uses the base 2 under stored on-site, base 4 structure to realize substantially, Mainly include the steps such as memory reference address generation, the exchange of Lothrus apterus data, butterfly calculating.Lothrus apterus data exchange process In, if using the method adding additional buffered, the occupancy of caching can be made to increase along with the increase counted, the most a large amount of The read-write operation of caching can increase the time of whole calculating.In current many butterflies parallel computation fast fourier transform, use more Memorizer conflict-free access address generation module, thus balances out the conflict produced during data access before butterfly calculates, The parallel computation of many butterflies is carried out under conditions of conflict-free access memorizer.
Beijing Institute of Technology's CN101504638 patent discloses a kind of point-variable assembly line FFT processor, this invention A kind of point-variable assembly line FFT processor, including the one 1024 variable FFT processing module, twiddle factor processing module, 2nd 1024 variable FFT processing module, selection and control module.Aforementioned four module is deposited with the intermediate data outside processor Storage module completes the two-dimensional process of ultra long FFT jointly.ZTE Co., Ltd's CN101847986A patent is open A kind of circuit realizing FFT/IFFT conversion and method, method is: determine iterations, the degree of depth of the first and second RAM, The degree of depth of ROM memory;The front and rear of input data to be transformed half part is stored in respectively second and a RAM;Carry out repeatedly For butterfly computation: the, first in iteration, uses bit reversed order to read when reading the first and second RAM, even-times butterfly computation result Write a RAM, odd-times butterfly computation result write the 2nd RAM;In other iteration, normal bit order is used to read the first He 2nd RAM, the mode writing back RAM is identical with first time;Last iteration uses normal sequence to access memorizer.
Said method is primarily present problems with:
(1) selection that during the unicity of radix have impact on actual application, Fourier transformation is counted
Fast Fourier transform of the prior art, when base 2, corresponding the counting of base 4 select, at most needs to supply one Zero again reaches the condition of a length of power side of 2 with this.This allows for the memory space doubled during calculating The calculating time with more than one times.
(2) selection of butterfly computing unit degree of parallelism during the unicity of radix have impact on actual application
In prior art, owing to radix is the power side of 2, if the degree of parallelism of butterfly computing unit also 2 power just now Computational efficiency can be reached the highest.And hardware resource may not reach such requirement in actual applications, in this case use Reduction degree of parallelism meets design and requires then can significantly reduce the performance of system-computed.
Summary of the invention
It is an object of the invention to overcome the deficiencies in the prior art, it is provided that a kind of parallel fast Foourier transform processor and Processing method, it is possible to achieve.
The above-mentioned purpose of the present invention is realized by below scheme:
A kind of parallel fast fourier transform processing method, comprises the steps:
(1), data sequence x (n) received is carried out first order packet, N number of data will be divided into v one-level Data block, each described level one data block includesIndividual data;Wherein, n-th2In individual level one data block n-th1Individual data are x ' (n1,n2)=x (n1v+n2),N2=0,1 ..., v-1;N=rS, v=rZ, Z=0,1 ... or S-2, S and r is integer;
(2) each level one data block, to step (1) divided carries out second level packet, will each level one data block Being divided into r secondary data block, each described secondary data block includesIndividual data;Wherein, n-th2Individual level one data block divides N-th '2In individual secondary data block n-th '1Individual data x " (n'1,n'2,n2)=x ' (n'1r+n'2,n2),n'2=0,1 ..., r-1, n2=0,1 ..., v-1;
(3), in each secondary data blockIndividual data are carried outPoint FFT calculates;Wherein n-th2Individual level one data block N-th divided '2In individual secondary data blockIndividual data FFT result of calculation is Wherein,n'2=0,1 ..., r-1, n2=0,1 ..., v-1,
(4), the FFT result of calculation of r secondary data block in each level one data block is merged, obtain each one The FFT result of calculation of DBMS block;Wherein, n-th2In individual level one data blockThe FFT result of calculation of individual data isWherein,n2=0,1 ..., v-1,
(5), the FFT result of calculation of v level one data block being merged, the N point FFT obtaining data sequence x (n) calculates ResultWherein,k=0,1 ..., N-1, WN=e-j2π/N
Above-mentioned parallel fast fourier transform processing method, in step (2), uses vrThe double port memory of point Deposit the data of vr secondary data block;Described vr double port memory is divided into v memorizer group, corresponding to v level one data Block;Each described memorizer group includes r double port memory, corresponding to r secondary data block in 1 level one data block; Wherein use S position r system Counter to obtain the storage address of each data in sequence x (n), i.e. determine in sequence x (n) the The storage address sequence number that n data are saved in the individual memorizer of Ram_Id (n) in Group_Id (n) individual memorizer group Add_ID (n), n=0,1 ..., N-1, concrete methods of realizing is as follows:
The r binary value of count value n of described enumerator is (aS-1aS-2…aZ+1aZaZ-1…a1a0)r, wherein a0~aS-1For Described r binary value the 1st~the numeral of S position, the span of described numeral is 0~r-1, then memorizer group sequence number The r binary value of Group_Id (n) is (aZ-1…a1a0)r, the r binary value of memorizer sequence number Ram_Id (n) is (aZ)r, deposit The r binary value of storage address sequence number Add_ID (n) is (aS-1aS-2…aZ+1)r
Above-mentioned parallel fast fourier transform processing method, in step (3), uses base r FFT to be calculated each In secondary data blockThe FFT result of individual data;Base r FFT is used to calculate in each level one data block in step (4) The FFT result of calculation of r secondary data block merge;Base v FFT is used to calculate v level one data in step (5) The FFT result of calculation of block merges.
Above-mentioned parallel fast fourier transform processing method, in step (3), uses v base r butterfly unit to vr Secondary data block carries out base r FFT and calculates, and r secondary data block in each level one data block is shared by time division multiplex system 1 base r butterfly computing unit;Described time division multiplex system includes r serioparallel exchange module, first order gating control cells, base r Butterfly computing unit, second level gating control cells and r parallel serial conversion module, wherein:
R serioparallel exchange module: with r secondary data block one_to_one corresponding in a level one data block;Respectively from r two Reading data in the memorizer of DBMS block, obtain r road serial data, wherein every road serial data includesIndividual data point;So Rear each serioparallel exchange module carries out serioparallel exchange to corresponding serial data, by serialIndividual data point is converted to r road also Row data, every channel parallel data includesIndividual data point;
First order gating control cells: the parallel data of r serioparallel exchange module output is carried out gating operation, selects every time The r channel parallel data of logical wherein 1 serioparallel exchange module output, then exports described r channel parallel data base r butterfly and calculates Unit;
Base r butterfly computing unit: receiving r channel parallel data and carry out base r FFT calculating, output r road parallel computation result arrives Second level passage gating control cells;
Second level gating control cells: gate between r parallel serial conversion module, the r road parallel FFT that will receive Result of calculation exports wherein 1 parallel serial conversion module, the parallel serial conversion module sequence number of gating and first order gating control cells The serioparallel exchange module sequence number of gating is consistent;
R parallel serial conversion module: with r secondary data block one_to_one corresponding in a level one data block;Select through the second level Parallel serial conversion module after logical control unit gating, receives r road parallel FFT result of calculation, carries out parallel serial conversion 1 road serial number According to, in the memorizer of the secondary data block that described serial data is saved in correspondence, storage position is read with serioparallel exchange module The position consistency of data, i.e. realizes stored on-site.
The present invention compared with prior art, has the advantage that
(1) limit the power side that data points N is 2 of FFT, in the prior art, and the present invention uses base r quick Fourier transform, carries out the data of FFT and counts N=rS, wherein r and S is arbitrary integer, and therefore the data of the present invention are counted N has more value, can select the scheme that zero padding is minimum in these values, thus when reducing memory space and calculate Between take;
(2), the present invention uses the parallel computation of many butterflies, and number v=r of parallel butterfly computing unitz, wherein z=0, 1 ..., S-2, r and S be arbitrary integer, therefore the present invention can select degree of parallelism according to the configuration of hardware resource, has bigger Motility;
(3), the present invention is carrying out N=rSDuring point FFT, use v=rzIndividual parallel butterfly computing unit calculates, The whole calculating cycle is rS-z-1(S-z-r)+2r(S-z-1)。
Accompanying drawing explanation
Fig. 1 is the parallel fast fourier transform process flow figure of the present invention;
Fig. 2 is that conflict-free access address of the present invention produces schematic diagram;
Fig. 3 is the sequential chart that data in single double port memory are processed by base r butterfly computing unit;
Fig. 4 is the sequential chart that data in multiple double port memories are processed by time division multiplex base r butterfly computing unit;
Fig. 5 is the theory diagram of the base r butterfly computing unit time division multiplex system of the present invention.
Fig. 6 a is that in the present invention, 1 base r butterfly computing unit of r memorizer time division multiplex carries out r groupWhen point FFT calculates Digital independent and storage schematic diagram;
Fig. 6 b is that in the present invention, 1 base r butterfly computing unit of r memorizer time division multiplex carries out r groupPoint FFT calculates Digital independent when result merges and storage schematic diagram;
Fig. 6 c is that in the present invention, 1 base v butterfly computing unit of v memorizer group time division multiplex carries out v groupPoint FFT calculates Digital independent when result merges and storage schematic diagram.
Detailed description of the invention
The present invention is described in further detail with instantiation below in conjunction with the accompanying drawings:
(1), the theoretical derivation that parallel FFT calculates
The catabolic process of 1.1N point FFT
Receiving data sequence x (n) carrying out N point FFT and is calculated X (k), computing formula is as follows:
X ( k ) = Σ n = 0 N - 1 x ( n ) W N n k - - - ( 1 )
Wherein, WN=e-j2π/N, k=0,1 ..., N-1, n=0,1 ..., N-1.
Definition v is positive integer that can be evenly divisible by N, then n and k can be analyzed to:
Wherein,n2=0,1 ..., v-1,k2=0,1 ..., v-1.Formula (2) will be mapped to the bivector of [0, v-1] × [0, (N/v)-1] at one-dimensional vector n in [0, N-1] interval, k (n1,n2) and (k1,k2), then formula (1) is rewritable is:
Wherein, X'(k1,n2)=FFT [x (n1,n2),(N/v)]。
Formula (3) shows that a N point FFT can be decomposed into v point FFT and N/v point FFT.Equally, N/v point FFT can be further Being decomposed into the FFT of more small point, definition r is positive integer that can be evenly divisible by N, n1And k1Can be analyzed to:
{ n 1 = n ′ 1 r + n ′ 2 , k 1 = k ′ 1 + [ N / ( v r ) ] k ′ 2 . - - - ( 4 )
Wherein,n'2=0,1 ..., r-1,k'2=0,1 ..., r-1.Formula (4) will beOne-dimensional vector n in interval1And k1Be mapped as the two dimension of [0, r-1] × [0, N/ (vr)-1] to Amount.
According to the decomposition formula of formula (4), by X'(k1,n2) arrange as follows:
Wherein, X " (k'1,n'2,n2)=FFT [x (n'1,n'2,n2),N/(vr)]。
The parallel computation process of 1.2N point FFT
According to the catabolic process of the above N point FFT derived, the present invention uses parallel fast Flourier as shown in Figure 1 to become Change process flow, input data sequence x (n) carried out N point FFT and is calculated X (k):
(b1), data sequence x (n) received is carried out first order packet, N number of data will be divided into v one DBMS block, each level one data block includesIndividual data;Wherein, n-th2In individual level one data block n-th1Individual data x ' (n1,n2) =x (n1v+n2),n2=0,1 ..., v-1;N=rS, v=rZ, Z=0,1 ..., S-2, S and r For integer;
(b2) each level one data, to step (b1) divided carries out second level packet, will each level one data block Being divided into r secondary data block, each secondary data block includesIndividual data;Wherein, n-th2Individual level one data block divide the n'2In individual secondary data block n-th '1Individual data x " (n'1,n'2,n2)=x ' (n'1r+n'2,n2), n'2=0,1 ..., r-1, n2=0,1 ..., v-1;
(b3), in each secondary data blockIndividual data are carried outPoint FFT calculates;Wherein n-th2Individual level one data block N-th divided '2In individual secondary data blockIndividual data FFT result of calculation is Wherein,n'2=0,1 ..., r-1, n2=0,1 ..., v-1,
(b4), the FFT result of calculation of r secondary data block in each level one data block is merged, obtain each The FFT result of calculation of level one data block;Wherein, n-th2In individual level one data blockThe FFT result of calculation of individual data isWherein,n2=0,1 ..., v-1,
(b5), the FFT result of calculation of v level one data block is merged, obtain the N point FFT meter of data sequence x (n) Calculate resultWherein,k=0,1 ..., N-1, WN=e-j2π/N
In journey processed above, step (b3) uses base r FFT to be calculated in each secondary data blockNumber According to FFT result.In step (b4)~(b5), by X " (k'1,n'2,n2) obtain X'(k1,n2), again by X'(k1,n2) obtain X K (), this includes one-level vPoint base r FFT and one-level N point base v FFT, say, that use base r FFT in step (b4) Calculate the FFT result of calculation to r secondary data block in each level one data block to merge, step (b5) uses base V FFT calculates the FFT result of calculation to v level one data block and merges.
(2), parallel FFT computation structure design
Step (b1) as described above and (b2), will count as N=rSData sequence x (n) of (S is integer) divides For vr secondary data block, base r FFT then can be used to be calculated in each secondary data blockThe FFT knot of individual data Really.Due to independent of one another between above-described vr secondary data block, the most each secondary data block carries out parallel FFT calculating, Thus improve the N=r of sequence x (n)SPoint FFT calculates speed.And the present invention proposes a kind of base r butterfly unit time division multiplex Method, it is possible to achieve 1 the base r butterfly unit of r secondary data block time division multiplex in 1 level one data block, it is achieved r two grades The FFT of data block calculates, and therefore whole system just can realize the FFT of vr secondary data block only with v base r butterfly unit Parallel computation, i.e. can get N=r by one-level base r and one-level base v FFT the most againSThe FFT result of calculation of point data.
2.1, input data storage allocation
The present invention uses vrVr the secondary data that the double port memory of point deposits step (b1)~(b2) divides Block.Wherein this vr double port memory is divided into v memorizer group by the present invention, and this v memorizer group corresponds to v level one data Block;Each memorizer group includes r double port memory, and this r double port memory is corresponding to the r in 1 level one data block Secondary data block.
The present invention uses a S position r system Counter to obtain the storage address of each data in sequence x (n), i.e. determines sequence The storage in the individual memorizer of Ram_Id (n) during nth data x (n) is saved in Group_Id (n) individual memorizer group in row Address sequence number Add_ID (n), n=0,1 ..., N-1, concrete methods of realizing is as follows:
The r binary value of count value n of this enumerator is (aS-1aS-2…aZ+1aZaZ-1…a1a0)r, wherein a0~aS-1For institute Stating r binary value the 1st~the numeral of S position, the span of described numeral is 0~r-1, then: memorizer group sequence number Group_Id's (n)rBinary value is (aZ-1…a1a0)r, the r binary value of memorizer sequence number Ram_Id (n) is (aZ)r, storage The r binary value of address sequence number Add_ID (n) is (aS-1aS-2…aZ+1)r
2.2, r secondary data blockPoint FFT parallel computation
According to above-mentioned input data storage allocation method, each secondary data block is independently deposited in 1 memorizer, often In individual memorizerPoint data carries out independent FFT and calculates.When parallel FFT given below calculates, in each memorizer, data are visited Ask the production method of address, and the method for one base r butterfly unit of r secondary data block time division multiplex.
2.2.1, each memorizerPoint data reference address production method
In each memorizerPoint data carries out independent FFT and calculates, thisThe FFT of point data sequence x (n') It is defined as follows:
Hereinafter time domain address mark n' and mark k' M position, frequency domain address r system number are represented:
Wherein, M=S-z-1.According to formula (7), formula (6) is carried out r system rewriting, then:
From formula (8) it can be seen that N' point FFT calculates can be decomposed into M repeatedly band calculating, wherein, the m time iterative formula For:
Wherein, m=1,2 ..., M, the phase term of twiddle factorFrom formula (9) it can be seen that at the m time In iterative process, data access address is divided into three parts:
Wherein,Addr (m, 1)=km-1,
The multiple continuous butterfly unit maximum set defining twiddle factor different is a butterfly set, passes through formula (10) understand Addr (m, 0) and represent that the mark of nonoverlapping butterfly set in m level, Addr (m, 1) represent each butterfly unit The mark of middle operand, Addr (m, 2) then represents the mark of the butterfly unit included in each butterfly set.And for m The data access address of level can regard the shifter-adder combination of three as, and therefore the present invention uses the r system of a M position to count Device C=(cM-1cM-2…c1c0)r, by the intercepting of C is generated data access address.
To in each memorizerWhen point data carries out FFT calculating, need the butterfly collection in butterfly computing unit Butterfly unit in conjunction, butterfly set and r operand in butterfly unit carry out sequential processing respectively.So can use (cM-1cM-2…cM-m+2cM-m+1)rMap mark Addr (m, 0) of nonoverlapping butterfly set in m level, (c0)rMap each Mark Addr (m, 1) of operand, (c in butterfly unitM-m-1cM-m-2…c1)rMap the butterfly included in each butterfly set Mark Addr (m, 2) of unit, then the reference address of every one-level and the relation such as formula (11) of count value C are shown, m level access Location AcesAddr (m) can be expressed as follows:
The data access address that in each memorizer, FFT at different levels calculate generates process as shown in Figure 2.
2.2.2, base r butterfly computing unit calculates process
Single double port memory carrying out N' point FFT calculate, base r butterfly computing unit has needed two parts to operate: one It it is the process with rotation fac-tor;Two are and r point DFT matrix multiple.The former can be by reading data from memory order Completing by flowing water complex multiplier, the latter then can be by depositor time delay r the operand parallel to immediately taking out in turn Input butterfly unit.
The calculating process of whole base r butterfly computing unit is as described below:
(A), the data access address derived is saved by the read bus Data of memorizer according to upper oneloadTake out data, every r Data represent r of butterfly unit operation data, are labeled as Op (i) successively, while taking out data and rotation fac-tor, knot Fruit is labeled as Op'(i), wherein i=0,1 ..., r-1;
(B), r input register of note butterfly unit is Bf_reginI (), output register is Bf_regout(i), often After taking out r data from bus, make Bf_regin(i)=Op'(i), and carry out butterfly calculating.Butterfly unit uses parallel stream Water calculation, having calculated butterfly unit has r-1 free time every time, and result of calculation is expressed as Op " (i), make Bf_ regout(i)=Op " (i);
(C), to Bf_regoutI r data in () are assigned to the write bus Data of memorizer successivelystoreOn, complete this The calculating process of butterfly unit.
The sequential chart of said process as it is shown on figure 3, it can be seen that during single base r FFT calculates, butterfly list Unit all can have r-1 to be in idle condition every r clock cycle.
2.2.3, time division multiplex base r butterfly computing unit completes secondary data block FFT calculating
In the step (b3) that N point FFT calculates, need that vr secondary data block is carried out base r FFT and calculate, if each Secondary data block all uses 1 base r butterfly unit to carry out FFT calculating, then vr secondary data block needs vr base r butterfly unit Realize parallel computation.But owing to base r butterfly unit uses parallel pipelining process calculation, base r butterfly unit is to individual data block When calculating, every r clock cycle has r-1 and is in idle condition, and a butterfly unit therefore can be utilized r N' Point data carries out FFT.Now, it is only necessary to the initial time controlling the peek of each N' point data can complete the time-division of butterfly unit again With.By f (f=0,1 ..., r-1) r flag operand of individual N' point data be Opf(i) (i=0,1 ..., r-1), will deposit Result after memory bus reading and rotation fac-tor is designated as Opf' (i), sequentially input this r N' of butterfly unit and count According to operation data, butterfly result of calculation is designated as Opf" (i), and it is stored in corresponding memorizer according to the value of f, use this time-division multiple By the process sequential after method as shown in Figure 4.
Therefore, the present invention, in order to improve the resource utilization of system, uses v base r butterfly unit to vr secondary data Block carries out base r FFT and calculates, and in the most each level one data block, r secondary data block shares 1 base r by time division multiplex system Butterfly computing unit.As it is shown in figure 5, the base r butterfly unit time division multiplex system that the present invention uses includes r serioparallel exchange mould Block, first order gating control cells, base r butterfly computing unit, second level gating control cells and r parallel serial conversion module, its In:
R serioparallel exchange module and r secondary data block one_to_one corresponding in a level one data block.This r string also turns Die change block reads data respectively from the memorizer of r secondary data block, obtains r road serial data, wherein every road serial data IncludingIndividual data point;The most each serioparallel exchange module carries out serioparallel exchange to corresponding serial data, by serialIndividual Data point is converted to r channel parallel data, and every channel parallel data includesIndividual data point.
The parallel data that r serioparallel exchange module is exported by first order gating control cells carries out gating operation, selects every time The r channel parallel data of logical wherein 1 serioparallel exchange module output, then exports described r channel parallel data base r butterfly and calculates Unit.
Base r butterfly computing unit receives the r channel parallel data through first order gating control cells gating, carries out base r FFT Calculating, output r road parallel computation result is to second level passage gating control cells.
Second level gating control cells gates between r parallel serial conversion module, is exported by base r butterfly computing unit R road parallel FFT result of calculation export wherein 1 parallel serial conversion module, wherein, the parallel serial conversion module sequence number of gating and the The serioparallel exchange module sequence number of one-level gating control cells gating is consistent.
R secondary data block one_to_one corresponding in r parallel serial conversion module and a level one data block.Gate through the second level Parallel serial conversion module after control unit gating, receives r road parallel FFT result of calculation, carries out parallel serial conversion 1 tunnel serial data, In the memorizer of the secondary data block that described serial data is saved in correspondence, data are read with serioparallel exchange module in storage position Position consistency, i.e. realize stored on-site.
During above time division multiplex calculates, digital independent in each memorizer and storage mode are as shown in Figure 6 a.
2.2.4, time division multiplex base r butterfly computing unit completes the merging of FFT result of calculation
In step (b4)~(b5), by X " (k'1,n'2,n2) obtain X'(k1,n2), again by X'(k1,n2) obtain X (k), This includes one-level vPoint base r FFT and one-level N point base v FFT, say, that use base r FFT to calculate in step (b4) The FFT result of calculation of r secondary data block in each level one data block is merged, step (b5) uses base v FFT calculates the FFT result of calculation to v level one data block and merges.In order to improve computational efficiency, carrying out above FFT meter When calculating result merging, equally use time division multiplex system as shown in Figure 5.
Wherein, in step (b3) in each secondary data blockIndividual data are carried outWhen point FFT calculates, memorizer group Processing procedure as shown in Figure 6 a;Base r FFT is used to calculate the r in each level one data block two progression in step (b4) When merging according to the FFT result of calculation of block, the processing procedure of memorizer group is as shown in Figure 6 b;Base r is used in step (b5) When the FFT result of calculation of r secondary data block in each level one data block is merged by FFT calculating, the place of memorizer group Reason process is as fig. 6 c.
Embodiment:
In the present embodiment, the FFT using 4 base 2 butterfly units to realize 1024 point data calculates, it may be assumed that N=1024, r= 2, v=4, S=10, Z=2.
According to the processing method of the present invention, calculation procedure is as follows:
(1), input data storage allocation
In the present embodiment, the double port memory of 8 128 is used to carry out inputting data storage.Wherein, these 8 are deposited Reservoir is divided into 4 memorizer groups, and each memorizer group includes 2 memorizeies, and each memorizer has 128 storage positions.
First 1024 point data to input carry out sequential addressing, then 2 systems of the address mark n of data x (n) are expressed as (n9n8…n1n0)2.Wherein, nth data is saved in Group_Id (n)=(n1n0) Ram_Id in individual memorizer group (n)=(n2) in individual memorizer, the storage address in this memorizer is (n9n8…n4n3)2
According to storage distribution method as above, 1024 point data storage condition in 8 memorizeies is as shown in table 1.
Table 1 inputs data in each memorizer
(2), Lothrus apterus address accesses and realizes 8 128 parallel FFTs calculating
After data are stored in respective memory, start the data in 8 memorizeies are carried out 128 Ji2FFTChu simultaneously Reason.For 128 base 2FFT, during 7 grades of FFT calculate, by 72 system Counter C=(c6c5…c1c0)2Cut Take generation data access address.The data access address that FFT at different levels calculate is as shown in table 2.
Employing time division multiplex system as shown in Figure 5,1 base 2 butterfly computing unit of the data sharing of every 2 memorizeies, Thus complete 8 128 parallel FFTs with 4 base 2 butterfly computing units and calculate, the whole calculating time needs 924 clock weeks Phase.
Table 2 each memorizer data access address in FFT at different levels calculate
(3), first order result synthesis
According to formula (5), each memorizer group is carried out one-level 256 base 2FFT respectively to calculate, owing to there being 4 base 2 butterfly lists Unit, and 2 of butterfly operation data are respectively present in the middle of the memorizer of different 2, such synchronization can be from 8 memorizeies Middle 8 data of access simultaneously, only just can complete this calculating with 128 clock cycle.
(4), second level result synthesis
4 group of 256 point data according to formula (3) employing one-level base 4FFT calculating thus is obtained the FFT result of 1024.This Time butterfly unit operation data be belonging respectively to different memorizer groups, therefore 4 operands can be carried out concurrent access.Cause This, complete this calculating and have only to 256 clock cycle.
Step processed as above, 1024 base 2FFT of parallel computation is used to need 1308 clock cycle altogether, relative to existing There is technology, can effectively reduce calculating time loss.
The above, only one detailed description of the invention of the present invention, but protection scope of the present invention is not limited thereto, and appoints How those familiar with the art is in the technical scope that the invention discloses, the change that can readily occur in or replacement, all Should contain within protection scope of the present invention.
The content not being described in detail in description of the invention belongs to the known technology of professional and technical personnel in the field.

Claims (4)

1. a parallel fast fourier transform processing method, it is characterised in that comprise the steps:
(1), data sequence x (n) received is carried out first order packet, N number of data will be divided into v level one data Block, each described level one data block includesIndividual data;Wherein, n-th2In individual level one data block n-th1Individual data are x ' (n1,n2) =x (n1v+n2),n2=0,1 ..., v-1;N=rS, v=rZ, Z=0,1 ... or S-2, S and r are Integer;
(2) each level one data block, to step (1) divided carries out second level packet, will divide by each level one data block For r secondary data block, each described secondary data block includesIndividual data;Wherein, n-th2Individual level one data block divide the n'2In individual secondary data block n-th '1Individual data x " (n'1,n'2,n2)=x ' (n'1r+n'2,n2),n'2=0,1 ..., r-1, n2=0,1 ..., v-1;
(3), in each secondary data blockIndividual data are carried outPoint FFT calculates;Wherein n-th2Individual level one data block divides N-th '2In individual secondary data blockIndividual data FFT result of calculation is X ′ ′ ( k ′ 1 , n ′ 2 , n 2 ) = Σ n ′ 1 = 0 N v r - 1 x ′ ′ ( n ′ 1 , n ′ 2 , n 2 ) W N / ( v r ) k ′ 1 n ′ 1 , Wherein, k ′ 1 = 0 , 1 , ... , N v r - 1 , n'2=0,1 ..., r-1, n2=0,1 ..., v-1, W N / ( v r ) = e - j 2 π / N v r ;
(4), the FFT result of calculation of r secondary data block in each level one data block is merged, obtain each progression FFT result of calculation according to block;Wherein, n-th2In individual level one data blockThe FFT result of calculation of individual data is X ′ ( k 1 , n 2 ) = Σ n ′ 2 = 0 r - 1 { X ′ ′ ( k 1 - N v r n ′ 2 , n ′ 2 , n 2 ) } W N / v k 1 n ′ 2 , Wherein, k 1 = 0 , 1 , ... , N v - 1 , n2=0,1 ..., v-1,
(5), the FFT result of calculation of v level one data block is merged, obtain the N point FFT result of calculation of data sequence x (n)Wherein, k=0,1 ..., N-1, WN=e-j2π/N
One the most according to claim 1 parallel fast fourier transform processing method, it is characterised in that: in step (2) In, use vrThe double port memory of point deposits the data of vr secondary data block;Described vr double port memory is divided into v Memorizer group, corresponding to v level one data block;Each described memorizer group includes r double port memory, corresponding to 1 one R secondary data block in DBMS block;A S position r system Counter is wherein used to obtain each data in sequence x (n) Storage address, i.e. determines the Ram_Id during nth data is saved in Group_Id (n) individual memorizer group in sequence x (n) Storage address sequence number Add_ID (n) in (n) individual memorizer, n=0,1 ..., N-1, concrete methods of realizing is as follows:
The r binary value of count value n of described enumerator is (aS-1aS-2…aZ+1aZaZ-1…a1a0)r, wherein a0~aS-1For described R binary value the 1st~the numeral of S position, the span of described numeral is 0~r-1, then memorizer group sequence number Group_Id N the r binary value of () is (aZ-1…a1a0)r, the r binary value of memorizer sequence number Ram_Id (n) is (aZ)r, storage address sequence number The r binary value of Add_ID (n) is (aS-1aS-2…aZ+1)r
One the most according to claim 1 parallel fast fourier transform processing method, it is characterised in that: in step (3) In, use base r FFT to be calculated in each secondary data blockThe FFT result of individual data;Base r is used in step (4) FFT calculates the FFT result of calculation to r secondary data block in each level one data block and merges;Step (5) uses Base v FFT calculates the FFT result of calculation to v level one data block and merges.
One the most according to claim 1 parallel fast fourier transform processing method, it is characterised in that: in step (3) In, use v base r butterfly unit that vr secondary data block carries out base r FFT and calculate, the r in each level one data block two DBMS block shares 1 base r butterfly computing unit by time division multiplex system;Described time division multiplex system includes r string and turns Die change block, first order gating control cells, base r butterfly computing unit, second level gating control cells and r parallel-serial conversion mould Block, wherein:
R serioparallel exchange module: with r secondary data block one_to_one corresponding in a level one data block;Respectively from r two progression According to reading data in the memorizer of block, obtaining r road serial data, wherein every road serial data includesIndividual data point;The most every Individual serioparallel exchange module carries out serioparallel exchange to corresponding serial data, by serialIndividual data point is converted to r road line number According to, every channel parallel data includesIndividual data point;
First order gating control cells: the parallel data of r serioparallel exchange module output is carried out gating operation, gates it every time In the r channel parallel data of 1 serioparallel exchange module output, then described r channel parallel data is exported base r butterfly and calculates single Unit;
Base r butterfly computing unit: receiving r channel parallel data and carry out base r FFT calculating, output r road parallel computation result is to second Level passage gating control cells;
Second level gating control cells: gate between r parallel serial conversion module, calculates the r road parallel FFT received Result exports wherein 1 parallel serial conversion module, and the parallel serial conversion module sequence number of gating gates with first order gating control cells Serioparallel exchange module sequence number consistent;
R parallel serial conversion module: with r secondary data block one_to_one corresponding in a level one data block;Through second level gating control Parallel serial conversion module after one-cell switching processed, receives r road parallel FFT result of calculation, carries out parallel serial conversion 1 tunnel serial data, will In the memorizer of the secondary data block that described serial data is saved in correspondence, data are read with serioparallel exchange module in storage position Position consistency, i.e. realizes stored on-site.
CN201610052233.5A 2016-01-26 2016-01-26 A kind of parallel Fast Fourier Transform processing method Active CN105718424B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610052233.5A CN105718424B (en) 2016-01-26 2016-01-26 A kind of parallel Fast Fourier Transform processing method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610052233.5A CN105718424B (en) 2016-01-26 2016-01-26 A kind of parallel Fast Fourier Transform processing method

Publications (2)

Publication Number Publication Date
CN105718424A true CN105718424A (en) 2016-06-29
CN105718424B CN105718424B (en) 2018-11-02

Family

ID=56155063

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610052233.5A Active CN105718424B (en) 2016-01-26 2016-01-26 A kind of parallel Fast Fourier Transform processing method

Country Status (1)

Country Link
CN (1) CN105718424B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111368250A (en) * 2018-12-26 2020-07-03 北京欣奕华科技有限公司 Data processing system, method and device based on Fourier transform/inverse transform
CN112149046A (en) * 2020-10-16 2020-12-29 北京理工大学 FFT (fast Fourier transform) processor and processing method based on parallel time division multiplexing technology
CN112511480A (en) * 2020-11-10 2021-03-16 展讯半导体(成都)有限公司 Secondary FFT and IFFT transformation method and related product
CN114995765A (en) * 2022-06-06 2022-09-02 南京创芯慧联技术有限公司 Data processing method, data processing device, storage medium and electronic equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1823333A (en) * 2003-07-18 2006-08-23 加拿大西格纳斯通信公司 Recoded radix-2 pipelined FFT processor
CN101072218A (en) * 2007-03-01 2007-11-14 华为技术有限公司 FFT/IFFI paired processing system, method and its device and method
CN101154215A (en) * 2006-09-27 2008-04-02 上海杰得微电子有限公司 Fast Fourier transform method and hardware structure based on three cubed 2 frequency domain sampling
US20090150470A1 (en) * 2005-11-25 2009-06-11 Matsushita Electric Industrial Co., Ltd Fast fourier transformation circuit
WO2014164298A2 (en) * 2013-03-13 2014-10-09 Qualcomm Incorporated Vector processing engines having programmable data path configurations for providing multi-mode radix-2x butterfly vector processing circuits, and related vector processors, systems, and methods

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1823333A (en) * 2003-07-18 2006-08-23 加拿大西格纳斯通信公司 Recoded radix-2 pipelined FFT processor
US20090150470A1 (en) * 2005-11-25 2009-06-11 Matsushita Electric Industrial Co., Ltd Fast fourier transformation circuit
CN101154215A (en) * 2006-09-27 2008-04-02 上海杰得微电子有限公司 Fast Fourier transform method and hardware structure based on three cubed 2 frequency domain sampling
CN101072218A (en) * 2007-03-01 2007-11-14 华为技术有限公司 FFT/IFFI paired processing system, method and its device and method
WO2014164298A2 (en) * 2013-03-13 2014-10-09 Qualcomm Incorporated Vector processing engines having programmable data path configurations for providing multi-mode radix-2x butterfly vector processing circuits, and related vector processors, systems, and methods

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
HERBERT L. GROGINSKY ET AL: "A Pipeline Fast Fourier Transform", 《IEEE TRANSACTIONS ON COMPUTERS》 *
YUN-NAN CHANG ET AL: "An Efficient VLSI Architecture for Normal I/O Order Pipleline FFT Design", 《IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—II: EXPRESS BRIEFS》 *
姜建国等: "使用特殊复数系统的基-6FFT算法", 《西安电子科技大学学报(自然科学版)》 *
禹霁阳等: "一种基于矢量基2X2的二维FFT高效结构", 《北京理工大学学报》 *
禹霁阳等: "一种高性能单精度浮点基 -3蝶形运算单元的设计与实现", 《仪器仪表学报》 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111368250A (en) * 2018-12-26 2020-07-03 北京欣奕华科技有限公司 Data processing system, method and device based on Fourier transform/inverse transform
CN111368250B (en) * 2018-12-26 2023-08-15 北京欣奕华科技有限公司 Data processing system, method and equipment based on Fourier transformation/inverse transformation
CN112149046A (en) * 2020-10-16 2020-12-29 北京理工大学 FFT (fast Fourier transform) processor and processing method based on parallel time division multiplexing technology
CN112511480A (en) * 2020-11-10 2021-03-16 展讯半导体(成都)有限公司 Secondary FFT and IFFT transformation method and related product
CN114995765A (en) * 2022-06-06 2022-09-02 南京创芯慧联技术有限公司 Data processing method, data processing device, storage medium and electronic equipment
CN114995765B (en) * 2022-06-06 2023-11-21 南京创芯慧联技术有限公司 Data processing method and device, storage medium and electronic equipment

Also Published As

Publication number Publication date
CN105718424B (en) 2018-11-02

Similar Documents

Publication Publication Date Title
CN105718424B (en) A kind of parallel Fast Fourier Transform processing method
CN103970718B (en) Device and method is realized in a kind of fast Fourier transform
CN101083643A (en) Low memory spending hybrid base FFT processor and its method
CN100563226C (en) Utilize the modulating equipment of mixed-radix fast fourier transform
CN101290613B (en) FFT processor data storage system and method
CN102063411A (en) 802.11n based FFT/IFFT (Fast Fourier Transform)/(Inverse Fast Fourier Transform) processor
WO2017000756A1 (en) Data processing method and processor based on 3072-pointfast fourier transformation, and storage medium
CN101571849B (en) Fast Foourier transform processor and method thereof
CN105335331A (en) SHA256 realizing method and system based on large-scale coarse-grain reconfigurable processor
CN101847137B (en) FFT processor for realizing 2FFT-based calculation
CN104699624A (en) FFT (fast Fourier transform) parallel computing-oriented conflict-free storage access method
CN101082906A (en) Fixed-base FFT processor with low memory spending and method thereof
CN105183701A (en) 1536-point FFT processing mode and related equipment
CN103034621B (en) The address mapping method of base 2 × K parallel FFT framework and system
CN103544111B (en) A kind of hybrid base FFT method based on real-time process
CN105095152A (en) Configurable 128 point fast Fourier transform (FFT) device
CN102364456A (en) 64-point fast Fourier transform (FFT) calculator
CN107391439A (en) A kind of processing method of configurable Fast Fourier Transform (FFT)
CN104504205A (en) Parallelizing two-dimensional division method of symmetrical FIR (Finite Impulse Response) algorithm and hardware structure of parallelizing two-dimensional division method
Kala et al. High throughput, low latency, memory optimized 64K point FFT architecture using novel radix-4 butterfly unit
CN104268124A (en) FFT (Fast Fourier Transform) implementing device and method
CN102541813B (en) Method and corresponding device for multi-granularity parallel FFT (Fast Fourier Transform) butterfly computation
CN103049716B (en) First moment-based convolver
CN105975436A (en) IP circuit universal in SoC system and capable of being configured with accelerating unit
CN102411557B (en) Multi-granularity parallel FFT (Fast Fourier Transform) computing device

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant