CN1477802A - Method for defining parallel dual butterfly computation fast Fourier transform processor structure - Google Patents

Method for defining parallel dual butterfly computation fast Fourier transform processor structure Download PDF

Info

Publication number
CN1477802A
CN1477802A CNA031415407A CN03141540A CN1477802A CN 1477802 A CN1477802 A CN 1477802A CN A031415407 A CNA031415407 A CN A031415407A CN 03141540 A CN03141540 A CN 03141540A CN 1477802 A CN1477802 A CN 1477802A
Authority
CN
China
Prior art keywords
butterfly
parallel
processing
data
calculated
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CNA031415407A
Other languages
Chinese (zh)
Other versions
CN1259782C (en
Inventor
田继锋
姜海宁
宋文涛
罗汉文
张海滨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Jiaotong University
Original Assignee
Shanghai Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Jiaotong University filed Critical Shanghai Jiaotong University
Priority to CN 03141540 priority Critical patent/CN1259782C/en
Publication of CN1477802A publication Critical patent/CN1477802A/en
Application granted granted Critical
Publication of CN1259782C publication Critical patent/CN1259782C/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Abstract

The present invention relates to an improvement on core unit in fast Fourier transform processor-butterfly computation unit to obtain concurrent double butterfly computation processing method. The method includes the following steps: firstly, making concurrent butterfly computation processing, i.e. cocurrently making processing only containing adder and processing not only containing multiplier, but also containing adder to obtain cocurrent butterfly computation structure, then making double butterfly computation processing, i.e. according to the correspondent rotation factor group processing input of butterfly computation unit, then according to that in the butterfly computation processing the multiplier is contained or not using two butterfly computation processing units to process the inputted data so as to obtain the concurrent double butterfly computation fast Fourier transform processor structure.

Description

Determine the method for parallel dual butterfly computation fast Foourier transform processor structure
Technical field
What the present invention relates to is a kind of method of definite fast Foourier transform processor structure, and particularly a kind of method of determining parallel dual butterfly computation fast Foourier transform processor structure belongs to areas of information technology.
Background technology
OFDM (Orthogonal Frequency Division Multiplexing, OFDM) be a multi-transceiver technology, it is divided into the narrowband subchannels of several quadratures with a broad-band channel, and transmits information on each subchannel simultaneously, thereby has improved channel utilization.And, reduced intersymbol interference by adding Cyclic Prefix, strengthened system's anti-multipath interference capability.Recent years, the OFDM technology has obtained application more and more widely because of its remarkable performance.It has not only obtained successful application in Digital Television, digital audio broadcasting, and WLAN standard IEEE 802.11a, HIPERLAN-2 also adopted this technology as physical layer interface, and the research of the B3G that now carries out in full preparation also trends towards selecting for use the OFDM technology.In the evolution of OFDM technology, owing to adopt fast fourier transform (Fast Fourier Transform, FFT) substitute initial single sub-carrier modulation, thereby greatly reduce the implementation complexity of ofdm system, but, in the face of the increase day by day of the required processing speed of ofdm system, the bottleneck that the realization of high speed fft processor has become the OFDM technology to further develop again.
The implementation method of high speed fft processor can be divided into following several by the difference of structure: single internal storage structure, two internal storage structure, pipeline organization and parallel organization.The hardware resource that traditional single internal storage structure and two internal storage structure take is less, but throughput is low, needs higher clock frequency.In order effectively to improve the arithmetic speed of FFT, find by literature search, people such as Byung S.Son are at " IEEE International Symposium on Circuits andSystems " (2002, vol.3, pp.281-284) " High-speed FFT Processor forOFDM Systems " (the high speed fft processor in the ofdm system of publishing an article on, the international conference of IEEE Circuits and Systems), this article proposes single internal memory grouping scheme on traditional single internal storage structure basis, this scheme has improved the fft processor arithmetic speed significantly and has taken less internal memory, but increased the consumption of addressing complexity and complex multiplier, thereby taken more hardware resource.
Summary of the invention
The objective of the invention is to overcome deficiency of the prior art, a kind of method of determining parallel dual butterfly computation fast Foourier transform processor structure is provided, achieve easily, cost is low, hardware resource consumes few, the fast operation of processor is well positioned to meet in the ofdm communication system requirement to fft processor.
The present invention is achieved by the following technical solutions, the present invention is to the core cell in the fast Foourier transform processor---and butterfly is calculated the unit and improves, obtain parallel dual butterfly computation processing method, the butterfly that at first walks abreast is calculated and handles, be about to butterfly and calculate the processing that only contains adder in handling and not only contained multiplier but also contained parallel the carrying out of processing of adder, obtain parallel butterfly and calculate structure; Carrying out dual butterfly computation then handles, the input that is about to butterfly calculation unit is carried out packet transaction according to the difference of institute's corresponding rotation factor, whether contain multiplier in again the input data being handled according to the butterfly calculation and divide two different butterflies calculation processing units to handle, obtain parallel dual butterfly computation fast Foourier transform processor structure.
Below the inventive method is further described, particular content is as follows:
1, parallel butterfly is calculated and handles
It is the core cell of fast Foourier transform processor that butterfly is calculated the unit.The butterfly of base four decimations in frequency is calculated the unit and carries out 4 point discrete Fourier conversion process earlier, then result and twiddle factor multiplied each other,
A 1=(A+C)+(B+D) B 1=(A-C)-j(B-D)
C 1=(A+C)-(B+D) D 1=(A-C)+j(B-D) (1) A ′ = A 1 , B ′ = B 1 × W N p , C ′ = C 1 × W N 2 p , D ′ = D 1 × W N 3 p - - - ( 2 )
Wherein W N = e - j 2 π / N , A, B, C, D are the input that butterfly is calculated the unit, and A ', B ', C ', D ' calculate the output of unit for butterfly.Can get the output result who only needs some adders just can obtain A ' by (2) formula, the processing that obtains A ' output is carried out with the processing that obtains other B ', C ', three outputs of D ' is parallel.Promptly calculating former butterfly increases by one 4 point discrete Fourier conversion process unit in the unit, this unit only contains adder.
In order more effectively to improve the utilance of hardware resource, calculate the method for introducing streamline in the structure parallel butterfly.This method divided for four steps carried out:
The first step is read in data from internal memory, adopt 3 clocks to divide two-way to read in 4 complex datas from dual port RAM;
Second step, carry out 4 point discrete Fourier conversion process, in order to adopt the method for streamline, control whole flow process by state machine (3 states), the state S1 of making obtains A 1, B 1, state S2 obtains C 1, state S3 obtains D 1
The 3rd step, carry out answering of accordingly result and twiddle factor and take advantage of processing, a complex multiplier is realized by 3 real multipliers and 5 real add musical instruments used in a Buddhist or Taoist mass, 3 real multipliers parallel processings, being 3 parallel adders before it, is thereafter 2 parallel adders, and processing procedure needs 3 clocks altogether;
The 4th step, will handle the gained write memory, 3 clocks divide two-way to write 4 data.Whole streamline length is 12 clocks.
2, dual butterfly computation is handled
By (2) formula, butterfly is each time calculated to handle needs 4 twiddle factors, and these 4 twiddle factors are associated with a p value, so it is corresponding with a specific p value that each butterfly is calculated four inputs handling.Four input data definitions that butterfly that will be relevant with certain p value is calculated the unit are that a butterfly is calculated the cell data group, and defining the butterfly calculation cell data group relevant with p=0 is zero data set, otherwise is non-zero data groups.Different data sets adopts different butterflies to calculate and handles, the butterfly of non-zero data groups contains multiplier in calculating and handling, adopting above-mentioned parallel butterfly to calculate handles, the butterfly of zero data set only contains adder in calculating and handling, adopt a plurality of 4 point discrete Fourier conversion process (claim this butterfly to calculate and be treated to simple butterfly calculation processing), the butterfly that will walk abreast is calculated to handle with simple butterfly calculation and handles parallel carrying out, and obtains parallel dual butterfly computation structure, and concrete steps are as follows:
The first step, inputoutput data is handled.3 clocks of each butterfly calculation processing read in 4 data in the parallel butterfly calculation structure, so in 3 clocks, a port of dual port RAM reads in 3 data, another port reads in 1 data, thereby this port will have 2 clock free time in 3 clocks, utilize the data of these two clock transfer zero data set to carry out simple butterfly and calculate and handle.
In second step, simple butterfly is calculated the stream treatment of processing unit.Synchronous for the transmission maintenance that guarantees non-zero data groups and zero data set, introducing and the corresponding flowing water of parallel butterfly calculation processing postpone in simply the butterfly calculation is handled.Promptly adopt 6 clocks read data from internal memory, simple butterfly is calculated and handles 3 clocks, 6 clocks of write data in internal memory.
The 3rd step, the processing of afterbody.The fft processor of calculating the unit based on butterfly divides some levels to carry out, one-level in the end, and the input data are zero data set entirely, can all carry out simple butterfly and calculate and handle, and make full use of two port parallel transmission data in the dual port RAM.This moment, streamline length was 7 clocks.
The present invention has substantive distinguishing features and marked improvement, the present invention is in traditional two internal storage structures, introduce parallel organization and stream treatment, obtain the fast Foourier transform processor of parallel dual butterfly computation structure, improved the arithmetic speed of fast Foourier transform processor effectively.Only comprise some simple adders in this parallel organization, compare with the method for mentioning in the background technology, take less hardware resource, multiplier resources especially, thus solved arithmetic speed in the fast Foourier transform processor and the contradiction between the hardware consumption preferably.
Description of drawings
The parallel dual butterfly computation of Fig. 1 is handled schematic diagram
Embodiment
As shown in Figure 1, provide following examples in conjunction with content of the present invention:
Method of the present invention is used for the design of 64 fft processors of WLAN standard IEEE 802.11a, and with the time parameter (T among the IEEE 802.11a FFT=3.2 μ s) be standard.
The first step, the butterfly that walks abreast are calculated and are handled.64 o'clock base four decimation in frequency fft processors carry out by minutes 3 grades, handle 64 data for every grade, and each butterfly is calculated and handles 3 clocks and read in 4 data, and it is n that each grade consumes clock number One_vage=64/4 * 3+12=60 clock, FFT of 64 handle that to consume clock number altogether be n FFT=n One_vage* 3=60 * 3=180 clock.If adopt the system clock of 60MHz, 64 fft processors are finished single treatment 180/60MHz=3 μ consuming time s<3.2 μ s so.Handle by adopting parallel butterfly to calculate, the gained fft processor can reach consensus standard.
Second step, adopt dual butterfly computation to handle, each grade data to be divided into groups, the different pieces of information group adopts different butterflies to calculate and handles, as Fig. 1.The first order has 15 non-zero data groups, the butterfly that walks abreast calculation processing, and 1 zero data set carries out simple butterfly and calculates processing, thereby first order processing consumption clock number is 15 * a 4 * 3/4+12=57 clock; There are 12 non-zero data groups the second level, 4 zero data sets, and it is 12 * 4 * 3/4+12=48 clock that this grade handled the consumption clock number; The third level is an afterbody, is zero data set entirely, all carry out simple butterfly and calculate processing, and be 64/2+7=39 clock so finish this grade processing consumption clock number.Thereby 64 fft processors are finished single treatment, and to consume clock number be 57+48+39=144 clock.Under the 60MHz system clock, 64 fft processors are finished single treatment 144/60MHz=2.4 μ consuming time s.
Parallel dual butterfly computation is handled and is not only satisfied the requirement of 802.11a agreement to fft processor well, and has improved the arithmetic speed of fft processor greatly.Parallel in addition butterfly calculates processing and the dual butterfly computation processing all is to have increased some simple adders in the traditional double internal storage structure, so the hardware resource consumption seldom.

Claims (5)

1, a kind of method of determining parallel dual butterfly computation fast Foourier transform processor structure, it is characterized in that, to the core cell in the fast Foourier transform processor---butterfly is calculated the unit and improves, obtain parallel dual butterfly computation processing method, the butterfly that at first walks abreast is calculated and handles, being about to butterfly calculates the processing that only contains adder in handling and not only contained multiplier but also contained parallel the carrying out of processing of adder, obtain parallel butterfly and calculate structure, carrying out dual butterfly computation then handles, the input that is about to butterfly calculation unit is carried out packet transaction according to institute's corresponding rotation factor, whether contain multiplier in again the input data being handled according to the butterfly calculation and divide two butterflies calculation processing units to handle, obtain parallel dual butterfly computation fast Foourier transform processor structure.
2, the method for determining parallel dual butterfly computation fast Foourier transform processor structure according to claim 1 is characterized in that, described parallel butterfly is calculated and handles, and is specific as follows:
It is the core cell of fast Foourier transform processor that butterfly is calculated the unit, and the butterfly of basic four decimations in frequency is calculated the unit and carries out 4 point discrete Fourier conversion process earlier, then result and twiddle factor are multiplied each other,
A 1=(A+C)+(B+D) B 1=(A-C)-j(B-D)
C 1=(A+C)-(B+D) D 1=(A-C)+j(B-D) (1) A ′ = A 1 , B ′ = B 1 × W N p , C ′ = C 1 × W N 2 p , D ′ = D 1 × W N 3 p - - - ( 2 )
Wherein W N = e - j 2 π / N , A, B, C, D are the input that butterfly is calculated the unit, and the output that A ', B ', C ', D ' calculate the unit for butterfly is carried out the processing that obtains A ' output with the processing that obtains other B ', C ', three outputs of D ' is parallel.
3, the method for determining parallel dual butterfly computation fast Foourier transform processor structure according to claim 1 and 2 is characterized in that, effectively improves the utilance of hardware resource, and by calculating the method for introducing streamline in the structure parallel butterfly, this method divided for four steps carried out:
The first step is read in data from internal memory, adopt 3 clocks to divide two-way to read in 4 complex datas from dual port RAM;
Second step, carry out 4 point discrete Fourier conversion process, control whole flow process by state machine, the state S1 of making obtains A 1, B 1, state S2 obtains C 1, state S3 obtains D 1
The 3rd step, carry out answering of accordingly result and twiddle factor and take advantage of processing, a complex multiplier is realized by 3 real multipliers and 5 real add musical instruments used in a Buddhist or Taoist mass, 3 real multipliers parallel processings, being 3 parallel adders before it, is thereafter 2 parallel adders, and processing procedure needs 3 clocks altogether;
The 4th step, will handle the gained write memory, 3 clocks divide two-way to write 4 data, and whole streamline length is 12 clocks.
4, the method for determining parallel dual butterfly computation fast Foourier transform processor structure according to claim 1 is characterized in that, described dual butterfly computation is handled, and is specific as follows:
It is corresponding with a specific p value that each butterfly is calculated four inputs handling, it is that a butterfly is calculated the cell data group that four butterflies that will be relevant with certain p value are calculated unit input data definition, defining the butterfly calculation cell data group relevant with p=0 is zero data set, otherwise be non-zero data groups, the butterfly of non-zero data groups contains multiplier in calculating and handling, adopting parallel butterfly to calculate handles, the butterfly of zero data set only contains adder in calculating and handling, adopting simple butterfly to calculate handles, the butterfly that will walk abreast is calculated to handle with simple butterfly calculation and handles parallel carrying out, and obtains parallel dual butterfly computation structure.
5, according to claim 1 or the 4 described methods of determining parallel dual butterfly computation fast Foourier transform processor structure, it is characterized in that described dual butterfly computation is handled, concrete steps are as follows:
The first step, inputoutput data is handled, 3 clocks of each butterfly calculation processing read in 4 data in the parallel butterfly calculation structure, in 3 clocks, a port of dual port RAM reads in 3 data, another port reads in 1 data, and this port has 2 clock free time in 3 clocks, carries out simple butterfly by the data of these two clock transfer zero data set and calculates processing;
Second step, simple butterfly is calculated the stream treatment of processing unit, in simple butterfly calculation processing, introduce with parallel butterfly and calculate the corresponding flowing water delay of processing, promptly adopt 6 clocks read data from internal memory, simple butterfly is calculated and handles 3 clocks, 6 clocks of write data in internal memory, the synchronous transmission of assurance non-zero data groups and zero data set;
The 3rd step, the processing of afterbody, the fft processor of calculating the unit based on butterfly divides some levels to carry out, one-level in the end, the input data are zero data set entirely, all carry out simple butterfly and calculate and handle, and making full use of two port parallel transmission data in the dual port RAM, this moment, streamline length was 7 clocks.
CN 03141540 2003-07-10 2003-07-10 Method for defining parallel dual butterfly computation fast Fourier transform processor structure Expired - Fee Related CN1259782C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 03141540 CN1259782C (en) 2003-07-10 2003-07-10 Method for defining parallel dual butterfly computation fast Fourier transform processor structure

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 03141540 CN1259782C (en) 2003-07-10 2003-07-10 Method for defining parallel dual butterfly computation fast Fourier transform processor structure

Publications (2)

Publication Number Publication Date
CN1477802A true CN1477802A (en) 2004-02-25
CN1259782C CN1259782C (en) 2006-06-14

Family

ID=34155340

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 03141540 Expired - Fee Related CN1259782C (en) 2003-07-10 2003-07-10 Method for defining parallel dual butterfly computation fast Fourier transform processor structure

Country Status (1)

Country Link
CN (1) CN1259782C (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100390782C (en) * 2005-07-15 2008-05-28 北京大学深圳研究生院 Real-time fast Fourier transform circuit
CN101277283B (en) * 2007-03-28 2010-10-20 中国科学院微电子研究所 Fast Flourier transformation butterfly type unit
CN101617306B (en) * 2005-04-12 2012-02-01 Nxp股份有限公司 Device for Fast fourier transform operation
CN101764778B (en) * 2009-10-09 2012-12-19 重庆唐大科技有限公司 Base band processor and base band processing method

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101617306B (en) * 2005-04-12 2012-02-01 Nxp股份有限公司 Device for Fast fourier transform operation
CN100390782C (en) * 2005-07-15 2008-05-28 北京大学深圳研究生院 Real-time fast Fourier transform circuit
CN101277283B (en) * 2007-03-28 2010-10-20 中国科学院微电子研究所 Fast Flourier transformation butterfly type unit
CN101764778B (en) * 2009-10-09 2012-12-19 重庆唐大科技有限公司 Base band processor and base band processing method

Also Published As

Publication number Publication date
CN1259782C (en) 2006-06-14

Similar Documents

Publication Publication Date Title
CN1109991C (en) Pipelined fast fourier transform processor
Jo et al. New continuous-flow mixed-radix (CFMR) FFT processor using novel in-place strategy
CN101154215B (en) Fast Fourier transform hardware structure based on three cubed 2 frequency domain sampling
CN103970718A (en) Quick Fourier transformation implementation device and method
CN112465110A (en) Hardware accelerator for convolution neural network calculation optimization
Wang et al. Novel memory reference reduction methods for FFT implementations on DSP processors
CN102214159A (en) Method for realizing 3780-point fast Fourier transform/inverse fast Fourier transform (FFT/IFFT) and processor thereof
CN105095152B (en) A kind of 128 configurable point FFT devices
CN1808419A (en) Real-time fast Fourier transform circuit
CN1259782C (en) Method for defining parallel dual butterfly computation fast Fourier transform processor structure
CN102170276B (en) Up-sampling filtering method for ultrasonic signal processing
CN103345379B (en) A kind of complex multiplier and its implementation
CN112559954B (en) FFT algorithm processing method and device based on software-defined reconfigurable processor
CN1348141A (en) Discrete 3780-point Fourier transformation processor system and its structure
CN103262067B (en) A kind of data processing method, data processing equipment and communication system
CN105975436A (en) IP circuit universal in SoC system and capable of being configured with accelerating unit
CN201993753U (en) Radix-4 butterfly unit circuit applied to FFT/IFFT
CN101764778B (en) Base band processor and base band processing method
CN1858998A (en) No multiplication realizing method for digital audio frequency filter
CN108628805A (en) A kind of butterfly processing element and processing method, fft processor of low-power consumption
Tsai et al. Power-efficient continuous-flow memory-based FFT processor for WiMax OFDM mode
CN207529364U (en) A kind of parallel processor array structure
Zhang et al. Accelerating the data shuffle operations for FFT algorithms on SIMD DSPs
Tian et al. Efficient algorithms of FFT butterfly for OFDM systems
CN110069746A (en) A kind of IFFT processing unit applied to point-variable in TD-LTE

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C17 Cessation of patent right
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20060614