Summary of the invention
This application discloses a kind of 1536 FFT disposal routes and relevant device, high data throughput cannot be ensured, the problem of high operational precision and low consumption of resources so that solve in prior art for what exist in 1536 FFT process simultaneously.
The embodiment of the present application discloses a kind of 1536 FFT disposal routes, comprising:
512 fft processing units obtain 1536 sampling points of input, three groups of sampling point data groups that often group comprises 512 sampling points are divided into after described 1536 sampling points are converted to block floating point, and respectively FFT process is carried out to 512 sampling points often organized in described sampling point data group, obtain corresponding sampling point process data group, obtain three groups of sampling point process data groups are sent to follow-on base 3 unit;
Wherein, carry out described FFT processing procedure comprise perform anti-spilled judgement;
Described follow-on base 3 unit receives sampling point process data group described in three groups that input, and processes three described sampling point process data at every turn, until complete after 512 times, obtains the FFT result of whole 1536 sampling points and exports;
Wherein, three described sampling point process data are respectively from three groups of sampling point process data groups.
Preferably, describedly carry out FFT process to 512 sampling points often organized in described sampling point data group respectively, the process obtaining corresponding sampling point process data group comprises:
Ping-pong ram is adopted to be undertaken resetting and storing by being divided into three groups of sampling point data groups that often group comprises 512 sampling points;
Obtain the three groups of sampling point data groups be stored in described ping-pong ram, adopt base 2 butterfly computation to perform S level butterfly computation for each group sampling point data group,
Undertaken in butterfly computation process for one group of sampling point data group by level, the first data sequence obtained for every one-level or part butterfly computation according to application scenarios carries out anti-spilled judgement, and carry out cut position operation according to judged result, obtain the second corresponding data sequence;
The first data sequence that one-level every in front S-1 level butterfly computation is obtained or is carried out the second data sequence that anti-spilled judgement obtains and be stored in two-port RAM corresponding to butterfly computation at different levels respectively, after the first data sequence obtained in S level butterfly computation or the data in carrying out the second data sequence that anti-spilled judgement obtains are carried out Data Format Transform and after resetting, obtain the sampling point process data group that current sampling point data group of carrying out butterfly process is corresponding, in described sampling point process data group, comprise sampling point process data corresponding to each sampling point;
Wherein, first order butterfly computation carries out computing based on the sampling point data in described ping-pong ram, and computing is carried out based on the input data sequence of two-port RAM described in upper level to S level butterfly computation in the second level in preceding clock cycle.
Preferably, describedly undertaken in butterfly computation process for one group of sampling point data group by level, the first data sequence obtained for every one-level or part butterfly computation according to application scenarios carries out anti-spilled judgement, and carries out cut position operation according to judged result, obtains the second corresponding data sequence; Comprise:
Undertaken in butterfly computation process for one group of sampling point data group by level, according to current application scenarios, the first data sequence to every one-level butterfly computation obtains can be selected to carry out anti-spilled judgement, or the first data sequence obtained after selecting to perform butterfly computation to part carries out anti-spilled judgement;
When more than the data bits of the input data sequence carrying out butterfly computation two of the data bits that the result judged is described first data sequence, the numerical value of high three according to exporting carries out cut position operation to current the first data sequence obtained respectively, obtains the second corresponding data sequence.
Preferably, described follow-on base 3 unit receives sampling point process data group described in three groups that input, and three described sampling point process data are processed, until complete after 512 times, the process that the FFT result obtaining whole 1536 sampling points also exports comprises:. at every turn
Receive sampling point process data group described in three groups that input, according to the address relationship of correspondence, sampling point process data component described in three groups is not stored in three corresponding ping-pong rams;
When storing corresponding described sampling point process data group for each soldier pang RAM, two that exist based on described ping-pong ram relatively independent storage areas, successively by each sampling point process data in sampling point process data group described in a group stored in, wherein, first time is stored in two sampling point process data, after the sampling point process data in a storage area are read, sampling point process data cover this storage area to be deposited by the next one, the like until 512 sampling point process data are all read;
Complete 512 readings to described three ping-pong rams, from three ping-pong rams, read three sampling point process data altogether at every turn, three road FFT process are carried out to described three sampling point process data simultaneously and obtain three road FFT results;
Wherein, first via FFT is treated to based on twiddle factor
corresponding Fast Fourier Transform (FFT) formula
multiplier and totalizer process described three sampling point process data, obtain first via FFT result; Second road FFT is treated to based on twiddle factor
and the Fast Fourier Transform (FFT) formula of correspondence
multiplier and totalizer process described three sampling point process data, obtain the second road FFT result; 3rd road FFT is treated to based on twiddle factor
and the Fast Fourier Transform (FFT) formula of correspondence
multiplier and totalizer process described three sampling point process data, obtain the 3rd road FFT result;
Export after whole 1536 the FFT results obtained after completing 512 times are stored in a ping-pong ram according to address.
A kind of 1536 FFT treating apparatus, comprising:
512 fft processing units, for obtaining 1536 sampling points of input, three groups of sampling point data groups that often group comprises 512 sampling points are divided into after described 1536 sampling points are converted to block floating point, and respectively FFT process is carried out to 512 sampling points often organized in described sampling point data group, obtain corresponding sampling point process data group, obtain three groups of sampling point process data groups are sent to follow-on base 3 processing unit; Wherein, carry out described FFT processing procedure comprise perform anti-spilled judgement;
Follow-on base 3 processing unit, for sampling point process data group described in receive input three groups, and processes three described sampling point process data at every turn, until complete after 512 times, obtains the FFT result of whole 1536 sampling points and export; Wherein, three described sampling point process data are respectively from three groups of sampling point process data groups.
Preferably, described 512 fft processing units comprise:
The Data Format Transform of described 1536 sampling points, for obtaining 1536 sampling points of input, is after block floating point by the first Data Format Transform unit, and is divided into three groups of sampling point data groups that often group comprises 512 sampling points;
Ping-pong ram, for being undertaken resetting and storing by also dividing three groups of sampling point data groups that the often group obtained comprises 512 sampling points through described first Data Format Transform cell translation form;
S butterfly processing element, for obtaining the three groups of sampling point data groups be stored in described ping-pong ram, adopts base 2 butterfly computation to perform S level butterfly computation for each group sampling point data group,
Wherein, comprise in described each butterfly processing element, base 2 butterfly unit, anti-spilled unit and two-port RAM;
Described base 2 butterfly unit, for carrying out base 2 butterfly computation to input data sequence, obtains the first data sequence; Wherein, the sampling point data in described ping-pong ram are carried out computing as input data sequence by first order butterfly computation, and computing is carried out based on the input data sequence of two-port RAM described in upper level to S level butterfly computation in the second level in preceding clock cycle;
Described anti-spilled unit, for to according to application scenarios select this grade of butterfly computation after perform anti-spilled judgement time, obtain this grade to carry out via described base 2 butterfly unit the first data sequence that base 2 butterfly computation obtains and carry out anti-spilled judgement, and carry out cut position operation according to judged result, obtain the second corresponding data sequence;
Described two-port RAM, for storing first data sequence of not carrying out anti-spilled judgement or carrying out described second data sequence that anti-spilled judgement obtains;
Wherein, the first data sequence that before performing, every one-level obtains in S-1 level butterfly processing element or carry out the second data sequence that anti-spilled judgement obtains and be stored in the described two-port RAM in butterfly processing element at different levels respectively;
Second Data Format Transform unit, for after carrying out Data Format Transform to the first data sequence obtained in S level butterfly computation or the data in carrying out the second data sequence that anti-spilled judgement obtains and after resetting, obtain sampling point process data group corresponding to current sampling point data group of carrying out butterfly process and export, in described sampling point process data group, comprising sampling point process data corresponding to each sampling point;
Control module, for controlling described first Data Format Transform unit, described ping-pong ram unit, described butterfly processing element, described second Data Format Transform unit performs above-mentioned corresponding operating.
Preferably, described for according to application scenarios select this grade of butterfly computation after perform anti-spilled judgement time, obtain this grade to carry out via described base 2 butterfly unit the first data sequence that base 2 butterfly computation obtains and carry out anti-spilled judgement, and carry out cut position operation according to judged result, the anti-spilled unit obtaining the second corresponding data sequence comprises:
Described anti-spilled unit, for when performing anti-spilled judgement after selecting this grade of butterfly computation, obtain this grade to carry out via described base 2 butterfly unit the first data sequence that base 2 butterfly computation obtains and carry out anti-spilled judgement, when more than the data bits of the input data sequence carrying out butterfly computation two of the data bits that the result judged is described first data sequence, the numerical value of high three according to exporting carries out cut position operation to current the first data sequence obtained respectively, obtains the second corresponding data sequence.
Preferably, described follow-on base 3 processing unit comprises:
Writing control module, for receiving sampling point process data group described in three groups, according to the address relationship of correspondence, sampling point process data component described in three groups not being stored in three corresponding ping-pong rams;
All there are two relatively independent storage areas in each ping-pong ram in three ping-pong rams, for successively by each sampling point process data in sampling point process data group described in a group stored in, wherein, first time is stored in two sampling point process data, after the sampling point process data in a storage area are read, sampling point process data cover this storage area to be deposited by the next one, the like until 512 sampling point process data are all read;
Read control module, for completing 512 readings to described three ping-pong rams, from three ping-pong rams, reading three sampling point process data altogether at every turn and inputing to three road processing units;
In described three road processing units, first via processing unit, for based on twiddle factor
fast Fourier Transform (FFT) formula
multiplier and totalizer are carried out process to described three sampling point process data and are obtained first via FFT result;
Second road processing unit, for based on twiddle factor
fast Fourier Transform (FFT) formula
multiplier and totalizer are carried out process to described three sampling point process data and are obtained the second road FFT result;
3rd road processing unit, for based on twiddle factor
fast Fourier Transform (FFT) formula
multiplier or totalizer are carried out process to described three sampling point process data and are obtained the 3rd road FFT result;
Output unit, for described read control module complete be stored in a ping-pong ram via whole 1536 the FFT results obtained after described three tunnel processing unit processes according to address 512 times after and export.
A kind of fft processor, comprises 1536 FFT treating apparatus described above;
Described 1536 FFT treating apparatus, for being obtained 1536 sampling points of input by 512 fft processing units, three groups of sampling point data groups that often group comprises 512 sampling points are divided into after described 1536 sampling points are converted to block floating point, and respectively FFT process is carried out to 512 sampling points often organized in described sampling point data group, obtain corresponding sampling point process data group, obtain three groups of sampling point process data groups are sent to follow-on base 3 unit, read three described sampling point process data by described follow-on base 3 unit to process at every turn, until complete after 512 times, obtain the FFT result of whole 1536 sampling points and export,
Wherein, carry out described FFT processing procedure comprise perform anti-spilled judgement; Three described sampling point process data are respectively from three groups of sampling point process data groups.
A kind of ofdm system, comprises fft processor described above.
The embodiment of the present application discloses a kind of 1536 FFT disposal routes and relevant device.The application adopts the processing mode of block floating point and streamline, after 1536 sampling points being divided into three groups of sampling point data groups, FFT process is carried out again for 512 sampling points in every group, key aspect line of reasoning footpath is divided into relatively independent process path, make every grade of butterfly computation can time-sharing multiplex, and carried out anti-spilled judgement, thus the larger processing speed improving data and precision, further base 3 module is optimized, three sampling point process data are processed by follow-on base 3 processing unit at every turn, and export three road FFT results simultaneously, not only save storage unit, also substantially increase the throughput of data and reduce the power consumption of system.
Embodiment
LTE:LongTermEvolution, Long Term Evolution;
LTE-A:LongTermEvolutionAdvanced, Long Term Evolution strengthens;
FPGA:Field-ProgrammableGateArray, field programmable gate array;
FFT:FastFourierTransform, Fast Fourier Transform (FFT);
IFFT:InverseFastFourierTransform, inverse fast Fourier transform;
DFT:DiscreteFourierTransform, discrete Fourier transformation;
IDFT:InverseDiscreteFourierTransform, inverse discrete Fourier transform;
IP:IntellectualProperty, intellecture property;
IP kernel: refer to the logical block for products application special IC (ASIC) or Programmadle logic device (FPGA) or data block;
3GPP:ThirdGenerationPartnershipProject, third generation partner program;
OFDM:OrthogonalFrequencyDivisionMultiplexing, OFDM;
MIMO:MultipleInputMultipleOutput, multiple-input and multiple-output;
ENB:evolvedNodeB, evolution NodeB.
Below in conjunction with the accompanying drawing in the embodiment of the present application, be clearly and completely described the technical scheme in the embodiment of the present application, obviously, described embodiment is only some embodiments of the present application, instead of whole embodiments.Based on the embodiment in the application, those of ordinary skill in the art are not making the every other embodiment obtained under creative work prerequisite, all belong to the scope of the application's protection.
From background technology, existing fft processor hardware can not adapt to process completely under the prerequisite meeting system height data throughput, ensure the demand of high operational precision and low consumption of resources, therefore, this application discloses a kind of processing scheme for the FFT of 1536 newly, for based on 1536 FFT disposal routes of FPGA and relevant device, so that can under the prerequisite meeting system data throughput, ensure high operational precision and low consumption of resources, namely can according to different application scenarioss, at high data throughput, adjust flexibly between operational precision and resource consumption, major way is described in detail by following the embodiment of the present application.
Embodiment one
Be illustrated in figure 1 the process flow diagram of 1536 FFT disposal routes disclosed in the present application, the method, based on FPGA, mainly comprises the following steps:
Step S101,512 fft processing units obtain 1536 sampling points of input, after described 1536 sampling points are converted to block floating point, described 1536 sampling points are divided into three groups of sampling point data groups that often group comprises 512 sampling points, and respectively FFT process is carried out to 512 sampling points often organized in described sampling point data group, obtain corresponding sampling point process data group, obtain three groups of sampling point process data groups are carried out being sent to follow-on base 3 unit; Wherein, carry out described FFT processing procedure comprise perform anti-spilled judgement;
In this step S101 process of execution, adopt the processing mode of this block floating point and streamline, after described 1536 sampling points are converted to block floating point, described 1536 sampling points are divided into three groups of sampling point data groups that often group comprises 512 sampling points, FFT process is carried out again for 512 sampling points in every group, key aspect line of reasoning footpath is divided into relatively independent process path, make every grade of butterfly computation can time-sharing multiplex, and processed data (sampling point) flow between every grade of butterfly computation, the processing speed improving data that can be larger.And carry out for according to different application scenarioss in the process of FFT process for 512 sampling points in every group again, selectable portion or all perform anti-spilled judgement, thus the larger precision improving data.
Step S102, described follow-on base 3 unit receives sampling point process data group described in three groups that input, and processes three described sampling point process data at every turn, until complete after 512 times, obtains the FFT result of whole 1536 sampling points and exports;
In step s 102, carry out base 3 at every turn process by follow-on base 3 processing unit to three described sampling point process data, these three described sampling point process data are respectively from three groups of sampling point process data groups.
In the process performing step S102, obtain the sampling point process data of 1536 sampling points obtained after performing step S101, when processing in follow-on base 3 processing unit, from three groups of sampling point process data groups, read sampling point process data respectively at every turn, namely each read from three groups of different sampling point process data groups totally three sampling point process data carry out base 3 and process, and exportable three road FFT results at every turn, until perform after 512 times to the three groups of sampling point process data groups got, obtain the FFT result output that totally 1536 sampling points are corresponding.
The application is by mode disclosed in above-described embodiment one, adopt the processing mode of block floating point and streamline, 1536 sampling points are divided into three the sampling point data groups often organizing 512 sampling points by 512 fft processing units, FFT process is carried out again for 512 sampling points in every group, key aspect line of reasoning footpath is divided into relatively independent process path, make every grade of butterfly computation can time-sharing multiplex, and carry out anti-spilled judgement, thus the larger processing speed improving data and precision.Further in follow-on base 3 processing unit, carrying out process to three sampling point process data can export three road FFT results simultaneously at every turn, has not only saved storage unit, also substantially increases the throughput of data and reduces the power consumption of system.Meanwhile, adopt aforesaid way according to different application scenarioss, at high data throughput, can also to adjust flexibly between operational precision and resource consumption, optimize the processing procedure of 1536 FFT further.
Embodiment two
Based on above-mentioned the embodiment of the present application 1,1536 FFT disposal routes are disclosed, the method realizes based on FPGA, as shown in Figure 1 described 1536 sampling points are divided into three groups of sampling point data groups that often group comprises 512 sampling points, and respectively FFT process is carried out to 512 sampling points often organized in described sampling point data group, obtain corresponding sampling point process data group, by the step S101 that obtain three groups of sampling point process data groups carry out sending, concrete implementation is as follows, mainly comprises the steps:
Step S201, obtains 1536 sampling points of input, is block floating point, and is divided into three groups of data that often group has 512 sampling points by the Data Format Transform of described 1536 sampling points.
Step S202, adopts ping-pong ram three groups of sampling point data groups that the often group obtained through division comprises 512 sampling points to be carried out resetting and storing;
In step S202, the three groups of sampling point data groups got when ping-pong ram is for storing and performing step S201,512 sampling points are comprised in each sampling point data group, also according to sequencing when actual its stores and resets, store with the address relationship of correspondence, and for the sampling point data in each group sampling point data group according to sequencing from little to sorting greatly.
Step S203, obtains the three groups of sampling point data groups be stored in described ping-pong ram, adopts base 2 butterfly computation to perform S level butterfly computation for each group sampling point data group,
Step S204, undertaken in butterfly computation process for one group of sampling point data group by level, the first data sequence obtained for every one-level or part butterfly computation according to application scenarios carries out anti-spilled judgement, and carries out cut position operation according to judged result, obtains the second corresponding data sequence;
In step S204, the anti-spilled judgement for the first data sequence is not that every one-level all must perform, and can, according to current application scenarios, select flexibly to carry out anti-spilled judgement to executing the first data sequence after which rank of butterfly computation.That is, which carry out anti-spilled judgement after executing grade butterfly computation to determine according to embody rule scene, according to different application scenarioss, at high data throughput, can adjust flexibly between operational precision and resource consumption, optimize the processing procedure of 1536 FFT further.
Step S205, by the first data sequence obtained in front S-1 level butterfly computation and/or carry out the second data sequence that anti-spilled judgement obtains and be stored in two-port RAM corresponding to butterfly computation at different levels respectively, after the first data sequence obtained in S level butterfly computation or the data in carrying out the second data sequence that anti-spilled judgement obtains are carried out Data Format Transform and after resetting, obtain the sampling point process data group that current sampling point data group of carrying out butterfly process is corresponding, in described sampling point process data group, comprise sampling point process data corresponding to each sampling point;
Wherein, first order butterfly computation carries out computing based on the sampling point data in described ping-pong ram, and computing is carried out based on the input data sequence of two-port RAM described in upper level to S-1 level butterfly computation in the second level in preceding clock cycle;
In step S205, for the every one-level butterfly computation in above-mentioned front S-1 level, be stored in two-port RAM corresponding to butterfly computation at different levels by the first data sequence obtained without anti-spilled judgement or through the second data sequence that anti-spilled judgement obtains.
Step S206, sends and obtain three groups of sampling point process data groups after butterfly computation.
In the process performing step S203 to step S206, be described for the butterfly computation of one group of sampling point process data group:
First, with the sampling point data in group sampling point data group of in described ping-pong ram for input data carry out first order base 2 butterfly computation, the butterfly computation device of base 2 butterfly computation is carried out as shown in Figure 2.Then, select according to current application scenarios, current needs carries out anti-spilled judgement, then carry out anti-spilled judgement by carrying out the first data sequence that first order base 2 butterfly computation obtains, and carry out cut position operation according to judged result, thus obtain the second data sequence; Then, the second data sequence obtained is stored in two-port RAM corresponding to this first order base 2 butterfly computation; Then, perform second level base 2 butterfly computation, this second level base 2 butterfly computation carries out computing based on the input data sequence in two-port RAM corresponding to first order base 2 butterfly computation in preceding clock cycle, same, after second level base 2 butterfly computation, select according to current application scenarios, do not need to carry out anti-spilled judgement if current, then the first data sequence obtained after second level base 2 butterfly computation is stored in two-port RAM corresponding to this second level base 2 butterfly computation, so that follow-up third level base 2 butterfly computation uses.
The like, to S level base 2 butterfly computation, reset not carrying out in S level base 2 butterfly computation after the first data sequence of anti-spilled judgement or the data in carrying out the second data sequence that anti-spilled judgement obtains carry out Data Format Transform.
After executing the above-mentioned butterfly computation to each sampling point data in three groups of sampling point data and anti-spilled judgement, export the three groups of sampling point process data groups obtained.
It should be noted that, in this article, first, second is only different in order to distinguish experienced operation, carry out the data sequence that base 2 butterfly computation obtains to be collectively referred to as in this application " the first data sequence " as above-mentioned, but do not represent that at different levels to carry out the first data sequence that base 2 butterfly computation obtains identical, " the second data sequence " is also same reason.This refers to and be through that butterfly computation obtains to distinguish one, one be through anti-spilled judgement after obtain.
In the process performing above-mentioned butterfly computation, specifically undertaken in butterfly computation process for one group of sampling point data group by level, the first data sequence obtained for every one-level butterfly computation is selected according to application scenarios, whether carry out anti-spilled judgement, and carry out cut position operation according to judged result, obtain the second corresponding data sequence; Its detailed process comprises:
Undertaken in butterfly computation process for one group of sampling point data group by level, according to current application scenarios, the first data sequence to every one-level butterfly computation obtains can be selected to carry out anti-spilled judgement, or the first data sequence obtained after selecting to perform butterfly computation to part carries out anti-spilled judgement;
When more than the data bits of the input data sequence carrying out butterfly computation two of the data bits that the result judged is described first data sequence, the numerical value of high three according to exporting carries out cut position operation to current the first data sequence obtained respectively, obtains the second corresponding data sequence.
In conjunction with in above-mentioned butterfly computation process, data are carried out in the process flowed in RAM at different levels and butterfly computation, the embodiment of the present application, in the consideration precision of data and the consumption of resource, needs to carry out anti-spilled judgement and cut position operation to the data after carrying out base 2 butterfly computation according to application scenarios.As shown in Figure 2, for the input data having symbol, after butterfly process, its maximal value be less than process before 4 times, the application introduce after anti-spilled judgement cut position operation.
Inputting data as set is M position, becomes M+2 position after butterfly computation, according to the numerical value 000/111 of high three (M+1:M-1) that export; 001/110; 01x/10 (x represents 0 or 1) show that the input data of next stage butterfly computation intercept respectively respectively: M-1:0 (without overflowing); M:1 (overflowing one); M+1:1 (overflowing 2).
In the open process for 512 the concrete FFT process in 1536 FFT process of above-mentioned the embodiment of the present application, adopt the processing mode of block floating point and streamline, 1536 sampling points are divided into the three groups of sampling point data groups often organizing 512 sampling points, FFT process is carried out again for 512 sampling points in every group, key aspect line of reasoning footpath is divided into relatively independent process path, make every grade of butterfly computation can time-sharing multiplex, and carried out anti-spilled judgement, thus the larger processing speed improving data and precision.Adopt ping-pong ram to store in addition, the processing mode of incorporating pipeline further adds the throughput of data.On the other hand because base 2 butterfly computation adopted is identical address operation, between butterfly computation at different levels, adopt two-port RAM storage further can save resources of chip.Further carrying out in base 3 processing procedure, process is being carried out to three sampling point process data at every turn and can export three road FFT results simultaneously, not only saved storage unit, also substantially increasing the throughput of data and reduce the power consumption of system.Meanwhile, adopt aforesaid way according to different application scenarioss, at high data throughput, can also to adjust flexibly between operational precision and resource consumption, optimize the processing procedure of 1536 FFT further
Embodiment three
Based on above-mentioned the embodiment of the present application one and disclosed 1536 the FFT disposal routes of embodiment two, the method realizes based on FPGA, follow-on base 3 unit as shown in Figure 1 receives sampling point process data group described in three groups that input, and three described sampling point process data are processed at every turn, until complete after 512 times, obtain the FFT result of whole 1536 sampling points and the step S102 of output, concrete implementation is as follows, mainly comprises the following steps:
Step S301 receives sampling point process data group described in three groups that input, and according to the address relationship of correspondence, sampling point process data component described in three groups is not stored in three corresponding ping-pong rams;
Step S302, when storing corresponding described sampling point process data group for each soldier pang RAM, two that exist based on described ping-pong ram relatively independent storage areas, successively by each sampling point process data in sampling point process data group described in a group stored in;
In the process performing step S302, wherein, for a ping-pong ram, first time is stored in two sampling point process data, after the sampling point process data in a storage area are read, sampling point process data cover this storage area to be deposited by the next one, the like until 512 sampling point process data are all read;
Concrete is: the sampling point process data group comprising 512 sampling point process data for a group, according to address relationship, store first sampling point process data at the first storage area of the ping-pong ram of its correspondence, store second sampling point process data at the second storage area, when (process complete after) after first sampling point process data is read, sampling point process data next to be deposited, namely first sampling point process data region covers by the 3rd sampling point process data, namely be stored in the first storage area of this ping-pong ram, when (process complete after) after second sampling point process data is read, sampling point process data next to be deposited, namely second sampling point process data region covers by the 4th sampling point process data, namely be stored in the second storage area of this ping-pong ram, the like until 512 sampling point process data are all read (all having processed).
The embodiment of the present application adopts this ping-pong ram, by the distribution of flows of input in two relatively independent storage areas, by the switching of the read and write of two relatively independent storage areas, can realize the continuous-flow type transmission of data.
Step S303, completes 512 readings to described three ping-pong rams, reads three sampling point process data altogether, based on twiddle factor from three ping-pong rams at every turn
and after the fast fourier transform formula (1) of correspondence, formula (2) and formula (3) process respectively, then through the combined treatment of multiplier and totalizer, obtain three road FFT results;
In step S303, Fast Fourier Transform (FFT) formula (1), formula (2) and formula (3) are based on twiddle factor
obtain, be specially:
If k=0,1,2 ... 511, then formula (2) is:
In like manner, formula (3) is:
Wherein, N=1536 in above-mentioned formula (1), formula (2) and formula (3),
Step S304, exports after whole 1536 the FFT results obtained after completing 512 times are stored in a ping-pong ram according to address.
It should be noted that, needed to re-start sequence in step s 304 before output FFT result.
Known based on above-mentioned steps S303,1536 sampling points that the embodiment of the present application exports only need a set of k to get the twiddle factor of 512 numerical value
completing in 512 reading processes to described three soldier pang RAM, from three ping-pong rams, reading three sampling point process data altogether at every turn, three road FFT process being carried out to described three sampling point process data simultaneously and obtaining three road FFT results.
Wherein, first via FFT is treated to based on twiddle factor
corresponding Fast Fourier Transform (FFT) formula
multiplier and totalizer process described three sampling point process data, obtain first via FFT result;
Second road FFT is treated to based on twiddle factor
and the Fast Fourier Transform (FFT) formula of correspondence
multiplier and totalizer process described three sampling point process data, obtain the second road FFT result;
3rd road FFT is treated to based on twiddle factor
and the Fast Fourier Transform (FFT) formula of correspondence
multiplier and totalizer process described three sampling point process data, obtain the 3rd road FFT result;
By above-mentioned disclosed Fast Fourier Transform (FFT) formula (1), (2) and (3), namely
With
Sampling point is processed, storer can not only be saved by the way, three road FFT results (three circuit-switched data) can also be exported according to above-mentioned three transforms simultaneously simultaneously, thus substantially increase the output speed of data, reduce the power consumption of system.
Embodiment four
Based on disclosed 1536 the FFT disposal routes of above-mentioned the embodiment of the present application one to embodiment three, the embodiment of the present application also correspondence discloses a kind of 1536 FFT treating apparatus, and this device realizes based on FPGA equally, and concrete structure is as described below.
As described in Figure 3, a kind of structural representation of 1536 FFT treating apparatus disclosed in the embodiment of the present application four, mainly comprises:
512 fft processing units 1, for obtaining 1536 sampling points of input, after described 1536 sampling points are converted to block floating point, described 1536 sampling points are divided into three groups of sampling point data groups that often group comprises 512 sampling points, and respectively FFT process is carried out to 512 sampling points often organized in described sampling point data group, obtain corresponding sampling point process data group, obtain three groups of sampling point process data groups are sent to follow-on base 3 processing unit; Wherein, carry out described FFT processing procedure comprise perform anti-spilled judgement;
Follow-on base 3 processing unit 2, for sampling point process data group described in receive input three groups, and carries out base 3 to three described sampling point process data at every turn and processes, until complete after 512 times, obtain the FFT result of whole 1536 sampling points and export; Wherein, three described sampling point process data are respectively from three groups of sampling point process data groups.
Above-mentioned 512 fft processing units 1 and the corresponding the embodiment of the present application one of the concrete implementation of follow-on base 3 processing unit 2, cross-referencedly can illustrate, no longer repeat here.
For above-mentioned disclosed 512 fft processing units 1, its concrete structure as shown in Figure 4, mainly comprises:
The Data Format Transform of described 1536 sampling points, for obtaining 1536 sampling points of input, is block floating point by the first Data Format Transform unit 11, and described 1536 sampling points is divided into three groups of sampling point data groups that often group comprises 512 sampling points;
Ping-pong ram (number in the figure is 12), for being undertaken resetting and storing by also dividing three groups of sampling point data groups that the often group obtained comprises 512 sampling points through described first Data Format Transform unit 11 format transformation;
S butterfly processing element, for obtaining the three groups of sampling point data groups be stored in described ping-pong ram, adopts base 2 butterfly computation to perform S level butterfly computation for each group sampling point data group,
Wherein, comprise in described each butterfly processing element, base 2 butterfly unit 13, anti-spilled unit 14 and two-port RAM (number in the figure is 15);
Described base 2 butterfly unit 13, for carrying out base 2 butterfly computation to input data sequence, obtains the first data sequence; Wherein, sampling point data in described ping-pong ram are carried out computing as input data sequence by first order butterfly computation, computing is carried out based on the input data sequence of two-port RAM described in upper level to S-1 level butterfly computation in the second level in preceding clock cycle, export data sequence in a rear clock period to next stage two-port RAM, S level butterfly computation carries out computing based on the input data sequence of two-port RAM described in upper level in preceding clock cycle;
It should be noted that, in this base 2 butterfly unit 13, key component is made up of multiplier, but from relational expression (5):
A+B*j=(a+b*j)*(c+d*j)
:
A=ac-bd=(c-d)*b+(a-b)*c
(5)
B=ad-bc=(c+d)*a-(a-b)*c
This base 2 butterfly unit 12 can reduce by a multiplier to increase totalizer for cost, thus can the multiplier resources of surplus chip internal, improves the speed of butterfly computation further.
Described anti-spilled unit 14, for to according to application scenarios select this grade of butterfly computation after perform anti-spilled judgement time, obtain this grade to carry out via described base 2 butterfly unit the first data sequence that base 2 butterfly computation obtains and carry out anti-spilled judgement, and carry out cut position operation according to judged result, obtain the second corresponding data sequence;
It should be noted that, this anti-spilled unit 14 is specifically for when performing anti-spilled judgement after selecting this grade of butterfly computation, obtain this grade and carry out anti-spilled judgement to carrying out the first data sequence that base 2 butterfly computation obtains via described base 2 butterfly unit, when more than the data bits of the input data sequence carrying out butterfly computation two of the data bits that the result judged is described first data sequence, the numerical value of high three according to exporting carries out cut position operation to current the first data sequence obtained respectively, obtains the second corresponding data sequence.
Described two-port RAM (number in the figure is 15), for for storing first data sequence of not carrying out anti-spilled judgement or carrying out described second data sequence that anti-spilled judgement obtains; Wherein, the first data sequence obtained in S-1 level butterfly processing element before performing or carry out the second data sequence that anti-spilled judgement obtains and be stored in the described two-port RAM in butterfly processing element at different levels respectively;
Second Data Format Transform unit 16, for after carrying out Data Format Transform to the first data sequence obtained in S level butterfly computation or the data in carrying out the second data sequence that anti-spilled judgement obtains and after resetting, obtain sampling point process data group corresponding to current sampling point data group of carrying out butterfly process and export, in described sampling point process data group, comprising sampling point process data corresponding to each sampling point;
Control module (does not all illustrate in figure, the twiddle factor unit 17 and address generator 18 that relate to only are shown), for controlling described first Data Format Transform unit in conjunction with twiddle factor unit 17 and address generator 18, described ping-pong ram, described in described butterfly processing element, the second Data Format Transform unit performs above-mentioned corresponding operating.
In these 512 fft processing units 1, the implementation of unit can see the related description in the embodiment of the present application two.Both are consistent, no longer repeat here.
Based in the above-mentioned disclosed process for 512 the concrete FFT process in 1536 FFT process, adopt the processing mode of block floating point and streamline, 1536 sampling points are divided into the three groups of sampling point data groups often organizing 512 sampling points, FFT process is carried out again for 512 sampling points in every group, key aspect line of reasoning footpath is divided into relatively independent process path, make every grade of butterfly computation can time-sharing multiplex, and carry out anti-spilled judgement, thus the larger processing speed improving data and precision.Adopt ping-pong ram to store in addition, the processing mode of incorporating pipeline further adds the throughput of data.On the other hand because base 2 butterfly computation adopted is identical address operation, between butterfly computation at different levels, adopt two-port RAM storage further can save resources of chip.
For above-mentioned disclosed follow-on base 3 processing unit 2, its concrete structure as shown in Figure 5, mainly comprises:
Write control module 21, for receiving sampling point process data group described in three groups, be not stored in three corresponding ping-pong rams according to the address relationship of correspondence by sampling point process data component described in three groups, described three soldier pang RAM form storage unit 22;
All there are two relatively independent storage areas in each ping-pong ram in three ping-pong rams, for successively by each sampling point process data in sampling point process data group described in a group stored in, wherein, first time is stored in two sampling point process data, after the sampling point process data in a storage area are read, sampling point process data cover this storage area to be deposited by the next one, the like until 512 sampling point process data are all read;
It should be noted that, the storage area of ping-pong ram can be made up of the multi-disc IC of separate memory bank, thus makes it in structure, speed, capacity etc., have greater flexibility.
Read control module 23, for completing 512 readings to described three ping-pong rams, from three ping-pong rams, reading three sampling point process data altogether at every turn and inputing to three road processing units 24;
In described three road processing units 24, first via processing unit, for based on twiddle factor
fast fourier transform formula
multiplier and totalizer are carried out process to described three sampling point process data and are obtained first via FFT result;
Second road processing unit, for based on twiddle factor
fast fourier transform formula
multiplier and totalizer are carried out process to described three sampling point process data and are obtained the second road FFT result;
3rd road processing unit, for based on twiddle factor
fast fourier transform formula
multiplier or totalizer are carried out process to described three sampling point process data and are obtained the 3rd road FFT result;
Output unit (not shown), for completing 512 times and be stored in a ping-pong ram via whole 1536 the FFT results obtained after described three tunnel processing unit processes according to address to described control module of reading, and export after re-starting sequence.
In this follow-on base 3 processing unit 2, the implementation of unit can see the related description in the embodiment of the present application three.Both are consistent, no longer repeat here.
In follow-on base 3 processing unit, at every turn process is carried out to three sampling point process data can export three road FFT results simultaneously by above-mentioned, not only saved storage unit, also substantially increased the throughput of data and reduce the power consumption of system.Meanwhile, adopt aforesaid way according to different application scenarioss, at high data throughput, can also to adjust flexibly between operational precision and resource consumption, optimize the processing procedure of 1536 FFT further.
Equally, based on 1536 FFT treating apparatus disclosed in above-mentioned the embodiment of the present application, the application is also corresponding discloses the fft processor comprising these 1536 FFT treating apparatus, and comprises the ofdm system of this fft processor.
Specifically, these 1536 FFT treating apparatus are used for 1536 sampling points being obtained input by 512 fft processing units, after described 1536 sampling points are converted to block floating point, 1536 sampling points are divided into three groups of sampling point data groups that often group comprises 512 sampling points, and respectively FFT process is carried out to 512 sampling points often organized in described sampling point data group, obtain corresponding sampling point process data group, obtain three groups of sampling point process data groups are sent to follow-on base 3 unit, read three described sampling point process data by described follow-on base 3 unit to process at every turn, until complete after 512 times, obtain the FFT result of whole 1536 sampling points and export,
Wherein, in the process of carrying out described FFT process, anti-spilled judgement is performed; Three described sampling point process data are respectively from three groups of sampling point process data groups.
In sum, the embodiment of the present application adopts the processing mode of block floating point and streamline, 1536 sampling points are divided into three sampling point data groups that often group comprises 512 sampling points, FFT process is carried out again for 512 sampling points in every group, key aspect line of reasoning footpath is divided into relatively independent process path, make every grade of butterfly computation can time-sharing multiplex, and carried out anti-spilled judgement, thus the larger processing speed improving data and precision, further carrying out process to three sampling point process data can export three road FFT results simultaneously at every turn carrying out in follow-on base 3 processing unit, not only save storage unit, also substantially increase the throughput of data and reduce the power consumption of system, simultaneously can also according to different application scenarioss, at high data throughput, adjust flexibly between operational precision and resource consumption.
In this instructions, each embodiment adopts the mode of going forward one by one to describe, and what each embodiment stressed is the difference with other embodiments, between each embodiment identical similar portion mutually see.For system disclosed in embodiment, because it corresponds to the method disclosed in Example, so description is fairly simple, relevant part illustrates see method part.
Apply specific case herein to set forth the principle of the application and embodiment, the explanation of above embodiment is just for helping method and the core concept thereof of understanding the application; Meanwhile, for one of ordinary skill in the art, according to the thought of the application, all will change in specific embodiments and applications.In sum, this description should not be construed as the restriction to the application.
To the above-mentioned explanation of the disclosed embodiments, professional and technical personnel in the field are realized or uses the present invention.To be apparent for those skilled in the art to the multiple amendment of these embodiments, General Principle as defined herein can without departing from the spirit or scope of the present invention, realize in other embodiments.Therefore, the present invention can not be restricted to these embodiments shown in this article, but will meet the widest scope consistent with principle disclosed herein and features of novelty.