CN113569190B - Fast Fourier transform twiddle factor computing system and method - Google Patents
Fast Fourier transform twiddle factor computing system and method Download PDFInfo
- Publication number
- CN113569190B CN113569190B CN202110751901.4A CN202110751901A CN113569190B CN 113569190 B CN113569190 B CN 113569190B CN 202110751901 A CN202110751901 A CN 202110751901A CN 113569190 B CN113569190 B CN 113569190B
- Authority
- CN
- China
- Prior art keywords
- memory
- twiddle
- datablock
- fast fourier
- twiddle factor
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 21
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 23
- 238000004364 calculation method Methods 0.000 claims abstract description 17
- 238000010276 construction Methods 0.000 claims abstract description 13
- 230000011218 segmentation Effects 0.000 claims abstract description 7
- 230000009466 transformation Effects 0.000 claims abstract 3
- 238000005192 partition Methods 0.000 claims description 15
- 238000000638 solvent extraction Methods 0.000 claims description 15
- 238000004590 computer program Methods 0.000 claims description 10
- 238000004891 communication Methods 0.000 abstract description 3
- 238000010586 diagram Methods 0.000 description 7
- 238000012545 processing Methods 0.000 description 5
- 230000006870 function Effects 0.000 description 4
- 238000013461 design Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Abstract
The invention discloses a fast Fourier transformation twiddle factor computing system and a fast Fourier transformation twiddle factor computing method in the field of communication systems, aiming at solving the technical problem that the same memory is easy to have read-write conflict in the same clk under different fftSize. A fast Fourier transform twiddle factor calculation method, divide memory into a plurality of blocks through the memory division method, distribute different datablock in different memory blocks, and calculate out twiddle factors needed when carrying out the superposition operation on datablock through the twiddle factor parallel construction method; the memory is a device for temporarily storing data of each level, and datablock is a data block. The invention can meet the segmentation modes of FFT scenes of various fftSize of an NR system, provides corresponding read-write methods, and then verifies under various scenes, thereby completely avoiding the conflict of read-write memory; in addition, only one group of twiddle factors are saved, and a plurality of twiddle factors can be constructed in parallel through another auxiliary small table, so that the storage space can be saved, and the requirement of a parallel algorithm can be met.
Description
Technical Field
The invention relates to a fast Fourier transform twiddle factor computing system and a method, belonging to the technical field of communication systems.
Background
FFT is an efficient algorithm for DFT, called fast fourier transform (fast Fouriertransform), and is widely used in various digital signal processing systems, such as analyzing signal spectrum characteristics, transformprecoding, OFDMmudulation, PRACH waveform generation in 5G wireless communication systems, and receiving end. It inputs N numbers at a time, transforms and outputs N numbers, where N is the number of points called FFTSize. For example, a 5G system needs to support all FFTs with n=12×rbnum, where 0< rb < =273, and even FFTs up to n=48×4096 may be used in the PRACH module.
The 5G system needs to support a high data throughput rate, so the throughput rate of the FFT module is an important design index. In the case of limited platform dominant frequency, the throughput rate needs to be improved by parallel computing or pipeline. In addition, because of the need to support multiple N values, some of which may be large, it is desirable to use memory instead of registers to store data and twiddle factors when implemented in hardware; in addition, the Cooley-Turkey algorithm has an inherent data dependency relationship and a corresponding addressing mode, which means that parallel computing needs to read and write a plurality of data in the same clk, and on the other hand, the same memory can only be read and written once in the same clk, which is a difficulty in realizing the parallel algorithm.
In the prior art, under different n= fftSize (i.e. the data amount of the input or output of the FFT), the same memory is easy to have read-write collision in the same clk, and for this purpose, we propose a fast fourier transform twiddle factor computing system and method.
Disclosure of Invention
The invention aims to overcome the defects in the prior art, provides a fast Fourier transform twiddle factor computing system and a fast Fourier transform twiddle factor computing method, and solves the technical problem that the same memory is easy to have read-write conflict in the same clk under different fftSize.
In order to achieve the above purpose, the invention is realized by adopting the following technical scheme:
In a first aspect, the present invention provides a fast fourier transform twiddle factor calculation method, in which a memory is divided into a plurality of blocks by a memory division method, different datablock are allocated in different memory blocks, and twiddle factors required for performing a stacking operation on datablock are calculated by a twiddle factor parallel construction method;
The memory is a device for temporarily storing data of each level, and datablock is a data block.
Further, the memory segmentation method includes:
According to the read-write data rule determined by the scheduling algorithm, a computer search program is written to search a memory division scheme, and the memory is divided into a plurality of blocks according to the memory division scheme, wherein the blocks are used for providing allocation spaces for different datablock.
Further, the twiddle factor parallel construction method comprises the following steps:
Preserving twiddle factors to form a large table, introducing another auxiliary small table, and calculating a plurality of twiddle factors by matching the large table and the small table; wherein the large table is used for storing part of rotation factors of the rotation factor table, and the size of the large table is in direct proportion to fftSize; the small table is used for storing the phase interval rule of the rotation factor table under a specific scheduling algorithm.
In a second aspect, the present invention provides a fast fourier transform twiddle factor calculation system, said apparatus comprising:
datablock assignment module: the memory partitioning module is used for partitioning the memory into a plurality of partitions and distributing different datablock into different memory partitions;
A twiddle factor calculation module: the rotation factor parallel construction module is used for calculating rotation factors required by performing the stacking operation on datablock.
Furthermore, the memory partitioning module is configured to write a computer search program according to the read-write data rule determined by the scheduling algorithm to search a memory partitioning scheme, and partition the memory into a plurality of partitions according to the memory partitioning scheme, where the plurality of partitions are used to provide allocation spaces for different datablock.
Further, the twiddle factor parallel construction module is used for storing twiddle factors to form a large table, introducing another auxiliary small table, and calculating a plurality of twiddle factors through the cooperation of the large table and the small table; wherein the large table is used for storing part of rotation factors of the rotation factor table, and the size of the large table is in direct proportion to fftSize; the small table is used for storing the phase interval rule of the rotation factor table under a specific scheduling algorithm.
In a third aspect, the present invention provides a fast fourier transform rotation factor calculation apparatus, comprising a processor and a storage medium;
The storage medium is used for storing instructions;
The processor is operative according to the instructions to perform the steps of the fast fourier transform twiddle factor calculation method according to any of the above.
In a fourth aspect, the present invention provides a computer readable storage medium having stored thereon a computer program, characterized in that the program when executed by a processor implements the steps of the fast fourier transform twiddle factor calculation method according to any of the above.
Compared with the prior art, the invention has the beneficial effects that:
According to the internal read-write rule of a corresponding scheduling algorithm, a search program of a memory segmentation mode is written, search is traversed, the number of segmented blocks is as small as possible on the premise of ensuring that read-write conflicts cannot occur in various scenes, the segmentation mode which can meet FFT scenes of various fftSize of an NR system is successfully found, a corresponding read-write method is provided, verification is carried out in various scenes, and the conflicts of read-write memory can be completely avoided; in addition, only one group of twiddle factors (a large table, the size of which is proportional to fftSize) is saved, and a plurality of twiddle factors can be constructed in parallel through another auxiliary small table (the table is only related to the number of butterfly operation points of two adjacent stages and is irrelevant to fftSize, so that the memory cost is quite low), so that the memory space can be saved, and the requirement of a parallel algorithm can be met.
Drawings
Fig. 1 is a diagram illustrating datablock allocation according to a first embodiment of the present invention.
Detailed Description
The Cooley-Turkey algorithm acts as an FFT algorithm by decomposing a very large N-valued FFT into a combination of several shorter FFTs, e.g. for n=n 1N2, then the Cooley-Turkey algorithm can be decomposed into:
Wherein the method comprises the steps of :0≤n1≤N1-1;0≤n2≤N2-1;n=N2n1+n2;k=N1k2+k1
Similarly, if n=n 1N2N3, then the Cooley-Turkey algorithm can be decomposed into:
Wherein: is a twiddle factor, typically pre-generated offline and pre-stored in a hardware implementation.
The invention is further described below with reference to the accompanying drawings. The following examples are only for more clearly illustrating the technical aspects of the present invention, and are not intended to limit the scope of the present invention.
Embodiment one:
A fast Fourier transform twiddle factor calculation method divides a memory into a plurality of blocks by a memory division method, distributes different datablock into different memory blocks, calculates twiddle factors required for carrying out a stacking operation on datablock by a twiddle factor parallel construction method, and comprises the following steps: writing a computer search program to search a memory division scheme according to a read-write data rule determined by a scheduling algorithm, and dividing the memory into a plurality of blocks according to the memory division scheme, wherein the blocks are used for providing allocation spaces for different datablock; the twiddle factor parallel construction method comprises the following steps: preserving twiddle factors to form a large table, introducing another auxiliary small table, and calculating a plurality of twiddle factors by matching the large table and the small table; wherein the large table is used for storing part of the twiddle factors of the twiddle factor table, and the size of the large table is in direct proportion to fftSize; the small table is used for storing the phase interval rule of the rotation factor table under a specific scheduling algorithm, wherein a memory is a device for temporarily storing data of each stage, and datablock is a data block.
Specifically, in the design stage, a search program of a memory segmentation mode is written according to the internal read-write rule of a corresponding scheduling algorithm, and the search is traversed, so that the number of segmented blocks is as small as possible on the premise of ensuring that read-write conflict does not occur in various scenes. The method successfully finds out the segmentation modes of FFT scenes capable of meeting various fftSize of an NR system, gives out corresponding reading and writing methods, and then verifies under various scenes, so that the conflict of reading and writing memory can be completely avoided.
Due to the uniform spacing between the several twiddle factors required for each butterfly operation, in particular, Wherein/>Can be obtained by a large table/>Can be obtained by looking up a small table, wherein the small table is stored in advanceBecause D 2 and D 2next are both basic butterfly factors, D 2 < = 8 and D 2next < = 8, which are small compared to fftSize, the memory space of the small table is much smaller than the large table.
Only one large table is used for storing a part of a twiddle factor table, and the other small table is used for storing the two tables, so that a plurality of twiddle factors can be generated in parallel without collision by mutually matching the two tables, and the storage cost of twiddle factors is effectively reduced. If the memory method is not divided, the method can only be implemented by using a register, the efficiency is low, and when fftSize is large, the chip area of the register mode is greatly increased compared with that of the memory mode. If there is no efficient way to construct twiddle factors in parallel, multiple sets of twiddle factors need to be stored, the storage overhead being proportional to fftSize, and when fftSize is large, the storage overhead is large.
Embodiment two:
a fast fourier transform twiddle factor computing system, the apparatus comprising:
datablock assignment module: the memory partitioning module is used for partitioning the memory into a plurality of partitions and distributing different datablock into different memory partitions;
A twiddle factor calculation module: the rotation factor parallel construction module is used for calculating rotation factors required by performing the stacking operation on datablock.
It should be noted that, the memory partitioning module is configured to write a computer search program according to the read-write data rule determined by the scheduling algorithm to search a memory partitioning scheme, and partition the memory into a plurality of partitions according to the memory partitioning scheme, where the plurality of partitions are configured to provide allocation spaces for different datablock; the twiddle factor parallel construction module is used for storing twiddle factors to form a large table, introducing another auxiliary small table, and calculating a plurality of twiddle factors through the cooperation of the large table and the small table; wherein the large table is used for storing part of the twiddle factors of the twiddle factor table, and the size of the large table is in direct proportion to fftSize; the small table is used for storing the phase interval rule of the rotation factor table under a specific scheduling algorithm.
Embodiment III:
the embodiment of the invention also provides a fast Fourier transform twiddle factor calculating device, which comprises a processor and a storage medium;
The storage medium is used for storing instructions;
the processor is operative to perform the steps of the fast fourier transform twiddle factor calculation method according to embodiment one.
Embodiment four:
the present invention provides a computer-readable storage medium having stored thereon a computer program, characterized in that the program when executed by a processor implements the steps of the fast fourier transform twiddle factor calculation method in embodiment one.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The foregoing is merely a preferred embodiment of the present invention, and it should be noted that modifications and variations could be made by those skilled in the art without departing from the technical principles of the present invention, and such modifications and variations should also be regarded as being within the scope of the invention.
Claims (6)
1. A fast Fourier transformation twiddle factor calculation method is characterized in that a memory is divided into a plurality of blocks by a memory division method, different datablock are distributed in different memory blocks, and twiddle factors required by butterfly operation on datablock are calculated by a twiddle factor parallel construction method;
wherein memory is a device for temporarily storing data of each level, datablock is a data block;
the twiddle factor parallel construction method comprises the following steps:
Preserving twiddle factors to form a large table, introducing another auxiliary small table, and calculating a plurality of twiddle factors by matching the large table and the small table; wherein the large table is used for storing part of rotation factors of the rotation factor table, and the size of the large table is in direct proportion to fftSize; the small table is used for storing the phase interval rule of the rotation factor table under a specific scheduling algorithm;
Wherein the large table storage section includes The small table is stored in parts of
2. The fast fourier transform rotation factor calculating method as recited in claim 1, wherein the memory segmentation method comprises:
According to the read-write data rule determined by the scheduling algorithm, a computer search program is written to search a memory division scheme, and the memory is divided into a plurality of blocks according to the memory division scheme, wherein the blocks are used for providing allocation spaces for different datablock.
3. A fast fourier transform twiddle factor computing system, said system comprising:
datablock assignment module: the memory partitioning module is used for partitioning the memory into a plurality of partitions and distributing different datablock into different memory partitions;
A twiddle factor calculation module: the rotation factor parallel construction module is used for calculating rotation factors required by performing butterfly operation on datablock;
the twiddle factor parallel construction module is used for storing twiddle factors to form a large table, introducing another auxiliary small table, and calculating a plurality of twiddle factors through the cooperation of the large table and the small table; wherein the large table is used for storing part of rotation factors of the rotation factor table, and the size of the large table is in direct proportion to fftSize; the small table is used for storing the phase interval rule of the rotation factor table under a specific scheduling algorithm;
Wherein the large table storage section includes The small table is stored in parts of
4. A fft twiddle factor computing system according to claim 3 wherein the memory partitioning module is configured to write a computer search program to search for a memory partitioning scheme based on read-write data rules determined by a scheduling algorithm, and partition a memory into a plurality of partitions based on the memory partitioning scheme, the plurality of partitions being configured to provide allocation space for different datablock.
5. A fast fourier transform twiddle factor computing device, comprising a processor and a storage medium;
The storage medium is used for storing instructions;
The processor is operative according to the instructions to perform the steps of the fast fourier transform twiddle factor calculation method according to any of claims 1-2.
6. A computer readable storage medium having stored thereon a computer program, characterized in that the program when executed by a processor realizes the steps of the fast fourier transform twiddle factor calculation method according to any of claims 1-2.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110751901.4A CN113569190B (en) | 2021-07-02 | Fast Fourier transform twiddle factor computing system and method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110751901.4A CN113569190B (en) | 2021-07-02 | Fast Fourier transform twiddle factor computing system and method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113569190A CN113569190A (en) | 2021-10-29 |
CN113569190B true CN113569190B (en) | 2024-06-04 |
Family
ID=
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20040026910A (en) * | 2002-09-26 | 2004-04-01 | 엘지전자 주식회사 | Fast fourier transform apparatus and method thereof |
CN101504638A (en) * | 2009-03-19 | 2009-08-12 | 北京理工大学 | Point-variable assembly line FFT processor |
CN101630308A (en) * | 2008-07-16 | 2010-01-20 | 财团法人交大思源基金会 | Design and addressing method for any point number quick Fourier transformer based on memory |
CN102209962A (en) * | 2008-09-10 | 2011-10-05 | 先进汽车技术有限公司合作研究中心 | Method and device for computing matrices for discrete fourier transform (dft) coefficients |
CN102411491A (en) * | 2011-12-31 | 2012-04-11 | 中国科学院自动化研究所 | Data access method and device for parallel FFT (Fast Fourier Transform) computation |
CN103218348A (en) * | 2013-03-29 | 2013-07-24 | 北京创毅视讯科技有限公司 | Method and system for processing fast Fourier transform |
CN105893328A (en) * | 2016-04-19 | 2016-08-24 | 南京亚派科技股份有限公司 | Cooley-Tukey-based fast Fourier transform (FFT) algorithm |
CN113569189A (en) * | 2021-07-02 | 2021-10-29 | 星思连接(上海)半导体有限公司 | Fast Fourier transform calculation method and device |
CN113591022A (en) * | 2021-07-02 | 2021-11-02 | 星思连接(上海)半导体有限公司 | Read-write scheduling processing method and device capable of decomposing data |
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20040026910A (en) * | 2002-09-26 | 2004-04-01 | 엘지전자 주식회사 | Fast fourier transform apparatus and method thereof |
CN101630308A (en) * | 2008-07-16 | 2010-01-20 | 财团法人交大思源基金会 | Design and addressing method for any point number quick Fourier transformer based on memory |
CN102209962A (en) * | 2008-09-10 | 2011-10-05 | 先进汽车技术有限公司合作研究中心 | Method and device for computing matrices for discrete fourier transform (dft) coefficients |
CN101504638A (en) * | 2009-03-19 | 2009-08-12 | 北京理工大学 | Point-variable assembly line FFT processor |
CN102411491A (en) * | 2011-12-31 | 2012-04-11 | 中国科学院自动化研究所 | Data access method and device for parallel FFT (Fast Fourier Transform) computation |
CN103218348A (en) * | 2013-03-29 | 2013-07-24 | 北京创毅视讯科技有限公司 | Method and system for processing fast Fourier transform |
CN105893328A (en) * | 2016-04-19 | 2016-08-24 | 南京亚派科技股份有限公司 | Cooley-Tukey-based fast Fourier transform (FFT) algorithm |
CN113569189A (en) * | 2021-07-02 | 2021-10-29 | 星思连接(上海)半导体有限公司 | Fast Fourier transform calculation method and device |
CN113591022A (en) * | 2021-07-02 | 2021-11-02 | 星思连接(上海)半导体有限公司 | Read-write scheduling processing method and device capable of decomposing data |
Non-Patent Citations (5)
Title |
---|
A Hardware-Efficient and Reconfigurable UFMC Transmitter Architecture With its FPGA Prototype;Vikas Kumar 等;《IEEE Embedded Systems Letters》;20201201;第12卷(第4期);109-112 * |
High-throughput and compact FFT architectures using the Good-Thomas and Winograd algorithms;Nikhilesh Bhagat 等;《IET Communications》;20180419;第12卷(第8期);1011-1018 * |
一种在多核嵌入式平台上实现FFT的快速并行算法;彭自然 等;《计算机应用研究》;20171128;第34卷(第1期);3242-3246 * |
基于改进旋转因子的高性能FFT硬件设计;骆阳 等;《浙江大学学报(工学版)》;20210615;第55卷(第6期);1199-1207 * |
基于颗粒度可重构架构的并行FFT算法实现;曹鹏 等;《东南大学学报(自然科学版)》;20131120;第43卷(第6期);1174-1179 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN102375805B (en) | Vector processor-oriented FFT (Fast Fourier Transform) parallel computation method based on SIMD (Single Instruction Multiple Data) | |
CN108170639B (en) | Tensor CP decomposition implementation method based on distributed environment | |
US9740493B2 (en) | System and method of loop vectorization by compressing indexes and data elements from iterations based on a control mask | |
DiMarco et al. | Performance impact of dynamic parallelism on different clustering algorithms | |
CN111159235A (en) | Data pre-partition method and device, electronic equipment and readable storage medium | |
CN111240744B (en) | Method and system for improving parallel computing efficiency related to sparse matrix | |
CN116401258B (en) | Data indexing method, data query method and corresponding devices | |
Liroz-Gistau et al. | Dynamic workload-based partitioning for large-scale databases | |
US8990281B2 (en) | Techniques for improving the efficiency of mixed radix fast fourier transform | |
Bisson et al. | A GPU implementation of the sparse deep neural network graph challenge | |
CN110806942B (en) | Data processing method and device | |
CN109460406A (en) | A kind of data processing method and device | |
Niederhagen et al. | Implementing Joux-Vitse’s Crossbred Algorithm for Solving Systems over on GPUs | |
CN113569190B (en) | Fast Fourier transform twiddle factor computing system and method | |
US9582474B2 (en) | Method and apparatus for performing a FFT computation | |
US20100179978A1 (en) | Fft-based parallel system with memory reuse scheme | |
US20150331634A1 (en) | Continuous-flow conflict-free mixed-radix fast fourier transform in multi-bank memory | |
CN113569190A (en) | Fast Fourier transform rotation factor calculation system and method | |
EP2538345A1 (en) | Fast fourier transform circuit | |
CN116129325A (en) | Urban treatment image target extraction method and device and application thereof | |
CN113656507B (en) | Method and device for executing transaction in block chain system | |
CN114116012B (en) | Method and device for realizing vectorization of FFT code bit reverse order algorithm based on shuffle operation | |
CN113569189B (en) | Fast Fourier transform calculation method and device | |
CN110955380B (en) | Access data generation method, storage medium, computer device and apparatus | |
CN106469134A (en) | A kind of data conflict-free access method for fft processor |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant |