CN113569190B - Fast Fourier transform twiddle factor computing system and method - Google Patents

Fast Fourier transform twiddle factor computing system and method Download PDF

Info

Publication number
CN113569190B
CN113569190B CN202110751901.4A CN202110751901A CN113569190B CN 113569190 B CN113569190 B CN 113569190B CN 202110751901 A CN202110751901 A CN 202110751901A CN 113569190 B CN113569190 B CN 113569190B
Authority
CN
China
Prior art keywords
memory
twiddle
datablock
fast fourier
twiddle factor
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110751901.4A
Other languages
Chinese (zh)
Other versions
CN113569190A (en
Inventor
黄勇富
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xingsi Connection Shanghai Semiconductor Co ltd
Original Assignee
Xingsi Connection Shanghai Semiconductor Co ltd
Filing date
Publication date
Application filed by Xingsi Connection Shanghai Semiconductor Co ltd filed Critical Xingsi Connection Shanghai Semiconductor Co ltd
Priority to CN202110751901.4A priority Critical patent/CN113569190B/en
Publication of CN113569190A publication Critical patent/CN113569190A/en
Application granted granted Critical
Publication of CN113569190B publication Critical patent/CN113569190B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The invention discloses a fast Fourier transformation twiddle factor computing system and a fast Fourier transformation twiddle factor computing method in the field of communication systems, aiming at solving the technical problem that the same memory is easy to have read-write conflict in the same clk under different fftSize. A fast Fourier transform twiddle factor calculation method, divide memory into a plurality of blocks through the memory division method, distribute different datablock in different memory blocks, and calculate out twiddle factors needed when carrying out the superposition operation on datablock through the twiddle factor parallel construction method; the memory is a device for temporarily storing data of each level, and datablock is a data block. The invention can meet the segmentation modes of FFT scenes of various fftSize of an NR system, provides corresponding read-write methods, and then verifies under various scenes, thereby completely avoiding the conflict of read-write memory; in addition, only one group of twiddle factors are saved, and a plurality of twiddle factors can be constructed in parallel through another auxiliary small table, so that the storage space can be saved, and the requirement of a parallel algorithm can be met.

Description

Fast Fourier transform twiddle factor computing system and method
Technical Field
The invention relates to a fast Fourier transform twiddle factor computing system and a method, belonging to the technical field of communication systems.
Background
FFT is an efficient algorithm for DFT, called fast fourier transform (fast Fouriertransform), and is widely used in various digital signal processing systems, such as analyzing signal spectrum characteristics, transformprecoding, OFDMmudulation, PRACH waveform generation in 5G wireless communication systems, and receiving end. It inputs N numbers at a time, transforms and outputs N numbers, where N is the number of points called FFTSize. For example, a 5G system needs to support all FFTs with n=12×rbnum, where 0< rb < =273, and even FFTs up to n=48×4096 may be used in the PRACH module.
The 5G system needs to support a high data throughput rate, so the throughput rate of the FFT module is an important design index. In the case of limited platform dominant frequency, the throughput rate needs to be improved by parallel computing or pipeline. In addition, because of the need to support multiple N values, some of which may be large, it is desirable to use memory instead of registers to store data and twiddle factors when implemented in hardware; in addition, the Cooley-Turkey algorithm has an inherent data dependency relationship and a corresponding addressing mode, which means that parallel computing needs to read and write a plurality of data in the same clk, and on the other hand, the same memory can only be read and written once in the same clk, which is a difficulty in realizing the parallel algorithm.
In the prior art, under different n= fftSize (i.e. the data amount of the input or output of the FFT), the same memory is easy to have read-write collision in the same clk, and for this purpose, we propose a fast fourier transform twiddle factor computing system and method.
Disclosure of Invention
The invention aims to overcome the defects in the prior art, provides a fast Fourier transform twiddle factor computing system and a fast Fourier transform twiddle factor computing method, and solves the technical problem that the same memory is easy to have read-write conflict in the same clk under different fftSize.
In order to achieve the above purpose, the invention is realized by adopting the following technical scheme:
In a first aspect, the present invention provides a fast fourier transform twiddle factor calculation method, in which a memory is divided into a plurality of blocks by a memory division method, different datablock are allocated in different memory blocks, and twiddle factors required for performing a stacking operation on datablock are calculated by a twiddle factor parallel construction method;
The memory is a device for temporarily storing data of each level, and datablock is a data block.
Further, the memory segmentation method includes:
According to the read-write data rule determined by the scheduling algorithm, a computer search program is written to search a memory division scheme, and the memory is divided into a plurality of blocks according to the memory division scheme, wherein the blocks are used for providing allocation spaces for different datablock.
Further, the twiddle factor parallel construction method comprises the following steps:
Preserving twiddle factors to form a large table, introducing another auxiliary small table, and calculating a plurality of twiddle factors by matching the large table and the small table; wherein the large table is used for storing part of rotation factors of the rotation factor table, and the size of the large table is in direct proportion to fftSize; the small table is used for storing the phase interval rule of the rotation factor table under a specific scheduling algorithm.
In a second aspect, the present invention provides a fast fourier transform twiddle factor calculation system, said apparatus comprising:
datablock assignment module: the memory partitioning module is used for partitioning the memory into a plurality of partitions and distributing different datablock into different memory partitions;
A twiddle factor calculation module: the rotation factor parallel construction module is used for calculating rotation factors required by performing the stacking operation on datablock.
Furthermore, the memory partitioning module is configured to write a computer search program according to the read-write data rule determined by the scheduling algorithm to search a memory partitioning scheme, and partition the memory into a plurality of partitions according to the memory partitioning scheme, where the plurality of partitions are used to provide allocation spaces for different datablock.
Further, the twiddle factor parallel construction module is used for storing twiddle factors to form a large table, introducing another auxiliary small table, and calculating a plurality of twiddle factors through the cooperation of the large table and the small table; wherein the large table is used for storing part of rotation factors of the rotation factor table, and the size of the large table is in direct proportion to fftSize; the small table is used for storing the phase interval rule of the rotation factor table under a specific scheduling algorithm.
In a third aspect, the present invention provides a fast fourier transform rotation factor calculation apparatus, comprising a processor and a storage medium;
The storage medium is used for storing instructions;
The processor is operative according to the instructions to perform the steps of the fast fourier transform twiddle factor calculation method according to any of the above.
In a fourth aspect, the present invention provides a computer readable storage medium having stored thereon a computer program, characterized in that the program when executed by a processor implements the steps of the fast fourier transform twiddle factor calculation method according to any of the above.
Compared with the prior art, the invention has the beneficial effects that:
According to the internal read-write rule of a corresponding scheduling algorithm, a search program of a memory segmentation mode is written, search is traversed, the number of segmented blocks is as small as possible on the premise of ensuring that read-write conflicts cannot occur in various scenes, the segmentation mode which can meet FFT scenes of various fftSize of an NR system is successfully found, a corresponding read-write method is provided, verification is carried out in various scenes, and the conflicts of read-write memory can be completely avoided; in addition, only one group of twiddle factors (a large table, the size of which is proportional to fftSize) is saved, and a plurality of twiddle factors can be constructed in parallel through another auxiliary small table (the table is only related to the number of butterfly operation points of two adjacent stages and is irrelevant to fftSize, so that the memory cost is quite low), so that the memory space can be saved, and the requirement of a parallel algorithm can be met.
Drawings
Fig. 1 is a diagram illustrating datablock allocation according to a first embodiment of the present invention.
Detailed Description
The Cooley-Turkey algorithm acts as an FFT algorithm by decomposing a very large N-valued FFT into a combination of several shorter FFTs, e.g. for n=n 1N2, then the Cooley-Turkey algorithm can be decomposed into:
Wherein the method comprises the steps of :0≤n1≤N1-1;0≤n2≤N2-1;n=N2n1+n2;k=N1k2+k1
Similarly, if n=n 1N2N3, then the Cooley-Turkey algorithm can be decomposed into:
Wherein: is a twiddle factor, typically pre-generated offline and pre-stored in a hardware implementation.
The invention is further described below with reference to the accompanying drawings. The following examples are only for more clearly illustrating the technical aspects of the present invention, and are not intended to limit the scope of the present invention.
Embodiment one:
A fast Fourier transform twiddle factor calculation method divides a memory into a plurality of blocks by a memory division method, distributes different datablock into different memory blocks, calculates twiddle factors required for carrying out a stacking operation on datablock by a twiddle factor parallel construction method, and comprises the following steps: writing a computer search program to search a memory division scheme according to a read-write data rule determined by a scheduling algorithm, and dividing the memory into a plurality of blocks according to the memory division scheme, wherein the blocks are used for providing allocation spaces for different datablock; the twiddle factor parallel construction method comprises the following steps: preserving twiddle factors to form a large table, introducing another auxiliary small table, and calculating a plurality of twiddle factors by matching the large table and the small table; wherein the large table is used for storing part of the twiddle factors of the twiddle factor table, and the size of the large table is in direct proportion to fftSize; the small table is used for storing the phase interval rule of the rotation factor table under a specific scheduling algorithm, wherein a memory is a device for temporarily storing data of each stage, and datablock is a data block.
Specifically, in the design stage, a search program of a memory segmentation mode is written according to the internal read-write rule of a corresponding scheduling algorithm, and the search is traversed, so that the number of segmented blocks is as small as possible on the premise of ensuring that read-write conflict does not occur in various scenes. The method successfully finds out the segmentation modes of FFT scenes capable of meeting various fftSize of an NR system, gives out corresponding reading and writing methods, and then verifies under various scenes, so that the conflict of reading and writing memory can be completely avoided.
Due to the uniform spacing between the several twiddle factors required for each butterfly operation, in particular, Wherein/>Can be obtained by a large table/>Can be obtained by looking up a small table, wherein the small table is stored in advanceBecause D 2 and D 2next are both basic butterfly factors, D 2 < = 8 and D 2next < = 8, which are small compared to fftSize, the memory space of the small table is much smaller than the large table.
Only one large table is used for storing a part of a twiddle factor table, and the other small table is used for storing the two tables, so that a plurality of twiddle factors can be generated in parallel without collision by mutually matching the two tables, and the storage cost of twiddle factors is effectively reduced. If the memory method is not divided, the method can only be implemented by using a register, the efficiency is low, and when fftSize is large, the chip area of the register mode is greatly increased compared with that of the memory mode. If there is no efficient way to construct twiddle factors in parallel, multiple sets of twiddle factors need to be stored, the storage overhead being proportional to fftSize, and when fftSize is large, the storage overhead is large.
Embodiment two:
a fast fourier transform twiddle factor computing system, the apparatus comprising:
datablock assignment module: the memory partitioning module is used for partitioning the memory into a plurality of partitions and distributing different datablock into different memory partitions;
A twiddle factor calculation module: the rotation factor parallel construction module is used for calculating rotation factors required by performing the stacking operation on datablock.
It should be noted that, the memory partitioning module is configured to write a computer search program according to the read-write data rule determined by the scheduling algorithm to search a memory partitioning scheme, and partition the memory into a plurality of partitions according to the memory partitioning scheme, where the plurality of partitions are configured to provide allocation spaces for different datablock; the twiddle factor parallel construction module is used for storing twiddle factors to form a large table, introducing another auxiliary small table, and calculating a plurality of twiddle factors through the cooperation of the large table and the small table; wherein the large table is used for storing part of the twiddle factors of the twiddle factor table, and the size of the large table is in direct proportion to fftSize; the small table is used for storing the phase interval rule of the rotation factor table under a specific scheduling algorithm.
Embodiment III:
the embodiment of the invention also provides a fast Fourier transform twiddle factor calculating device, which comprises a processor and a storage medium;
The storage medium is used for storing instructions;
the processor is operative to perform the steps of the fast fourier transform twiddle factor calculation method according to embodiment one.
Embodiment four:
the present invention provides a computer-readable storage medium having stored thereon a computer program, characterized in that the program when executed by a processor implements the steps of the fast fourier transform twiddle factor calculation method in embodiment one.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The foregoing is merely a preferred embodiment of the present invention, and it should be noted that modifications and variations could be made by those skilled in the art without departing from the technical principles of the present invention, and such modifications and variations should also be regarded as being within the scope of the invention.

Claims (6)

1. A fast Fourier transformation twiddle factor calculation method is characterized in that a memory is divided into a plurality of blocks by a memory division method, different datablock are distributed in different memory blocks, and twiddle factors required by butterfly operation on datablock are calculated by a twiddle factor parallel construction method;
wherein memory is a device for temporarily storing data of each level, datablock is a data block;
the twiddle factor parallel construction method comprises the following steps:
Preserving twiddle factors to form a large table, introducing another auxiliary small table, and calculating a plurality of twiddle factors by matching the large table and the small table; wherein the large table is used for storing part of rotation factors of the rotation factor table, and the size of the large table is in direct proportion to fftSize; the small table is used for storing the phase interval rule of the rotation factor table under a specific scheduling algorithm;
Wherein the large table storage section includes The small table is stored in parts of
2. The fast fourier transform rotation factor calculating method as recited in claim 1, wherein the memory segmentation method comprises:
According to the read-write data rule determined by the scheduling algorithm, a computer search program is written to search a memory division scheme, and the memory is divided into a plurality of blocks according to the memory division scheme, wherein the blocks are used for providing allocation spaces for different datablock.
3. A fast fourier transform twiddle factor computing system, said system comprising:
datablock assignment module: the memory partitioning module is used for partitioning the memory into a plurality of partitions and distributing different datablock into different memory partitions;
A twiddle factor calculation module: the rotation factor parallel construction module is used for calculating rotation factors required by performing butterfly operation on datablock;
the twiddle factor parallel construction module is used for storing twiddle factors to form a large table, introducing another auxiliary small table, and calculating a plurality of twiddle factors through the cooperation of the large table and the small table; wherein the large table is used for storing part of rotation factors of the rotation factor table, and the size of the large table is in direct proportion to fftSize; the small table is used for storing the phase interval rule of the rotation factor table under a specific scheduling algorithm;
Wherein the large table storage section includes The small table is stored in parts of
4. A fft twiddle factor computing system according to claim 3 wherein the memory partitioning module is configured to write a computer search program to search for a memory partitioning scheme based on read-write data rules determined by a scheduling algorithm, and partition a memory into a plurality of partitions based on the memory partitioning scheme, the plurality of partitions being configured to provide allocation space for different datablock.
5. A fast fourier transform twiddle factor computing device, comprising a processor and a storage medium;
The storage medium is used for storing instructions;
The processor is operative according to the instructions to perform the steps of the fast fourier transform twiddle factor calculation method according to any of claims 1-2.
6. A computer readable storage medium having stored thereon a computer program, characterized in that the program when executed by a processor realizes the steps of the fast fourier transform twiddle factor calculation method according to any of claims 1-2.
CN202110751901.4A 2021-07-02 Fast Fourier transform twiddle factor computing system and method Active CN113569190B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110751901.4A CN113569190B (en) 2021-07-02 Fast Fourier transform twiddle factor computing system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110751901.4A CN113569190B (en) 2021-07-02 Fast Fourier transform twiddle factor computing system and method

Publications (2)

Publication Number Publication Date
CN113569190A CN113569190A (en) 2021-10-29
CN113569190B true CN113569190B (en) 2024-06-04

Family

ID=

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20040026910A (en) * 2002-09-26 2004-04-01 엘지전자 주식회사 Fast fourier transform apparatus and method thereof
CN101504638A (en) * 2009-03-19 2009-08-12 北京理工大学 Point-variable assembly line FFT processor
CN101630308A (en) * 2008-07-16 2010-01-20 财团法人交大思源基金会 Design and addressing method for any point number quick Fourier transformer based on memory
CN102209962A (en) * 2008-09-10 2011-10-05 先进汽车技术有限公司合作研究中心 Method and device for computing matrices for discrete fourier transform (dft) coefficients
CN102411491A (en) * 2011-12-31 2012-04-11 中国科学院自动化研究所 Data access method and device for parallel FFT (Fast Fourier Transform) computation
CN103218348A (en) * 2013-03-29 2013-07-24 北京创毅视讯科技有限公司 Method and system for processing fast Fourier transform
CN105893328A (en) * 2016-04-19 2016-08-24 南京亚派科技股份有限公司 Cooley-Tukey-based fast Fourier transform (FFT) algorithm
CN113569189A (en) * 2021-07-02 2021-10-29 星思连接(上海)半导体有限公司 Fast Fourier transform calculation method and device
CN113591022A (en) * 2021-07-02 2021-11-02 星思连接(上海)半导体有限公司 Read-write scheduling processing method and device capable of decomposing data

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20040026910A (en) * 2002-09-26 2004-04-01 엘지전자 주식회사 Fast fourier transform apparatus and method thereof
CN101630308A (en) * 2008-07-16 2010-01-20 财团法人交大思源基金会 Design and addressing method for any point number quick Fourier transformer based on memory
CN102209962A (en) * 2008-09-10 2011-10-05 先进汽车技术有限公司合作研究中心 Method and device for computing matrices for discrete fourier transform (dft) coefficients
CN101504638A (en) * 2009-03-19 2009-08-12 北京理工大学 Point-variable assembly line FFT processor
CN102411491A (en) * 2011-12-31 2012-04-11 中国科学院自动化研究所 Data access method and device for parallel FFT (Fast Fourier Transform) computation
CN103218348A (en) * 2013-03-29 2013-07-24 北京创毅视讯科技有限公司 Method and system for processing fast Fourier transform
CN105893328A (en) * 2016-04-19 2016-08-24 南京亚派科技股份有限公司 Cooley-Tukey-based fast Fourier transform (FFT) algorithm
CN113569189A (en) * 2021-07-02 2021-10-29 星思连接(上海)半导体有限公司 Fast Fourier transform calculation method and device
CN113591022A (en) * 2021-07-02 2021-11-02 星思连接(上海)半导体有限公司 Read-write scheduling processing method and device capable of decomposing data

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
A Hardware-Efficient and Reconfigurable UFMC Transmitter Architecture With its FPGA Prototype;Vikas Kumar 等;《IEEE Embedded Systems Letters》;20201201;第12卷(第4期);109-112 *
High-throughput and compact FFT architectures using the Good-Thomas and Winograd algorithms;Nikhilesh Bhagat 等;《IET Communications》;20180419;第12卷(第8期);1011-1018 *
一种在多核嵌入式平台上实现FFT的快速并行算法;彭自然 等;《计算机应用研究》;20171128;第34卷(第1期);3242-3246 *
基于改进旋转因子的高性能FFT硬件设计;骆阳 等;《浙江大学学报(工学版)》;20210615;第55卷(第6期);1199-1207 *
基于颗粒度可重构架构的并行FFT算法实现;曹鹏 等;《东南大学学报(自然科学版)》;20131120;第43卷(第6期);1174-1179 *

Similar Documents

Publication Publication Date Title
CN102375805B (en) Vector processor-oriented FFT (Fast Fourier Transform) parallel computation method based on SIMD (Single Instruction Multiple Data)
CN108170639B (en) Tensor CP decomposition implementation method based on distributed environment
US9740493B2 (en) System and method of loop vectorization by compressing indexes and data elements from iterations based on a control mask
DiMarco et al. Performance impact of dynamic parallelism on different clustering algorithms
CN111159235A (en) Data pre-partition method and device, electronic equipment and readable storage medium
CN111240744B (en) Method and system for improving parallel computing efficiency related to sparse matrix
CN116401258B (en) Data indexing method, data query method and corresponding devices
Liroz-Gistau et al. Dynamic workload-based partitioning for large-scale databases
US8990281B2 (en) Techniques for improving the efficiency of mixed radix fast fourier transform
Bisson et al. A GPU implementation of the sparse deep neural network graph challenge
CN110806942B (en) Data processing method and device
CN109460406A (en) A kind of data processing method and device
Niederhagen et al. Implementing Joux-Vitse’s Crossbred Algorithm for Solving Systems over on GPUs
CN113569190B (en) Fast Fourier transform twiddle factor computing system and method
US9582474B2 (en) Method and apparatus for performing a FFT computation
US20100179978A1 (en) Fft-based parallel system with memory reuse scheme
US20150331634A1 (en) Continuous-flow conflict-free mixed-radix fast fourier transform in multi-bank memory
CN113569190A (en) Fast Fourier transform rotation factor calculation system and method
EP2538345A1 (en) Fast fourier transform circuit
CN116129325A (en) Urban treatment image target extraction method and device and application thereof
CN113656507B (en) Method and device for executing transaction in block chain system
CN114116012B (en) Method and device for realizing vectorization of FFT code bit reverse order algorithm based on shuffle operation
CN113569189B (en) Fast Fourier transform calculation method and device
CN110955380B (en) Access data generation method, storage medium, computer device and apparatus
CN106469134A (en) A kind of data conflict-free access method for fft processor

Legal Events

Date Code Title Description
PB01 Publication
SE01 Entry into force of request for substantive examination
GR01 Patent grant