WO2006014528A1 - A method of and apparatus for implementing fast orthogonal transforms of variable size - Google Patents
A method of and apparatus for implementing fast orthogonal transforms of variable sizeInfo
- Publication number
- WO2006014528A1 WO2006014528A1 PCT/US2005/024063 US2005024063W WO2006014528A1 WO 2006014528 A1 WO2006014528 A1 WO 2006014528A1 US 2005024063 W US2005024063 W US 2005024063W WO 2006014528 A1 WO2006014528 A1 WO 2006014528A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- butterfly
- unit
- reconfigurable
- stage
- architecture
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/14—Fourier, Walsh or analogous domain transformations, e.g. Laplace, Hilbert, Karhunen-Loeve, transforms
- G06F17/141—Discrete Fourier transforms
- G06F17/142—Fast Fourier transforms, e.g. using a Cooley-Tukey type algorithm
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/14—Fourier, Walsh or analogous domain transformations, e.g. Laplace, Hilbert, Karhunen-Loeve, transforms
Definitions
- the disclosure relates to a system for and method of providing on-line reconfigurability of hardware so as to allow implementation of orthogonal transforms of vectors of varying size, such as FFT/IFFT (Inverse FFT) transforms, Walsh-Hadamard transforms, etc. including combinations of more than one type of such transform.
- orthogonal transforms of vectors of varying size such as FFT/IFFT (Inverse FFT) transforms, Walsh-Hadamard transforms, etc. including combinations of more than one type of such transform.
- FFT/IFFT Inverse FFT
- Walsh-Hadamard transforms etc. including combinations of more than one type of such transform.
- FFT Fast Fourier Transform
- IFFT Inverse FFT
- FFT/IFFT can be performed using a FFT block, by conjugating the input and output of the FFT and dividing the output by the size of the processed vectors. Hence the same hardware can be used for both FFT and IFFT.
- FFT/IFFT can be performed using a FFT block, by conjugating the input and output of the FFT and dividing the output by the size of the processed vectors.
- IFFT can be performed using a FFT block, by conjugating the input and output of the FFT and dividing the output by the size of the processed vectors.
- FFT/IFFT Several standard implementations of performing FFT/IFFT are known, some of which provide reconfigurability.
- One standard FFT/IFFT implementation is using FFT kernel arithmetic.
- N-point DFT discrete Fourier transform
- DIF decimation-in- f ⁇ equency
- DIT decimation-in-time
- the FFT can be viewed as a shuffle-exchange interconnecting network of butterfly blocks, which varies with the size of the FFT, thus making it difficult to support flexibility of the most energy-efficient fully-parallel implementation.
- the signal flow graph can be directly mapped onto hardware. For instance, for a 16-point FFT there are total of 32 butterfly units and they are interconnected in the manner as shown by the trellis in Figure 2.
- N general, the N-point FFT requires — Log 2 N butterfly units. This maximally
- parallel architecture has the potential for high performance and low power consumption, however it bears a high cost of large silicon area especially for large FFT sizes.
- each complex addition is composed of two real additions, which expand the input word-length by 1 bit.
- Each complex multiplication is composed of four real multiplications and two real additions.
- a real multiplication doubles the input word-length.
- the output word-length is either increased to (M+l)bits, or the output needs to be truncated or rounded to M bits. If truncation is performed, the most significant bit of the output is simply discarded, by truncating the values to the maximum values that can be described by M bits.
- the pipelined-based implementation needs more clock cycles per FFT frame than the column-based approach since the pipelined-based approach can implement a full FFT frame in N (when using radix-2 based butterfly architecture) clock cycles, while the column approach needs log 2 iV (when using radix-2 based butterfly architecture) clock cycles due to the iterative time-multiplexed structure.
- the clock number for processing an FFT frame is not an obstacle since the data is inserted in a serial manner, frame by frame, and the number of clock cycles per frame is transformed into a constant initial delay, while the throughput remains high.
- the single-path delay feedback (SDF) implementation uses memory more efficiently by storing the butterfly outputs in feedback shift registers or FIFO's 46 (their sizes are given in Figure 4, in the example the lengths of the registers are 8, 4, 2, and 1, correspondingly).
- a single data stream passes the multiplier at every stage.
- the hybrid approach combines benefits of the column and feedback approaches. It uses elements of the feedback approach to save memory, and the column stages are used for better hardware utilization. Use of the column stage butterfly units of 4 bits' width can be combined with employing a greater BUS width and proper reconfigurable multipliers. The architecture can also be converted to one with an exact BUS width necessary for high space utilization and algorithmic efficiency.
- FIG. 5 A popular architecture for running an iterative process is shown in Figure 5.
- This FFT implementation utilizes a single butterfly unit 50.
- the single butterfly unit design is mainly focused on optimizing a scheduling and memory access scheme, i.e., providing a pipeline approach when implementing each of the stages by reusing the same butterfly unit, time-multiplexed in an iterative way.
- the Spiffee processor see for example, B. M. Baas, "A Low-power, high- performance, 1024-point FFT processor," /EEE Journal of Solid-State Circuits, March 1999, is an example of using cached memory architecture, including RAM 52 and multiplier 56, to exploit the regular memory access pattern of a FFT algorithm in order to achieve low power consumption.
- the processor shown as controller 54, can be programmed to perform any length of FFT, but certain features, such as cache sizes provided by RAM 52, are optimized only for a certain FFT size, and this approach operates at very low speeds because the N clock cycles needed for the computation of a FFT frame through the full implementation of the pipeline algorithm, yielding a constant initial delay. This means that due to the iterative time-multiplexing of the stages by the reused butterfly unit 50, the full frame needs to be computed (needs N clock cycles when using a radix-2 based butterfly unit) before it can begin to handle the next FFT frame.
- Most of the FFT accelerators that are implemented in advanced DSPs and chips are based on the Radix-2 or Radix-4 FFT processors. They have a limited usage (only for FFTs transforms), very low speed utilization and suffer from the need of high clock rate design.
- any kind of filter or correlation function with high efficiency. It is achieved by using the multiplier of the last stage of a FFT transform for multiplication by a filter coefficient (time domain multiplication) followed by an IFFT as best seen in Figure 6 at 60. It is also efficient in implementing any sub-product of a FFT / IFFT, e.g. Discrete Cosine / Sine Transforms (DCT and DST), and any algorithms which are a combination of the above-mentioned algorithms, like filtering using cascaded FFT and IFFT algorithms (which can be used also for equalization, prediction, interpolation and computing correlations).
- DCT and DST Discrete Cosine / Sine Transforms
- the radix-2 2 algorithm is of particular interest. It has the same multiplicative complexity as radix-4 and split-radix algorithms respectively, while retaining a regular radix-2 butterfly structure. This spatial regularity provides a great structural advantage over other algorithms for VLSI implementation.
- Figure 7 illustrates a trellis representing such a coefficient rearrangement (in parallel form): for any two butterfly coefficients fflf' N and Jjf' * ' , W ⁇ * s f actore d out and forwarded to the next stage, which leaves the coefficients 1 and -j in the corresponding positions. After performing this coefficient rearrangement over all the coefficient pairs, one stage is left without non-trivial multiplication. [0035] Hybrid pipeline / Multiplex approach:
- FIG. 8A A number of pipelined FFT architectures have been proposed over the last decade. Since the spatial regularity of the signal flow graph is preserved in pipelined architectures, they are highly modular and scalable.
- the shuffle network 80 is implemented through a single-path delay feedback depicted in Figure 8A, where the data is processed between stages 82 in a single path and feedback FIFO registers 84 are used to store new inputs and intermediate results.
- the basic idea behind this scheme is to store the data and scramble it so that the next stage can receive data in the correct order.
- the FIFO registers 84 are filled with the first half of the inputs, the last half of the previous results are shifted out to the next stage. During this time, the operational elements are bypassed.
- Figure 1 is an illustration of a FFT butterfly computation trellis
- Figure 2 is an illustration of a decimation-in- frequency 16-point FFT trellis
- Figure 3 is an illustration of a Column-based 16-point FFT trellis;
- Figure 5 is an illustration of a block diagram of an architecture for implementing a simple Radix-2 FFT processor
- Figure 7 is an illustration of a trellis of a multiplication elimination technique through coefficient rearrangement
- Figure 8 is an illustration of a trellis, block diagram and packet diagram of a pipelined implementation of a shuffle-exchange interconnect transformer
- Figure 9 is an illustration of a matrix operation for use in a radix-4 butterfly architecture in accordance with one aspect of the method and system of the present disclosure
- Figure 10 is an illustration of a radix-2 2 stage trellis in accordance with one aspect of the method and system of the present disclosure
- Figure 11 is an illustration of a block diagram of an architecture of a reconfigurable Radix-2 2 stage butterfly arrangement in accordance with one aspect of the method and system of the present disclosure
- Figure 17 is an illustration of a block diagram of an architecture of providing a reconfigurable MF-I core processor in accordance with one aspect of the method and system of the present disclosure.
- Figure 18 is an illustration of a block diagram of an architecture of providing a reconfigurable MF-I core processor in accordance with one aspect of the method and system of the present disclosure
- Figure 19 is a block diagram of a communication system configured to comprise a transformer of any of the type described herein.
- the following disclosure describes a method of and system for implementing orthogonal transforms, such as Fast Fourier Transforms (FFTs) of vectors having varying size (real and complex vectors).
- FFTs Fast Fourier Transforms
- Adaptive algorithms are implemented where the size of the transform can be determined on line and is dependent on the input to the algorithm. Examples of such adaptive algorithms are (1) FFTs, (2) inverse FFT (IFFTs), (3) any sub-products of FFTs and IFFTs, e.g. Discrete Cosine/Sine Transforms (DCT and DST), (4) Walsh-Hadamard transforms and any its' sub-products, e.g. CDMA, DSSS, Spreading/De-spreading core algorithms, and any combination of the algorithms mentioned above.
- IFFTs inverse FFT
- DCT and DST Discrete Cosine/Sine Transforms
- Walsh-Hadamard transforms and any its' sub-products e.g. CDMA,
- the method and system can also be used for filtering and other functions, such as achieved when cascading FFT and IFFT algorithms (which in turn can be used also for equalization, Hubert transforms, predictions and interpolations and correlations).
- the method and system allows implementation of FFT/IFFT and all the above-mentioned algorithms with high efficiency and in a wide range of parameters by fast on-line reconfiguration of hardware. It provides a significant decrease in the amount of hardware in devices which are intended for parallel or serial implementation of several FFT transforms or algorithms mentioned above of different sizes.
- the disclosed approach is to modify an orthogonal transform processor so as to provide a simplified interconnection structure that makes it easy to achieve flexibility by adapting to the length of the FFT vectors and sizing the memory accordingly, e.g., changing the length of the shift registers (or FIFO's), modifying the interconnecting buses as needed, and providing simple multiplexing of I/O blocks.
- a clock frequency at the input sample rate the entire range of FFT's can be accommodated by either direct mapping to hardware and disabling unnecessary blocks for the shorter length FFT's or by folding the processing stages and time-sharing the hardware for the longer (but lower symbol rate) cases.
- This architecture does not need buffering or serial-to-parallel conversion.
- the architecture can be implemented using according to Radix2, Radix2 2 , Radix2 3 , Radix4, Radix 8, or similar format.
- the radix-4 (without the Twiddle coefficients' multipliers) can be represented also as a matrix operation as shown in Figure 9, and implemented as shown by the trellis in Figure 10 .
- An embodiment of a reconfigurable radix 2 2 stage implementation comprises in input multiplexer 111, two stages of butterfly units 110a and HOb, two feedback memories 112a and 112b with only one general multiplier 114 and one cross junction (with sign inversion capability) block 116, and a controller 118.
- the block 116 is used to switch between IFFT and FFT processing, thus eliminating the need for a multiplier at the output of the butterfly unit HOa .
- the size of the usable memory of memories 112a and 112b can be modified by the controller 118 to accommodate the length of the FFT being processed.
- the length of the transform vectors can be detected by detector 117 and determined by controller 118.
- memory 119 is provided for storing coefficients for use by the multiplier 114 for each stage of calculation..
- the controller 128 provides an input to set the size of each of the memories, in this case shift registers 124 for each stage.
- the multiplexer 121 is also set to provide the desired sequential inputs to the input the butterfly unit 122a of the first stage.
- the multipliers 126a, 126b and 126c are separately positioned at the output of each of the first three stages, with the final stage not requiring one. As seen the multipliers 126a and 126c convert the output of the stages to which the are coupled to an imaginary complex "j", by multiplying by "j".
- Figure 13 An alternative embodiment is shown in Figure 13 which incorporates an architecture for carrying out an iterative process.
- the output of the multiplier 130b provides feedback, as well as the output of the transformation processor.
- the output of the multiplexer 131 is provided to the input of the butterfly unit 132a.
- the latter provides feedback to the memory (e.g. shift register 134a, and an output to the "j" multiplier 136a.
- the output of the "j" multiplier 136a is applied to the input of the butterfly unit 132b.
- controller 138 controls the size of the memories 134 depending to the stage of the processing. In the first instance when the signal vectors are first received, the registers 134a and 134b are set at "8" and "4" respectively, and the signals processed through the two stages. The output of the processor is disabled and the output of the second stage butterfly unit 132b is applied through the feedback path to the input of the butterfly unit 132a. During the next iteration, the memories are set by the controller to "2" and "1". The signals are then serially processed through to the output of the second butterfly unit 132b. The output of the processor is then enabled, and the feedback path disabled so that the output of the processor is provided at 139.
- the architecture can be iterative or a mixture of pipeline / iterative or parallel.
- Figure 14 can be modiefied to be interative or a mixture of pipeline/iterative or parallel architecture.
- the "Radix 4" Walsh Spreading/De-spreading butterfly unit can be represented as a matrix operation as follows:
- the radix-4 transform is a complex operation, one obtains two independent Walsh spreading/de-spreading processes for the real vectors since the trivial multipliers by ⁇ 1 do not interchange between the I and the Q signals. Therefore, this feature can be used for implementing, for example, a two fingers' RAKE receiver or a complex Walsh spreading/de-spreading function as in the new WCDMA standards.
- the complex multipliers now can be used in implementation of filters in the frequency domain for randomizing/de-randomizing the Walsh sequence with quasi-random sequences with very high efficiency (when dealing with CDMA modulation /demodulation of several codes together, i.e. for a heavy data load (as can be seen in CDMA / WCDMA standards).
- the efficiency is achieved due to the fact that one needs to multiply the modulated data only once (for all the codes) and not every code is multiplied separately ⁇
- Figure 15 illustrates a trellis of an example of an embodiment of the transformation of a radix-4 stage to Walsh spreading/de-spreading function when Twiddle multipliers for randomizing Walsh codes are used in the beginning and the end of a parallel architecture.
- complex multipliers can be used as explained above, e.g. for implementation of filters in the frequency domain, or for randomization/de- randomization of the Walsh sequences with quasi-random sequences. Efficiency is achieved due to the fact that one needs to multiply the modulated data only once (for all the codes), and thus each code need not be multiplied separately.
- a "bank" of small radix2 2 butterfly units of 4 bits' width can be combined to form wider BUS radix2 2 , with each of the small Radixes connected to a reconfigurable controlled "Bank” of RAMs that can be combined /split.
- Reconfigurable multipliers for BUS splitting can also be implemented based on the above methodology using a reconfigurable "processing" core with very high utilization and low power consumption of any length of IFFT/FFT/ filter / correlator and Walsh -Hadamard transformations or any sub product of it e.g., a CDMA DSSS core or even a DDS frequency filter, with any BUS width necessary when several algorithms can run in any configuration, including a variety of parallel/pipeline/iterative algorithmic architecture schemes.
- FIG. 17 shows an example of a reconfigurable MF-I core for processing FFT/IFFT vectors.
- the current approach includes modification of the basic FFT processor by using a simplified interconnection structure. This allows flexibility in adjusting for the size of the FFT simply by changing the length of the shift registers (or FIFO's) of the memory, changing the bus sizes as needed, and simple multiplexing of the I/O blocks. With a clock frequency at the input sample rate, the entire range of the FFT's can be accommodated by either direct mapping to hardware and disabling unnecessary blocks for the shorter length FFT's, or by folding the processing stages and time ⁇ sharing the hardware for the longer (but slower symbol rate) cases. This architecture does not require buffering or serial-to- parallel conversion.
- the radix-4 (without the twiddle coefficients' multipliers) can be represented also as a matrix operation as seen in Figure 9.
- the corresponding butterfly structure is presented in Figure 10.
- a radix 2 2 stage implementation will need two stages of butterfly units with only one general multiplier and one cross junction (also needed for BFFT/FFT changing) with sign multiplication, and thus eliminating the need for a multiplier.
- the corresponding structure is presented in Figure 11.
- the corresponding multistage implementation (cf. with Figure 4) of Radix2 2 implementation of the a 16 point FFT is given in Figure 12.
- non-trivial multipliers are all that is necessary for implementation of the trivial multipliers needed for Walsh spreading/despreading, with the ability to change between FFT+ ⁇ IFFT and multiply with -j.
- the only extra requirement for the hardware is in a controller for managing and controlling the operation of the processor.
- the "Radix 4" Walsh spreading/despreading butterfly can be also represented as a matrix operation as shown below:
- the radix-4 transform is a complex operation, one gets two independent Walsh spreading/despreading processes for real vectors spreading/despreading (since the trivial multipliers by ⁇ 1 do not interchange between the I and the Q signals).
- This aspect is useful in implementing a two fingers' RAKE receiver or a complex Walsh Spreading/De-spreading processor as is provided for in the new WCDMA standards.
- the complex multipliers now can be used for implementing such configurations as filters in the frequency domain for randomizing/de-randomizing the Walsh sequence with quasi-random sequences with very high efficiency (when dealing with CDMA modulation /demodulation of several codes together, i.e. for a heavy data load (as can be seen in CDMA / WCDMA standards).
- the efficiency is achieved due to the fact that one needs to multiply the modulated data only once (for all the codes), and not every code is multiplied separately.
- Figure 15 presents a transformation of radix-4 stage to a Walsh spreading/de-spreading function when Twiddle multipliers for randomizing the Walsh codes are needed (beginning/end) in a parallel architecture.
- the twiddle multipliers need to be changed to "l"s only.
- the example of 16 chips' Walsh spreading/despreading sequences for modulation/demodulation processing is shown in Figure 16.
- the complex multipliers can be used as explained above, e.g. for implementing filters in the frequency domain, or for randomization/de-randomization of the Walsh sequences with quasi-random sequences. The efficiency is achieved due to the fact that one needs to multiply the modulated data only once (for all the codes). Each code is not required to be multiplied separately.
- the general architecture of the reconfigurable device for implementing the general orthogonal transforms is summarily shown in Figure 18 for the case of Radix2;/x butterfly transforms.
- the computation unit can be implemented by use of Radix2, Radix2 2 , Radix2 3 , Radix 4, Radix 8, etc, butterfly units.
- the device preferably comprises a reconfigurable RAM cluster and a reconfigurable BUS multiplexer block 180, computation unit 182 comprising one or more butterfly units, reconfigurable multipliers block 184, controlling and storage unit 186 and detector 188.
- the unit 186 modifies the coefficients of the multipliers in the butterfly units of 2 according to the transform (the corresponding coefficients may take on the values ⁇ -l,l,j,-j ⁇ ).
- the result of the operation by unit 182 is stored in the registers of the unit 180 (which is also controlled by unit 186).
- the size of the registers is changed from stage to stage.
- a part of the stored data is inserted into the reconfigurable multipliers block 184, data is multiplied by coefficients established by the controlling and storage unit 186, according to the stage and the algorithm.
- the result of the multiplication is stored in block 180.
- a multiplexer of block 180 is used for multiplexing the stored data. It will be evident that as few as one butterfly unit and one multiplexer can be used for each stage, and that the one butterfly unit and multiplier can be reused for each stage by simply reconfiguring the hardware.
- an embodiment of an integrated chip made to comply with the foregoing chip architecture requirements will comprise the following basic functional components:
- CPU 190 is preferably a relatively small computer processing unit needed for (a) controlling the configware part of the device i.e., net bus 192, I/O block 194, RAM block 196, megafunction block(s) 198, interconnect block 200, flash memory block 202 and clock 204; and (b) fixing the configuration of the megafunctions block(s) 198, as well as the bus 192, I/O block 194, RAM block 196, interconnect block 200, flash memory block 202 and clock 204, depending upon the protocol of the signals be processed by the chip.
- CPU 190 can also help by computing minor and simple assignments or tasks, and configuring the bus that is used to interconnect the megafunctions and the I/O block.
- the net bus 192 is reconfigurable depending on the protocol.
- I/O block 194 is preferably a configurable I/O block that connects the chip with the outside world. Its tasks include receiving the "compiled software" of the application algorithm, and receiving input data and delivering output-processed data.
- RAM 196 is a random access memory preferably configured to store the "compiled software instructions", and to cache and buffer data.
- Megafunctions block 198 is preferably configured to include the major application functions of two or more applications, i.e., protocols, which are processed by computing each domain of the application functions as one function with extraordinary efficiency. In the present case, the megafunction block 198 is configured to include one or more of the orthogonal transforms, or any combination thereof, described herein.
- Interconnect block 200 preferably includes the reconfigurable net bus, which connects all the components of the chip including the CPU 190, I/O block 194, RAM 196, Megafunctions block 198, and Flash Memory 202 and Clock block 204.
- the interconnect block can also be configured to perform minor and simple assignments or tasks, preferably in extra memory.
- flash memory 200 preferably serves to store data as the chip runs through its programs. Flash memory is preferably in the form of EEPROM that allows multiple memory locations to be erased or written in one programming operation, so that it can operate at higher effective speeds when the systems using it read and write to different locations at the same time. It should be appreciated that for less complex operations, other types of memory could be used.
- flash memory Information is preferably stored in the flash memory by storing the information on a silicon chip in a way that does not need power to maintain the information in the chip. Consequently, power to the chip can be withdrawn and the information retained in flash memory without consuming any power.
- flash memory offers fast read access times and solid-state shock resistance, making flash memory particularly desirable in applications such as data storage on battery-powered devices like cellular phones and PDAs.
- the architecture thus described thus can be implemented as an integrated circuit.
- the architecture is believed adaptable for any type of orthogonal signaling, in which the vectors can vary in size (both real and complex vectors).
- orthogonal signaling can contain, but not restricted to FFT transforms, inverse FFT transforms (IFFT) or any its sub-product like Discrete Cosine/Sine Transforms (DCT and DST), Walsh-Hadamard transforms or any its sub-product like CDMA DSSS Spreading / De-spreading, and any algorithm which is a combination of two or more of these algorithms, and such other functionality, for example, filtering by using concatenation of FFT and IFFT transforms, which can be used also for equalization, Hubert transforms, predictions, interpolations, correlations, etc.
- IFFT inverse FFT transforms
- DCT and DST Discrete Cosine/Sine Transforms
- Walsh-Hadamard transforms or any its sub-product like CDMA DSSS Spreading
Landscapes
- Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Computational Mathematics (AREA)
- Mathematical Analysis (AREA)
- Mathematical Optimization (AREA)
- Pure & Applied Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Algebra (AREA)
- Software Systems (AREA)
- General Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Discrete Mathematics (AREA)
- Complex Calculations (AREA)
Abstract
Description
Claims
Priority Applications (6)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2007520491A JP2008506191A (en) | 2004-07-08 | 2005-07-08 | Method and apparatus for performing variable size fast orthogonal transform |
KR1020077003027A KR101162649B1 (en) | 2004-07-08 | 2005-07-08 | A method of and apparatus for implementing fast orthogonal transforms of variable size |
EP05768342A EP1769391A1 (en) | 2004-07-08 | 2005-07-08 | A method of and apparatus for implementing fast orthogonal transforms of variable size |
AU2005269896A AU2005269896A1 (en) | 2004-07-08 | 2005-07-08 | A method of and apparatus for implementing fast orthogonal transforms of variable size |
CA002563450A CA2563450A1 (en) | 2004-07-08 | 2005-07-08 | A method of and apparatus for implementing fast orthogonal transforms of variable size |
IL180586A IL180586A0 (en) | 2004-07-08 | 2007-01-08 | A method of and apparatus for implementing fast orthogonal transforms of variable size |
Applications Claiming Priority (12)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US58639004P | 2004-07-08 | 2004-07-08 | |
US58638904P | 2004-07-08 | 2004-07-08 | |
US58639104P | 2004-07-08 | 2004-07-08 | |
US58635304P | 2004-07-08 | 2004-07-08 | |
US60/586,389 | 2004-07-08 | ||
US60/586,353 | 2004-07-08 | ||
US60/586,391 | 2004-07-08 | ||
US60/586,390 | 2004-07-08 | ||
US60425804P | 2004-08-25 | 2004-08-25 | |
US60/604,258 | 2004-08-25 | ||
US11/071,340 US7568059B2 (en) | 2004-07-08 | 2005-03-03 | Low-power reconfigurable architecture for simultaneous implementation of distinct communication standards |
US11/071,340 | 2005-03-03 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2006014528A1 true WO2006014528A1 (en) | 2006-02-09 |
Family
ID=35787416
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2005/024063 WO2006014528A1 (en) | 2004-07-08 | 2005-07-08 | A method of and apparatus for implementing fast orthogonal transforms of variable size |
Country Status (6)
Country | Link |
---|---|
EP (1) | EP1769391A1 (en) |
JP (1) | JP2008506191A (en) |
KR (1) | KR101162649B1 (en) |
AU (1) | AU2005269896A1 (en) |
CA (1) | CA2563450A1 (en) |
WO (1) | WO2006014528A1 (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2010026211A1 (en) * | 2008-09-05 | 2010-03-11 | Commissariat A L'energie Atomique | Digital processing device for fourier transform and filtering with finite impulse response |
KR101275087B1 (en) | 2011-10-28 | 2013-06-17 | (주)에프씨아이 | Ofdm receiver |
EP2696294A4 (en) * | 2011-04-07 | 2017-03-01 | ZTE Microelectronics Technology Co., Ltd | Method and device of supporting arbitrary replacement among multiple data units |
CN113111300A (en) * | 2020-01-13 | 2021-07-13 | 上海大学 | Fixed point FFT implementation architecture with optimized resource consumption |
Families Citing this family (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2009110022A1 (en) * | 2008-03-03 | 2009-09-11 | 富士通株式会社 | Wireless communication device |
WO2013042249A1 (en) * | 2011-09-22 | 2013-03-28 | 富士通株式会社 | Fast fourier transform circuit |
JPWO2013042249A1 (en) * | 2011-09-22 | 2015-03-26 | 富士通株式会社 | Fast Fourier transform circuit |
JP5954415B2 (en) * | 2012-07-18 | 2016-07-20 | 日本電気株式会社 | FFT circuit |
US9934199B2 (en) | 2013-07-23 | 2018-04-03 | Nec Corporation | Digital filter device, digital filtering method, and storage medium having digital filter program stored thereon |
WO2015087495A1 (en) | 2013-12-13 | 2015-06-18 | 日本電気株式会社 | Digital filter device, digital filter processing method, and storage medium having digital filter program stored thereon |
GB2548908B (en) * | 2016-04-01 | 2019-01-30 | Advanced Risc Mach Ltd | Complex multiply instruction |
KR102155770B1 (en) * | 2018-11-27 | 2020-09-14 | 한국항공대학교산학협력단 | Scalable fast Fourier transform apparatus and method based on twice perfect shuffle network for radar applications |
US20230029006A1 (en) * | 2020-02-06 | 2023-01-26 | Mitsubishi Electric Corporation | Complex multiplication circuit |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6061705A (en) * | 1998-01-21 | 2000-05-09 | Telefonaktiebolaget Lm Ericsson | Power and area efficient fast fourier transform processor |
US6237012B1 (en) * | 1997-11-07 | 2001-05-22 | Matsushita Electric Industrial Co., Ltd. | Orthogonal transform apparatus |
US20030055861A1 (en) * | 2001-09-18 | 2003-03-20 | Lai Gary N. | Multipler unit in reconfigurable chip |
US6735167B1 (en) * | 1999-11-29 | 2004-05-11 | Fujitsu Limited | Orthogonal transform processor |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH02504682A (en) * | 1987-08-21 | 1990-12-27 | コモンウエルス・サイエンティフィック・アンド・インダストリアル・リサーチ・オーガニゼイション | Conversion processing circuit |
US5293330A (en) * | 1991-11-08 | 1994-03-08 | Communications Satellite Corporation | Pipeline processor for mixed-size FFTs |
WO1997019412A1 (en) * | 1995-11-17 | 1997-05-29 | Teracom Svensk Rundradio | Improvements in or relating to real-time pipeline fast fourier transform processors |
US6003056A (en) * | 1997-01-06 | 1999-12-14 | Auslander; Lewis | Dimensionless fast fourier transform method and apparatus |
JP3846197B2 (en) * | 2001-01-19 | 2006-11-15 | ソニー株式会社 | Arithmetic system |
JP4546711B2 (en) * | 2002-10-07 | 2010-09-15 | パナソニック株式会社 | Communication device |
-
2005
- 2005-07-08 CA CA002563450A patent/CA2563450A1/en not_active Abandoned
- 2005-07-08 AU AU2005269896A patent/AU2005269896A1/en not_active Abandoned
- 2005-07-08 WO PCT/US2005/024063 patent/WO2006014528A1/en active Application Filing
- 2005-07-08 EP EP05768342A patent/EP1769391A1/en not_active Withdrawn
- 2005-07-08 JP JP2007520491A patent/JP2008506191A/en active Pending
- 2005-07-08 KR KR1020077003027A patent/KR101162649B1/en not_active IP Right Cessation
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6237012B1 (en) * | 1997-11-07 | 2001-05-22 | Matsushita Electric Industrial Co., Ltd. | Orthogonal transform apparatus |
US6061705A (en) * | 1998-01-21 | 2000-05-09 | Telefonaktiebolaget Lm Ericsson | Power and area efficient fast fourier transform processor |
US6735167B1 (en) * | 1999-11-29 | 2004-05-11 | Fujitsu Limited | Orthogonal transform processor |
US20030055861A1 (en) * | 2001-09-18 | 2003-03-20 | Lai Gary N. | Multipler unit in reconfigurable chip |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2010026211A1 (en) * | 2008-09-05 | 2010-03-11 | Commissariat A L'energie Atomique | Digital processing device for fourier transform and filtering with finite impulse response |
FR2935819A1 (en) * | 2008-09-05 | 2010-03-12 | Commissariat Energie Atomique | DIGITAL PROCESSING DEVICE FOR FOURIER TRANSFORMATION AND FINAL IMPULSE RESPONSE FILTERING |
EP2696294A4 (en) * | 2011-04-07 | 2017-03-01 | ZTE Microelectronics Technology Co., Ltd | Method and device of supporting arbitrary replacement among multiple data units |
KR101275087B1 (en) | 2011-10-28 | 2013-06-17 | (주)에프씨아이 | Ofdm receiver |
CN113111300A (en) * | 2020-01-13 | 2021-07-13 | 上海大学 | Fixed point FFT implementation architecture with optimized resource consumption |
Also Published As
Publication number | Publication date |
---|---|
JP2008506191A (en) | 2008-02-28 |
AU2005269896A1 (en) | 2006-02-09 |
EP1769391A1 (en) | 2007-04-04 |
CA2563450A1 (en) | 2006-02-09 |
KR101162649B1 (en) | 2012-07-06 |
KR20070060074A (en) | 2007-06-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US7870176B2 (en) | Method of and apparatus for implementing fast orthogonal transforms of variable size | |
KR101162649B1 (en) | A method of and apparatus for implementing fast orthogonal transforms of variable size | |
He et al. | A new approach to pipeline FFT processor | |
Uzun et al. | FPGA implementations of fast Fourier transforms for real-time signal and image processing | |
Wold et al. | Pipeline and parallel-pipeline FFT processors for VLSI implementations | |
JP4163178B2 (en) | Optimized discrete Fourier transform method and apparatus using prime factorization algorithm | |
CN113568851B (en) | Method for accessing memory and corresponding circuit | |
EP1546863B1 (en) | Computationally efficient mathematical engine | |
Revanna et al. | A scalable FFT processor architecture for OFDM based communication systems | |
US6658441B1 (en) | Apparatus and method for recursive parallel and pipelined fast fourier transform | |
Joshi | FFT architectures: a review | |
CN100547580C (en) | Be used to realize the method and apparatus of the fast orthogonal transforms of variable-size | |
EP1076296A2 (en) | Data storage for fast fourier transforms | |
Hassan et al. | Implementation of a reconfigurable ASIP for high throughput low power DFT/DCT/FIR engine | |
US6330580B1 (en) | Pipelined fast fourier transform processor | |
Vergara et al. | A 195K FFT/s (256-points) high performance FFT/IFFT processor for OFDM applications | |
Adiono et al. | 64-point fast efficient FFT architecture using radix-23 single path delay feedback | |
Ward et al. | Bit-level systolic array implementation of the Winograd Fourier transform algorithm | |
Rawski et al. | Distributed arithmetic based implementation of Fourier transform targeted at FPGA architectures | |
Guo | An efficient parallel adder based design for one dimensional discrete Fourier transform | |
Chang et al. | Hardware-efficient implementations for discrete function transforms using LUT-based FPGAs | |
Tang et al. | A new memory reference reduction method for FFT implementation on DSP | |
KR100416641B1 (en) | The Calculation Methods and Cricuits for High-Speed FFT on Programmable Processors | |
Zhou et al. | A Coarse-Grained Dynamically Reconfigurable Processing Array (RPA) for Multimedia Application | |
CA2451167A1 (en) | Pipelined fft processor with memory address interleaving |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A1 Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KM KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NA NG NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SM SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW |
|
AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): GM KE LS MW MZ NA SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LT LU LV MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
WWE | Wipo information: entry into national phase |
Ref document number: 2005768342 Country of ref document: EP Ref document number: 2005269896 Country of ref document: AU Ref document number: 2563450 Country of ref document: CA |
|
ENP | Entry into the national phase |
Ref document number: 2005269896 Country of ref document: AU Date of ref document: 20050708 Kind code of ref document: A |
|
WWP | Wipo information: published in national office |
Ref document number: 2005269896 Country of ref document: AU |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2007520491 Country of ref document: JP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 200580023094.3 Country of ref document: CN Ref document number: 180586 Country of ref document: IL Ref document number: 86/CHENP/2007 Country of ref document: IN |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
WWW | Wipo information: withdrawn in national office |
Ref document number: DE |
|
WWE | Wipo information: entry into national phase |
Ref document number: 1020077003027 Country of ref document: KR |
|
WWP | Wipo information: published in national office |
Ref document number: 2005768342 Country of ref document: EP |