JPS59189474A

JPS59189474A - High speed foulier transformation operating device

Info

Publication number: JPS59189474A
Application number: JP58065130A
Authority: JP
Inventors: Hideo Nagai; 秀夫長井
Original assignee: Yokogawa Medical Systems Ltd
Current assignee: GE Healthcare Japan Corp
Priority date: 1983-04-13
Filing date: 1983-04-13
Publication date: 1984-10-27
Also published as: JPH0230062B2

Abstract

PURPOSE:To shorten time required for operation by providing two high speed access memories of minimum capacity and controlling input/output memory access of large capacity memory ingeniously. CONSTITUTION:Data of one point each is taken out from small blocks made by quartering each block and written in a high speed access memory HSM1. Next four complex number data are written in a memory HSM2 from a large capacity memory MS, and at the same time, high speed Foulier transformation operation (FFT) is made on the memory HSM1, and the result is written in HSM1. Then, FFT operation is made on the memory HSM2 and the result is written in the memory HSM2. At the same time, four complex number data of the memory HSM1 are written in the memory MS. Then, next data are written in the memory HSM1 from the memory MS. FFT operation is made on the memory HGM1 and the result is written in the memory HSM1. At the same time, data of the memory HSM2 are written in the memory MS, and then next data are read from the memory MS.

Description

【発明の詳細な説明】本発明は、高速フーリエ変換（以下ＦＦＴと略す）を行
うＦＦＴ演算装置に関する０従来から、汎用のノくイブライン方式のアレイプロ４　
ッｆ　−（Ａｒｒａｙ　ｐｒｏｃｅｓｓｏｒ　　以下Ａ
Ｐと略す）では、通常データ・メモリ・アクセスに時間
遅れ汐よあシ、加減算２乗算の完了に時間遅れを増大さ
せている。DETAILED DESCRIPTION OF THE INVENTION The present invention relates to an FFT arithmetic device that performs fast Fourier transform (hereinafter abbreviated as FFT).
f - (Array processor hereinafter A
(abbreviated as P), there is normally a time delay for data memory access and an increased time delay for the completion of addition, subtraction, and squaring.

演算の入力（又は出力）−データの格納メモＩＪ　７５
−らのデータ読出しく又は書込み）に遅れｉ！あり、か
つまた連続的に大量のデータの読出しないし書込みを必
要とするＦＦＴ演算においては、その演算（処理）能力
は、−メモリとのデータ授受でｆＩＩｊ限される速度に
なってしまう。Calculation input (or output) - data storage memo IJ 75
- There is a delay in reading or writing data from i! In an FFT operation that requires continuous reading or writing of a large amount of data, the calculation (processing) capability is limited by fIIj by the data exchange with the memory.

例えば、２点のコンプレックス・データ（Ｃｏｍｐｌｅ
ｘ　Ｄａｔａ　）の読出し及び書込みに要する時間は、
各点が実数部と虚数部の２データで表挑されているため
、読出しに！サイクル、書込みに４サイクルの計８サイ
クル必要である。この場合、４回の乗算の開始と６回の
加算の開始が、乗算とカロ算の並列処理によって６サイ
クルで実行できたとしても、メモリ・アクセス回数８サ
イクルがＦＦＴ演算の速度を決定してしまい、すなわち
ＦＦＴ演算速度がメモリ・アクセス回数で制限されてし
まうという問題があった。For example, two points of complex data (Complete
The time required to read and write xData) is
Each point is represented by two data, the real part and the imaginary part, so read it out! A total of 8 cycles, 4 cycles for writing, are required. In this case, even if the start of 4 multiplications and the start of 6 additions can be executed in 6 cycles by parallel processing of multiplication and Caro arithmetic, the number of memory accesses of 8 cycles determines the speed of the FFT operation. In other words, there is a problem in that the FFT calculation speed is limited by the number of memory accesses.

本発明は、このような点に鑑み、その目的とするところ
は、ＦＦＴ入出力データ格納用低速大容量のメモリの他
に、２個の最小容量の高速アクセス・メモリを用意し、
大容量メモリの入力、出力メモリ・アドレスを巧みに制
御して演算に要する時間の短縮化を図り得るＦＦＴ装置
を提供するととＫある。In view of these points, the present invention aims to provide, in addition to a low-speed, large-capacity memory for storing FFT input/output data, two minimum-capacity, high-speed access memories;
It is an object of the present invention to provide an FFT device that can shorten the time required for calculation by skillfully controlling input and output memory addresses of a large-capacity memory.

以下図面を用いて本発明の詳細な説明する。まず、本発
明の方式のアルゴリズムについて説明する。第１図は本
発明の方式のシグナル・フローを示す図で、データ数１
６．ＤＩＴ法、１ｎ−ｐｌａｃｅ方式。The present invention will be described in detail below using the drawings. First, the algorithm of the method of the present invention will be explained. Figure 1 is a diagram showing the signal flow of the method of the present invention, where the number of data is 1.
6. DIT method, 1n-place method.

入力正順の例における演算の方法及びデータの流れを示
し、第２図にブロック内データ数２・恥。The calculation method and data flow in an example of normal input order are shown in Figure 2, where the number of data in a block is 2.

ＤＩＴ法＋　Ｉｎ　ｐｌａｃｅ方式、入力正順の例にお
けるシグナル・フローの詳細を示しである。The details of the signal flow in an example of the DIT method + In place method and normal input order are shown.

本発明の方式におけるＦＦＴアルコ゛１ノズムの基本を
述べれば次のとおりである。本ＦＦＴは、Ｂａ５ｅ２Ｆ
ＦＴ　、　：Ｉ　７プレツクス・データ数Ｎ　、　　１
ｎ−ｐｌａｃｅ方式、入力データ正順、　ＤＩＴ法の場
合であり、演算手順をＮ　＝　２２ＹとＮ＝　２２Ｙ”
１−の場合に分けて示せば次のとおりである。The basics of the FFT algorithm in the method of the present invention are as follows. This FFT is Ba5e2F
FT, :I 7plex data number N, 1
This is the case of n-place method, input data forward order, and DIT method, and the calculation procedure is N = 22Y and N = 22Y"
The case of 1- is shown separately as follows.

（（）　　Ｎ＝２２Ｙの場合（１）　　Ｎ　＝　２２Ｙ、　ＮＤ　＝　Ｎ／２．　ｔ
　＝　１とする。(() For N=22Y (1) N = 22Y, ND = N/2.t
= 1.

（２）　　Ｋ　＝　Ｏ、Ｍ　＝　Ｏとする。(2) Let K=O, M=O.

（３）　　ｋ＝に、　ｋ＋１．　　ｋ＋２．　、、、、
、、、　　ｋ＋ＮＤ−１に対し、下記の演算を行う。(3) For k=, k+1. k+2. ,,,,
, , The following calculation is performed for k+ND-1.

Ａ＝Ｃ，−□（ｋ）−！−Ｃ，−□（ｋ＋ＮＤ）・Ｗ（
Ｍ）　　　　　　（１−１）ｃ＝ｃ見−１（ｋ）−Ｃ，
、（ｋ＋ＮＤ）・ｗ（＋ｌＩ）　　　　　　（１−２）
Ｂ＝Ｃ（ｋ＋−ＮＤ）＋Ｃ，−１（ｋ＋ＴＮＤ）・Ｗ（
Ｍ）　　　　（２Ｊ）ｚ−１２Ｄ　＝　Ｃ，、（ｋ＋　Ｔ　ＮＤ）−Ｃ，１（ｋ＋　Ｔ
年戸（Ｍ）　　（２−２）ｃ、（ｋ）　＝　Ａ＋Ｂ−Ｗ
（２Ｍ）　　　　　　　　　　（３−：Ｌ）Ｃ，（ｋ＋
ＴＮＤ）　＝　Ａ−Ｂ−Ｗ（２Ｍ）　　　　　　　　（
３−２）Ｃｙ、（ｋ”ＮＤ　）　＝Ｃ＋Ｄ−Ｗ（２Ｍ”
２　）　　　　　　　　（４−１）ｃ　　（ｋ＋−ＮＤ
）＝　ｃ−Ｄ−ｗ（２Ｍ＋２）　　　　　　　　　　　
　　　　（４−２）　　　２Ｐ＝Ｂｉｔ　Ｒｅｖｅｒｓｅ（ｉ　）Ｉｊ２＝−１（４
）　　Ｍ＝Ｍ＋２　、　Ｋ＝に＋２・即とし、Ｋ（Ｎな
ら（３）に戻る。A=C, -□(k)-! −C, −□(k+ND)・W(
M) (1-1)c=cview-1(k)-C,
, (k+ND)・w(+lI) (1-2)
B=C(k+-ND)+C,-1(k+TND)・W(
M) (2J)z-12 D = C,, (k+ T ND)-C,1(k+ T
Year (M) (2-2)c, (k) = A+B-W
(2M) (3-:L)C, (k+
TND) = A-B-W(2M) (
3-2) Cy, (k”ND) = C+D-W(2M”
2) (4-1)c (k+-ND
) = c-D-w(2M+2)
(4-2) 2 P=Bit Reverse(i)Ij2=-1(4
) M=M+2, K=+2・immediately, and if K(N), return to (3).

（５）　　ＮＤ＝ＮＤ／４．　ｔ　＝　ｔ　＋　１とし
、ｔ≦γなら（２）に戻る。(5) ND=ND/4. Let t = t + 1, and if t≦γ, return to (2).

（６）　　ｋ＝ｏ、　１　、・・・・・・、Ｎ−１に対
して（出力データ配列正順化）ｑ　＝　Ｂｉｔ　ＲｅｖｅｒｓｅＪ　ｋ　）としｑ）ｋ
ならＣＬ（ｋ）　＝　Ｇ（ロ）Ｎ＝２２Ｙ＋１の場合（１）Ｎ；２２Ｙ＋ＩＩ卯＝Ｎ／２．　ｔ　＝　１とす
る。(6) For k=o, 1,..., N-1 (output data array normalization) q = Bit ReverseJ k ) and q)k
Then CL(k) = G (b) If N=22Y+1 (1) N; 22Y+II=N/2. Let t = 1.

（２）　　Ｋ＝Ｏ，Ｍ＝Ｏとする。(2) K=O, M=O.

（３）　ｋ＝に、に＋４．に＋２．−−−−−−、　　
ｋ＋ＩｆＤ−４に対し、下記の演算を行う。(3) k=to, to+4. +2. --------,
The following calculation is performed for k+IfD-4.

Ａ−ＣＬ−１（ｋ）＋Ｃ１−１（ｋ＋ＮＤ）・Ｗ（Ｍ）
ｃ＝ｃ免−１（ｋ）−ＣＬ−１（ｋ士別・Ｗ（Ｍ）Ｂ　
＝　Ｃ（ｋ＋ＮＤ／２）＋ＣＣＬ１（ｋ＋ＴＮＤ）°Ｗ
（＋１１）免−１Ｄ＝Ｃ（ｋ十ＮＤ／２）−ＣＬ−１（ｋ＋ＴＮＤ）・Ｗ
（酌９、−１Ｃ（ｋ）　＝　Ａ＋Ｂ−Ｗ（２Ｍ）Ｃ（ｋ＋ＮＤ／２）　＝　ｈ−Ｂ−Ｗ（２Ｍ）交Ｃ（ｋ＋ＮＤ）　＝Ｃ＋Ｄ−Ｗ（２Ｍ＋２）見Ｃ（ｋ＋−’−ＮＤ）　＝Ｃ−Ｄ−Ｗ（２Ｍ＋２）　　
　２Ｐ＝Ｂｉｔ　Ｒｅｖｅｒｓｅｆ　ｉ）　、　ｊ２＝Ｊ（
４）　　Ｍ＝Ｍ＋２　、　Ｋぞに＋２・卯とし、Ｋ＜Ｎ
なら（６）に戻る。A-CL-1(k)+C1-1(k+ND)・W(M)
c=cmen-1(k)-CL-1(k Shibetsu・W(M)B
= C(k+ND/2)+CCL1(k+TND)°W
(+11) Immunity-1 D=C(k+ND/2)-CL-1(k+TND)・W
(cup 9, -1 C(k) = A+B-W(2M) C(k+ND/2) = h-B-W(2M) Cross C(k+ND) = C+D-W(2M+2) Look C(k+-' -ND) =C-D-W(2M+2)
2 P=Bit Reverse i), j2=J(
4) M=M+2, K +2, Rabbit, K<N
Then return to (6).

（５）　　ＮＤ＝ＮＤ／４　、　ｔ　：＝　ｔ　十ｉと
し、ｔ≦γなら（２）に戻る。(5) ND=ND/4, t := t 10i, and if t≦γ, return to (2).

（６）　　ｋ＝ｏ、１，２．・、、、Ｔ−１に対し、下
記の演算を行う。(6) k=o, 1, 2.・The following calculation is performed for T-1.

ＣＬ（２ｋ）−ＣＬ（２ｋ）＋ＣＬ（２に＋４）・Ｗ（
２１０（５−１）Ｃ（２に＋１）＝Ｃ（２ｋ）−Ｃ（２
に＋１）・Ｗ（２ｋ）　　　　（５−２）Ｌ　　　　　
　　見　　　　　克（ア）　　ｋ−０，１，２，、、、、Ｎ−１に対しくデ
ータ配列正順化）ｑ　＝　Ｂｉｔ　Ｒｅｖｅｒｓｅ　（ｋ　Ｉとしｑ＞ｋ
々らＣ，（ｑ）＝Ｃ，（ｋ）Ｃｉ（ｋ）＝Ｃまた゛、本発明の方式は、各ブロックを４分割した小ブ
ロックよシ各１点ずつのデータを取出し、この４点デー
タについて通常のＦｎの２ル一プ分の処理を一度に行い
、大容量メモリ・アクセス回数を半減するように構成さ
れている。CL(2k)-CL(2k)+CL(2+4)・W(
210(5-1)C(2+1)=C(2k)-C(2
+1)・W(2k) (5-2)L
(a) Data array normalization for k-0, 1, 2, ..., N-1) q = Bit Reverse (k I and q>k
C, (q)=C, (k) Ci(k)=C Also, in the method of the present invention, each block is divided into four small blocks, and one point of data is extracted from each small block, and these four points of data are It is configured to perform processing for two loops of normal Fn at one time, and to reduce the number of large-capacity memory accesses by half.

第５図及び第４図はループとブロック及び小ブロックの
関係を示す図で、第３図は実数部だけ又は虚数部だけの
データが連続するような場合、第４図は実数部と虚数部
とでなるデータが連続する場合について示しである。5 and 4 are diagrams showing the relationship between loops, blocks, and small blocks. This figure shows a case where data consisting of is continuous.

以上述べたようなアルゴリズムを施行するためのＦＦＴ
装置の一実施例を第５図に示す。第５図において、ＲＦ
Ｉ、　ＲＦ２は複数個のレジスターでなるレジスターフ
ァイルで、大容量メモリＭＳの読出しアドレス、書込み
アドレスを格納する。ＲＩｉとＲＩ２はともに読出しア
ドレスの変化分及び格納アドレスの変化分を与えるレジ
スタである。５ＥＬＬ、　５ＥＬ２は入力ａ□、ａ２の
いずれか一方を選択するセレクターで、ＡＤＩ、　ＡＤ
２はアドレスを加算する加算器である０ＡＤ１ではＲＦ
Ｉと５ＥＬＩの百出力の加算を行い、結果をセレクター
５ＥＬ３経由でＭＳのアドレス・レジスターＭＡに与え
る。FFT to implement the algorithm described above
An embodiment of the apparatus is shown in FIG. In Figure 5, RF
I and RF2 are register files consisting of a plurality of registers, and store read addresses and write addresses of the large capacity memory MS. Both RIi and RI2 are registers that provide changes in the read address and changes in the storage address. 5ELL and 5EL2 are selectors that select either input a□ or a2, and ADI, AD
2 is an adder that adds addresses. RF in 0AD1
The 100 outputs of I and 5ELI are added and the result is given to the address register MA of the MS via the selector 5EL3.

ＡＤｌの出力はＲＩｉの該当レジスターにフィードバッ
クされ、次のアドレスデータとなる。ＡＤ２ではＲＦ２
と５ＥＬ２の百出力の加算を行い、結果を５ＥＬ３経由
でＭＡに与える。ＡＤ２の出力は、ＲＦ２の該当レジス
ターに格納され、次のアドレス・データとなる。The output of ADl is fed back to the corresponding register of RIi and becomes the next address data. RF2 in AD2
and the 100 outputs of 5EL2 are added, and the result is given to MA via 5EL3. The output of AD2 is stored in the corresponding register of RF2 and becomes the next address data.

Ｈ８ＭＩ、　Ｉ（８Ｈ２は高速小容量のメモリで、）、
ＩＳがもの読出しデータの格納、　ＦＦＴ演算の中間結
果の格納。H8MI, I (8H2 is a high-speed small capacity memory),
Storage of IS read data and storage of intermediate results of FFT calculations.

ＦＦＴ演算結果の格納メモリとして使用される。It is used as a storage memory for FFT calculation results.

Ｈ８Ｍ１．２　　は最小サイクル・タイムでのデータの
読出し／書込みの同時動作ができる（読出しアドレスと
書込みアドレスは必ずしも一致しない）。H8M1.2 is capable of simultaneous data read/write operations with minimum cycle time (read address and write address do not necessarily match).

ＴＢＭはＦＦＴ演算の定数（Ｗ（ロ）など）類を格納し
ておくデータ・メモリ（ＲＡＭが使用される）である。The TBM is a data memory (RAM is used) that stores constants (such as W) for FFT calculations.

ＡＤＤは加算器で、２人力Ａｌ、　Ａ２の加算を−行い
、結果をＨ８ＭＩ、　Ｈ８１ｉ１２．　ＡＤＤのＡ２各
入力へ与えることができる。ＡＤＤのＡ１人力には、Ｈ
８八り１．　Ｉ（８Ｈ２，ＭＵＬの出力の中の何れかが
選ばれる。ＡＤＤのＡ２人力には、Ｈ３ＭＩＬ、　Ｈ８
Ｍ２．　ＡＤＤ自身の出力の中の何れかが選ばれる。Ｍ
ＵＬは乗算器で、２人力Ｍｌ、　Ｈ２の乗算を行い、結
果をＨ８ＭＩ、　Ｈ８Ｍ２．　ＡＤＤのＡ１各入力へ与
えることができる。ＭＵＬ　（Ｄ　Ｍｌ入力には、ＴＢ
Ｍの出力が与えられる。ADD is an adder that performs addition of Al and A2 by two people, and outputs the result to H8MI, H81i12. It can be given to each A2 input of ADD. For ADD A1 manpower, H
88ri1. I (8H2, one of the MUL outputs is selected. ADD A2 manual power is H3MIL, H8
M2. One of the outputs of ADD itself is selected. M
UL is a multiplier that multiplies Ml and H2 by two people, and outputs the results to H8MI, H8M2. It can be given to each A1 input of ADD. MUL (D Ml input has TB
The output of M is given.

ＭＵＬのＭ２２人力には、Ｈ３ＭＩ、Ｈ８λ１２の出力
の中のいずれかが選ばれる。For MUL's M22 manual power, one of the outputs of H3MI and H8λ12 is selected.

ＩＦは外部装置（ＣＰＵ　、データ収集装置等）とのイ
ンターフェース部である。The IF is an interface unit with external devices (CPU, data collection device, etc.).

データやアドレス情報、制御信号等が授受される。外部
とのデータ転送はＭＳ〜外部装置で行う。Data, address information, control signals, etc. are exchanged. Data transfer with the outside is performed between the MS and the external device.

ｃｓｌｄ、　ＦＦＴ　演算用のマイクロ・プログラムを
格納す不メモリであり、ＣＴＬはマイクロ・プログラム
にるが、ＡＤＤ、　ＭＵＬ、　１（ＳＭＩ／Ｈ８Ｍ２の
読出し及び書込み動作、　　ＴＢＭの読出し動作等は並
列的に動作できる。csld, FFT It is a non-memory that stores the micro program for calculation. CTL is in the micro program, but ADD, MUL, 1 (SMI/H8M2 read and write operations, TBM read operations, etc. are performed in parallel. It can work.

このような構成における本発明の動作を次に説明する。The operation of the present invention in such a configuration will be described next.

（１）　　各ループ処理の準備ＦＦＴの入力データは、ＭＳＫ格納しておく。(1) Preparation for each loop process The input data of FFT is stored in MSK.

ＲＦ］、、　ＲＦ２は、各８レジスターより成るレジス
ター７アイルで、各レジスターの初期値ｒｔｐｌ（ｊ）
。RF],, RF2 is a register 7 aisle consisting of 8 registers each, and the initial value rtpl(j) of each register is
.

ＲＦ２（ｊ）　（ｊ＝ｏ、　１．、、、．７）を次のよ
うにセットする（　ＲＦＩは読出し、　　ＲＦ２は書込
み用）。Set RF2(j) (j=o, 1., .7) as follows (RFI is for reading, RF2 is for writing).

ＲＢＡ　：実数データの先頭アドレスよりＡ；虚数データＮＤニブロック内データ数の半分またＲＩＩ、　ＲＩ２の初期値を次のようにセットする
（実数データと虚数データが分離している場合の例）。RBA: A from the first address of real number data; Imaginary number data ND Half of the number of data in the block and the initial values of RII and RI2 are set as follows (an example when real number data and imaginary number data are separated).

Ｒより　＝　１　　　　　　　　　　（９−１，）Ｒ１
２＝２−Ｎ′Ｄ＋１　　　　　　　　　　（９２）Ｗ（
λ１）＝Ｗ（０）と１　　　　　　　（１０）（２）Ｈ
３Ｍ　１に（ＲＢＡ、　ＩＢＡ）；　（ＲＢＡ　＋Ｔ、
　ＩＢＡ　＋−Ｔ−）；に対応する４複素数データを書
込む（ＭＳ　−＋　Ｈ８ＭＩ　’）、。From R = 1 (9-1,)R1
2=2-N'D+1 (92)W(
λ1)=W(0) and 1 (10)(2)H
3M to 1 (RBA, IBA); (RBA +T,
Write 4 complex number data corresponding to IBA +-T-) (MS-+H8MI').

（５）　Ｈ８Ｍ２に、ＭＳより次の４複素数データを書
込む（ＭＳ−＋Ｈ８Ｍ２　）。(5) Write the next 4 complex number data from MS to H8M2 (MS-+H8M2).

同時に、Ｈ８Ｍ’ｉにＦＦＴ演算を施す（Ｈ８ＭＩ→Ｈ
３ＭＩＨアルゴリズム（イ）の（３）項）。At the same time, perform FFT operation on H8M'i (H8MI→H
3MIH algorithm (a) (3)).

（４）　　次の動作を並行処理する。(4) Process the following operations in parallel.

（１）■（ＳＭｌの４　ｃｏｍｐｌｅｘ　ｄａｔａをｈ
ブロックの先頭データのときはＲＦ２（ｊ）十ＲＩ２ブロックの先頭データ以外のときはＲＦ２（ｊ）　＋ＲＩＩのアトｌメスに格納しく　Ｈ８ＭＩ→ＭＳ）、その後、
ブロックの先頭データのときはＲＦＩ（ｊ　）　＋　ＲＩ２ブロックの先頭データ以外のときはＲＦＩ（、ｊ）　十Ｒ工１（７）　４　ｃｏｍｐｌｅｘ　ｄａｔａをＩＩＳＭＩに
読込む（１ｉｉｓ　−＋　Ｈ３ＭＩ　）。(1) ■ (4 complex data of SM1
If it is the first data of the block, it should be stored in the RF2(j)+RII female.If it is other than the first data of the block, it should be stored in the atl female of RF2(j)
If it is the first data of the block, RFI(j) + RI2 If it is other than the first data of the block, RFI(,j) 10R engineering 1 (7) 4 Read complex data into IISMI (1iis −+ H3MI).

（ｉｉ）　　Ｉ（８Ｍ２ｔ７）　４　ｃｏｍｐｌｅｘ　
ｄａｔａにＦＦＴ演算を施す（Ｈ８Ｍ２→Ｈ３Ｍ２；　
　アルゴリズム（イ）の（３）項）。(ii) I (8M2t7) 4 complex
Perform FFT operation on data (H8M2→H3M2;
Section (3) of algorithm (a)).

（５）　　次の動作を並行処理する。(5) Process the following operations in parallel.

（ｉ）ｎｓｈ＋２の　４　ｃｏｍｐｌｅｘ　ｄａｔａを
ブロックの先頭データのときはＲＦ２（ｊ　）　十ＲＩ２ブロックの先頭データ以外のときはＲＦ２（ｊ　）　＋　ＲＩＩのアドレスに格納しく　Ｈ５Ｍ２→ＭＳ）、その後、ブ
ロックの先頭デ・〜夕のときはＲＦＩ（ｊ）　十ＲＩ２それ以外のときはＲＦＩ（ｊ）＋Ｒ工１の４　ｃｏｍｐｌｅｘ　ｄａｔａをＨ８Ｍ２に読込む０
．（Ｓ−＋Ｈ３Ｍ２　）。(i) When the 4 complex data of nsh+2 is the first data of the block, store it at the address of RF2(j), and when it is other than the first data of the block, store it at the address of RF2(j)+RII (H5M2→MS), then If it is the first day or evening, RFI (j) 10RI 2 Otherwise, RFI (j) + R engineering 1 4 Read complex data into H8M2 0
．． (S-+H3M2).

（ｉｉ）　　Ｈ８Ｍｉ８Ｍミノｏｍｐｌｅｘ　ｄａｔａ
　Ｋ　ＦＦＴ演算を施す（Ｈ８Ｍ１→Ｉ（ＳＭｌｉ　　
アルゴリズム（イ）の（３）項）。(ii) H8Mi8M mino complex data
K Perform FFT operation (H8M1→I(SMli
Section (3) of algorithm (a)).

（６）１ブロツクの全ＦＦＴ演算の終了まで上記（４）
。(6) The above (4) until the end of all FFT calculations for one block.
.

（５）を繰９返し、この演算処理の終了後Ｍ　＝　Ｍ＋
２とし、このループの全ブロックの処理が終了しなけれ
ば再び（５）の動作に戻る。Repeat (5) 9 times, and after completing this calculation process, M = M+
2, and if processing of all blocks in this loop is not completed, the operation returns to step (5) again.

（７）ＨｓＭ２の４　ＣｏｍｐｌｅｘデータをＭＳに格
納する。(7) Store the 4 Complex data of HsM2 in the MS.

１ルーズの処理終了後ＮＤ　＝　ＮＤ　／　４　　とし
、全ループの処理が終了しなければ（１）に戻る。After completing the processing of 1 loose, set ND = ND / 4, and if the processing of all loops is not completed, return to (1).

上記（１）〜（７）が第５図の装置を使用して、メモリ
・アクセス・リミットを越える高速ＦＦＴ演算を実現す
るアルゴリズムの例である。The above (1) to (7) are examples of algorithms for realizing high-speed FFT operations exceeding the memory access limit using the apparatus shown in FIG.

データ処理回数の制御（１ブロックｎ回）、ループ回数
の制御等は、Ｃ８のマイクロ・プログラムによＦ）、Ｃ
ＴＬで行う。Control of the number of data processing times (n times per block), control of the number of loops, etc. are controlled by the C8 micro program.F), C
Do it on TL.

レジスター・ファイルＲＦＩ、　ＲＦ２は、４点８デー
タのＲｅａｄ　Ａｄｄｒｅｓｓ　ｏｒ　Ｗｒｉｔｅ　Ａ
ｄｄｒｅｓｓを記憶し、参照後そのレジスターの内容を
次の参照アドレスに更新する。ＲＩ２は、ブロックの先
頭データのアドレス算出に際して用いられ（Ｒｅａｄ　
Ａｄｄ、／Ｗｒｉｔｅ　Ａｄｄ、）、ＲＩｉはその他の
データ・アドレス算出に際して用いられるように制御さ
れる。Register files RFI and RF2 have 4 points and 8 data Read Address or Write A
ddress is stored, and after reference, the contents of the register are updated to the next reference address. RI2 is used when calculating the address of the first data of the block (Read
Add, /Write Add, ), RIi are controlled to be used when calculating other data addresses.

なお、本発明は、上述の実施例に限定することなく以下
に列挙する拡張や変形も可能でおる。Note that the present invention is not limited to the above-described embodiments, but can also be expanded and modified as listed below.

（１）１ブロツクを８分割（１６，３２，、、、分割）
し、３ループ（４，５，、、、ループ）一括処理アＡ・
ゴリズムによるＦＦＴ演算装置等のアルゴリズムの拡張
や変形（２）　　制御信号、データ、アドレス等のライン、バ
ス等の統一または分離及び装置相互間の別の接続、結合
、結線。(1) 1 block divided into 8 (16, 32,..., division)
Then, 3 loops (4, 5,..., loop) batch processing A.
(2) Unification or separation of lines and buses for control signals, data, addresses, etc., and different connections, combinations, and connections between devices.

（３）　　複数の加算器（加減算器）の使用（Ｂｕｔｔ
ｅｒｆｌｙ演算器を有するもの等）、複数の乗算器の使
用による処理の高速化を図った場合（４）　　装置の共用、代用（１）　　Ａｐｌ　とＡＤ２の共用（５ＥＬ３は不要と
なる）（ｉｔ）　　ＴＢＭ　をＨ５ＭＩ　ｏｒ　Ｈ３Ｎ
２と共用０ｉｉ）　　ｃｓなしでＣＴＬで代用の場合（
５）装置の結合、合体（ｉ）　　（ＳＥＬ３．）ＭＡ　　をＭＳに含む（（Ｓ
ＥＬ３．）　ＭＡが見かけ上ガいケース）（ｉｉ）ＳＥＬＬをＡＤＤに含む（山）　　５ＥＬ２をＡＤ２に含む６ｖ）　　ＲＩＩ　とＲＩ２の合体（ＳＥＬコ、　５Ｅ
Ｌ２不要）Ｍ　　ＲＦＩ　とＲＦ２の合体（６）その他會ＲＩＩ、　ＲＩ２を２個ずつもつ場合・Ｒ工’ｌ、　
ＲＩ２をＲＯＭとする場合（複数個）・ＲＩＩ、　ＲＩ
２が複数個ある場合・ＴＢＭがＲＯＭの場合 −ＲＦＩ、　ＲＦ２のレジスター数は８に限定されない
以上説明したように、本発明によれば、次のような効果
を奏する。(3) Use of multiple adders (addition/subtraction units) (Butt
(4) Sharing and substitution of devices (1) Sharing of Apl and AD2 (5EL3 becomes unnecessary) (it) TBM to H5MI or H3N
2 and shared 0ii) When using CTL instead without cs (
5) Combining and merging devices (i) (SEL3.) Include MA in MS ((S
EL3. ) Case where MA is apparently bad) (ii) Including SELL in ADD (mountain) Including 5EL2 in AD2 6v) Combining RII and RI2 (SEL, 5E)
(L2 unnecessary) Combination of M RFI and RF2 (6) If you have two each of RII and RI2, R engineering'l,
When using RI2 as ROM (multiple ROMs)・RII, RI
In the case where there is a plurality of registers 2 and TBM is a ROM, the number of registers of RFI and RF2 is not limited to eight.As explained above, the present invention provides the following effects.

（１）　　メモリ・アクセス・リミットを越える高速Ｆ
ＦＴ演算が実現できる。(1) High-speed F that exceeds memory access limits
FT calculation can be realized.

（２）　　簡潔で適用性の高いＢａ５ｅ２．　ＦＦＴを
実現できる。(2) Concise and highly applicable Ba5e2. FFT can be realized.

（３）　　Ｂｕｔｔｅｒｆｌｙ演算器等を使用しない、
安価で経済的な装置ができる。(3) Do not use Butterfly calculation units, etc.
A cheap and economical device can be created.

（４）　　汎用Ａｒｒａｙ　Ｐｒｏｃｅｓｓｏｒとして
の特徴を失わない装置になし得る。(4) The device can be made into a device that does not lose its characteristics as a general-purpose array processor.

[Brief explanation of drawings]

第１図及び第２図は本発明のシグナル・フローを示す図
、第３図及び第４図はループとブロック及び小ブロック
の関係を示す図、第５図は本発明のＦＦＴ演算装置の一
実施例を示す要部構成図である。ＲＦＩ、　ＲＦ２・・・レジスターファイル、ＲＩＩ、
　ＲＩ２・・・レジスタ、５ＥＬＩ、　５ＥＬ２．５Ｅ
Ｌ３　・・・セレクター、ＡＤｌ。ＡＤ２・・・加算器、ＭＡ・・・アドレス・レジスター
、λＩＳ・・・大容量メモＩＪ　、Ｈ３ＭＩ、　Ｈ３Ｎ
２・・・高速小容量メモリ、ＴＢＭ・・・データ・メモ
リ、ＡＤＤ・・・加算器、ＭＴＪＬ・・・乗Ｘｉ、ｘＦ
・・・インターフェース。ｉ＝０．１，２．、Ｂ＋ＩＬ−１″７゛０′凸０　　　尾４図1 and 2 are diagrams showing the signal flow of the present invention, Figures 3 and 4 are diagrams showing the relationship between loops, blocks, and small blocks, and Figure 5 is an illustration of the FFT calculation device of the present invention. FIG. 2 is a main part configuration diagram showing an example. RFI, RF2... register file, RII,
RI2...Register, 5ELI, 5EL2.5E
L3...Selector, ADl. AD2...Adder, MA...Address register, λIS...Large capacity memo IJ, H3MI, H3N
2...High speed small capacity memory, TBM...Data memory, ADD...Adder, MTJL...Xi, xF
···interface. i=0.1,2. , B+ IL-1″7゛0′ Convex 0 Tail 4 figure

Claims

[Claims]

(1) Memory access is achieved by dividing the input data block, further dividing each block into small block groups, and performing FFT operations for multiple loops between the data groups extracted from the small block groups. 9. An FFT arithmetic device using an FFT arithmetic method, which is characterized by reducing the number of times, reducing delays in memory access, and enabling high-speed arithmetic operations.

(2) Using double-A high-speed access memory as a buffer, the high-speed memory for this data group is controlled while controlling the input data address group and output data address group of small blocks in the low-speed large-capacity memory. A patent claim characterized in that the FFT calculation of the plural loops used and the readout operation of the next data group from the low-speed memory to the high-speed memory after storing the previous FFT calculation result in the low-speed memory are performed in parallel. The FFT calculation device according to item 1.