JPH05266060A

JPH05266060A - Matrix arithmetic circuit

Info

Publication number: JPH05266060A
Application number: JP6381992A
Authority: JP
Inventors: Eiji Morimatsu; 映史森松; Tadami Kono; 忠美河野
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1992-03-19
Filing date: 1992-03-19
Publication date: 1993-10-15

Abstract

PURPOSE:To execute matrix arithmetic in arbitrary arithmetic and output scan directions without enlarging hardware scale. CONSTITUTION:Reading ate for either input matrix data stored in a data memory 1 and similar input matrix data stored in a coefficient memory 2 is made n-times as high as a serial output, read speed for the other data is made equal, therefore, the matrix arithmetic is executed by one multiplier, and the output matrix data of this arithmetic result are serially outputted. Then, this circuit is provided with an address generating circuit 3 to generate a data memory address and a coefficient memory address by selecting whichever the input matrix data or coefficient matrix data makes the reading rate n-times or make it equal and selecting whether the order of addresses is made horizontal or vertical to the matrix corresponding to an arithmetic direction instruction showing whether the coefficient matrix data are multiplied from the right or left of the input matrix data and an output scan direction instruction showing whether data are outputted by horizontal scan or vertical scan.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明はマトリクス演算回路に関
し、特に２つの行列データを掛け合わせる演算を行うマ
トリクス演算回路に関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a matrix operation circuit, and more particularly to a matrix operation circuit for performing an operation for multiplying two matrix data.

【０００２】ディジタル信号処理装置等においては、マ
トリクス演算は非常に広く用いられる演算の一つであ
る。このようなマトリクス演算を実現する回路としては
種々考えられるが、ハードウェアとして規模が大きくな
る乗算器の数が少ないことが望ましい。Matrix operations are one of the most widely used operations in digital signal processing devices and the like. Although various circuits can be considered for realizing such a matrix operation, it is desirable that the number of multipliers, which increases the scale of hardware, is small.

【０００３】[0003]

【従来の技術】このようなマトリクス演算の例が図９に
示されており、ここでは簡略化のため２×２の行列を例
にとって説明する。行列Ｘで表される入力のディジタル
データに対して或る変換係数行列Ｃで表されるディジタ
ルデータを掛けることにより変換された出力データ行列
Ｙが得られる。尚、行列の各要素ｘ_11,ｘ_12,ｘ_21,ｘ
_22,ｃ_11,ｃ_12,ｃ_21,ｃ_22,ｙ_11,ｙ_12,ｙ_21,ｙ₂₂
の１つ１つが（例えば８ビットなどで表現される) １個
のディジタルデータを表している。2. Description of the Related Art An example of such matrix calculation is shown in FIG.
Shown, here a 2x2 matrix is taken as an example for simplicity
To explain. Input digital represented by matrix X
A digit represented by a transform coefficient matrix C for data
Output data matrix transformed by multiplying
Y is obtained. Note that each element of the matrix x_11,x_12,x_{twenty one,}x
_{twenty two,}c_11,c_12,c_{twenty one,}c_{twenty two,}y_11,y_12,y_{twenty one,}y_{twenty two}
One for each (represented by, for example, 8 bits)
Represents the digital data of.

【０００４】このマトリクス演算を実現する回路として
はいくつか考えられるが、ハードウェアとして規模の大
きい乗算器の数が少なくて済むような回路例が図１０に
示されており、１は通常、このマトリクス演算回路１０
の外部に設けられて入力行列データ（以下、単にデータ
と称することがある）Ｘを蓄えるデータメモリ、２はマ
トリクス演算回路１０の内部に設けられて係数行列デー
タ（以下、単に係数と称することがある）を蓄える係数
メモリ、２１及び２２はこれらのデータメモリ１と係数
メモリ２に対するそれぞれのアドレスを生成して与える
アドレス発生部、１１はデータメモリ１と係数メモリ２
の各データを掛け合わせる乗算器、１２，１３は乗算器
１１の乗算結果と一つ前の演算時の値を保持しているレ
ジスタ１４，１５の出力値とを加える加算器、１６，１
７はそれぞれレジスタ１４，１５の出力値を保持するた
めのレジスタ、そして、１８はレジスタ１６，１７に保
持された出力値の一方をシリアルに出力するセレクタで
ある。Although there are several possible circuits for implementing this matrix operation, an example of a circuit that requires a small number of large-scale multipliers as hardware is shown in FIG. Matrix operation circuit 10
A data memory 2 is provided outside of the matrix memory 10 for storing input matrix data (hereinafter, may be simply referred to as data), and a coefficient memory 2 is provided inside the matrix operation circuit 10 (hereinafter, may be simply referred to as coefficient). There is an address generator for generating and giving respective addresses to the data memory 1 and the coefficient memory 2, and 11 is a data memory 1 and a coefficient memory 2.
, 13 are multipliers for multiplying the respective data, and adders 16 and 1 are for adding the multiplication result of the multiplier 11 and the output values of the registers 14 and 15 holding the values at the previous operation.
Reference numeral 7 is a register for holding the output values of the registers 14 and 15, respectively, and reference numeral 18 is a selector for serially outputting one of the output values held in the registers 16 and 17.

【０００５】このようなマトリクス演算回路１０の動作
タイムチャートが図１１に示されており、アドレス発生
部２１から所定のアドレスをメモリ１に与えることに
より入力データＸを図示のようにｘ₁₁→ｘ₁₂→ｘ₂₁→ｘ
₂₂の順にシリアルに読み出し、同時にアドレス発生部２
２から所定のアドレスを内部のメモリ２に与えること
により係数データＣを図示のようにｃ₁₁→ｃ₁₂→ｃ₂₁→
ｃ₂₂の順に２倍の速度で読み出す。FIG. 11 shows an operation time chart of the matrix operation circuit 10 as described above. By inputting a predetermined address from the address generator 21 to the memory 1, the input data X is x ₁₁ → x as shown in the figure. ₁₂ → x ₂₁ → x
Reads serially in the order of ₂₂ and simultaneously generates address
By giving a predetermined address from 2 to the internal memory 2, the coefficient data C is c ₁₁ → c ₁₂ → c ₂₁ → as shown in the figure.
Reading is performed at twice the speed in the order of c ₂₂ .

【０００６】そして、これらメモリ１及び２の出力デー
タ同士を、入力データＸが変化するタイミングに従って
乗算器１１で掛け合わせて加算器１２及びレジスタ１４
を経由することによりレジスタ１６からは入力データの
２個分のタイミングで図示のようなデータが出力され、
また同時に加算器１３及びレジスタ１５を経由すること
によりレジスタ１７からは入力データの２個分のタイミ
ングでしかも係数データの１個のタイミング分だけレジ
スタ１６より遅れた時点で図示のようなデータが出力さ
れる。The output data of the memories 1 and 2 are multiplied by the multiplier 11 in accordance with the timing when the input data X changes, and the adder 12 and the register 14 are connected.
The data shown in the figure is output from the register 16 at a timing corresponding to two pieces of input data by way of
Further, at the same time, the data as shown in the drawing is output from the register 17 at the timing of two pieces of input data and at the timing delayed from the register 16 by one timing of the coefficient data by passing through the adder 13 and the register 15. To be done.

【０００７】そして、これらのレジスタ１６及び１７の
出力データがセレクタ１８において入力データのタイミ
ングで交互に選択されることにより図９に示したデータ
ｙ₁₁（ｘ₁₁ｃ₁₁＋ｘ₁₂ｃ₂₁）_,ｙ₁₂（ｘ₁₁ｃ₁₂＋ｘ₁₂ｃ
₂₂）_,ｙ₂₁（ｘ₂₁ｃ₁₁＋ｘ₂₂ｃ₂₁）_,ｙ₂₂（ｘ₂₁ｃ₁₂＋
ｘ₂₂ｃ₂₂）が得られる。The output data of the registers 16 and 17 are alternately selected by the selector 18 at the timing of the input data, so that the data y ₁₁ (x ₁₁ c ₁₁ + x ₁₂ c ₂₁ ) _, y shown in FIG. ₁₂ (x ₁₁ c ₁₂ + x ₁₂ c
₂₂ ) _, y ₂₁ (x ₂₁ c ₁₁ + x ₂₂ c ₂₁ ) _, y ₂₂ (x ₂₁ c ₁₂ +
x ₂₂ c ₂₂ ) is obtained.

【０００８】このようにして係数データＣの処理速度を
入力データＸの処理速度の２倍の早さで動作させること
により、乗算器１個でマトリクス演算を実現している。In this way, by operating the processing speed of the coefficient data C at twice the processing speed of the input data X, the matrix operation is realized by one multiplier.

【０００９】[0009]

【発明が解決しようとする課題】上記のような従来例に
は下記のような問題がある。アドレス発生が固定なため、入力データを係数データ
の右から掛けるか左からかけるかという演算方向の自由
度が無く、同じアドレス発生で逆方向からの掛け算を行
う場合には、入力データメモリ内のデータ配置と内部係
数メモリのデータ配置を転置行列の形に予め変更して置
かねばならない。内部のハードウェア構成の関係上、入力データを出力
するスキャンの順番が限られてしまうという点にある。
具体的には、図１２(a) に示すように入力データＸを係
数データＣに対して左から掛ける場合には水平方向スキ
ャンで出力せねばならず、同図(b) のように入力データ
Ｘを係数データＣに対して右から掛ける場合には、マト
リクス演算出力は図示のように垂直方向スキャンに限ら
れてしまう。後段のデータ処理のスキャン順序がこれと
異なる場合には、外部にスキャン変換用のバッファを設
けねばならず、ハードウェア規模が増大することにな
る。The above-mentioned conventional example has the following problems. Since the address generation is fixed, there is no degree of freedom in the calculation direction of whether to multiply the input data from the right or left of the coefficient data, and when performing the multiplication from the opposite direction with the same address generation, the input data memory The data arrangement and the data arrangement of the internal coefficient memory must be changed in advance in the form of a transposed matrix. Due to the internal hardware configuration, the order of scans for outputting input data is limited.
Specifically, when the input data X is multiplied from the coefficient data C from the left as shown in FIG. 12 (a), the input data X must be output in a horizontal scan. When the coefficient data C is multiplied by X from the right, the matrix calculation output is limited to the vertical scan as shown in the figure. If the scan order of the subsequent data processing is different from this, a buffer for scan conversion must be provided externally, which increases the hardware scale.

【００１０】従って本発明は、データメモリに蓄えられ
た入力行列データと係数メモリに蓄えられ該入力行列デ
ータの速度の２倍の速度で処理される係数行列データと
のマトリクス演算を実行し、この演算結果の出力行列デ
ータをシリアル出力するマトリクス演算回路において、
ハードウェア規模を増大させることなく任意の演算方向
と出力スキャン方向のマトリクス演算を実行できるよう
にすることを目的とする。Therefore, the present invention executes a matrix operation of the input matrix data stored in the data memory and the coefficient matrix data stored in the coefficient memory and processed at a speed twice as fast as the speed of the input matrix data. In the matrix operation circuit that serially outputs the output matrix data of the operation result,
It is an object of the present invention to be able to execute a matrix operation in an arbitrary operation direction and an output scan direction without increasing the hardware scale.

【００１１】[0011]

【課題を解決するための手段】図１は上記の課題を解決
するための本発明に係るマトリクス演算回路を原理的に
示したもので、係数行列データＣを入力行列データＸの
右から掛けるか左から掛けるかを示す演算方向指示及び
水平スキャンで出力するか垂直スキャンで出力するかを
示す出力スキャン方向指示に応じて、該入力行列データ
Ｘに対する該係数行列データＣの処理速度を２倍にする
か１／２にするかを選択すると共にアドレスの順番を行
列の水平方向とするか垂直方向とするかを選択して該デ
ータメモリ１及び該係数メモリ２に対するそれぞれの読
出アドレスを生成するアドレス発生回路３を設けたこと
を特徴とするものである。FIG. 1 shows the principle of a matrix operation circuit according to the present invention for solving the above problems. Whether the coefficient matrix data C is multiplied from the right of the input matrix data X or not. The processing speed of the coefficient matrix data C with respect to the input matrix data X is doubled according to the calculation direction instruction indicating whether to multiply from the left and the output scan direction instruction indicating whether to output in the horizontal scan or the vertical scan. Address for generating read addresses for the data memory 1 and the coefficient memory 2 by selecting whether to make the address horizontal or vertical in the matrix. The generation circuit 3 is provided.

【００１２】[0012]

【作用】図１に示す本発明において、入力行列データＸ
に対して係数行列データＣを右から掛けるか左から掛け
るかを示す演算方向と水平スキャンで出力するか垂直ス
キャンで出力するかを示す出力スキャン方向とをアドレ
ス発生回路３に指示すると、まず、入力行列データＸに
対する該係数行列データＣの処理速度を２倍にするか１
／２にするかを選択（決定）する。In the present invention shown in FIG. 1, the input matrix data X
When the address generating circuit 3 is instructed of the calculation direction indicating whether the coefficient matrix data C is multiplied from the right side or the left side and the output scan direction indicating whether to output in the horizontal scan or the vertical scan, Whether to double the processing speed of the coefficient matrix data C with respect to the input matrix data X
Select (decide) whether to set to / 2.

【００１３】これは図１０について説明したように、デ
ータＸ×係数Ｃで水平出力スキャンのような場合には１
つの乗算器でマトリクス演算を実行するため係数行列デ
ータＣの処理速度を入力行列データＸの処理速度の２倍
に設定しなければならないので、アドレス発生回路３に
おいては、上記のように演算方向と出力スキャン方向を
変えて図２(a) 及び(b) に示す４通りのパターンを作り
出すためには係数行列データＣと入力行列データＸとの
処理速度を２倍又は１／２にする必要があるからであ
る。As described with reference to FIG. 10, this is 1 in the case of horizontal output scan with data X × coefficient C.
Since the processing speed of the coefficient matrix data C must be set to twice the processing speed of the input matrix data X in order to execute the matrix operation by one multiplier, the address generator circuit 3 operates in the direction of operation as described above. In order to change the output scan direction and create the four patterns shown in FIGS. 2 (a) and 2 (b), it is necessary to double or half the processing speed of the coefficient matrix data C and the input matrix data X. Because there is.

【００１４】また、アドレス発生回路３においては、こ
のような処理速度と共にデータメモリ１及び係数メモリ
２に対するそれぞれの読出アドレスを生成するため、上
記の出力スキャン方向の指示に従ってアドレスの順番を
図２(b) に示すように行列の水平方向とするか垂直方向
とするかを選択して出力する。Further, in the address generation circuit 3, in order to generate the respective read addresses for the data memory 1 and the coefficient memory 2 together with the processing speed as described above, the order of the addresses is shown in FIG. As shown in b), select whether to output the matrix horizontally or vertically and output it.

【００１５】このようにして、演算方向と出力スキャン
方向に応じて種々のデータ及び係数を用意することなく
小さなハードウェア規模でマトリクス演算を実行するこ
とが可能となる。In this way, it becomes possible to execute the matrix operation on a small hardware scale without preparing various data and coefficients according to the operation direction and the output scan direction.

【００１６】[0016]

【実施例】図３は図１に示したアドレス発生回路３を図
１０に示すような１個の乗算器を有するマトリクス演算
回路１０に組み込んだ実施例を示したものであり、この
アドレス発生回路３の具体的な実施例が図４に示されて
いる。FIG. 3 shows an embodiment in which the address generating circuit 3 shown in FIG. 1 is incorporated in a matrix operation circuit 10 having one multiplier as shown in FIG. Three specific examples are shown in FIG.

【００１７】図４において、３１はアドレス発生部とし
て図２に示したような２×２サイズの入力行列データＸ
及び係数行列データＣの各要素を特定するための２ビッ
ト出力（０，１，２，３）を発生するカウンタ、３２は
やはりアドレス発生部として１ビット出力（０，１）を
発生するカウンタであり、カウンタ３１の２ビット出力
の内のＬＳＢ側の１ビットが、カウンタ３２からの１ビ
ット出力のＭＳＢ側に加えられて２ビット出力を構成し
ている。尚、このようにカウンタ３１，３２を２ビット
及び１ビットとしているのは、カウンタ３２のビット数
を減らすためであり、カウンタ３２も２×２サイズの入
力行列データＸ及び係数行列データＣの各要素を特定す
るための２ビットのものを使用しても構わない。In FIG. 4, reference numeral 31 denotes an address generator, which is the input matrix data X of 2 × 2 size as shown in FIG.
And a counter that generates a 2-bit output (0, 1, 2, 3) for specifying each element of the coefficient matrix data C, and 32 is a counter that also generates a 1-bit output (0, 1) as an address generator. Yes, one bit on the LSB side of the 2-bit output of the counter 31 is added to the MSB side of the 1-bit output from the counter 32 to form a 2-bit output. The reason why the counters 31 and 32 are 2 bits and 1 bit in this way is to reduce the number of bits of the counter 32, and the counter 32 also has a 2 × 2 size of input matrix data X and coefficient matrix data C. It is also possible to use a 2-bit one for identifying the element.

【００１８】また、３３はカウンタ３１，３２の各２ビ
ット出力を受けてそのまま通す（以下、スルーと称す
る）か交差（以下、クロスと称する）させるかを選択す
るスイッチ、３４及び３５はスイッチ３３の各出力（２
ビット）を受けてその２ビット信号をスルーにするかク
ロスにするかを選択して図３に示したデータメモリ１及
び係数メモリ２にそれぞれ与えるためのスイッチ、そし
て、３６はこれらのスイッチ３３〜３５を演算方向及び
出力スキャン方向に関する指示に従って制御する制御信
号を生成する制御信号生成部である。Further, 33 is a switch for receiving 2-bit output of each of the counters 31 and 32 and selecting whether to pass the signal as it is (hereinafter referred to as “through”) or to intersect (hereinafter referred to as “cross”), and 34 and 35 are switches 33. Each output of (2
Bit) to select whether to pass the 2-bit signal through or cross it and apply it to the data memory 1 and the coefficient memory 2 shown in FIG. 3, respectively, and 36 are switches 33 to 33. 35 is a control signal generation unit that generates a control signal for controlling the control unit 35 according to the instructions regarding the calculation direction and the output scan direction.

【００１９】このようなアドレス発生回路３を用いたと
きのマトリクス演算回路１０の動作タイムチャートが図
５〜図８に示されており、以下順に説明する。5 to 8 are operation time charts of the matrix operation circuit 10 when the address generation circuit 3 is used, which will be described below in order.

【００２０】（１）データ×係数で垂直出力スキャン
（図５参照）：この場合には係数がデータの左側に位置
するので図２(a) の例に相当しており、まず、係数デー
タＣの処理速度と入力データＸの処理速度とは図１０に
関しても説明したように２倍又は１／２の関係になるの
で、カウンタ３２のカウント速度は図示のようにカウン
タ３１のカウント速度の２倍に設定しておく。尚、この
関係は図６〜図８の例においても同様である。 (1) Vertical output scan with data × coefficient
(Refer to FIG. 5): In this case, the coefficient is located on the left side of the data, which corresponds to the example of FIG. 2 (a). First, the processing speed of the coefficient data C and the processing speed of the input data X are Since the relationship of 10 is also double or 1/2 as described above, the count speed of the counter 32 is set to twice the count speed of the counter 31 as illustrated. Note that this relationship is the same in the examples of FIGS.

【００２１】そして、上記のようにカウンタ３１の２ビ
ット出力はそのままスイッチ３３の上側入力端子に２進
のカウント値「０」，「１」，「２」，「３」の順に与
えられるが、カウンタ３２側からは、カウンタ３２から
の１ビットがＬＳＢとなりカウンタ３１のＬＳＢ側の１
ビットがＭＳＢとなって２ビット分としてスイッチ３３
の下側入力端子にやはりカウント値「０」，「１」，
「２」，「３」の順に２倍の速度で与えられる。As described above, the 2-bit output of the counter 31 is directly given to the upper input terminal of the switch 33 in the order of binary count values "0", "1", "2", "3". From the counter 32 side, 1 bit from the counter 32 becomes the LSB, and 1 on the LSB side of the counter 31.
Switch 33 with 2 bits as MSB.
The count value "0", "1",
"2" and "3" are given in this order at double speed.

【００２２】今、上記のようにデータ×係数の演算方向
で垂直出力スキャンが指示されているので、制御信号生
成部３６からスイッチ３３に与えられる制御信号は次の
表１のようになる。Since the vertical output scan is instructed in the data × coefficient calculation direction as described above, the control signal given from the control signal generating unit 36 to the switch 33 is as shown in Table 1 below.

【００２３】[0023]

【表１】 [Table 1]

【００２４】即ち、スイッチ３３はクロスとなるように
制御信号生成部３６によって制御され、従って図５に示
すようにカウンタ３１側からの２ビット出力はスイッチ
３５に、カウンタ３２側からの２ビット出力はスイッチ
３４にそれぞれ入れ替わった形でカウント値「０」，
「１」，「２」，「３」の順に送られる。That is, the switch 33 is controlled by the control signal generator 36 so as to be crossed, so that the 2-bit output from the counter 31 side is output to the switch 35 and the 2-bit output from the counter 32 side as shown in FIG. Are switched to the switch 34 respectively and the count value is "0",
“1”, “2”, and “3” are sent in this order.

【００２５】これらのスイッチ３４，３５も上記の表１
に従って制御信号生成部３６により制御されるので、こ
の例の場合には、両スイッチ３４，３５共に２ビット信
号のＭＳＢ側とＬＳＢ側とが入れ替わる（クロス）形で
通過することとなり、図５に示すようにデータアドレス
は２進値「０」，「２」，「１」，「３」の順になり、
係数アドレスは１／２の速度で同じく２進値「０」，
「２」，「１」，「３」の順になる。These switches 34 and 35 are also shown in Table 1 above.
According to the control signal generator 36, the two switches 34 and 35 pass the 2-bit signal in a form in which the MSB side and the LSB side are interchanged (crossed), as shown in FIG. As shown, the data addresses are in the order of binary values "0", "2", "1", "3",
Coefficient address is 1/2 speed and binary value "0",
The order is "2", "1", "3".

【００２６】このようなデータアドレスをアドレス発生
回路３からメモリ１（図３参照）に与えることにより入
力データＸを図示のようにｘ₁₁→ｘ₂₁→ｘ₁₂→ｘ₂₂の順
（垂直スキャン）にシリアルに読み出し、同時に係数ア
ドレスをメモリ２（図３参照）に与えることにより係数
データＣを図示のようにｃ₁₁→ｃ₂₁→ｃ₁₂→ｃ₂₂の順
（垂直スキャン）に入力データＸの１／２の速度で読み
出す。By inputting such a data address from the address generating circuit 3 to the memory 1 (see FIG. 3), the input data X is in the order of x ₁₁ → x ₂₁ → x ₁₂ → x ₂₂ (vertical scan) as shown in the figure. Serially, and at the same time, by giving the coefficient address to the memory 2 (see FIG. 3), the coefficient data C is input in the order of c ₁₁ → c ₂₁ → c ₁₂ → c ₂₂ (vertical scan) as shown in the figure. Read at 1/2 speed.

【００２７】この後の処理動作は図１０において説明し
たものと同じであり、最終的にセレクタ１８からは図５
に示すように、ｙ₁₁（ｘ₁₁ｃ₁₁＋ｘ₁₂ｃ₂₁）_,ｙ₂₁（ｘ
₂₁ｃ ₁₁＋ｘ₂₂ｃ₂₁）_,ｙ₁₂（ｘ₁₁ｃ₁₂＋ｘ₁₂ｃ₂₂）_,ｙ
₂₂（ｘ₂₁ｃ₁₂＋ｘ₂₂ｃ₂₂）がシリアルに順次出力される
こととなる。The subsequent processing operation will be described with reference to FIG.
It is the same as that shown in FIG.
As shown in y₁₁(X₁₁c₁₁+ X₁₂c_{twenty one})_,y_{twenty one}(X
_{twenty one}c ₁₁+ X_{twenty two}c_{twenty one})_,y₁₂(X₁₁c₁₂+ X₁₂c_{twenty two})_,y
_{twenty two}(X_{twenty one}c₁₂+ X_{twenty two}c_{twenty two}) Are serially output
It will be.

【００２８】（２）係数×データで水平出力スキャン
（図６参照）：この場合には、係数がデータの左側に来
るので図２(b) の例に相当し、上記の表１によりスイッ
チ３３はクロスとなり、スイッチ３４，３５はスルーに
制御されることとなり、図示のようにデータアドレス及
び係数アドレス共に「０」，「１」，「２」，「３」の
順となり、且つデータアドレスの速度が係数アドレスの
速度の２倍となって出力されることとなる。 (2) Horizontal output scan with coefficient × data
(See FIG. 6): In this case, the coefficient comes to the left side of the data, which corresponds to the example of FIG. 2 (b). According to Table 1 above, the switch 33 is crossed and the switches 34 and 35 are controlled to be through. As shown in the figure, the data address and the coefficient address are in the order of “0”, “1”, “2”, “3”, and the speed of the data address is double the speed of the coefficient address and output. Will be done.

【００２９】このようなデータアドレスをアドレス発生
回路３からメモリ１に与えることにより入力データＸを
図示のようにｘ₁₁→ｘ₁₂→ｘ₂₁→ｘ₂₂の順（水平スキャ
ン）にシリアルに読み出し、同時に係数アドレスをメモ
リ２に与えることにより係数データＣを図示のようにｃ
₁₁→ｃ₁₂→ｃ₂₁→ｃ₂₂の順（水平スキャン）に入力デー
タＸの１／２の速度で読み出している。By inputting such a data address from the address generation circuit 3 to the memory 1, the input data X is serially read out in the order of x ₁₁ → x ₁₂ → x ₂₁ → x ₂₂ (horizontal scan) as shown in the drawing, At the same time, by giving a coefficient address to the memory 2, the coefficient data C is changed to c as shown in the figure.
₁₁ → c ₁₂ → c ₂₁ → c ₂₂ are read in this order (horizontal scan) at a speed half that of the input data X.

【００３０】この後の処理動作は図１０において説明し
たものと同じであり、最終的にセレクタ１８からは図５
に示すように、ｚ₁₁（ｘ₁₁ｃ₁₁＋ｘ₂₁ｃ₁₂）_,ｚ₁₂（ｘ
₁₂ｃ ₁₁＋ｘ₂₂ｃ₁₂）_,ｚ₂₁（ｘ₁₁ｃ₂₁＋ｘ₂₁ｃ₂₂）_,ｚ
₂₂（ｘ₁₂ｃ₂₁＋ｘ₂₂ｃ₂₂）がシリアルに順次出力される
こととなる。The subsequent processing operation will be described with reference to FIG.
It is the same as that shown in FIG.
As shown in z₁₁(X₁₁c₁₁+ X_{twenty one}c₁₂)_,z₁₂(X
₁₂c ₁₁+ X_{twenty two}c₁₂)_,z_{twenty one}(X₁₁c_{twenty one}+ X_{twenty one}c_{twenty two})_,z
_{twenty two}(X₁₂c_{twenty one}+ X_{twenty two}c_{twenty two}) Are serially output
It will be.

【００３１】（３）係数×データで垂直出力スキャン
（図７参照）：この場合には、係数がデータの左側に来
るのでやはり図２(b) の例に相当し、上記の表１により
スイッチ３３はスルーとなり、スイッチ３４，３５はク
ロスに制御されることとなり、図示のようにデータアド
レス及び係数アドレス共に「０」，「２」，「１」，
「３」の順となり、且つデータアドレスの速度が係数ア
ドレスの速度の１／２となって出力されることとなる。 (3) Vertical output scan with coefficient × data
(Refer to FIG. 7): In this case, the coefficient comes to the left side of the data, which also corresponds to the example of FIG. 2 (b). According to Table 1 above, the switch 33 becomes through and the switches 34 and 35 are controlled to cross. As shown in the figure, both the data address and the coefficient address are “0”, “2”, “1”,
The order is "3", and the speed of the data address is 1/2 of the speed of the coefficient address, and the data is output.

【００３２】このようなデータアドレスをアドレス発生
回路３からメモリ１に与えることにより入力データＸを
図示のようにｘ₁₁→ｘ₂₁→ｘ₁₂→ｘ₂₂の順（垂直スキャ
ン）にシリアルに読み出し、同時に係数アドレスをメモ
リ２に与えることにより係数データＣを図示のようにｃ
₁₁→ｃ₂₁→ｃ₁₂→ｃ₂₂の順（垂直スキャン）に入力デー
タＸの２倍の速度で読み出している。By inputting such a data address from the address generating circuit 3 to the memory 1, the input data X is serially read out in the order of x ₁₁ → x ₂₁ → x ₁₂ → x ₂₂ (vertical scan) as shown in the drawing. At the same time, by giving a coefficient address to the memory 2, the coefficient data C is changed to c as shown in the figure.
Reading is performed at a speed twice as fast as the input data X in the order of ₁₁ → c ₂₁ → c ₁₂ → c ₂₂ (vertical scan).

【００３３】この後の処理動作は図１０において説明し
たものと同じであり、最終的にセレクタ１８からは図５
に示すように、ｚ₁₁（ｘ₁₁ｃ₁₁＋ｘ₂₁ｃ₁₂）_,ｚ₂₁（ｘ
₁₁ｃ ₂₁＋ｘ₂₁ｃ₂₂）_,ｚ₁₂（ｘ₁₂ｃ₁₁＋ｘ₂₂ｃ₁₂）_,ｚ
₂₂（ｘ₁₂ｃ₂₁＋ｘ₂₂ｃ₂₂）がシリアルに順次出力される
こととなる。The subsequent processing operation will be described with reference to FIG.
It is the same as that shown in FIG.
As shown in z₁₁(X₁₁c₁₁+ X_{twenty one}c₁₂)_,z_{twenty one}(X
₁₁c _{twenty one}+ X_{twenty one}c_{twenty two})_,z₁₂(X₁₂c₁₁+ X_{twenty two}c₁₂)_,z
_{twenty two}(X₁₂c_{twenty one}+ X_{twenty two}c_{twenty two}) Are serially output
It will be.

【００３４】（４）データ×係数で水平出力スキャン
（図８参照）：この場合には図１０の従来例と同様とな
り、係数がデータの右側に来るので図２(a) の例に相当
し、上記の表１によりスイッチ３３はスルーとなり、ス
イッチ３４，３５もスルーに制御されることとなり、図
示のようにデータアドレス及び係数アドレス共に
「０」，「１」，「２」，「３」の順となり、且つデー
タアドレスの速度が係数アドレスの速度の１／２となっ
て出力されることとなる。 (4) Horizontal output scan with data × coefficient
(Refer to FIG. 8): In this case, the same as the conventional example of FIG. 10, the coefficient comes to the right side of the data, so it corresponds to the example of FIG. 2 (a). 34 and 35 are also controlled to be through, and as shown in the figure, both the data address and the coefficient address are in the order of “0”, “1”, “2”, “3”, and the speed of the data address is the coefficient address. It will be output at half the speed.

【００３５】このようなデータアドレスをアドレス発生
回路３からメモリ１に与えることにより入力データＸを
図示のようにｘ₁₁→ｘ₁₂→ｘ₂₁→ｘ₂₂の順（水平スキャ
ン）にシリアルに読み出し、同時に係数アドレスをメモ
リ２に与えることにより係数データＣを図示のようにｃ
₁₁→ｃ₁₂→ｃ₂₁→ｃ₂₂の順（水平スキャン）に入力デー
タＸの２倍の速度で読み出している。By inputting such a data address from the address generating circuit 3 to the memory 1, the input data X is serially read out in the order of x ₁₁ → x ₁₂ → x ₂₁ → x ₂₂ (horizontal scan) as shown in the drawing. At the same time, by giving a coefficient address to the memory 2, the coefficient data C is changed to c as shown in the figure.
₁₁ → c ₁₂ → c ₂₁ → c ₂₂ (horizontal scan) is read at twice the speed of the input data X.

【００３６】そして、最終的にセレクタ１８からは図５
に示すように、ｙ₁₁（ｘ₁₁ｃ₁₁＋ｘ ₁₂ｃ₂₁）_,ｙ₁₂（ｘ
₁₁ｃ₁₂＋ｘ₁₂ｃ₂₂）_,ｙ₂₁（ｘ₂₁ｃ₁₁＋ｘ₂₂ｃ₂₁）_,ｙ
₂₂（ｘ₂₁ｃ₁₂＋ｘ₂₂ｃ₂₂）がシリアルに順次出力される
こととなる。Finally, as shown in FIG.
As shown in y₁₁(X₁₁c₁₁+ X ₁₂c_{twenty one})_,y₁₂(X
₁₁c₁₂+ X₁₂c_{twenty two})_,y_{twenty one}(X_{twenty one}c₁₁+ X_{twenty two}c_{twenty one})_,y
_{twenty two}(X_{twenty one}c₁₂+ X_{twenty two}c_{twenty two}) Are serially output
It will be.

【００３７】尚、ここでは説明のために２×２のサイズ
の行列におけるマトリクス演算回路を例にとったが、同
様の考え方で任意のサイズのマトリクス演算に本発明を
適用することができる。Although a matrix operation circuit in a matrix of size 2 × 2 is taken as an example for explanation here, the present invention can be applied to a matrix operation of an arbitrary size in the same way.

【００３８】[0038]

【発明の効果】以上説明したように本発明に係るマトリ
クス演算回路によれば、係数行列データを入力行列デー
タの右から掛けるか左から掛けるかを示す演算方向指示
及び水平スキャンで出力するか垂直スキャンで出力する
かを示す出力スキャン方向指示に応じて、該入力行列デ
ータに対する該係数行列データの処理速度を２倍にする
か１／２にするかを選択すると共にアドレスの順番を行
列の水平方向とするか垂直方向とするかを選択してデー
タメモリアドレス及び係数メモリアドレスを生成するア
ドレス発生回路を設けるように構成したので、簡単な回
路の追加で柔軟な入出力インタフェースを有するマトリ
クス演算回路を実現することができる。As described above, according to the matrix operation circuit of the present invention, the operation direction indication indicating whether the coefficient matrix data is multiplied from the right or the left of the input matrix data and whether the coefficient matrix data is output by the horizontal scan or the vertical scan is output. Depending on an output scan direction instruction indicating whether to output by scanning, it is selected whether the processing speed of the coefficient matrix data with respect to the input matrix data is doubled or halved, and the address order is set to the horizontal matrix. Since the address generating circuit for generating the data memory address and the coefficient memory address by selecting the vertical direction or the vertical direction is provided, the matrix arithmetic circuit having a flexible input / output interface by adding a simple circuit. Can be realized.

[Brief description of drawings]

【図１】本発明に係るマトリクス演算回路を原理的に示
したブロック図である。FIG. 1 is a block diagram showing in principle a matrix operation circuit according to the present invention.

【図２】本発明に係るマトリクス演算回路の出力順序を
示した図である。FIG. 2 is a diagram showing an output sequence of a matrix operation circuit according to the present invention.

【図３】本発明に係るマトリクス演算回路の実施例を示
したブロック図である。FIG. 3 is a block diagram showing an embodiment of a matrix operation circuit according to the present invention.

【図４】本発明に係るマトリクス演算回路に使用するア
ドレス発生回路の実施例を示したブロック図である。FIG. 4 is a block diagram showing an embodiment of an address generation circuit used in the matrix operation circuit according to the present invention.

【図５】本発明に係るマトリクス演算回路の実施例の動
作（その１）を示したタイムチャート図である。FIG. 5 is a time chart showing an operation (No. 1) of the embodiment of the matrix operation circuit according to the present invention.

【図６】本発明に係るマトリクス演算回路の実施例の動
作（その２）を示したタイムチャート図である。FIG. 6 is a time chart showing the operation (No. 2) of the embodiment of the matrix operation circuit according to the present invention.

【図７】本発明に係るマトリクス演算回路の実施例の動
作（その３）を示したタイムチャート図である。FIG. 7 is a time chart diagram showing an operation (No. 3) of the embodiment of the matrix operation circuit according to the present invention.

【図８】本発明に係るマトリクス演算回路の実施例の動
作（その４）を示したタイムチャート図である。FIG. 8 is a time chart diagram showing an operation (No. 4) of the embodiment of the matrix operation circuit according to the present invention.

【図９】マトリクス演算を一般的に説明するための図で
ある。FIG. 9 is a diagram for generally explaining a matrix calculation.

【図１０】従来例を示したブロック図である。FIG. 10 is a block diagram showing a conventional example.

【図１１】従来例の動作を示したタイムチャート図であ
る。FIG. 11 is a time chart showing the operation of the conventional example.

【図１２】従来例の出力順序を示した図である。FIG. 12 is a diagram showing an output order of a conventional example.

[Explanation of symbols]

１データメモリ２係数メモリ３アドレス発生回路図中、同一符号は同一又は相当部分を示す。 1 data memory 2 coefficient memory 3 address generation circuit In the figure, the same reference numerals indicate the same or corresponding portions.

─────────────────────────────────────────────────────
─────────────────────────────────────────────────── ───

【手続補正書】[Procedure amendment]

【提出日】平成４年９月４日[Submission date] September 4, 1992

【手続補正１】[Procedure Amendment 1]

【補正対象書類名】明細書[Document name to be amended] Statement

【補正対象項目名】特許請求の範囲[Name of item to be amended] Claims

【補正方法】変更[Correction method] Change

【補正内容】[Correction content]

【特許請求の範囲】[Claims]

【手続補正２】[Procedure Amendment 2]

【補正対象書類名】明細書[Document name to be amended] Statement

【補正対象項目名】０００８[Correction target item name] 0008

【補正方法】変更[Correction method] Change

【補正内容】[Correction content]

【０００８】このようにして係数データＣのメモリから
の読出速度を入力データＸの読出速度の２倍の早さとす
ることにより、乗算器１個でマトリクス演算を実現して
いる。In this way, by making the reading speed of the coefficient data C from the memory twice as fast as the reading speed of the input data X, the matrix operation is realized by one multiplier.

【手続補正３】[Procedure 3]

【補正対象書類名】明細書[Document name to be amended] Statement

【補正対象項目名】００１０[Correction target item name] 0010

【補正方法】変更[Correction method] Change

【補正内容】[Correction content]

【００１０】従って本発明は、データメモリに蓄えられ
た入力行列データと係数メモリに蓄えられた係数行列デ
ータのいずれかの読出速度を、シリアル出力データの速
度のｎ倍の速度で読み出し、１個の乗算器のみを用いて
マトリクス演算を実行し、この演算結果の出力行列デー
タをシリアル出力するマトリクス演算回路において、ハ
ードウェア規模を増大させることなく任意の演算方向と
出力スキャン方向のマトリクス演算を実行できるように
することを目的とする。Therefore, according to the present invention, the reading speed of either the input matrix data stored in the data memory or the coefficient matrix data stored in the coefficient memory is read at a speed n times the speed of the serial output data. Matrix operation circuit using only the multipliers of the above, and the matrix operation circuit that serially outputs the output matrix data of this operation result, executes the matrix operation in any operation direction and output scan direction without increasing the hardware scale. The purpose is to be able to.

【手続補正４】[Procedure amendment 4]

【補正対象書類名】明細書[Document name to be amended] Statement

【補正対象項目名】００１１[Correction target item name] 0011

【補正方法】変更[Correction method] Change

【補正内容】[Correction content]

【００１１】[0011]

【課題を解決するための手段】図１は上記の課題を解決
するための本発明に係るマトリクス演算回路を原理的に
示したもので、係数行列データＣを入力行列データＸの
右から掛けるか左から掛けるかを示す演算方向指示及び
水平スキャンで出力するか垂直スキャンで出力するかを
示す出力スキャン方向指示に応じて、該出力行列データ
のシリアル出力速度に対する該係数行列データＣの読出
速度をｎ倍にするか同じにするかを選択すると共に該入
力行列データ（Ｘ）の読出速度についてもｎ倍にするか
同じにするかを選択し且つアドレス出力の順番を行列の
水平方向とするか垂直方向とするかを選択して該データ
メモリ１及び該係数メモリ２に対するそれぞれの読出ア
ドレスを生成するアドレス発生回路３を設けたことを特
徴とするものである。FIG. 1 shows the principle of a matrix operation circuit according to the present invention for solving the above problems. Whether the coefficient matrix data C is multiplied from the right of the input matrix data X or not. The read speed of the coefficient matrix data C relative to the serial output speed of the output matrix data is determined according to the calculation direction instruction indicating whether to multiply from the left and the output scan direction instruction indicating whether to output in the horizontal scan or the vertical scan. Select whether to make n times or the same, and also to make the reading speed of the input matrix data (X) n times or the same, and whether to set the address output order in the horizontal direction of the matrix. It is characterized in that an address generating circuit 3 is provided for selecting whether to make the vertical direction and generating respective read addresses for the data memory 1 and the coefficient memory 2. .

【手続補正５】[Procedure Amendment 5]

【補正対象書類名】明細書[Document name to be amended] Statement

【補正対象項目名】００１２[Correction target item name] 0012

【補正方法】変更[Correction method] Change

【補正内容】[Correction content]

【００１２】[0012]

【作用】図１に示す本発明において、入力行列データＸ
に対して係数行列データＣを右から掛けるか左から掛け
るかを示す演算方向と水平スキャンで出力するか垂直ス
キャンで出力するかを示す出力スキャン方向とをアドレ
ス発生回路３に指示すると、まず、入力行列データＸと
該係数行列データＣのいずれかの読出速度をｎ倍にする
か同じにするかを選択（決定）する。In the present invention shown in FIG. 1, the input matrix data X
When the address generating circuit 3 is instructed to calculate the calculation direction indicating whether the coefficient matrix data C is multiplied from the right or from the left and the output scan direction indicating whether to output in the horizontal scan or the vertical scan, The reading speed of either the input matrix data X or the coefficient matrix data C is selected (determined) to be n times or the same.

【手続補正６】[Procedure Amendment 6]

【補正対象書類名】明細書[Document name to be amended] Statement

【補正対象項目名】００１３[Correction target item name] 0013

【補正方法】変更[Correction method] Change

【補正内容】[Correction content]

【００１３】これは図１０について説明したように、デ
ータＸ×係数Ｃで水平出力スキャンのような場合には１
つの乗算器でマトリクス演算を実行するため係数行列デ
ータＣの読出速度を入力行列データＸの読出速度のｎ倍
に設定しなければならないので、アドレス発生回路３に
おいては、上記のように演算方向と出力スキャン方向を
変えて図２(a) 及び(b) に示す４通りのパターンを作り
出すためには係数行列データＣと入力行列データＸの読
出速度のいずれかをｎ倍にする必要があるからである。As described with reference to FIG. 10, this is 1 in the case of horizontal output scan with data X × coefficient C.
Since the read speed of the coefficient matrix data C must be set to n times the read speed of the input matrix data X in order to execute the matrix operation by one multiplier, the address generator circuit 3 operates in the direction of operation as described above. In order to change the output scan direction to create the four patterns shown in FIGS. 2 (a) and 2 (b), either the coefficient matrix data C or the input matrix data X must be read n times faster. Is.

【手続補正７】[Procedure Amendment 7]

【補正対象書類名】明細書[Document name to be amended] Statement

【補正対象項目名】００２０[Correction target item name] 0020

【補正方法】変更[Correction method] Change

【補正内容】[Correction content]

【００２０】（１）データ×係数で垂直出力スキャン
（図５参照）：この場合には係数がデータの左側に位置
するので図２(a) の例に相当しており、まず、係数デー
タＣの処理速度と入力データＸの処理速度は図１０に関
しても説明したようにシリアル出力に対して２倍又は同
じ速度の関係になるので、カウンタ３２のカウント速度
は図示のようにカウンタ３１のカウント速度の２倍に設
定しておく。尚、この関係は図６〜図８の例においても
同様である。 (1) Vertical output scan with data × coefficient
(See FIG. 5): In this case, since the coefficient is located on the left side of the data, it corresponds to the example of FIG. 2 (a). First, the processing speed of the coefficient data C and the processing speed of the input data X are shown in FIG. As described above, since the serial output has a double speed or the same speed, the count speed of the counter 32 is set to double the count speed of the counter 31 as illustrated. Note that this relationship is the same in the examples of FIGS.

【手続補正８】[Procedure Amendment 8]

【補正対象書類名】明細書[Document name to be amended] Statement

【補正対象項目名】００２５[Name of item to be corrected] 0025

【補正方法】変更[Correction method] Change

【補正内容】[Correction content]

【００２５】これらのスイッチ３４，３５も上記の表１
に従って制御信号生成部３６により制御されるので、こ
の例の場合には、両スイッチ３４，３５共に２ビット信
号のＭＳＢ側とＬＳＢ側とが入れ替わる（クロス）形で
通過することとなり、図５に示すようにデータアドレス
は２進値「０」，「２」，「１」，「３」の順になり、
係数アドレスはシリアル出力と同じ速度で同じく２進値
「０」，「２」，「１」，「３」の順になる。These switches 34 and 35 are also shown in Table 1 above.
According to the control signal generator 36, the two switches 34 and 35 pass the 2-bit signal in a form in which the MSB side and the LSB side are interchanged (crossed), as shown in FIG. As shown, the data addresses are in the order of binary values "0", "2", "1", "3",
The coefficient address has the same speed as the serial output, and is in the order of binary values "0", "2", "1", "3".

【手続補正９】[Procedure Amendment 9]

【補正対象書類名】明細書[Document name to be amended] Statement

【補正対象項目名】００３８[Correction target item name] 0038

【補正方法】変更[Correction method] Change

【補正内容】[Correction content]

【００３８】[0038]

【発明の効果】以上説明したように本発明に係るマトリ
クス演算回路によれば、ｎ×ｎの大きさの係数行列デー
タを同じくｎ×ｎの大きさの入力行列データの右から掛
けるか左から掛けるかを示す演算方向指示及び水平スキ
ャンで出力するか垂直スキャンで出力するかを示す出力
スキャン方向指示に応じて、該入力行列データと該係数
行列データのいずれかの読出速度をシリアル出力速度の
ｎ倍にするか同じにするかを選択すると共にアドレスの
順番を行列の水平方向とするか垂直方向とするかを選択
してデータメモリアドレス及び係数メモリアドレスを生
成するアドレス発生回路を設けるように構成したので、
簡単な回路の追加で柔軟な入出力インタフェースを有す
るマトリクス演算回路を実現することができる。As described above, according to the matrix operation circuit of the present invention, coefficient matrix data of size n × n is multiplied from the right of input matrix data of size n × n or from the left. The read speed of either the input matrix data or the coefficient matrix data is set to the serial output speed according to the calculation direction instruction indicating whether to multiply and the output scan direction instruction indicating whether to output in the horizontal scan or the vertical scan. An address generating circuit for generating a data memory address and a coefficient memory address by selecting n times or the same and selecting whether the address order is the horizontal direction or the vertical direction of the matrix is provided. I configured it, so
A matrix arithmetic circuit having a flexible input / output interface can be realized by adding a simple circuit.

Claims

[Claims]

1. A matrix operation of input matrix data (X) stored in a data memory (1) and coefficient matrix data (C) stored in a coefficient memory (2) is executed, and an output matrix of the operation result is obtained. In a matrix operation circuit for serially outputting data, an operation direction indicating whether to multiply the coefficient matrix data (C) from the right or the left of the input matrix data (X) and output in horizontal scan or vertical scan The processing speed of the coefficient matrix data (C) with respect to the input matrix data (X) is set to 2 according to the output scan direction instruction indicating
By doubling or halving, and by selecting whether the order of the addresses is the horizontal direction or the vertical direction of the matrix, the respective addresses for the data memory (1) and the coefficient memory (2) are selected. Address generation circuit for generating read address
A matrix operation circuit characterized in that (3) is provided.