JPS60201472A - Matrix product computing device - Google Patents

Matrix product computing device

Info

Publication number
JPS60201472A
JPS60201472A JP5790184A JP5790184A JPS60201472A JP S60201472 A JPS60201472 A JP S60201472A JP 5790184 A JP5790184 A JP 5790184A JP 5790184 A JP5790184 A JP 5790184A JP S60201472 A JPS60201472 A JP S60201472A
Authority
JP
Japan
Prior art keywords
operand
product
matrix
column
row
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
JP5790184A
Other languages
Japanese (ja)
Inventor
Akira Sawada
明 澤田
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NEC Corp
Original Assignee
NEC Corp
Nippon Electric Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by NEC Corp, Nippon Electric Co Ltd filed Critical NEC Corp
Priority to JP5790184A priority Critical patent/JPS60201472A/en
Publication of JPS60201472A publication Critical patent/JPS60201472A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization

Abstract

PURPOSE:To obtain a matrix product computing device in which matrix product is computed in short computing operator obtaining the product of two operands and integrating them in a matrix and connecting each operand registers of each operator as a shift register. CONSTITUTION:Product adders are arranged in a matrix, operand registers RA of each row are connected as shift registers and operand registers RB of each column are connected as shift registers. When an operand comes from a data bus DB, a clock signal (CA or CB) is given to a corresponding row or column and the content of the data bus is stored in the 1st row or column of operand register, the operand stored so far is stored in the operand register of the next stage and the operand is moved similarly. A flip-flop FFA or FFB in the product adder is set at the same time. When a clock Cop is outputted after the operand is fed to he required row or column, the arithmetic is conducted only for the product adder whose flip-flops FFA, FFB are both set.

Description

【発明の詳細な説明】 (1) 発明の属する分野の説明 本発明は中央処理装置を補助して高速にマ) IJクス
積の計算を行なう装置に関するものである。
DETAILED DESCRIPTION OF THE INVENTION (1) Description of the field to which the invention pertains The present invention relates to a device that assists a central processing unit to calculate an IJ product at high speed.

(2) 従来の技術の説明 従来装置で(m、n)形行列と<n、t)形行列の積を
めるには1.Σ alj−1)jkの計算をm・」−1 を回行なっていた。この方式で計算能力を高めるには、
aij @ 1)jl(の計算をしてその累計をめる積
和演算器を多数用いれば良い。しかし、積和演算器の数
がかなり多くなると、データバスのオペランド供給がお
いつかなくなり、実質的にデータバスの能力で計算速度
の限界が法談るようになる。
(2) Description of conventional technology To calculate the product of an (m, n) type matrix and a <n, t) type matrix using a conventional device, 1. The calculation of Σ alj-1)jk was performed m·''-1 times. To increase your computing power using this method,
aij @ 1) jl() and calculates the cumulative sum. However, if the number of product-accumulators increases considerably, the operand supply of the data bus will not be able to keep up, and the In the end, the limits of calculation speed became apparent due to the capabilities of the data bus.

この場合、全オペランドの供給時間が計算時間となるの
で、次式の関係が成立する。
In this case, since the supply time of all operands becomes the calculation time, the following relationship holds true.

T□ =2nsmst/M ・−・−・−・ (11こ
こでs TOは計算時間(秒)2Mはデータバスが単位
時間にオペランドを供給できる量である。
T□=2nsmst/M (11) where sTO is calculation time (seconds) and 2M is the amount of operands that the data bus can supply per unit time.

例えば(100,100)形行列どうしの積をめる場合
、Mを100M語/秒とすると、Toは20m秒となり
、積和演算器の数がいくら多くてもこれ以上速くは計算
できない。バスを多重化すればオペランド供給能力が高
くなるが、主記憶の競合が発生したり、バス間の制御が
複雑となるので多重度はあまり上げられない。
For example, when calculating the product of (100,100) type matrices, if M is 100M words/sec, To will be 20 msec, and no matter how many product-sum calculators there are, calculations cannot be made any faster. Multiplexing the buses increases operand supply capability, but the degree of multiplicity cannot be increased much because main memory contention occurs and control between buses becomes complicated.

(3) 発明の目的 本発明の目的は、短い計算時間でマトリクス積の計算が
でき、かつ制御が容易で積和演算器の数が少なくてすむ
・マトリクス積計算装置を得ることにある。
(3) Object of the Invention An object of the present invention is to provide a matrix product calculation device that can calculate matrix products in a short calculation time, is easy to control, and requires a small number of product-sum calculation units.

(4)発明の構成 本発明によれば、2つのオペランドの積をめてその累計
をとる演算器を格子状に配置し、各演算器のそれぞれの
オペランドレジスタ同志をシフトレジスタとして連結し
たマトリクス積計算装置を得る。
(4) Structure of the Invention According to the present invention, arithmetic units that multiply the products of two operands and take the cumulative sum are arranged in a grid, and each operand register of each arithmetic unit is connected as a shift register to form a matrix product. Get a computing device.

(5)発明の実施例 次に、図面を参照して本拠明をより詳細に説明する。(5) Examples of the invention Next, the main feature will be explained in more detail with reference to the drawings.

第1図は本発明のマl−IJクス積計算装置を構成する
構成要素である積和演算器の内部構成例であって、クロ
ックCAによって前段又はデータ源からの情報を供給す
る内部データーバスDAIの内容を記憶するオペランド
レジスタRAと同様にクロックCBによって前段又は他
のデータ源からの情報を供給する内部データバスDBI
の内容を記憶するオペランドレジスタRBおよび、各ク
ロック0人。
FIG. 1 shows an example of the internal configuration of a product-sum calculator that is a component of the multi-IJ multiplication calculation device of the present invention, and shows an internal data bus that supplies information from the previous stage or data source using a clock CA. An internal data bus DBI that supplies information from previous stages or other data sources by clock CB as well as an operand register RA that stores the contents of DAI.
Operand register RB that stores the contents of and each clock zero.

CB、COPにより起動され乗算・累算を行なう演算部
OPから成る。演算部OP内では、クロックCAでフリ
ップ70ツブFFAを駆動し、クロックCBでフリップ
フロップFFBを駆動し、フリップ70ツブFFAとF
FBとの出力とクロックCOPとのANDを取り、制御
部で制御信号を作っている。一方、制御部からの制御信
号で、内部データバスDAOからオペランドレジスタR
A75)うの情報と内部データバスDBOからオペラン
ドレジスタRBからの情報とを得て、これらを乗算器で
乗算し、その出力を、前回の乗算の結果を記憶している
結果レジスタからの出力とを加算器で加算して結果レジ
スタに新らたに記憶している。
It consists of an arithmetic unit OP that is activated by CB and COP and performs multiplication and accumulation. In the arithmetic unit OP, the clock CA drives the flip-flop 70-tube FFA, the clock CB drives the flip-flop FFB, and the flip-flop 70-tubes FFA and F
The output from the FB and the clock COP are ANDed to create a control signal in the control section. On the other hand, a control signal from the control unit causes the operand register R to be transferred from the internal data bus DAO.
A75) Obtain the information from the operand register RB from the internal data bus DBO, multiply them by a multiplier, and use the output as the output from the result register that stores the result of the previous multiplication. are added by an adder and newly stored in the result register.

第2図は本発明の一実施例であって、第1図の積和演算
器を格子状に並べ、各行毎のオペランドレジスタRAを
シフトレジスタとして結線し、各列毎のオペランドレジ
スタRBもシフトレジスタとして結線したものである。
FIG. 2 shows an embodiment of the present invention, in which the product-accumulators shown in FIG. 1 are arranged in a grid, the operand registers RA for each row are connected as shift registers, and the operand registers RB for each column are also shifted. It is wired as a resistor.

従ってクロックCA。Therefore, clock CA.

CBは各行、各列ごとに供給している。またクロックC
opは全積和演算器に共通に供給している。
CB is supplied to each row and each column. Also clock C
op is commonly supplied to all product-sum calculation units.

この動作は、データバスDBよりオペランドが送られて
くると該当する行または列にクロック信号(CAまたは
CB)が出され、データバスの内容が1列目または1行
目のオペランドレジスタこ記憶され、それまで記憶され
ていたオペランドは次段のオペランドレジスタに記憶さ
れ、以下同様にオペランドが移動する。同時に、積和演
算器内の7リツプフロツプFFAまたはFFBがセット
される。必要な行2列にオペランドを送ったあとでクロ
ックCopを出すき、フリップフロップFFA。
In this operation, when an operand is sent from the data bus DB, a clock signal (CA or CB) is output to the corresponding row or column, and the contents of the data bus are stored in the operand register in the first column or row. , the operands stored up to that point are stored in the next-stage operand register, and the operands are moved in the same manner thereafter. At the same time, 7 lip-flops FFA or FFB in the product-sum calculator are set. After sending the operands to the required rows and 2 columns, the clock Cop is sent to the flip-flop FFA.

FFBがともにセットされている積和演算器に限り演算
を行なう。演算終了後フリップフロップFFA、FFB
はリセットされる。この一連の動作を1サイクルとし、
必要なサイクル数分実行することによりマトリクス積が
得られる。
Calculation is performed only in the product-sum calculation unit in which both FFB is set. After the calculation is completed, flip-flops FFA and FFB
will be reset. This series of operations is called one cycle,
A matrix product can be obtained by executing the required number of cycles.

オペランドが送る順序は次のとうりである。オペランド
レジスタRAtxJこはマトリクスAの1行目のn個の
要素とそれにつづくt−1個のゼロを順に送る。オペラ
ンドレジスタRA21にはAの2行目のn個の要素とそ
れにつづくm−1個のゼロを2サイクル目から順に送る
。以下同様にn個の5− 要素とt−1個のゼロを1サイクルずつ遅らせながら送
る。同様にBの各列のn個の要素とm−1個のゼロを1
サイクルずつずらして送る。このようにして全部送り終
わると積がまっている。従って、積をめるに必要なサイ
クル数はn+(m−1)+(t−1)である。
The order in which the operands are sent is as follows. Operand register RAtxJ sequentially sends n elements of the first row of matrix A and t-1 zeros following them. The n elements of the second row of A and the following m-1 zeros are sequentially sent to the operand register RA21 from the second cycle. Thereafter, in the same way, n 5- elements and t-1 zeros are sent with a delay of one cycle. Similarly, n elements and m-1 zeros in each column of B are 1
Send by shifting each cycle. After sending everything in this way, it will be piled up. Therefore, the number of cycles required to calculate the product is n+(m-1)+(t-1).

第3図は具体的な動作例を示した図であって。FIG. 3 is a diagram showing a specific example of operation.

(4,2)形行列Aと(2,3)形行列Bの計算例であ
る。内側の4×3のマス目がそれぞれ積和演算器に対応
し、マス自白には経過サイクルに対応した演算器の動作
が示しである。外側のマスにはオペランドを送る順序を
示しである。
This is an example of calculation of a (4,2) type matrix A and a (2,3) type matrix B. Each of the inner 4×3 squares corresponds to a product-sum calculator, and the cell confession shows the operation of the calculator corresponding to the elapsed cycle. The outer cells indicate the order in which operands are sent.

次にm−を個の積和演算器を用いる場合について実施例
の装置と従来装置を比較してみる。1サイクル当り必要
なオペランド数は従来装置では2met個であるが、本
実施例の装置ではm+を個必要とするにすぎない。一方
、マトリクス積をめるに必要なサイクル数は従来nサイ
クルである6− 倍と少ない。例えば、(10,10)形行列どうしの場
合は028倍、(100,100)形の場合は約003
倍ですむ。逆にバスの能力を同じとすると演算器の数を
ふやすことができる。
Next, a comparison will be made between the device of the embodiment and the conventional device in the case where m- product-sum calculators are used. The number of operands required per cycle is 2met in the conventional device, but only m+ in the device of this embodiment. On the other hand, the number of cycles required to calculate the matrix product is 6- times smaller than the conventional n cycles. For example, in the case of (10,10) type matrices, it is multiplied by 028, and in the case of (100,100) type matrices, it is approximately 003
It costs twice as much. Conversely, if the bus capacity remains the same, the number of computing units can be increased.

以上のように、バスの制限を受けにくくなるので演算器
の数を大幅にふやすことができ、計算速度を向上させる
ことができる。
As described above, since it is less susceptible to bus limitations, the number of arithmetic units can be greatly increased, and calculation speed can be improved.

また、演算器の数を同じとするとバスの使用効率が小さ
くなり、バスを他のデータ転送に利用できるので中央処
理装置の処理能力が向上する。
Furthermore, if the number of arithmetic units is kept the same, the efficiency of using the bus will be reduced, and the bus can be used for other data transfers, thereby improving the processing capacity of the central processing unit.

【図面の簡単な説明】[Brief explanation of the drawing]

第1図は本発明の一実施例に用いる積和演算器の例を示
すブロック図、第2図は本発明の一実施例を示すブロッ
ク図、第3図は(4,2)形行列と(2,3)形行列の
マ) IJクス積をめるときの動作を示した図である。 D人I 、 DBI・・・・・・前段からの内部データ
バス、DAO。 DBO・・・・・・次段への内部データバス%C人、C
B・・・・・・クロックパルス、Cop・・・・・・演
算タイミングクロックパルス、D几・・・・・・データ
バスへの結線。
FIG. 1 is a block diagram showing an example of a product-sum calculator used in an embodiment of the present invention, FIG. 2 is a block diagram showing an embodiment of the present invention, and FIG. 3 is a block diagram showing an example of a (4,2) type matrix. FIG. 3 is a diagram showing the operation when calculating the IJ product of a (2,3) type matrix. D person I, DBI...Internal data bus from the previous stage, DAO. DBO・・・・・・Internal data bus to next stage %C person, C
B: Clock pulse, Cop: Operation timing clock pulse, D: Connection to data bus.

Claims (1)

【特許請求の範囲】[Claims] 2つのオペランドの積をめてその累計をとる演算器を格
子状に配置し、各演算器の2つのオペランドレジスタは
それぞれ格子の各行、各列ごとのシフトレジスタを構成
する要素となっていることを特徴とするマトリクス積計
算装置。
Arithmetic units that multiply the products of two operands and take the cumulative sum are arranged in a grid, and the two operand registers of each arithmetic unit are elements that constitute shift registers for each row and column of the lattice. A matrix product calculation device featuring:
JP5790184A 1984-03-26 1984-03-26 Matrix product computing device Pending JPS60201472A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP5790184A JPS60201472A (en) 1984-03-26 1984-03-26 Matrix product computing device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP5790184A JPS60201472A (en) 1984-03-26 1984-03-26 Matrix product computing device

Publications (1)

Publication Number Publication Date
JPS60201472A true JPS60201472A (en) 1985-10-11

Family

ID=13068895

Family Applications (1)

Application Number Title Priority Date Filing Date
JP5790184A Pending JPS60201472A (en) 1984-03-26 1984-03-26 Matrix product computing device

Country Status (1)

Country Link
JP (1) JPS60201472A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109376113A (en) * 2016-11-03 2019-02-22 北京中科寒武纪科技有限公司 SLAM arithmetic unit and method

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109376113A (en) * 2016-11-03 2019-02-22 北京中科寒武纪科技有限公司 SLAM arithmetic unit and method

Similar Documents

Publication Publication Date Title
US9384168B2 (en) Vector matrix product accelerator for microprocessor integration
JP3244506B2 (en) Small multiplier
JPS6217770B2 (en)
JPS6132437Y2 (en)
US3535498A (en) Matrix of binary add-subtract arithmetic units with bypass control
US4769780A (en) High speed multiplier
US4910700A (en) Bit-sliced digit-serial multiplier
CN113032723B (en) Matrix multiplier realizing method and matrix multiplier device
EP4318275A1 (en) Matrix multiplier and method for controlling matrix multiplier
JPH036546B2 (en)
JPH06502265A (en) Calculation circuit device for matrix operations in signal processing
JPH0477932B2 (en)
CN115408061B (en) Hardware acceleration method, device, chip and storage medium for complex matrix operation
JPS60201472A (en) Matrix product computing device
JPH07107664B2 (en) Multiplication circuit
JP3227538B2 (en) Binary integer multiplier
JPH05197525A (en) Method and circuit for negating operand
JP2600591B2 (en) Multiplier
JPS6259828B2 (en)
JPH05324694A (en) Reconstitutable parallel processor
SU594502A1 (en) Conveyer-type multiplier
JPH05101031A (en) Neural network device
JPH04364525A (en) Parallel arithmetic unit
SU868767A1 (en) Device for computing polynomials
JP2696903B2 (en) Numerical calculator