JPH06215019A

JPH06215019A - Arithmetic unit and parallel arithmetic unit system

Info

Publication number: JPH06215019A
Application number: JP5023346A
Authority: JP
Inventors: Hiroyuki Fujita; 裕之藤田
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 1993-01-18
Filing date: 1993-01-18
Publication date: 1994-08-05

Abstract

PURPOSE:To avoid the collision of data outputted from respective parts on a data bus or the delay of processing and to accelerate the repeated processing of the arithmetic of the sum of products by improving efficiency for using the respective parts. CONSTITUTION:A latch 12a is arranged before a multiplier 12b. The multiplier 12b supplies the product of an output from the latch 12a and data from an input terminal 13 to a multiplexer 12c. The multiplier 12b supplies the product through the multiplexer 12c and a latch 12d to a serially arranged an arithmetic and logic circuit 12e. The arithmetic and logic circuit 12e calculates the sum of products and outputs the result from an output terminal 14. The latch 12a, multiplexer 12c and arithmetic and logic circuit 12e are respectively controlled by control signals from input terminals 16, 17 and 18.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、供給されるデータに基
づいて積和演算を高速に行う演算装置及び複数の演算装
置を並列接続して成る並列演算システムに関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an arithmetic unit for performing product-sum operation at high speed based on supplied data and a parallel arithmetic system formed by connecting a plurality of arithmetic units in parallel.

【０００２】[0002]

【従来の技術】一般に、積和演算を行う演算装置は、概
略的に例えば図４に示すメモリ６０、乗算器６１及び算
術論理回路（以下ＡＬＵという）６２で構成している。
上述した各部は、データバスＤＢに接続され、このデー
タバスＤＢを介してデータの入出力を行っている。特
に、上述した回路構成を有するディジタル信号プロセッ
サ（ＤＳＰ）回路は、データバスＤＢを介して供給され
る入力データに対して繰り返し上記積和演算を行って例
えばフィルタ処理等を行っている。2. Description of the Related Art Generally, an arithmetic unit for performing a sum-of-products operation is roughly composed of, for example, a memory 60, a multiplier 61 and an arithmetic logic circuit (hereinafter referred to as ALU) 62 shown in FIG.
Each of the above-mentioned units is connected to the data bus DB, and inputs / outputs data via this data bus DB. In particular, the digital signal processor (DSP) circuit having the above-described circuit configuration repeatedly performs the above-described product-sum operation on input data supplied via the data bus DB, and performs, for example, filter processing.

【０００３】[0003]

【発明が解決しようとする課題】ところで、実際に図４
に示した演算装置を用いて例えば係数をａ，ｂとし、変
数ｘが入力された際の出力ｙを求める一次関数の計算式
ｙ＝ａｘ＋ｂを計算する場合、メモリ６０から係数ａが
乗算器６１にデータバスＤＢを介して供給される。この
後、上記メモリ６０は、変数ｘを被乗数として乗算器６
１にデータバスＤＢを介して供給する。メモリ６０は、
係数ｂをＡＬＵ６２にデータバスＤＢを介して供給す
る。乗算器６１は、係数ａと変数ｘの乗算した結果であ
る積ａｘをデータバスＤＢを介してＡＬＵ６２に供給す
る。上記ＡＬＵ６２は、供給された積ａｘと係数ｂとを
算術加算して演算結果ａｘ＋ｂを生成し、データバスＤ
Ｂを介してメモリ６０の変数ｙに入力している。By the way, FIG. 4 is actually used.
When calculating the linear function calculation formula y = ax + b for obtaining the output y when the variable x is input using the arithmetic unit shown in FIG. To the data bus DB. After that, the memory 60 uses the variable x as the multiplicand and the multiplier 6
1 through the data bus DB. The memory 60 is
The coefficient b is supplied to the ALU 62 via the data bus DB. The multiplier 61 supplies the product ax, which is the result of multiplication of the coefficient a and the variable x, to the ALU 62 via the data bus DB. The ALU 62 arithmetically adds the supplied product ax and the coefficient b to generate an operation result ax + b, and outputs the data bus D
It is input to the variable y of the memory 60 via B.

【０００４】ところが、入力データを高速にフィルタ処
理するとき、演算装置は、上述したような繰り返し演算
を行わなければならない。演算装置が演算処理する際、
データの入出力がデータバスＤＢを介して行われるた
め、各部から出力されるデータがデータバスＤＢ上で衝
突、いわゆるバスロックが生じたり、演算が終了するま
で他のデータ転送等の処理に遅滞が生じてしまうことが
ある。However, when the input data is filtered at high speed, the arithmetic unit must perform the repetitive arithmetic operation as described above. When the arithmetic device performs arithmetic processing,
Since data input / output is performed via the data bus DB, data output from each unit collides on the data bus DB, so-called bus lock occurs, or other processing such as data transfer is delayed until the operation ends. May occur.

【０００５】このようなバスロック等が生じると演算装
置は、各部を効率よく使うことができなくなってしま
い、演算が高速に処理されなくなる。When such a bus lock or the like occurs, the arithmetic unit cannot efficiently use each unit, and the arithmetic cannot be processed at high speed.

【０００６】そこで、本発明は、上述したような実情に
鑑みてなされたものであり、各部から出力されるデータ
のデータバス上での衝突や処理の遅滞を回避し、各部の
使用効率を上げて例えば積和演算の繰り返し処理を高速
化できる演算装置及び並列演算装置の提供を目的とす
る。Therefore, the present invention has been made in view of the above situation, and avoids the collision of data output from each unit on the data bus and the delay in processing, thereby improving the use efficiency of each unit. It is an object of the present invention to provide an arithmetic unit and a parallel arithmetic unit capable of speeding up the iterative processing of, for example, product-sum calculation.

【０００７】[0007]

【課題を解決するための手段】本発明に係る演算装置
は、供給されるデータに基づいて繰り返し積和演算を行
う演算装置において、外部から供給されるデータを一時
保持するデータ保持手段と、該データ保持手段からの出
力を乗算係数として用い、この乗算係数と外部から供給
される被乗数とを乗算する乗算手段と、外部から供給さ
れるデータをレジスタに格納して算術演算と論理演算と
を行う算術論理手段とを有し、上記乗算手段と上記算術
論理手段とを直列に接続することにより、上述の課題を
解決する。An arithmetic unit according to the present invention is a arithmetic unit that repeatedly performs sum-of-products calculation based on supplied data, and a data holding unit that temporarily holds data supplied from the outside. The output from the data holding means is used as a multiplication coefficient, and the multiplication means for multiplying this multiplication coefficient and the multiplicand supplied from the outside, and the data supplied from the outside are stored in the register to perform the arithmetic operation and the logical operation. The above problem is solved by having arithmetic logic means and connecting the multiplication means and the arithmetic logic means in series.

【０００８】ここで、演算装置は、上記算術論理手段に
供給する入力データを切り換えるためにマルチプレクサ
を設けている。この演算装置では、この演算装置の外部
に設けたローカルメモリのアドレスをアドレス算出手段
で計算し、ローカルメモリは、上記アドレス算出手段か
ら供給されるアドレスに応じてデータを読み出してプロ
セッサ内部に送っている。Here, the arithmetic unit is provided with a multiplexer for switching the input data supplied to the arithmetic logic means. In this arithmetic unit, the address of the local memory provided outside the arithmetic unit is calculated by the address calculating unit, and the local memory reads out the data according to the address supplied from the address calculating unit and sends it to the inside of the processor. There is.

【０００９】また、上記係数保持手段、上記算術加算器
は、例えばマイクロプログラムメモリから供給される制
御信号に応じて動作している。The coefficient holding means and the arithmetic adder operate according to a control signal supplied from, for example, a microprogram memory.

【００１０】本発明に係る並列演算装置システムは、複
数の演算装置を並列接続して成る並列演算システムにお
いて、各演算装置に外部から供給されるデータを一時保
持するデータ保持手段と、該データ保持手段からの出力
を乗算係数として用い、この乗算係数と外部から供給さ
れる被乗数とを乗算する乗算手段と、外部から供給され
るデータをレジスタに格納して算術演算と論理演算とを
行う算術論理手段とを有し、上記乗算手段と上記算術論
理手段とを直列に接続した構成からなる演算装置を用い
ることにより、上述の課題を解決する。A parallel computing device system according to the present invention is a parallel computing system in which a plurality of computing devices are connected in parallel, and data holding means for temporarily holding data supplied from the outside to each computing device, and the data holding means. An output from the means is used as a multiplication coefficient, multiplication means for multiplying this multiplication coefficient by a multiplicand supplied from the outside, and arithmetic logic for performing arithmetic operation and logical operation by storing data supplied from the outside in a register The above-mentioned problem is solved by using an arithmetic unit having a means and having a configuration in which the multiplication means and the arithmetic logic means are connected in series.

【００１１】[0011]

【作用】本発明に係る演算装置は、係数保持手段と算術
加算器とを直列に接続し、これらをそれぞれプログラム
に応じた制御信号で動作させ、データを読み出してプロ
セッサ内部に送って、データを入出力する上での衝突を
回避できるようになる。In the arithmetic unit according to the present invention, the coefficient holding means and the arithmetic adder are connected in series, each of which is operated by a control signal according to a program, the data is read and sent to the inside of the processor, and the data is transferred. It becomes possible to avoid collisions in inputting and outputting.

【００１２】本発明に係る並列演算装置システムは、上
記構成の演算装置を複数、並列接続してデータを入出力
する上での衝突を回避させている。In the parallel arithmetic device system according to the present invention, a plurality of arithmetic devices having the above-mentioned configuration are connected in parallel to avoid a collision in inputting / outputting data.

【００１３】[0013]

【実施例】以下、本発明に係る演算装置の一実施例につ
いて、図面を参照しながら説明する。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS An embodiment of the arithmetic unit according to the present invention will be described below with reference to the drawings.

【００１４】図１は、コード変換やデータ入力等を効率
的に行うために用いられるマイクロプログラムに応じて
動作させるプロセッサを示している。プロセッサは、プ
ログラムコントローラ（ＰＣ）１０、マイクロプログラ
ムメモリ（ＭＰＭ）１１及び演算部１２で構成してい
る。FIG. 1 shows a processor that operates according to a microprogram used for efficiently performing code conversion, data input, and the like. The processor includes a program controller (PC) 10, a micro program memory (MPM) 11 and a calculation unit 12.

【００１５】上記プログラムコントローラ１０は、マイ
クロプログラムメモリ１１内の制御に従ってアドレスを
発生させてマイクロプログラムメモリ１１に供給してい
る。マイクロプログラムメモリ１１は、マイクロプログ
ラムを記憶している。このマイクロプログラムの内容
は、プログラムコントローラ１０の制御、演算部１２の
制御を行う命令及び上記演算部１２に与える数値データ
ＮＵＭである。マイクロプログラムメモリ１１は、演算
部１２を制御するマイクロプログラム命令と数値データ
ＮＵＭとを演算部１２に供給している。The program controller 10 generates an address according to the control in the micro program memory 11 and supplies it to the micro program memory 11. The micro program memory 11 stores a micro program. The contents of this microprogram are instructions for controlling the program controller 10 and the arithmetic unit 12, and numerical data NUM given to the arithmetic unit 12. The micro program memory 11 supplies a micro program command for controlling the arithmetic unit 12 and numerical data NUM to the arithmetic unit 12.

【００１６】この演算部１２は、外部から入力端子１３
を介してデータが入力されている。演算部１２は、この
データに対して例えば積和演算を行って出力端子１４か
ら上記演算結果を出力する。The arithmetic unit 12 has an input terminal 13 from the outside.
Data has been entered via. The calculation unit 12 performs, for example, a sum of products calculation on this data and outputs the calculation result from the output terminal 14.

【００１７】このプロセッサは、演算部１２に本発明の
演算装置を適用している。図２は、演算装置の概略的な
構成を示している。ここで、共通する部分に同じ参照番
号を付す。演算装置は、主に、データ保持手段としてラ
ッチ１２ａ、乗算手段として乗算器１２ｂ及び算術論理
手段として算術論理回路１２ｃとで構成している。In this processor, the arithmetic unit of the present invention is applied to the arithmetic unit 12. FIG. 2 shows a schematic configuration of the arithmetic unit. Here, common parts are given the same reference numerals. The arithmetic unit is mainly composed of a latch 12a as data holding means, a multiplier 12b as multiplication means, and an arithmetic logic circuit 12c as arithmetic logic means.

【００１８】ラッチ１２ａは、外部から入力端子１５を
介して供給されるデータとして数値データＮＵＭを一時
的に保持するデータ保持手段として用いている。The latch 12a is used as a data holding means for temporarily holding numerical data NUM as data supplied from the outside through the input terminal 15.

【００１９】乗算器１２ｂは、上記ラッチ１２ａからの
出力を乗算係数として入力している。また、この乗算器
１２ｂは、外部から入力端子１３を介して供給されるデ
ータを被乗数として入力している。乗算器１２ｂは、通
常データ保持機能を有していない。乗算器１２ｂは、乗
算する乗算手段として乗算係数ＮＵＭと被乗数の２つの
データを入力し、乗算結果である積を１つ出力する。こ
の乗算器１２ｂは、出力をマルチプレクサ１２ｃに送
る。The multiplier 12b receives the output from the latch 12a as a multiplication coefficient. Further, the multiplier 12b inputs the data supplied from the outside through the input terminal 13 as the multiplicand. The multiplier 12b usually does not have a data holding function. The multiplier 12b inputs two pieces of data of a multiplication coefficient NUM and a multiplicand as multiplication means for multiplication, and outputs one product which is a multiplication result. The multiplier 12b sends the output to the multiplexer 12c.

【００２０】このマルチプレクサ１２ｃには、入力端子
１５から供給される数値データＮＵＭが入力されてい
る。このマルチプレクサ１２ｃは、制御信号に応じて上
記２入力の内のどちらか一方を選択してラッチ１２ｄを
経て算術論理回路１２ｅに供給する。算術論理回路１２
ｅは、外部から供給されるデータをレジスタに格納して
算術演算と論理演算とを行っている。算術論理回路１２
ｅは、入力端子１８から供給される制御信号に応じて演
算した結果を出力端子１４から出力する。Numerical data NUM supplied from the input terminal 15 is input to the multiplexer 12c. The multiplexer 12c selects either one of the two inputs according to the control signal and supplies it to the arithmetic logic circuit 12e via the latch 12d. Arithmetic logic circuit 12
The e stores data supplied from the outside in a register to perform arithmetic operation and logical operation. Arithmetic logic circuit 12
The e outputs from the output terminal 14 the result calculated according to the control signal supplied from the input terminal 18.

【００２１】次に、この演算装置の動作制御について図
１を参照しながら説明する。先ず、入力端子１５を介し
て数値データＮＵＭとして係数ｂがラッチ１２ａとマル
チプレクサ１２ｃに供給される。上記ラッチ１２ａは、
入力端子１６から供給されるデータ保持信号がイネーブ
ル状態にないため、この数値データＮＵＭを保持しな
い。また、マルチプレクサ１２ｃは、入力端子１７を介
して供給される制御信号で２入力の内、供給された係数
ｂ側の入力端を選択する。Next, the operation control of this arithmetic unit will be described with reference to FIG. First, the coefficient b is supplied as numerical data NUM to the latch 12a and the multiplexer 12c via the input terminal 15. The latch 12a is
Since the data holding signal supplied from the input terminal 16 is not in the enabled state, this numerical data NUM is not held. Further, the multiplexer 12c selects the supplied input end on the coefficient b side among the two inputs by the control signal supplied via the input terminal 17.

【００２２】次に、入力端子１５を介して数値データＮ
ＵＭとして係数ａがラッチ１２ａとマルチプレクサ１２
ｃに供給される。上記ラッチ１２ａは、データ保持信号
がイネーブル状態になり、係数ａがデータとして取り込
まれる。このとき、マルチプレクサ１２ｃは、上記数値
データＮＵＭ＝ｂをラッチ１２ｄを介して算術論理回路
１２ｅに供給する。算術論理回路１２ｅは、入力端子１
８を介して供給される制御信号により、数値データＮＵ
Ｍ＝ｂを内蔵するＲＡＭに格納する。Next, the numerical data N is input through the input terminal 15.
As UM, coefficient a is latch 12a and multiplexer 12
is supplied to c. The data holding signal is enabled in the latch 12a, and the coefficient a is fetched as data. At this time, the multiplexer 12c supplies the numerical value data NUM = b to the arithmetic logic circuit 12e via the latch 12d. The arithmetic logic circuit 12e has an input terminal 1
Numerical data NU by the control signal supplied via 8
Store M = b in the built-in RAM.

【００２３】入力端子１３を介して乗算器１２ｂは、デ
ータｘ１を取り込んでいる。乗算器１２ｂは、先に取り
込んだ係数ａとデータｘ１とを乗算してマルチプレクサ
１２ｃに出力する。マルチプレクサ１２ｃは、入力端子
１７から供給される制御信号により、乗算器１２ｂ側を
選択して上記ラッチ１２ｄを介して算術論理回路１２ｅ
に積ａ×ｘ１を供給する。The multiplier 12b takes in the data x1 via the input terminal 13. The multiplier 12b multiplies the coefficient a previously fetched by the data x1 and outputs the result to the multiplexer 12c. The multiplexer 12c selects the multiplier 12b side by the control signal supplied from the input terminal 17 and the arithmetic logic circuit 12e via the latch 12d.
To the product a × x1.

【００２４】算術論理回路１２ｅは、入力端子１８を介
して供給される制御信号により積ａ×ｘ１を取り込む。
算術論理回路１２ｅは、積ａ×ｘ１とＲＡＭに格納して
おいた係数ｂとを加算して積和演算を行う。積和演算の
結果は、出力端子１４を介して出力される。また、乗算
器１２ｂは、入力端子１３を介して次のデータｘ２を乗
算器１２ｂに取り込む。マルチプレクサ１２ｂは、上述
したように乗算結果を制御信号で選択してラッチ１２ｄ
を介して算術論理回路１２ｅに供給する。算術論理回路
１２ｅは、積ａ×ｘ２と係数ｂとの和の演算を行って出
力端子１４から演算結果を出力する。The arithmetic logic circuit 12e takes in the product a × x1 by the control signal supplied through the input terminal 18.
The arithmetic logic circuit 12e performs a product-sum operation by adding the product a × x1 and the coefficient b stored in the RAM. The result of the product-sum operation is output via the output terminal 14. Further, the multiplier 12b takes in the next data x2 into the multiplier 12b via the input terminal 13. The multiplexer 12b selects the multiplication result by the control signal as described above, and latches the signal 12d.
Is supplied to the arithmetic logic circuit 12e via. The arithmetic logic circuit 12e calculates the sum of the product a × x2 and the coefficient b, and outputs the calculation result from the output terminal 14.

【００２５】このように演算装置は、積和演算を行う回
路構成及び動作を一方向にしてデータバスを介したデー
タのやりとりをなくしてデータの衝突を回避している。
また、積和演算に必要な係数ａ、ｂの入力もラッチ１２
ａを設け、マルチプレクサ１２ｃの選択を利用すること
にすることにより、データの衝突を回避することができ
る。As described above, the arithmetic unit avoids data collision by making the circuit configuration and operation for performing the product-sum operation unidirectional so as not to exchange data via the data bus.
In addition, the inputs of the coefficients a and b necessary for the product-sum calculation are also input to the latch 12
By providing a and utilizing the selection of the multiplexer 12c, data collision can be avoided.

【００２６】この積和演算において、演算装置は、外部
から供給される係数ａ、ｂが演算の最中に変わることが
なく固定していることから、それぞれ乗算器１２ｂ、算
術論理回路１２ｅへの係数ａ、ｂの供給を一度だけ実行
すればよい。このため、一度各ＲＡＭに格納した後の積
和演算は、入力端子１３を介して順次供給される毎に、
データｘに乗算器１２ｂ、算術論理回路１２ｅでそれぞ
れ積、和演算（ａ×ｘ＋ｂ）が行われて出力される。こ
のように積和演算をおこなう演算装置は、係数ａ、ｂを
格納した後、パイプライン処理をする。In this multiply-accumulate operation, the arithmetic unit fixes the coefficients a and b supplied from the outside without changing during the arithmetic operation, so that the multiplier 12b and the arithmetic logic circuit 12e are supplied with them. It is sufficient to supply the coefficients a and b only once. Therefore, the product-sum calculation once stored in each RAM is carried out every time it is sequentially supplied through the input terminal 13.
The data x is multiplied and summed (a × x + b) by the multiplier 12b and the arithmetic logic circuit 12e, respectively, and output. The arithmetic unit that performs the sum-of-products calculation in this way performs pipeline processing after storing the coefficients a and b.

【００２７】さらに、具体的なより好ましい実施例につ
いて図３を参照しながら説明する。図３は、本発明の演
算装置を適用して高速演算による画像処理を行うための
演算処理ブロック図である。演算処理ブロック２０の周
囲は、図示しないフレームメモリから供給される映像デ
ータを１ライン分流す映像入力バス（ＶＩＢｕｓ）、映
像出力バス（ＶＯＢｕｓ）、１ラインデータをワークメ
モリに書込むワークメモリ入力バス（ＷＩＢｕｓ）、１
ラインデータをワークメモリに読出すワークメモリ出力
バス（ＷＯＢｕｓ）が配設されている。これらのバスか
ら取り出したデータは、それぞれラッチ２１〜２４を介
して演算処理ブロック２０への入出力が行われる。ま
た、データを一時的に格納しておくメモリとして映像テ
ンポラリメモリ（ＶＴＭ）２５と書込みテンポラリメモ
リ（ＷＴＭ）２６を設けている。映像テンポラリメモリ
（ＶＴＭ）２５と書込みテンポラリメモリ（ＷＴＭ）２
６は、この演算装置の外部に設けたローカルメモリであ
る。Further, a more specific preferred embodiment will be described with reference to FIG. FIG. 3 is a calculation processing block diagram for performing image processing by high-speed calculation by applying the calculation device of the present invention. Around the arithmetic processing block 20, a video input bus (VIBus) for flowing one line of video data supplied from a frame memory (not shown), a video output bus (VOBus), and a work memory input bus for writing one-line data to the work memory. (WIBus), 1
A work memory output bus (WOBus) for reading line data to the work memory is provided. The data extracted from these buses is input to and output from the arithmetic processing block 20 via the latches 21 to 24, respectively. Further, a video temporary memory (VTM) 25 and a write temporary memory (WTM) 26 are provided as memories for temporarily storing data. Video temporary memory (VTM) 25 and write temporary memory (WTM) 2
Reference numeral 6 is a local memory provided outside the arithmetic unit.

【００２８】演算処理ブロック２０は、入出力インター
フェース２６ａを介して例えば１ライン分の画像データ
ｘがバッファ２７を経てマルチプレクサ２８、２９、３
０にそれぞれ供給している。マルチプレクサ２８は、数
値データＮＵＭや画像データｘ等のいずれかを選択して
レジスタＲ１に出力する。レジスタＲ１は、このデータ
を係数算術論理回路（ＣＡＬＵ）３１に供給する。In the arithmetic processing block 20, for example, one line of image data x passes through the buffer 27 via the input / output interface 26a and then the multiplexers 28, 29, 3 are provided.
0 respectively. The multiplexer 28 selects any one of the numerical data NUM and the image data x and outputs it to the register R1. The register R1 supplies this data to the coefficient arithmetic logic circuit (CALU) 31.

【００２９】係数算術論理回路３１は、アドレス算出手
段として例えば乗算係数を記憶する演算装置の外部に設
けたローカルメモリである係数メモリ３４のアドレスを
計算している。係数算術論理回路３１は、レジスタＲ
２、バッファ３２、出力インターフェース３３を介して
係数メモリ３４に供給する。このローカルメモリが係数
メモリ（Coefficient Memory）３４である。係数メモリ
３４は、乗算係数、例えばサイン関数のような関数に基
づくデータテーブルの係数データ（Data）として記憶し
ている。係数メモリ３４は、係数データを入力インター
フェース３５、バッファ３６を介してマルチプレクサ２
９、３０に供給する。The coefficient arithmetic logic circuit 31 calculates the address of the coefficient memory 34, which is a local memory provided outside the arithmetic unit for storing the multiplication coefficient, for example, as the address calculating means. The coefficient arithmetic logic circuit 31 includes a register R
2, supplied to the coefficient memory 34 via the buffer 32 and the output interface 33. This local memory is a coefficient memory 34. The coefficient memory 34 stores multiplication coefficients, for example, coefficient data (Data) of a data table based on a function such as a sine function. The coefficient memory 34 receives the coefficient data from the multiplexer 2 via the input interface 35 and the buffer 36.
Supply to 9 and 30.

【００３０】上記マルチプレクサ２９、３０は、上記係
数アドレス、数値データＮＵＭ、画像データｘ、係数デ
ータ及び各部からのデータを入力し、それぞれプログラ
ムに応じて選択したデータをラッチ３７、３８に出力す
る。また、マルチプレクサ３０は、マルチプレクサ４１
に選択したデータを供給している。The multiplexers 29 and 30 receive the coefficient address, the numerical value data NUM, the image data x, the coefficient data and the data from each section, and output the data selected according to the program to the latches 37 and 38, respectively. In addition, the multiplexer 30 includes a multiplexer 41
Is supplying the selected data to.

【００３１】マルチプレクサ２９は、例えば画像データ
ｘを選択し、マルチプレクサ３０は、係数メモリ３４か
ら出力された係数データ（Data）をプログラムに応じて
選択してそれぞれラッチ３７、３８に出力する。これら
のラッチ３７、３８は、それぞれ例えば１６ビットデー
タを扱うことができる。ラッチ３７、３８は、マルチプ
レクサ２９、３０で選択したデータを乗算器３９に供給
する。The multiplexer 29 selects, for example, the image data x, and the multiplexer 30 selects the coefficient data (Data) output from the coefficient memory 34 according to the program and outputs it to the latches 37 and 38, respectively. Each of these latches 37 and 38 can handle, for example, 16-bit data. The latches 37 and 38 supply the data selected by the multiplexers 29 and 30 to the multiplier 39.

【００３２】ここで、マルチプレクサ３０においてプロ
グラムに応じて係数メモリ３４から係数ｂの係数データ
が出力され、マルチプレクサ４１がこのデータを選択す
るように切換選択すると、レジスタＲ３を介して算術論
理回路（ＸＡＬＵ）４３には係数ｂが供給される。算術
論理回路４３は、データを格納することができるレジス
タを内蔵している。算術論理回路４３は、供給された係
数ｂをレジスタに格納する。Here, in the multiplexer 30, the coefficient data of the coefficient b is output from the coefficient memory 34 in accordance with the program, and when the multiplexer 41 switches and selects this data, the arithmetic logic circuit (XALU) is selected via the register R3. ) 43 is supplied with the coefficient b. The arithmetic logic circuit 43 has a built-in register capable of storing data. The arithmetic logic circuit 43 stores the supplied coefficient b in the register.

【００３３】次に、マルチプレクサ３０は、係数メモリ
３４から出力された係数ａを選択してラッチ３８に供給
する。一般的に、乗算器は、前述したようにレジスタを
持っていない。ラッチは、供給された係数データをデー
タが変更されるまで保持し続ける。上記乗算器３９は、
係数ａと画像データｘとを乗算した結果の積を出力す
る。乗算器３９は、上位１６ビット（ＭＳＰ）と下位１
６ビット（ＬＳＰ）に分けて構成されている。乗算器３
９は、上位１６ビットのデータを丸める丸め演算部（Ｒ
ＮＤ）４０に供給する。乗算器３９は、下位１６ビット
のデータをマルチプレクサ４１に供給する。丸め演算部
４０も上位１６ビットに対する丸め演算結果をマルチプ
レクサ４１に供給する。Next, the multiplexer 30 selects the coefficient a output from the coefficient memory 34 and supplies it to the latch 38. In general, multipliers do not have registers, as mentioned above. The latch continues to hold the supplied coefficient data until the data is changed. The multiplier 39 is
The product of the results of multiplying the coefficient a and the image data x is output. Multiplier 39 has upper 16 bits (MSP) and lower 1
It is configured by being divided into 6 bits (LSP). Multiplier 3
9 is a rounding operation unit (R
ND) 40. The multiplier 39 supplies the lower 16-bit data to the multiplexer 41. The rounding operation unit 40 also supplies the rounding operation result for the upper 16 bits to the multiplexer 41.

【００３４】マルチプレクサ４１は、プログラムに応じ
て数値データＮＵＭ、丸め演算結果及びマルチプレクサ
３０からの出力のいずれかを選択してレジスタＲ３に出
力する。レジスタＲ３はマルチプレクサ２９、３０に戻
すと共に、算術論理回路（ＸＡＬＵ）４３にデータを出
力する。算術論理回路４３は、前述したようにレジスタ
を内蔵しており、係数ｂが格納されている。算術論理回
路４３は、レジスタＲ３からの出力と係数ｂを加算して
レジスタＲ４に出力する。算術論理回路４３が積和演算
を行う場合、算術論理回路４３は、乗算結果の積ａ×ｘ
と係数ｂを加算して積和演算結果ｙを求めている。The multiplexer 41 selects one of the numerical data NUM, the rounding operation result and the output from the multiplexer 30 according to the program and outputs it to the register R3. The register R3 returns the data to the multiplexers 29 and 30, and outputs the data to the arithmetic logic circuit (XALU) 43. The arithmetic logic circuit 43 incorporates the register as described above, and stores the coefficient b. The arithmetic logic circuit 43 adds the output from the register R3 and the coefficient b and outputs the result to the register R4. When the arithmetic logic circuit 43 performs the product-sum operation, the arithmetic logic circuit 43 calculates the product a × x of the multiplication results.
And the coefficient b are added to obtain the product-sum operation result y.

【００３５】レジスタＲ４は、出力データをマルチプレ
クサ４７に供給すると共に、クリッピング（ＣＬＩＰ）
部４４に供給する。クリッピング部４４は、供給された
データがオーバフロー、あるいはアンダフローのとき、
所定の最大値、あるいは最小値に供給されたデータを置
換して出力する。クリッピング部４４は、出力データを
マルチプレクサ４７に供給すると共に、絶対値演算部４
５に供給する。絶対値演算部４５は、供給されたデータ
に絶対値演算を行ってマルチプレクサ４７に出力する。The register R4 supplies the output data to the multiplexer 47 and also performs clipping (CLIP).
Supply to the section 44. The clipping unit 44, when the supplied data overflows or underflows,
The supplied maximum value or minimum value is replaced with the supplied data and output. The clipping unit 44 supplies the output data to the multiplexer 47, and at the same time, the absolute value calculation unit 4
Supply to 5. The absolute value calculation unit 45 performs absolute value calculation on the supplied data and outputs it to the multiplexer 47.

【００３６】マルチプレクサ４７は、スタックメモリ
（ＳＴＡＣＫ）４６、レジスタＲ４、クリッピング部４
４及び絶対値演算部４５からの出力データの中から一つ
を選択してマルチプレクサ２９、３０、４８に供給する
と共に、レジスタＲ５に出力する。レジスタＲ５は、例
えば積和演算の結果ｙをバッファ４９、入出力インター
フェース２６ａを介して出力する。また、レジスタＲ５
は、積和演算の結果ｙをバッファ５０、入出力インター
フェース５１を介して出力する。The multiplexer 47 includes a stack memory (STACK) 46, a register R4, and a clipping unit 4.
4 and one of the output data from the absolute value calculator 45 is selected and supplied to the multiplexers 29, 30 and 48, and also output to the register R5. The register R5 outputs, for example, the product-sum operation result y via the buffer 49 and the input / output interface 26a. Also, register R5
Outputs the result y of the product sum operation via the buffer 50 and the input / output interface 51.

【００３７】マルチプレクサ４８は、レジスタＲ６を介
して選択した出力データを算術論理回路（ＴＡＬＵ）５
２に供給する。算術論理回路５２は、演算装置の外部に
設けたローカルメモリのアドレスを計算するアドレス算
出手段として用いられ、アドレス指定するための演算を
行っている。算術論理回路５２は、レジスタＲ７を介し
て出力インターフェース５３、上記スタック４６に供給
している。算術論理回路５２から出力するアドレスデー
タは１７ワードからなる。このアドレスデータが、映像
テンポラリメモリ（ＶＴＭ）２５と書込みテンポラリメ
モリ（ＷＴＭ）２６とにそれぞれアドレスデータとして
供給されている。この係数が一旦、ラッチ及び算術論理
回路に保持されたならば、係数データを変更するまで固
定されるから、演算処理ブロック２０は、画像データｘ
を順次供給して積和演算をパイプライン処理することが
できる。The multiplexer 48 outputs the output data selected via the register R6 to the arithmetic logic circuit (TALU) 5
Supply to 2. The arithmetic logic circuit 52 is used as an address calculating means for calculating an address of a local memory provided outside the arithmetic unit, and performs an arithmetic operation for addressing. The arithmetic logic circuit 52 supplies the output interface 53 and the stack 46 via the register R7. The address data output from the arithmetic logic circuit 52 consists of 17 words. The address data is supplied to the video temporary memory (VTM) 25 and the write temporary memory (WTM) 26 as address data. Once this coefficient is held in the latch and arithmetic logic circuit, it is fixed until the coefficient data is changed.
Can be sequentially supplied to pipeline the product-sum operation.

【００３８】このように構成することにより、乗算器３
９の前に設けたラッチ３７、３８で係数データを保持
し、乗算器３９で演算した積を直列的に配した算術論理
回路４３に供給して清和演算することにより、データの
流れを１方向にしている。このため、演算処理ブロック
２０は、例えば演算が終了するまでデータの供給を停止
やバスロック等を防止して繰り返し演算を高速化するこ
とができるWith this configuration, the multiplier 3
The latches 37 and 38 provided in front of 9 hold the coefficient data, and the product calculated by the multiplier 39 is supplied to the arithmetic logic circuit 43 arranged in series to perform the sum operation, whereby the data flow is in one direction. I have to. Therefore, the arithmetic processing block 20 can speed up repetitive arithmetic operations by preventing data supply, bus lock, and the like, for example, until the arithmetic operations are completed.

【００３９】また、このように構成し、動作制御するこ
とにより、プログラムの管理が行い易い。Further, by configuring in this way and controlling the operation, it is easy to manage the program.

【００４０】さらに、並列演算装置システムである並列
計算機は、複数の演算装置を並列接続して成る並列演算
システムにおいて、各演算装置に上記演算装置構成から
なる演算装置を用いて並列処理させて高速演算を行って
いる。このように複数のプロセッサを並列型に配した並
列計算機にこの演算装置を適用すれば、効率のよく、演
算を高速化させることができる。Further, a parallel computer which is a parallel computing device system is a parallel computing system in which a plurality of computing devices are connected in parallel. The calculation is being performed. By applying this arithmetic unit to a parallel computer in which a plurality of processors are arranged in parallel as described above, it is possible to efficiently and speed up the arithmetic operation.

【００４１】[0041]

【発明の効果】本発明に係る演算装置によれば、供給さ
れるデータに基づいて繰り返し積和演算を行う演算装置
において、外部から供給されるデータを一時保持するデ
ータ保持手段と、該データ保持手段からの出力を乗算係
数として用い、この乗算係数と外部から供給される被乗
数とを乗算する乗算手段と、外部から供給されるデータ
をレジスタに格納して算術演算と論理演算とを行う算術
論理手段とを有し、上記乗算手段と上記算術論理手段と
を直列に接続することにより、例えばデータをやりとり
する際にデータの衝突に伴うバスロック等の障害の発生
を防止できるので、演算を安全に高速化させることがで
きる。According to the arithmetic unit of the present invention, in an arithmetic unit which repeatedly performs sum-of-products calculation based on supplied data, data holding means for temporarily holding data supplied from the outside, and the data holding unit. An output from the means is used as a multiplication coefficient, multiplication means for multiplying this multiplication coefficient by a multiplicand supplied from the outside, and arithmetic logic for performing arithmetic operation and logical operation by storing data supplied from the outside in a register By connecting the multiplication means and the arithmetic logic means in series, it is possible to prevent a failure such as a bus lock due to a data collision when exchanging data, so that the operation is safe. Can be speeded up.

【００４２】上記演算装置において、この演算装置の外
部に設けたローカルメモリのアドレスを計算するアドレ
ス算出手段を有し、上記乗算手段は、上記アドレス算出
手段からのアドレスに応じて読み出されるデータを基に
演算することにより、演算処理に要するデータを迅速に
供給することができ、高速化に寄与することができる。The arithmetic unit has an address calculating unit for calculating an address of a local memory provided outside the arithmetic unit, and the multiplying unit is based on the data read according to the address from the address calculating unit. By performing the calculation, it is possible to quickly supply the data required for the calculation processing, which can contribute to the speedup.

【００４３】また、複数の演算装置を並列接続して成る
並列演算システムに上記演算装置を並列接続して用いる
ことにより、データを入出力する上での衝突を回避させ
て高速演算をより効率のよく、演算を高速化させること
ができる。Further, by using the above-mentioned arithmetic units connected in parallel to a parallel arithmetic system which is formed by connecting a plurality of arithmetic units in parallel, it is possible to avoid a collision in inputting / outputting data and to make high-speed arithmetic more efficient. Well, the calculation can be speeded up.

[Brief description of drawings]

【図１】コード変換やデータ入力等を効率的に行うため
に用いられるマイクロプログラムに応じて動作させるプ
ロセッサの構成を概略的に示すブロック図である。FIG. 1 is a block diagram schematically showing a configuration of a processor that operates according to a microprogram used for efficiently performing code conversion, data input, and the like.

【図２】図１に示したプロセッサの演算部に本発明に係
る演算装置を適用した際の一実施例であり、この構成を
示す概略的なブロック図である。FIG. 2 is a schematic block diagram showing this configuration, which is an embodiment when the arithmetic unit according to the present invention is applied to the arithmetic unit of the processor shown in FIG.

【図３】より具体的なより好ましい実施例として本発明
の演算装置を適用して高速演算による画像処理を行うた
めの演算処理ブロック図である。FIG. 3 is a calculation processing block diagram for performing image processing by high-speed calculation by applying the calculation device of the present invention as a more specific and more preferable embodiment.

【図４】従来の演算装置における概略的な構成を示すブ
ロック図である。FIG. 4 is a block diagram showing a schematic configuration of a conventional arithmetic device.

[Explanation of symbols]

１０・・・・・プログラムコントローラ１１・・・・・プログラムメモリ１２・・・・・演算部１３、１５〜１８・・・入力端子１４・・・・・出力端子１２ａ、１２ｄ・・・・ラッチ１２ｂ・・・・乗算器１２ｃ・・・・マルチプレクサ１２ｅ、５２・・・算術論理回路３１・・・・・係数算術論理回路 10 ... Program controller 11 ... Program memory 12 ... Calculation unit 13, 15-18 ... Input terminal 14 ... Output terminal 12a, 12d ... Latch 12b ... Multiplier 12c ... Multiplexer 12e, 52 ... Arithmetic logic circuit 31 ... Coefficient arithmetic logic circuit

Claims

[Claims]

1. An arithmetic unit for repeatedly performing sum-of-products calculation based on supplied data, wherein data holding means for temporarily holding data supplied from the outside and output from the data holding means are used as multiplication coefficients, And a multiplication means for multiplying the multiplication coefficient by an externally supplied multiplicand, and an arithmetic logic means for storing externally supplied data in a register to perform an arithmetic operation and a logical operation. An arithmetic unit comprising the arithmetic logic means connected in series.

2. The arithmetic unit has an address calculating unit for calculating an address of a local memory provided outside the arithmetic unit, and the multiplying unit reads data according to the address from the address calculating unit. The arithmetic unit according to claim 1, wherein the arithmetic unit is operated based on

3. A parallel arithmetic system comprising a plurality of arithmetic units connected in parallel, wherein the arithmetic unit according to claim 1 is used for each arithmetic unit.