JP2012069081A

JP2012069081A - Operational circuit

Info

Publication number: JP2012069081A
Application number: JP2010215779A
Authority: JP
Inventors: Tsuguchika Tabaru; 司睦田原
Original assignee: Fujitsu Semiconductor Ltd
Current assignee: Fujitsu Semiconductor Ltd
Priority date: 2010-09-27
Filing date: 2010-09-27
Publication date: 2012-04-05
Anticipated expiration: 2030-09-27
Also published as: JP5668391B2

Abstract

PROBLEM TO BE SOLVED: To provide an operational circuit capable of using a few processing cycles for high-speed computation of operation result corresponding to input data in which an optional value is updated.SOLUTION: An operational circuit includes: a first register for retaining a first value including N elements; a second register for retaining a second value including N elements; an output register for retaining a product-sum operation value obtained through product-sum operation of the first and second values; a first subtractor for subtracting an element in the first value within the first register corresponding to the element from one element in the input first value; a multiplier for multiplying the output of the first subtractor by an element in the second value within the second register corresponding to the element in the input first value; and an adder for adding the output of the multiplier to the product-sum operation value of the output register to output the result to the output register.

Description

本発明は、演算回路に関する。 The present invention relates to an arithmetic circuit.

近年、画像の描画時に行われる座標変換におけるベクトル演算処理や、音声データ等のデジタル・フィルタ処理等において、積和演算が頻繁に用いられる。積和演算とは、積の和を求める演算、つまり乗算の結果を順次加算する演算である。例えば、デジタル・フィルタ処理の演算であるＦＩＲ（有限インパルス応答）フィルタの演算において、画像、音声データと係数とを乗算し累積加算する処理として積和演算が用いられる。このようなデジタル・フィルタ処理では、積和演算の処理速度を向上することにより、処理全体の速度を向上することが図られる。 In recent years, product-sum operations are frequently used in vector calculation processing in coordinate transformation performed at the time of drawing an image, digital filter processing of audio data, and the like. The product-sum operation is an operation for obtaining the sum of products, that is, an operation for sequentially adding the results of multiplication. For example, in the calculation of an FIR (finite impulse response) filter, which is an operation of digital filter processing, product-sum operation is used as processing for multiplying image and audio data by a coefficient and cumulatively adding them. In such digital filter processing, it is possible to improve the overall processing speed by improving the processing speed of the product-sum operation.

上記のようなデジタル・フィルタ処理では、画像、音声データの値の一部が更新され、係数については更新されない。従って、演算回路は、全ての入力データについて、積和演算をし直す必要はない。しかし、従来の演算回路は、全ての入力データに基づいて積和演算することにより、積和演算処理に時間を要していた。そこで、例えば、特許文献１のような演算回路が提案されている。 In the digital filter processing as described above, some of the values of the image and audio data are updated, and the coefficients are not updated. Therefore, the arithmetic circuit does not need to repeat the product-sum operation for all input data. However, the conventional arithmetic circuit requires time for the product-sum operation processing by performing the product-sum operation based on all input data. Thus, for example, an arithmetic circuit as disclosed in Patent Document 1 has been proposed.

特許第２５３０９１６号公報Japanese Patent No. 2530916

しかしながら、特許文献１の演算回路では、入力データのうち固定のデータが更新された場合の積和演算値を得ることはできるものの、任意のデータが更新された場合の積和演算値を得ることはできない。 However, in the arithmetic circuit of Patent Document 1, although a product-sum operation value can be obtained when fixed data is updated among input data, a product-sum operation value when arbitrary data is updated is obtained. I can't.

また、プロセッサーは、積和演算に限らず、加減算、乗算、論理積、論理和演算等の組合せによる様々な演算を行う。そのような他の組合せ演算回路においても同様にして、入力データ内の任意の値が更新された場合に、全ての入力データについて演算し直すことにより、演算に時間を要していた。 Further, the processor performs not only product-sum operations but also various operations based on combinations of addition / subtraction, multiplication, logical product, logical sum operation, and the like. Similarly, in such other combinational operation circuits, when any value in the input data is updated, it takes time for the calculation by recalculating all the input data.

そこで、本発明では、任意の値が更新された入力データに対応する演算結果を少ない処理サイクルで高速に算出可能な演算回路を提供することを目的とする。 Therefore, an object of the present invention is to provide an arithmetic circuit capable of calculating an arithmetic result corresponding to input data in which an arbitrary value is updated at high speed with a small number of processing cycles.

第１の側面は、Ｎ個の要素を有する第１の値を保持する第１のレジスターと、
Ｎ個の要素を有する第２の値を保持する第２のレジスターと、
前記第１の値と前記第２の値とが積和演算された積和演算値を保持する出力レジスターと、
入力された前記第１の値の１つの前記要素から、当該要素に対応する前記第１のレジスター内の前記第１の値の前記要素を減算する第１の減算器と、
前記第１の減算器の出力と、前記入力された第１の値の要素に対応する前記第２のレジスター内の前記第２の値の前記要素とを乗算する乗算器と、
前記乗算器の出力と、前記出力レジスターの前記積和演算値とを加算して前記出力レジスターに出力する加算器と、を有する。 A first aspect includes a first register holding a first value having N elements;
A second register holding a second value having N elements;
An output register for holding a product-sum operation value obtained by performing a product-sum operation on the first value and the second value;
A first subtractor for subtracting the element of the first value in the first register corresponding to the element from one of the elements of the input first value;
A multiplier that multiplies the output of the first subtractor with the element of the second value in the second register corresponding to the element of the input first value;
An adder that adds the output of the multiplier and the product-sum operation value of the output register and outputs the result to the output register.

第１の側面によれば、任意の値が更新された入力データに対応する演算結果を少ない処理サイクルで高速に算出可能となる。 According to the first aspect, calculation results corresponding to input data in which an arbitrary value is updated can be calculated at high speed with a small number of processing cycles.

専用の積和演算回路を有するシステムの一例である。It is an example of a system having a dedicated product-sum operation circuit. 式１における積和演算器の一例を表す図である。It is a figure showing an example of the product-sum calculator in Formula 1. 図２の積和演算器をバスに接続する回路の一例を表す図である。FIG. 3 is a diagram illustrating an example of a circuit that connects the product-sum calculator of FIG. 2 to a bus. ａ_０〜ａ_７が更新される場合の積和演算器における動作波形を表す図である。is a diagram showing operation waveforms in the multiply-add unit when a ₀ ~a ₇ is updated. ａ_２のみが更新される場合の積和演算回路における動作波形を表す図である。is a diagram showing the operation waveforms in the product-sum operation circuit when only a ₂ is updated. 式２における本実施の形態例における積和演算器の一例を表す図である。It is a figure showing an example of the sum-of-products calculator in the example of this Embodiment in Formula 2. 図６の積和演算器をバスに接続する回路を表す図である。It is a figure showing the circuit which connects the product-sum operation unit of FIG. 6 to a bus | bath. ａ_０〜ａ_７が更新される場合の図６の積和演算器における動作波形を表す図である。a ₀ ~a ₇ is a diagram showing operation waveforms in the multiply-add unit 6 when being updated. ａ_２のみが更新された場合の図６の積和演算器における動作波形を表す図である。Only a ₂ is a diagram showing operation waveforms in the multiply-add unit 6 when it is updated. 式３における本実施の形態例における積和演算器の一例を表す図である。It is a figure showing an example of the sum-of-products calculator in the example of this Embodiment in Formula 3. 式４における本実施の形態例における積和演算器の一例を表す図である。It is a figure showing an example of the sum-of-products calculator in the example of this Embodiment in Formula 4. 第４の実施の形態例における積和演算器の一例を表す図である。It is a figure showing an example of the product-sum calculator in the example of a 4th embodiment. 第５の実施の形態例における第１の演算回路である。It is a 1st arithmetic circuit in a 5th example of an embodiment. 第５の実施の形態例における第２の演算回路である。It is the 2nd arithmetic circuit in a 5th example of an embodiment. 図１３及び図１４の回路をバスに接続する回路を表す図である。It is a figure showing the circuit which connects the circuit of FIG.13 and FIG.14 to a bus | bath. 式７における本実施の形態例における回路の一例を表す図である。It is a figure showing an example of the circuit in this Embodiment in Formula 7.

以下、図面にしたがって本発明の実施の形態について説明する。ただし、本発明の技術的範囲はこれらの実施の形態に限定されず、特許請求の範囲に記載された事項とその均等物まで及ぶものである。 Hereinafter, embodiments of the present invention will be described with reference to the drawings. However, the technical scope of the present invention is not limited to these embodiments, but extends to the matters described in the claims and equivalents thereof.

図１は、専用の積和演算回路を有するシステム１０の一例である。同図のシステム１０は、ＣＰＵ１１とＤＭＡＣ１２、メモリ１３、専用の積和演算回路１４（以下、積和演算器）、その他のハードウェア１５を有する。ＣＰＵ１１は、メモリ１３から読み出した命令をバス１６を介して積和演算回路１４に送り、積和演算回路はバス１６を介して演算結果をＣＰＵ１１に送る。ＤＭＡＣ１２は、その他のハードウェア１５とメモリ１３間等のＣＰＵ１１を介さないデータ転送を制御する。 FIG. 1 is an example of a system 10 having a dedicated product-sum operation circuit. A system 10 in FIG. 1 includes a CPU 11 and a DMAC 12, a memory 13, a dedicated product-sum operation circuit 14 (hereinafter referred to as a product-sum operation unit), and other hardware 15. The CPU 11 sends the instruction read from the memory 13 to the product-sum operation circuit 14 via the bus 16, and the product-sum operation circuit sends the operation result to the CPU 11 via the bus 16. The DMAC 12 controls data transfer not via the CPU 11 such as between the other hardware 15 and the memory 13.

具体的に、ＣＰＵ１１は、積和演算器１４とのデータのやり取りを、積和演算器１４に割り振られたアドレスに対する読み出し及び書き込み処理によって行う。例えば、ＣＰＵ１１から積和演算器１４にデータを書き込む場合、ＣＰＵ１１は、バス１６に対して積和演算器１４のアドレス、書き込みデータ、タイミング信号、書き込みモード信号を発行する。積和演算器１４は、これらの信号を検出して発行されたアドレスが積和演算器１４のアドレスであることを判定し、バス１６からデータを取り込む。一方、積和演算器１４からデータを出力する場合、ＣＰＵ１１は、バス１６に対して、積和演算器１４のアドレス及びタイミング信号を発行し、積和演算器１４から供給されたデータをバス１６から取り込む。 Specifically, the CPU 11 exchanges data with the product-sum calculator 14 through read and write processing with respect to the address assigned to the product-sum calculator 14. For example, when data is written from the CPU 11 to the product-sum operation unit 14, the CPU 11 issues the address, write data, timing signal, and write mode signal of the product-sum operation unit 14 to the bus 16. The product-sum operation unit 14 detects these signals, determines that the issued address is the address of the product-sum operation unit 14, and takes in data from the bus 16. On the other hand, when outputting data from the product-sum calculator 14, the CPU 11 issues an address and timing signal of the product-sum calculator 14 to the bus 16, and sends the data supplied from the product-sum calculator 14 to the bus 16. Capture from.

積和演算回路１４は、一般的に、次の式１に基づいて積和演算値を算出する。式１において、積和演算値「Ｓ」は、値「ａ」「ｂ」の積和演算値である。 The product-sum operation circuit 14 generally calculates a product-sum operation value based on the following equation 1. In Equation 1, the product-sum operation value “S” is a product-sum operation value of the values “a” and “b”.

値「ａ」は、ｎ個の要素「ａ_０，ａ_１，…，ａ_ｎ−１」、値「ｂ」はｎ個の要素「ｂ_０，ｂ_１，…，ｂ_ｎ−１」をそれぞれ有する。式１では、各値のｉ（０≦ｉ≦ｎ−１）番目の要素同士の組の乗算値「ａ_ｉ×ｂ_ｉ」が順次加算されることにより、積和演算値「Ｓ」が算出される。 Value "a", n elements _{_{_{"a 0, a 1, ...,}}} a n-1 ", the value "b" n elements _{_{_{"b 0, b 1, ...,}}} b n-1 ", respectively Have. In Equation 1, a product-sum operation value “S” is calculated by sequentially adding a multiplication value “a _i × b _i ” of a set of i (0 ≦ i ≦ n−1) -th elements of each value. Is done.

図２は、式１において「ｎ＝８」とした場合の積和演算器の一例を表す図である。同図の積和演算器において、値「ａ」の各要素ａ_０〜ａ_７はレジスターＲ００〜Ｒ０７に入力され、値「ｂ」の各要素ｂ_０〜ｂ_７はレジスターＲ１０〜Ｒ１７に入力される。同図の積和演算器では、値「ａ」「ｂ」の全てまたは一部の要素が入力信号inputとして入力され、対応するレジスターに書き込まれる。続いて、レジスターに保持された全ての値「ａ」「ｂ」の要素データを用いて演算が行われ、出力レジスターＯＵＴに積和演算値「Ｓ」が出力される。 FIG. 2 is a diagram illustrating an example of a product-sum calculator when “n = 8” in Equation 1. In the product-sum operation unit shown in the figure, the elements a _{0 to} a ₇ of the value “a” are input to the registers R 00 to R 07, and the elements b _{0 to} b ₇ of the value “b” are input to the registers R 10 to R 17. The In the product-sum operation unit shown in the figure, all or a part of the values “a” and “b” are input as the input signal input and written into the corresponding register. Subsequently, an operation is performed using element data of all values “a” and “b” held in the register, and a product-sum operation value “S” is output to the output register OUT.

図２の積和演算器において、まず、演算対象の各要素データ（入力更新データ）が入力信号inputとして入力され、入力更新データに対応するレジスターがライト信号write_0_0〜write_1_7に応答し、当該レジスターに入力更新データが書き込まれる。全ての入力更新データが書き込まれると、入力信号startがカウンターＣＮＴと遅延器ＤＥＬＡに入力される。カウンターＣＮＴは入力信号startに応答して、クロックの立ち上がり毎に０から順にインクリメントした信号を、比較器ＣＯＭと選択器ＳＥＬＡ、ＳＥＬＢに出力する。 In the product-sum calculator of FIG. 2, first, each element data (input update data) to be calculated is input as an input signal input, and a register corresponding to the input update data responds to the write signals write_0_0 to write_1_7, Input update data is written. When all the input update data is written, the input signal start is input to the counter CNT and the delay unit DELA. In response to the input signal start, the counter CNT outputs a signal incremented from 0 every time the clock rises to the comparator COM and the selectors SELA and SELB.

選択器ＳＥＬＡ、ＳＥＬＢは、それぞれ、カウンターＣＮＴからのセレクト信号select1（０〜７）に応答して、当該セレクト信号select1に対応するレジスターが保持するデータを乗算器ＭＵＬへ出力する。具体的に、例えば、セレクト信号select1が「０」の場合、選択器ＳＥＬＡはレジスターＲ００のデータ、選択器ＳＥＬＢはレジスターＲ１０のデータを乗算器ＭＵＬに出力する。乗算器ＭＵＬは同時に入力された値「ａ」「ｂ」の対応する要素の組の各データを乗算して加算器ＡＤＤに順次出力し、加算器ＡＤＤは出力された乗算値と、前回の加算器ＡＤＤの出力とを順次加算する。 The selectors SELA and SELB output the data held in the register corresponding to the select signal select1 to the multiplier MUL in response to the select signal select1 (0 to 7) from the counter CNT. Specifically, for example, when the select signal select1 is “0”, the selector SELA outputs the data in the register R00, and the selector SELB outputs the data in the register R10 to the multiplier MUL. The multiplier MUL multiplies each data of the corresponding element set of the values “a” and “b” inputted simultaneously and sequentially outputs them to the adder ADD. The adder ADD and the previous addition The outputs of the adder ADD are sequentially added.

一方、比較器ＣＯＭはカウンターＣＮＴからのセレクト信号select1が「７」の場合、遅延器ＤＥＬＢに信号を出力する。遅延器ＤＥＬＢは、比較器ＣＯＭからの出力信号を全組の乗算結果が加算されるタイミングに遅延させて、出力レジスターＯＵＴへ書き込み信号writeを出力する。なお、遅延器ＤＥＬＡは、入力信号startを初めの組の乗算値が加算される直前に合わせて遅延させ、加算器ＡＤＤにリセット信号resetを出力し予め「０」で初期化しておく。 On the other hand, when the select signal select1 from the counter CNT is “7”, the comparator COM outputs a signal to the delay device DELB. The delay device DELB delays the output signal from the comparator COM to the timing at which all sets of multiplication results are added, and outputs the write signal write to the output register OUT. The delay unit DELA delays the input signal start just before the first set of multiplication values are added, outputs a reset signal reset to the adder ADD, and initializes it beforehand with “0”.

図３は、図２の積和演算器をバスに接続する回路の一例を表す図である。同図において、図１のバス１６が、制御バスＣＢ、データバスＤＢ、アドレスバスＡＢを有する。図２の積和演算器にアクセスが発生すると、積和演算器に割り振られたアドレスがアドレスバスＡＢに、有効なアドレスが流れたタイミングを表す信号が制御バスＣＢに出力される。 FIG. 3 is a diagram illustrating an example of a circuit that connects the product-sum calculator of FIG. 2 to a bus. In FIG. 1, the bus 16 of FIG. 1 has a control bus CB, a data bus DB, and an address bus AB. When an access occurs in the product-sum operation unit in FIG. 2, an address assigned to the product-sum operation unit is output to the address bus AB, and a signal indicating the timing at which a valid address flows is output to the control bus CB.

そして、ゲートＧ１は有効アドレスを検出し、比較器Ｃ００〜Ｃ１７は、それぞれ検出アドレスと対応するレジスター（レジスターＲ００〜Ｒ０７）に割り振られたアドレスとを比較し、一致した場合に、当該アドレスに対応するライト信号write_*_*（write_0_0〜write_1_7）を図２の積和演算器に送る。また、比較器Ｃ２０は、検出アドレスと入力信号startを発行するためのアドレスとを比較し、一致した場合に、図２の積和演算器に入力信号startを送る。 Then, the gate G1 detects the effective address, and the comparators C00 to C17 compare the detected addresses with the addresses assigned to the corresponding registers (registers R00 to R07), respectively, and correspond to the addresses when they match. The write signal write _ * _ * (write_0_0 to write_1_7) to be sent is sent to the product-sum calculator of FIG. Further, the comparator C20 compares the detected address with the address for issuing the input signal start, and when they match, sends the input signal start to the product-sum calculator of FIG.

また、ゲートＧ４は、データバスＤＢから送信されるデータと、有効なデータが検出されたタイミングを表す信号を制御バスＣＢから検出し、データバスＤＢの入力データを入力信号inputとして出力する。また、ゲートＧ２は有効アドレスを検出し、比較器Ｃ３０は、検出アドレスと積和演算器の出力レジスターＯＵＴのアドレスとを比較し、一致した場合に、出力レジスターＯＵＴの出力値outputをゲートＧ３からデータバスＤＢに出力する。 The gate G4 detects data transmitted from the data bus DB and a signal indicating the timing at which valid data is detected from the control bus CB, and outputs the input data of the data bus DB as an input signal input. Further, the gate G2 detects the effective address, and the comparator C30 compares the detected address with the address of the output register OUT of the product-sum calculator, and if they match, the output value output of the output register OUT is output from the gate G3. Output to the data bus DB.

図４は、「ａ_０〜ａ_７」が更新される場合の図２の積和演算器における動作波形を表す図である。この例において、レジスターＲ１０〜Ｒ１７には「ｂ_０〜ｂ_７」が予め保持されている。また、「ｐ_ｉ」は各組の要素の乗算値「ａ_ｉ×ｂ_ｉ」であり、「Ｓ_ｉ」はｉ番目の組までの積和演算値を表す。 FIG. 4 is a diagram illustrating operation waveforms in the product-sum calculator of FIG. 2 when “a _{0 to} a ₇ ” is updated. In this example, “b _{0 to} b ₇ ” are held in the registers R10 to R17 in advance. “P _i ” is a multiplication value “a _i × b _i ” of each set of elements, and “S _i ” represents a product-sum operation value up to the i-th set.

図４の動作波形図において、まず、入力信号inputとして「ａ_０」が入力されると共に、ライト信号write_0_0に応答して、次のクロックの立ち上がりタイミングでレジスターＲ００に「ａ_０」が書き込まれる。同様にして、入力信号inputとして「ａ_１〜ａ_７」が順次入力され、「ａ_１〜ａ_７」がレジスターＲ０１〜レジスターＲ０７に書き込まれる。そして、「ａ_７」の入力の次のクロックの立ち上がりタイミングで、入力信号startがカウンターＣＮＴに出力される。 In the operation waveform diagram of FIG. 4, first, “a ₀ ” is input as the input signal input, and “a ₀ ” is written to the register R 00 at the next clock rising timing in response to the write signal write — _{0 — 0} . Similarly, _"a 1 ~a _7" are sequentially input as the input signal input The _"a 1 ~a _7" is written to the register R01~ register R07. Then, the input signal start is output to the counter CNT at the rising timing of the clock next to the input of “a ₇ ”.

カウンターＣＮＴは入力信号startに応答して、クロックの立ち上がりタイミング毎に０から順に７までインクリメントした各信号を選択器ＳＥＬＡ、ＳＥＬＢと比較器ＣＯＭに出力する。最初、セレクト信号select1「０」に応答して、選択器ＳＥＬＡはレジスターＲ００のデータ「ａ_０」を、選択器ＳＥＬＢはレジスターＲ１０のデータ「ｂ_０」を選択し乗算器ＭＵＬに出力する。続いて、次のクロックの立ち上がりタイミングで、乗算器ＭＵＬは、「ａ_０」「ｂ_０」を乗算し、乗算値「ｐ_０＝ａ_０×ｂ_０」を加算器ＡＤＤに出力する。 In response to the input signal start, the counter CNT outputs each signal incremented from 0 to 7 at each rising edge of the clock to the selectors SELA and SELB and the comparator COM. First, in response to the select signal select1 “0”, the selector SELA selects the data “a ₀ ” in the register R00, and the selector SELB selects the data “b ₀ ” in the register R10 and outputs it to the multiplier MUL. Subsequently, at the rising timing of the next clock, the multiplier MUL multiplies “a ₀ ” and “b ₀ ” and outputs the multiplication value “p ₀ = a ₀ × b ₀ ” to the adder ADD.

また、遅延器ＤＥＬＡはスタート信号startに応答して加算器ＡＤＤにリセット信号resetを出力し、最初の組の乗算値「ｐ_０」が入力される前に加算器ＡＤＤを「０」に初期化しておく。従って、加算器ＡＤＤは、最初、初期値「０」と乗算器ＭＵＬから出力された乗算値「ｐ_０」との加算値「Ｓ_０＝０＋ｐ_０」を出力する。同様にして、次のクロックで、加算器ＡＤＤは、乗算器ＭＵＬからの出力「ｐ_ｉ＝ａ_ｉ×ｂ_ｉ」と、１つ前のクロックの加算器ＡＤＤからの出力値「ｐ_ｉ−１」とを加算した値「Ｓ_ｉ＝ｐ_ｉ−１＋ｐ_ｉ」を出力する。また、比較器ＣＯＭは、カウンターＣＮＴの出力信号が７になると遅延器ＤＥＬＢに比較信号を出力し、遅延器ＤＥＬＢは、全ての組の乗算結果「ｐ_０〜ｐ_７」が加算されたタイミングに、出力レジスターへの書き込み信号writeを出力する。 The delay unit DELA outputs a reset signal reset to the adder ADD in response to the start signal start, and initializes the adder ADD to “0” before the first set of multiplication values “p ₀ ” is input. Keep it. Therefore, the adder ADD first outputs an addition value “S ₀ = 0 + p ₀ ” between the initial value “0” and the multiplication value “p ₀ ” output from the multiplier MUL. Similarly, at the next clock, the adder ADD outputs “p _i = a _i × b _i ” from the multiplier MUL and the output value “p _i−1 ” from the adder ADD of the previous clock. ”And“ S _i = p _i−1 + p _i ”are output. Further, the comparator COM outputs a comparison signal to the delay device DELB when the output signal of the counter CNT becomes 7, and the delay device DELB has a timing at which all sets of multiplication results “p _{0 to} p ₇ ” are added. The write signal write to the output register is output.

上記のように、図２の積和演算器は、全ての入力更新データ「ａ_０〜ａ_７」がレジスターに蓄えられてから積和演算を開始するため、演算結果が得られるまで時間がかかっていた。これにより、図２の積和演算器は、データ「ａ_０〜ａ_７」が入力され始めてから積和演算値「Ｓ_７」が算出されるまで２０サイクル要していた。続いて、「ａ_０〜ａ_７」のうちひとつのデータのみが更新された場合について述べる。 As described above, the product-sum operation unit of FIG. 2 starts the product-sum operation after all the input update data “a _{0 to} a ₇ ” are stored in the register. Therefore, it takes time until the operation result is obtained. It was. Accordingly, the product-sum calculator of FIG. 2 takes 20 cycles from the start of input of the data “a _{0 to} a ₇ ” until the product-sum operation value “S ₇ ” is calculated. Next, a case where only one data among “a _{0 to} a ₇ ” is updated will be described.

図５は、「ａ_０〜ａ_７」のうち「ａ_２」のみが更新される場合の図２の積和演算回路における動作波形を表す図である。この例において、レジスターＲ１０〜Ｒ１７に「ｂ_０〜ｂ_７」が、レジスターＲ０２を除くレジスターＲ００〜Ｒ０７に、「ａ₂」を除く「ａ_０〜ａ_７」が予め保持されている。なお、「ｐ_ｉ」、「Ｓ_ｉ」については図４の動作波形図と同様である。 FIG. 5 is a diagram illustrating operation waveforms in the product-sum operation circuit of FIG. 2 when only “a ₂ ” of “a _{0 to} a ₇ ” is updated. In this example, “b _{0 to} b ₇ ” are held in the registers R 10 to R 17 in advance, and “a _{0 to} a ₇ ” except for “a ₂ ” are held in the registers R 00 to R 07 except for the register R 02 in advance. Note that “p _i ” and “S _i ” are the same as those in the operation waveform diagram of FIG.

図５の動作波形図において、まず、入力信号inputとして「ａ_２」が入力されると共に、ライト信号write_0_2に応答して、次のクロックの立ち上がりタイミングでレジスターＲ０２に「ａ_２」が書き込まれる。そして、「ａ_２」の入力のつぎのクロックの立ち上がりタイミングで、入力信号startがカウンターＣＮＴに出力され、カウンターＣＮＴは、０から順にインクリメントした信号を選択器ＳＥＬＡ、ＳＥＬＢと比較器ＣＯＭとに出力する。そして、図４と同様にして、選択器ＳＥＬＡ、ＳＥＬＢは、「０〜７」のセレクト信号select1に対応するレジスターが保持するデータの組を選択して順次乗算器ＭＵＬに出力し、乗算器ＭＵＬは各組のデータの乗算値「ｐ_ｉ（＝ａ_ｉ×ｂ_ｉ）」を順次加算器ＡＤＤに出力する。加算器ＡＤＤは、全ての組の乗算値「ｐ_ｉ」を順次加算し、積和演算値「Ｓ_７」を出力レジスターＯＵＴに出力する。 In the operation waveform diagram of FIG. 5, first, “a ₂ ” is input as the input signal input, and “a ₂ ” is written to the register R 02 at the next clock rising timing in response to the write signal write — 0 — ₂ . Then, the input signal start is output to the counter CNT at the rising timing of the clock next to the input of “a ₂ ”, and the counter CNT outputs the signals incremented from 0 to the selectors SELA, SELB and the comparator COM. To do. In the same manner as in FIG. 4, the selectors SELA and SELB select the data sets held in the registers corresponding to the select signals select1 of “0 to 7” and sequentially output them to the multiplier MUL. Sequentially outputs the multiplication value “p _i (= a _i × b _i )” of each set of data to the adder ADD. The adder ADD sequentially adds all sets of multiplication values “p _i ” and outputs a product-sum operation value “S ₇ ” to the output register OUT.

このように、図２の積和演算器は、例え「ａ_０〜ａ_７」のうち「ａ_２」のみが更新される場合であっても、レジスターに保持された全てのデータについて積和演算し直していた。また、図２の積和演算器は、演算対象の全てのデータについて積和演算し直すため、全てのデータがレジスターに書き込まれてから演算を開始していた。 As described above, the product-sum operation unit of FIG. 2 performs the product-sum operation on all the data held in the register even if only “a ₂ ” of “a _{0 to} a ₇ ” is updated. I was doing it again. Further, since the product-sum operation unit in FIG. 2 performs the product-sum operation again on all the data to be calculated, the operation is started after all the data is written to the register.

これにより、図２の積和演算器は、レジスターに保持された全てのデータを、当該全てのデータがレジスターに格納されてから積和演算し直すため、全ての入力更新データがレジスターに蓄えられるまでのデータ転送時間と、乗算と加算の全てのデータの演算時間とを要し、積和演算値が得られるまで時間を要していた。このため、図２の積和演算器では、「ａ₂」のみが更新される場合であっても、データ「ａ₂」が入力され始めてから積和演算値「Ｓ_７」が算出されるまで１３サイクル要していた。 As a result, the product-sum operation unit shown in FIG. 2 recalculates the product-sum operation after all the data stored in the register is stored in the register, so that all the input update data is stored in the register. The data transfer time up to and the calculation time of all data of multiplication and addition are required, and it takes time until the product-sum operation value is obtained. Therefore, in the product-sum calculator of FIG. 2, even if only “a ₂ ” is updated, the product-sum operation value “S ₇ ” is calculated after the data “a ₂ ” starts to be input. It took 13 cycles.

＜第１の実施の形態例＞
そこで、本実施の形態例の演算回路は、Ｎ個の要素を有する第１の値を保持する第１のレジスターと、Ｎ個の要素を有する第２の値を保持する第２のレジスターと、第１、第２の値が積和演算された積和演算値を保持する出力レジスターとを有する。そして、本実施の形態例の演算回路は、入力された第１の値の１つの要素から、当該要素に対応する第１のレジスター内の第１の値の要素を減算する減算器と、減算器の出力と入力された第１の値の要素に対応する第２のレジスター内の第２の値の要素とを乗算する乗算器と、乗算器の出力と出力レジスターの積和演算値とを加算して前記出力レジスターに出力する加算器とを有する。 <First Embodiment>
Therefore, the arithmetic circuit according to the present embodiment includes a first register that holds a first value having N elements, a second register that holds a second value having N elements, And an output register for holding a product-sum operation value obtained by multiply-accumulating the first and second values. The arithmetic circuit according to the present embodiment includes a subtracter that subtracts an element of the first value in the first register corresponding to the element from one element of the input first value, A multiplier for multiplying the output of the multiplier by an element of the second value in the second register corresponding to the element of the input first value, and the product-sum operation value of the output of the multiplier and the output register And an adder for adding and outputting to the output register.

本実施の形態例の積和演算器は、次の式２に基づいて積和演算値を算出する。式２において、値「ａ´＝（ａ_０´，ａ_１´，ａ_２´，…，ａ_ｎ−１´）」は、前回積和演算された古い値「ａ＝（ａ_０，ａ_１，ａ_２，…，ａ_ｎ−１）」である前回データに対してｊ番目の要素が更新されているものとする。なお、値「ａ」「ａ´」のｊ番目以外の要素は変更なく同一である。そして、「Ｓ」は「ａ」と「ｂ」の積和演算値（以下、前回の積和演算値）であり、「Ｓ´」は「ａ´」と「ｂ」の積和演算値（以下、更新後の積和演算値）を表す。 The product-sum operation unit according to the present embodiment calculates a product-sum operation value based on the following Equation 2. In Equation 2, the value “a ′ = (a ₀ ′, a ₁ ′, a ₂ ′,..., A _n−1 ′)” is an old value “a = (a ₀ , a ₁ , A ₂ ,..., A _n-1 ) ”, the j-th element is updated. The elements other than the jth value “a” and “a ′” are the same without change. “S” is the product-sum operation value of “a” and “b” (hereinafter, the previous product-sum operation value), and “S ′” is the product-sum operation value of “a ′” and “b” ( Hereinafter, the product-sum calculation value after update) is represented.

式２では、更新後の積和演算値「Ｓ´」を、前回の積和演算値「Ｓ」に、「Ｓ´」と「Ｓ」の差分値を加算することによって求める。具体的に、更新後の積和演算値「Ｓ´」は、前回の積和演算値「Ｓ」から値「ａ」「ｂ」のｊ番目の要素の乗算値「ａ_ｊ×ｂ_ｊ」を減算し、値「ａ´」「ｂ」のｊ番目の要素の乗算値「ａ_ｊ´×ｂ_ｊ」を加算した値（Ｓ´＝Ｓ−（ａ_ｊ×ｂ_ｊ）＋（ａ_ｊ´×ｂ_ｊ））である。この演算式は「Ｓ´＝Ｓ＋（ａ_ｊ´−ａ_ｊ）×ｂ_ｊ」のようにまとめられる。従って、更新後の積和演算値「Ｓ´」は、前回の積和演算値「Ｓ」に差分値「（ａ_ｊ´−ａ_ｊ）×ｂ_ｊ」が加算されることにより算出される。 In Equation 2, the updated product-sum operation value “S ′” is obtained by adding the difference value between “S ′” and “S” to the previous product-sum operation value “S”. Specifically, the updated product-sum operation value “S ′” is obtained by multiplying the multiplication value “a _j × b _j ” of the j-th element of the values “a” and “b” from the previous product-sum operation value “S”. A value obtained by subtracting and adding the multiplication value “a _j ′ × b _j ” of the j-th element of the values “a ′” and “b” (S ′ = S− (a _j × b _j ) + (a _j ′ × b _j )). This arithmetic expression is summarized as “S ′ = S + (a _j ′ −a _j ) × b _j ”. Therefore, the updated product-sum operation value “S ′” is calculated by adding the difference value “(a _j ′ −a _j ) × b _j ” to the previous product-sum operation value “S”.

式１と式２とを比較すると、値の一部の要素が更新される場合、式１に対して式２の演算量は少ない。従って、式２は、式１に対してより少ない処理サイクルで更新後の積和演算値「Ｓ」を算出することができる。 Comparing Expression 1 and Expression 2, when some elements of the value are updated, the amount of calculation of Expression 2 is smaller than that of Expression 1. Therefore, Equation 2 can calculate the updated product-sum operation value “S” with fewer processing cycles than Equation 1.

図６は、式２において「ｎ＝８」とした場合の本実施の形態例における積和演算器の一例を表す図である。図２の積和演算値と同様に、図６の積和演算器において、値「ａ´」の各要素ａ_０´〜ａ_７´はレジスターＲ００〜Ｒ０７に、値「ｂ」の各要素ｂ_０〜ｂ_７はレジスターＲ１０〜Ｒ１７に格納される。また、本実施の形態例における積和演算器は、レジスターＲ００〜Ｒ１７に前回データ「ａ_０〜ａ_７、ｂ_０〜ｂ_７」を、出力レジスターに前回の積和演算値「Ｓ」を保持する。 FIG. 6 is a diagram illustrating an example of a product-sum calculator in the present embodiment when “n = 8” in Expression 2. Similar to the product-sum operation value of FIG. 2, in the product-sum operation unit of FIG. 6, each element a ₀ ′ to a ₇ ′ of the value “a ′” is stored in the registers R00 to R07 and each element b of the value “b” is stored. ₀ ~b ₇ is stored in the register R10-R17. Further, the product-sum calculator in the present embodiment holds the previous data “a _{0 to} a ₇ , b _{0 to} b ₇ ” in the registers R 00 to R 17, and the previous product-sum operation value “S” in the output register. To do.

また、図６の積和演算器において、updateする組の番号s10は、入力信号inputとして入力される入力更新データに対応する組を表す。そして、write_0_0からwrite_0_7の論理和s0はレジスターＲ００〜Ｒ０７のいずれかに格納される入力更新データの有無を表し、write_0_0からwrite_0_7の論理和s0がＨレベルの場合、入力更新データがレジスターＲ００〜Ｒ０７のいずれかに格納されることを、Ｌレベルの場合はいずれにも格納されないことを示す。同様に、write_1_0からwrite_1_7の論理和s1は、レジスターＲ１０〜Ｒ１７のいずれかに格納される入力更新データの有無を表す。 In the product-sum operation unit of FIG. 6, the number s10 of the set to be updated represents a set corresponding to the input update data input as the input signal input. The logical sum s0 from write_0_0 to write_0_7 represents the presence or absence of input update data stored in any of the registers R00 to R07. When the logical sum s0 from write_0_0 to write_0_7 is at H level, the input update data is in the registers R00 to R07. Indicates that it is stored in any of the above, and in the case of the L level, it is not stored in any of them. Similarly, the logical sum s1 of write_1_0 to write_1_7 represents the presence or absence of input update data stored in any of the registers R10 to R17.

図６の積和演算器において、値「ａ」または値「ｂ」いずれかの入力更新データが入力信号inputとして入力される。そして、入力更新データinputと、入力更新データに基づく前回データの差分値が演算され、当該差分値が前回の積和演算値に加算されることによって更新後の積和演算値が算出される。 In the product-sum calculator of FIG. 6, input update data of either the value “a” or the value “b” is input as the input signal input. Then, the difference value between the input update data input and the previous data based on the input update data is calculated, and the difference value is added to the previous product-sum operation value to calculate the updated product-sum operation value.

まず、入力信号inputとしてデータが入力されると、入力更新データを書き込む前に、updateする組の番号s10に対応するレジスターが予め保持する前回データを、選択器ＳＥＬ１は遅延器ＤＥＬ２と減算器ＳＵＢ１に、選択器ＳＥＬ２は遅延器ＤＥＬ３と減算器ＳＵＢ２にそれぞれ出力する。同時に、入力更新データに対応するレジスターがライト信号write_0_0〜write_1_7に応答し、当該レジスターに入力更新データが書き込まれる。 First, when data is input as the input signal input, the selector SEL1 uses the delay unit DEL2 and the subtractor SUB1 to store the previous data stored in advance in the register corresponding to the set number s10 to be updated before writing the input update data. In addition, the selector SEL2 outputs to the delay device DEL3 and the subtractor SUB2. At the same time, the register corresponding to the input update data responds to the write signals write_0_0 to write_1_7, and the input update data is written into the register.

続いて、減算器ＳＵＢ１は、入力更新データinputから、選択器ＳＥＬ１から出力された前回データを減算した減算値を選択器ＳＥＬ３に出力する。また、遅延器ＤＥＬ２は、選択器ＳＥＬ１から出力された前回データを減算器ＳＵＢ１の出力に合わせて遅延させ、選択器ＳＥＬ３に出力する。また、遅延器ＤＥＬ１は、write_0_0からwrite_0_7の論理和s0を減算器ＳＵＢ１の出力に合わせて遅延させ、セレクト信号select3として選択器ＳＥＬ３に出力する。そして、選択器ＳＥＬ３は、セレクト信号select3がＨレベルの場合は減算器ＳＵＢ１からの出力を、セレクト信号select3がＬレベルの場合は遅延器ＤＥＬ２から出力された入力更新データの組の前回データを、乗算器ＭＵＬ１に出力する。選択器ＳＥＬ４についても同様である。 Subsequently, the subtractor SUB1 outputs a subtraction value obtained by subtracting the previous data output from the selector SEL1 from the input update data input to the selector SEL3. The delay unit DEL2 delays the previous data output from the selector SEL1 in accordance with the output of the subtractor SUB1, and outputs the delayed data to the selector SEL3. Further, the delay unit DEL1 delays the logical sum s0 of write_0_0 to write_0_7 in accordance with the output of the subtractor SUB1, and outputs it to the selector SEL3 as the select signal select3. The selector SEL3 outputs the previous data of the set of input update data output from the delay unit DEL2 when the select signal select3 is at the H level, and outputs the output from the subtractor DEL2 when the select signal select3 is at the L level. Output to the multiplier MUL1. The same applies to the selector SEL4.

前述したとおり、値「ａ」または値「ｂ」いずれかのデータが入力されるため、例えば、値「ａ」のデータが更新される場合、write_0_0からwrite_0_7の論理和s0はＨレベル、write_1_0からwrite_1_7の論理和s1はＬレベルとなる。この場合、乗算器ＭＵＬ１は、値「ａ」に係る選択器ＳＥＬ３による減算器ＳＵＢ１からの減算値（ａ_ｊ´−ａ_ｊ）と、値「ｂ」に係る選択器ＳＥＬ４による遅延器ＤＥＬ３からの前回データ（ｂ_ｊ）とを乗算する。一方、値「ｂ」のデータが更新される場合、write_0_0からwrite_0_7の論理和s0はＬレベル、write_1_0からwrite_1_7の論理和s1はＨレベルとなり、乗算器ＭＵＬ1は、値「ａ」に係る選択器ＳＥＬ３による遅延器ＤＥＬ２からの前回データ（ａ_ｊ）と、値「ｂ」に係る選択器ＳＥＬ４による減算器ＳＵＢ２からの減算値（ｂ_ｊ´−ｂ_ｊ）とを乗算する。 As described above, since the data of either the value “a” or the value “b” is input, for example, when the data of the value “a” is updated, the logical sum s0 of write_0_0 to write_0_7 is H level and from write_1_0 The logical sum s1 of write_1_7 is at the L level. In this case, the multiplier MUL1 outputs the subtraction value (a _j ′ −a _j ) from the subtractor SUB1 by the selector SEL3 related to the value “a” and the delay DEL3 from the selector SEL4 related to the value “b”. Multiply the previous data (b _j ). On the other hand, when the data of the value “b” is updated, the logical sum s0 of write_0_0 to write_0_7 is L level, the logical sum s1 of write_1_0 to write_1_7 is H level, and the multiplier MUL1 is the selector for the value “a”. Multiply the previous data (a _j ) from the delay unit DEL2 by the SEL3 and the subtraction value (b _j ′ −b _j ) from the subtractor SUB2 by the selector SEL4 related to the value “b”.

そして、乗算器ＭＵＬ１は、乗算結果を前回の積和演算値からの差分値として加算器ＡＤＤ１に出力する。この差分値は、上述した式２における「（ａ_ｊ´−ａ_ｊ）×ｂ_ｊ」（値「ａ」のデータが更新される場合）に対応する。続いて、加算器ＡＤＤ１は、当該差分値と、出力レジスターＯＵＴが保持する前回の積和演算値とを加算して、出力レジスターＯＵＴに出力する。ただし、連続するクロックサイクルで入力更新データが入力される場合、前回の積和演算値を出力レジスターから入力すると、次の加算器ＡＤＤ１の演算に間に合わない。そこで、連続するクロックサイクルで更新データが入力される場合、選択器ＳＥＬ５は、出力レジスターＯＵＴではなく加算器ＡＤＤ１からの出力を直接加算器ＡＤＤ１に入力する。 Then, the multiplier MUL1 outputs the multiplication result to the adder ADD1 as a difference value from the previous product-sum operation value. This difference value corresponds to “(a _j ′ −a _j ) × b _j ” (when the data of the value “a” is updated) in the above-described Expression 2. Subsequently, the adder ADD1 adds the difference value and the previous product-sum operation value held in the output register OUT, and outputs the result to the output register OUT. However, when input update data is input in successive clock cycles, if the previous product-sum operation value is input from the output register, it will not be in time for the next adder ADD1. Therefore, when update data is input in successive clock cycles, the selector SEL5 inputs the output from the adder ADD1 directly to the adder ADD1 instead of the output register OUT.

点線で囲んだ連続入力検出回路ＥＣ１は、連続するクロックサイクルで入力更新データが発生したか否かを判定する回路である。連続入力検出回路ＥＣ１は、遅延器ＤＥＬ６及び論理積器ＡＮＤ１によって、Ｈレベルのwrite_0_0からwrite_0_7の論理s0またはwrite_1_0からwrite_1_7の論理和s1が連続することが検出されると、Ｈレベルのセレクト信号select5を出力する。つまり、連続するクロックサイクルで入力更新データが発生する場合はＨレベルのセレクト信号select5を、そうでない場合はＬレベルのセレクト信号select5を選択器ＳＥＬ５に出力する。そして、選択器ＳＥＬ５は、セレクト信号select5がＨレベルの場合は加算器ＡＤＤ１からの前回の出力を、セレクト信号select5がＬレベルの場合は出力レジスターＯＵＴからの出力を加算器ＡＤＤ１に入力する。 The continuous input detection circuit EC1 surrounded by a dotted line is a circuit that determines whether or not input update data is generated in successive clock cycles. When the delay DEL6 and the AND circuit AND1 detect that the logic s0 from write_0_0 to write_0_7 or the logical sum s1 from write_1_0 to write_1_7 is detected by the delay unit DEL6 and the AND circuit AND1, the continuous input detection circuit EC1 selects the select signal select5 at H level. Is output. That is, when input update data is generated in successive clock cycles, an H level select signal select5 is output to the selector SEL5, and otherwise, an L level select signal select5 is output to the selector SEL5. The selector SEL5 inputs the previous output from the adder ADD1 to the adder ADD1 when the select signal select5 is at the H level and the output from the output register OUT when the select signal select5 is at the L level.

そして、遅延器ＤＥＬ５は、データが入力されたことを表す論理和器ＬＤ１のＨレベルの信号を積和演算値が算出されるタイミングに遅延させ、出力レジスターＯＵＴに書き込み信号writeを出力する。なお、図６の積和演算器において、入力レジスター（レジスターＲ００〜Ｒ１７）と出力レジスターＯＵＴは、初期状態では「０」に初期化されているものとする。そして、これらのレジスターが一旦初期化された後、入力レジスターが保持する各データの積和演算値が出力レジスターに保持される間は、再度初期化が行われる必要はない。 Then, the delay unit DEL5 delays the H level signal of the OR circuit LD1 indicating that data has been input to the timing at which the product-sum operation value is calculated, and outputs the write signal write to the output register OUT. In the product-sum calculator of FIG. 6, it is assumed that the input registers (registers R00 to R17) and the output register OUT are initialized to “0” in the initial state. After these registers are once initialized, it is not necessary to perform initialization again while the product-sum operation value of each data held in the input register is held in the output register.

このように、図６の積和演算器は、前回の積和演算値に対して、更新された任意の要素データ（入力更新データ）に基づいて前回の積和演算値との差分値（（ａ_ｊ´−ａ_ｊ）×ｂ_ｊ）を加算することによって、更新後の積和演算値を算出する。このため、図６の積和演算器は、演算対象の全ての値の要素データを積和演算し直す必要がなく、また、全ての入力更新データが対応するレジスターに蓄えられるのを待たずに演算を開始することができる。これにより、図６の積和演算器は、図２の積和演算器に対して、少ない処理サイクルで積和演算値を算出することができる。 As described above, the product-sum operation unit of FIG. 6 has a difference value ((()) from the previous product-sum operation value based on the updated arbitrary element data (input update data) with respect to the previous product-sum operation value. The updated product-sum operation value is calculated by adding a _j ′ −a _j ) × b _j ). For this reason, the product-sum operation unit of FIG. 6 does not need to perform the product-sum operation again on the element data of all values to be calculated, and does not wait for all the input update data to be stored in the corresponding registers. The calculation can be started. Thereby, the product-sum operation unit shown in FIG. 6 can calculate the product-sum operation value with fewer processing cycles than the product-sum operation unit shown in FIG.

ところで、図６の積和演算器は、図２の積和演算器と同様に、入力レジスター及び出力レジスターを有する。ただし、図２の積和演算器は、入力レジスター及び出力レジスターがない場合でも演算可能であるのに対し、本実施の形態例の積和演算器は、入力レジスター及び出力レジスターがない場合は演算できない。というのも、図２の積和演算器では、入力更新データは入力レジスターを介さずに選択器ＳＥＬＡ、ＳＥＬＢに出力されてもよく、加算器ＡＤＤの出力は出力レジスターＯＵＴを介さずに出力されてもよい。それに対し、本実施の形態例の積和演算器は、前回データ及び前回の積和演算値を演算に用いるため、それらのデータを予め保持しておく入力レジスター及び出力レジスターは必要不可欠である。 By the way, the product-sum operation unit in FIG. 6 has an input register and an output register in the same manner as the product-sum operation unit in FIG. However, the product-sum operation unit of FIG. 2 can perform the operation even when there is no input register and output register, whereas the product-sum operation unit of the present embodiment performs the operation when there is no input register and output register. Can not. This is because, in the product-sum calculator of FIG. 2, the input update data may be output to the selectors SELA and SELB without passing through the input register, and the output of the adder ADD is output without passing through the output register OUT. May be. On the other hand, since the product-sum operation unit of the present embodiment uses the previous data and the previous product-sum operation value for the operation, the input register and the output register that hold the data in advance are indispensable.

図７は、図６の積和演算器をバスに接続する回路を表す図である。同図において、図３のバス接続回路と同じ部分については、同じ引用番号が付与されている。本実施の形態例における積和演算器のバス接続回路は、さらに、３つの信号（ライト信号write_0_0〜write_0_7s0、write_1_0からwrite_1_7の論理和s1、updateする組の番号s10）を生成して図６の演算回路に出力する。 FIG. 7 is a diagram showing a circuit for connecting the product-sum calculator of FIG. 6 to the bus. In the figure, the same reference numbers are assigned to the same parts as those of the bus connection circuit of FIG. The bus connection circuit of the product-sum operation unit in the present embodiment further generates three signals (write signals write_0_0 to write_0_7s0, logical sum s1 of write_1_0 to write_1_7, and the number s10 of the set to be updated) in FIG. Output to the arithmetic circuit.

図７のバス接続回路において、論理和器Ｌ１０は、ライト信号write_0_0〜write_0_7の論理和に基づいてwrite_0_0からwrite_0_7の論理和s0を生成する。具体的に、ライト信号write_0_0〜write_0_7のいずれかの信号がＨレベルの場合、write_0_0からwrite_0_7の論理和s0はＨレベルとなる。同様にして、論理和器Ｌ２０は、ライト信号write_１_0〜write_１_7の論理和に基づいてwrite_1_0からwrite_1_7の論理和s1を生成する。また、論理和器Ｌ００〜Ｌ０７は、各論理和器に対応する組の信号（例えば、論理和器Ｌ００の場合、write_0_0、write_1_0）の論理和をそれぞれエンコーダーＥ１に出力する。エンコーダーＥ１は、Ｈレベルの信号を出力する論理和器を、updateする組の番号s10として数値化し、図６の積和演算器に出力する。 In the bus connection circuit of FIG. 7, the OR circuit L10 generates a logical sum s0 from write_0_0 to write_0_7 based on the logical sum of the write signals write_0_0 to write_0_7. Specifically, when any one of the write signals write_0_0 to write_0_7 is at H level, the logical sum s0 from write_0_0 to write_0_7 becomes H level. Similarly, the logical adder L20 generates a logical sum s1 from write_1_0 to write_1_7 based on the logical sum of the write signals write_1_0 to write_1_7. In addition, the logical adders L00 to L07 output the logical sum of a set of signals corresponding to each logical sumr (for example, write_0_0 and write_1_0 in the case of the logical sumr L00) to the encoder E1. The encoder E1 digitizes the logical adder that outputs the H level signal as the number s10 of the set to be updated, and outputs it to the product-sum calculator of FIG.

図８は、入力更新データとして「ａ_０´〜ａ_７´」が更新される場合の図６の積和演算器における動作波形を表す図である。ここでは入力更新データ「ａ_０´〜ａ_７´」と前回データである「ｂ_０〜ｂ_７」との積和演算が行われる。同図の例において、レジスターＲ００〜Ｒ０７には前回データ「ａ_０〜ａ_７」が、レジスターＲ１０〜Ｒ１７には前回データ「ｂ_０〜ｂ_７」が、出力レジスターには前回の積和演算値「Ｓ_７ ^８」が予め保持される。また、「ｄ_ｉ」は入力更新データ「ａ_ｉ´」から当該入力更新データに対応する前回データ「ａ_ｉ」の減算値「ｄ_ｉ（＝ａ_ｉ´−ａ_ｉ）」であり、「ｑ_ｉ」は入力更新データに対応する組の値「ｂ_ｉ」の要素データと減算値「ｄ_ｉ」との乗算値「ｑ_ｉ（＝ｄ_ｉ（＝ａ_ｉ´−ａ_ｉ）×ｂ_ｉ）」である。そして、「Ｓ_７ ^８−ｉ」は、「ｉ」番目の組までの積和演算値を表す。 FIG. 8 is a diagram illustrating operation waveforms in the product-sum calculator of FIG. 6 when “a ₀ ′ to a ₇ ′” is updated as input update data. Here, a product-sum operation is performed on the input update data “a ₀ ′ to a ₇ ′” and “b _{0 to} b ₇ ” which is the previous data. In the example of the figure, the previous data “a _{0 to} a ₇ ” is stored in the registers R 00 to R 07, the previous data “b _{0 to} b ₇ ” is stored in the registers R 10 to R 17, and the previous product-sum operation value is stored in the output register. “S ₇ ⁸ ” is held in advance. “D _i ” is a subtraction value “d _i (= a _i ′ −a _i )” of the previous data “a _i ” corresponding to the input update data from the input update data “a _i ′”, and “q _i ”is a multiplication value“ q _i (= d _i (= a _i ′ −a _i ) × b _i ) of element data of the set value “b _i ” corresponding to the input update data and the subtraction value “d _i ”. Is. “S ₇ ^8-i ” represents a product-sum operation value up to the “i” -th set.

レジスターＲ００には、予め前回データ「ａ_０」が保持される。そして、入力信号inputとして「ａ_０´」が入力されると、選択器ＳＥＬ１は、入力更新データ「ａ_０」に対応するupdateする組の番号s10（＝０）に基づいて、レジスターＲ００が予め保持する前回データ「ａ_０」を次のクロックの立ち上がりタイミングで遅延器ＤＥＬ２と減算器ＳＵＢ１に出力する。同時に、ライト信号write_0_0に応答して、レジスターＲ００に「ａ_０´」が書き込まれる。 The register R00 holds the previous data “a ₀ ” in advance. Then, when “a ₀ ′” is input as the input signal input, the selector SEL 1 stores the register R 00 in advance based on the number s 10 (= 0) of the set to be updated corresponding to the input update data “a ₀ ”. The previous data “a ₀ ” to be held is output to the delay unit DEL2 and the subtracter SUB1 at the rising timing of the next clock. At the same time, “a ₀ ′” is written to the register R00 in response to the write signal write_0_0.

減算器ＳＵＢ１は、入力更新データ「ａ_０´」から、選択器ＳＥＬ１から出力された前回データ「ａ_０」を減算し、減算値「ｄ_０（＝ａ_０´−ａ_０）」を選択器ＳＥＬ３に出力する。また、入力更新データ「ａ_０´」はレジスターＲ００に対応するためwrite_0_0からwrite_0_7の論理和s0はＨレベルとなり、遅延器ＤＥＬ１はＨレベルのセレクト信号select1を選択器ＳＥＬ３に出力する。そのため、選択器ＳＥＬ３は、Ｈレベルのセレクト信号select1に基づいて、減算器ＳＵＢ１からの出力「ｄ_０」を選択し乗算器ＭＵＬ１に出力する。 The subtractor SUB1 subtracts the previous data “a ₀ ” output from the selector SEL1 from the input update data “a ₀ ′”, and selects the subtraction value “d ₀ (= a ₀ ′ −a ₀ )”. Output to SEL3. Since the input update data “a ₀ ′” corresponds to the register R00, the logical sum s0 of write_0_0 to write_0_7 becomes H level, and the delay unit DEL1 outputs the H level select signal select1 to the selector SEL3. Therefore, the selector SEL3 selects the output “d ₀ ” from the subtractor SUB1 based on the H-level select signal select1, and outputs it to the multiplier MUL1.

一方、選択器ＳＥＬ２は、updateする組の番号s10（＝０）に基づいて、レジスターＲ１０に保持された前回データ「ｂ_０」を選択して減算器ＳＵＢ２と遅延器ＤＥＬ３に出力する。この場合、入力更新データ「ａ_０´」はレジスターＲ１０〜Ｒ１７には対応しないため、選択器ＳＥＬ４にＬレベルのセレクト信号select4が出力され、選択器ＳＥＬ４は、セレクト信号select4に基づいて遅延器ＤＥＬ３からの前回データ「ｂ_０」を選択し乗算器ＭＵＬ１に出力する。 On the other hand, the selector SEL2 selects the previous data “b ₀ ” held in the register R10 based on the number s10 (= 0) of the set to be updated, and outputs it to the subtractor SUB2 and the delay device DEL3. In this case, since the input update data “a ₀ ′” does not correspond to the registers R10 to R17, the L level select signal select4 is output to the selector SEL4, and the selector SEL4 receives the delay device DEL3 based on the select signal select4. The previous data “b ₀ ” from is selected and output to the multiplier MUL1.

そして、次のクロックの立ち上がりタイミングで、乗算器ＭＵＬ１は、選択器ＳＥＬ３から出力された減算値「ｄ_０」と、選択器ＳＥＬ４から出力された前回データ「ｂ_０」とを乗算し、乗算した値「ｑ_０（＝ｄ_０（＝ａ_０´−ａ_０）×ｂ_０）」を加算器ＡＤＤ１に出力する。この例では複数のデータａ_０´〜ａ_７´が連続するクロックサイクルで入力されるものの入力更新データ「ａ_０´」は最初の入力である。そのため、連続入力検出回路ＥＣ１はＬレベルのセレクト信号select5を選択器ＳＥＬ５に出力し、選択器ＳＥＬ５は、出力レジスターＯＵＴから出力される前回の積和演算値「Ｓ_７ ^８」を加算器ＡＤＤ１に入力する。 Then, at the next clock rising timing, the multiplier MUL1 multiplies the subtraction value “d ₀ ” output from the selector SEL3 by the previous data “b ₀ ” output from the selector SEL4. The value “q ₀ (= d ₀ (= a ₀ ′ −a ₀ ) × b ₀ )” is output to the adder ADD1. In this example, although a plurality of data a ₀ ′ to a ₇ ′ are input in successive clock cycles, the input update data “a ₀ ′” is the first input. Therefore, the continuous input detection circuit EC1 outputs an L-level select signal select5 to the selector SEL5, and the selector SEL5 outputs the previous product-sum operation value “S ₇ ⁸ ” output from the output register OUT to the adder ADD1. input.

次のクロックの立ち上がりタイミングで、加算器ＡＤＤ１は、乗算器ＭＵＬ１から入力された前回の積和演算結果からの差分値「ｑ_０」と前回の積和演算値「Ｓ_７ ^８」とを加算し、加算値「Ｓ_７ ^７（＝Ｓ_７ ^８＋ｑ_０）」を出力レジスターＯＵＴに出力する。これにより、出力レジスターＯＵＴに入力更新データ「ａ_０´」が反映された積和演算値「Ｓ_７ ^７」が書き込まれる。 At the next clock rise timing, the adder ADD1 adds the difference value “q ₀ ” from the previous product-sum operation result input from the multiplier MUL1 and the previous product-sum operation value “S ₇ ⁸ ”. The added value “S ₇ ⁷ (= S ₇ ⁸ + q ₀ )” is output to the output register OUT. As a result, the product-sum operation value “S ₇ ⁷ ” reflecting the input update data “a ₀ ′” is written to the output register OUT.

「ａ_０´」に続いて入力される入力更新データ「ａ_１´〜ａ_７´」についても同様である。ただし、「ａ_１´〜ａ_７´」の演算時は、連続クロックサイクルで入力更新データが発生する場合に該当するため、加算器ＡＤＤ１は、入力更新データ「ａ_ｉ´（ｉ＝１〜７）」に基づく差分値「ｑ_ｉ」と、加算器ＡＤＤ１から出力された前回の積和演算値「Ｓ_７ ^８−ｉ」とを加算し、出力レジスターＯＵＴに出力する。そして、遅延器ＤＥＬ５は、それぞれの入力更新データが反映された積和演算値が出力レジスターＯＵＴに出力される都度、出力レジスターＯＵＴに書き込み信号writeを出力する。 The same applies to the input update data “a ₁ ′ to a ₇ ′” input subsequent to “a ₀ ′”. However, since the calculation of “a ₁ ′ to a ₇ ′” corresponds to the case where the input update data is generated in continuous clock cycles, the adder ADD1 uses the input update data “a _i ′ (i = 1 to 7). The difference value “q _i ” based on “)” and the previous product-sum operation value “S ₇ ^8-i ” output from the adder ADD 1 are added and output to the output register OUT. The delay device DEL5 outputs a write signal “write” to the output register OUT each time a product-sum operation value reflecting each input update data is output to the output register OUT.

このように、本実施の形態例の積和演算器は、複数のデータを連続して入力する場合であっても、入力と演算のパイプライン処理によりそれぞれの入力更新データに基づく差分値を前回の積和演算値に順次加算することによって、それぞれ入力更新データに対応する更新後の積和演算値をその都度算出する。このため、本実施の形態例の積和演算値は、複数の入力更新データ「ａ_０´〜ａ_７´」がレジスターに蓄えられるのを待たずに演算を開始することができ、全ての入力更新データがレジスターに蓄えられるまでのデータ転送時間を要しない。これにより、図２の積和演算器が、積和演算値の演算にデータ「ａ_０´〜ａ_７´」が入力され始めてから２０サイクル要するのに対し、本実施の形態例における積和演算器は１２サイクルで積和演算値を算出することができる。 As described above, the product-sum operation unit according to the present embodiment, even when a plurality of pieces of data are continuously input, the difference value based on the respective input update data by the pipeline processing of the input and the operation is previously calculated. The product-sum operation values after updating corresponding to the input update data are calculated each time. For this reason, the product-sum operation value of the present embodiment can start the operation without waiting for a plurality of input update data “a ₀ ′ to a ₇ ′” to be stored in the register. Data transfer time is not required until update data is stored in the register. As a result, the product-sum operation unit of FIG. 2 takes 20 cycles after the data “a ₀ ′ to a ₇ ′” starts to be input for the operation of the product-sum operation value. The calculator can calculate the product-sum operation value in 12 cycles.

図９は、入力更新データとして「ａ_２´」のみが更新される場合の図６の積和演算器における動作波形を表す図である。図８と同様にして、レジスターＲ００〜Ｒ０７には前回データ「ａ_０〜ａ_７」が、レジスターＲ１０〜Ｒ１７には前回データ「ｂ_０〜ｂ_７」が、出力レジスターには前回の積和演算値「Ｓ_７」が予め保持される。「ｄ_ｉ」、「ｑ_ｉ」については図８と同様であり、「Ｓ_７」は前回の積和演算値を、「Ｓ_７´」は更新後の積和演算値を表す。 FIG. 9 is a diagram illustrating operation waveforms in the product-sum operation unit in FIG. 6 when only “a ₂ ′” is updated as input update data. As in FIG. 8, the previous data “a _{0 to} a ₇ ” is stored in the registers R 00 to R 07, the previous data “b _{0 to} b ₇ ” is stored in the registers R 10 to R 17, and the previous product-sum operation is performed in the output register. The value “S ₇ ” is held in advance. “D _i ” and “q _i ” are the same as those in FIG. 8, “S ₇ ” represents the previous product-sum operation value, and “S ₇ ′” represents the updated product-sum operation value.

入力信号inputとして「ａ_２´」が入力されると、選択器ＳＥＬ１は、入力更新データ「ａ_２´」に対応するupdateする組の番号s10（＝２）に基づいて、レジスターＲ０２が予め保持する前回データ「ａ_２」を次のクロックの立ち上がりタイミングで遅延器ＤＥＬ２と減算器ＳＵＢ１に出力する。同時に、ライト信号write_０_２に応答して、レジスターＲ０２に「ａ_２´」が書き込まれる。 When “a ₂ ′” is input as the input signal input, the selector SEL 1 holds in advance the register R 02 based on the number s 10 (= 2) of the set to be updated corresponding to the input update data “a ₂ ′”. The previous data “a ₂ ” to be output is output to the delay unit DEL2 and the subtracter SUB1 at the rising timing of the next clock. At the same time, “a ₂ ′” is written to the register R02 in response to the write signal write_0_2.

減算器ＳＵＢ１は、入力更新データ「ａ_２´」から、選択器ＳＥＬ１から出力された前回データ「ａ_２」を減算し、減算値「ｄ_２（＝ａ_２´−ａ_２）」を選択器ＳＥＬ３に出力する。そして、選択器ＳＥＬ３は、遅延器ＤＥＬ１からのＨレベルのセレクト信号select3に基づいて、減算器ＳＵＢ１からの出力「ｄ_２」を選択し乗算器ＭＵＬ１に出力する。一方、選択器ＳＥＬ４は、遅延器ＤＥＬ４からのＬレベルのセレクト信号select4に基づいて、遅延器ＤＥＬ３からの前回データ「ｂ_２」を選択し乗算器ＭＵＬ１に出力する。 The subtractor SUB1 subtracts the previous data “a ₂ ” output from the selector SEL1 from the input update data “a ₂ ′”, and selects the subtraction value “d ₂ (= a ₂ ′ −a ₂ )”. Output to SEL3. Then, the selector SEL3 selects the output “d ₂ ” from the subtractor SUB1 based on the H level select signal select3 from the delay device DEL1, and outputs it to the multiplier MUL1. On the other hand, the selector SEL4 selects the previous data “b ₂ ” from the delay device DEL3 based on the L level select signal select4 from the delay device DEL4, and outputs it to the multiplier MUL1.

そして、次のクロックの立ち上がりタイミングで、乗算器ＭＵＬ１は、選択器ＳＥＬ３から出力された減算値「ｄ_２」と、選択器ＳＥＬ４から出力された前回データ「ｂ_２」とを乗算し、乗算値「ｑ_２（＝（ｄ_２（＝ａ_２´−ａ_２）×ｂ_２）」を加算器ＡＤＤ１に出力する。この例は、連続するクロックサイクルで入力更新データが発生しない場合に該当するため、選択器ＳＥＬ５は出力レジスターＯＵＴから出力される前回の積和演算値「Ｓ_７」を加算器ＡＤＤ１に入力する。そして、加算器ＡＤＤ１は、次のクロックの立ち上がりタイミングで、前回の積和演算値「Ｓ_７」と乗算器ＭＵＬ１からの出力「ｑ_２」とを加算し、加算値「Ｓ_７´（＝Ｓ_７＋ｑ_２）」を出力レジスターＯＵＴに出力する。 At the next clock rising timing, the multiplier MUL1 multiplies the subtraction value “d ₂ ” output from the selector SEL3 by the previous data “b ₂ ” output from the selector SEL4, and the multiplied value. “Q ₂ (= (d ₂ (= a ₂ ′ −a ₂ ) × b ₂ )” is output to the adder ADD 1. This example corresponds to a case where no input update data is generated in successive clock cycles. The selector SEL5 inputs the previous product-sum operation value “S ₇ ” output from the output register OUT to the adder ADD1, and the adder ADD1 performs the previous product-sum operation at the rising timing of the next clock. The value “S ₇ ” and the output “q ₂ ” from the multiplier MUL 1 are added, and the added value “S ₇ ′ (= S ₇ + q ₂ )” is output to the output register OUT.

このように、本実施の形態例の積和演算器は、演算対象の全ての値の要素データを積和演算し直す必要がなく、入力更新データ「ａ_２´」がレジスターＲ０２に蓄えられるのを待たずに演算を開始することができるため、特に、一部の要素「ａ₂´」だけが更新された場合に、より少ない処理サイクルで積和演算値を算出することができる。これにより、図２の積和演算器が積和演算値の演算にデータ「ａ_２´」が入力され始めてから１３サイクル要していたのに対し、本実施の形態例における積和演算器は５サイクルで積和演算値を算出することができる。 As described above, the product-sum operation unit of the present embodiment does not need to perform the product-sum operation again on the element data of all values to be calculated, and the input update data “a ₂ ′” is stored in the register R02. Since the calculation can be started without waiting for the calculation, the product-sum calculation value can be calculated with fewer processing cycles, particularly when only a part of the element “a ₂ ′” is updated. As a result, the product-sum operation unit in FIG. 2 takes 13 cycles after the data “a ₂ ′” starts to be input for the calculation of the product-sum operation value. The product-sum operation value can be calculated in 5 cycles.

従って、本実施の形態例の積和演算器は、演算対象の全ての要素データがレジスターに蓄えられるまでのデータ転送時間、及び、全ての要素データの演算時間を要しないことにより、図２の積和演算器に対してより少ない処理サイクルで積和演算値を算出することができる。 Therefore, the product-sum calculator of the present embodiment does not require the data transfer time until all the element data to be calculated are stored in the register and the calculation time of all the element data. The product-sum operation value can be calculated with fewer processing cycles than the product-sum operation unit.

＜第２の実施の形態例＞
第１の実施の形態例では、２つの値「ａ」「ｂ」の積和演算値を算出する積和演算器について述べたが、第２の実施の形態例では、さらに、値「ｃ」を加えた値「ａ」「ｂ」「ｃ」の積和演算値を演算する積和演算器について述べる。第２の実施の形態例の積和演算器は、次の式３に基づいて積和演算値を算出する。 <Second Embodiment>
In the first embodiment, the product-sum operation unit that calculates the product-sum operation value of the two values “a” and “b” has been described. However, in the second embodiment, the value “c” is further provided. A product-sum operation unit that calculates the product-sum operation value of the values “a”, “b”, and “c” added with the above will be described. The product-sum operation unit according to the second embodiment calculates a product-sum operation value based on the following Equation 3.

式３における、値「ａ」、値「ａ´」及び、値「ｂ」については、式２と同様である。ただし、式３において、「Ｓ」は「ａ」と「ｂ」と「ｃ」の積和演算値（以下、前回の積和演算値）であり、「Ｓ´」は「ａ´」と「ｂ」と「ｃ」との積和演算値（以下、更新後の積和演算値）を表す。式３において、更新後の積和演算値「Ｓ´」は、前回の積和演算値「Ｓ」に対して、差分値「（ａ_ｊ´−ａ_ｊ）×ｂ_ｊ×ｃ_ｊ」が加算されることによって算出される。 The value “a”, the value “a ′”, and the value “b” in Equation 3 are the same as those in Equation 2. However, in Equation 3, “S” is the product-sum operation value of “a”, “b”, and “c” (hereinafter, the previous product-sum operation value), and “S ′” is “a ′” and “ The product-sum operation value of “b” and “c” (hereinafter, the product-sum operation value after update) is represented. In Equation 3, the updated product-sum operation value “S ′” is obtained by adding the difference value “(a _j ′ −a _j ) × b _j × c _j ” to the previous product-sum operation value “S”. To be calculated.

図１０は、式３において「ｎ＝８」とした場合の本実施の形態例における積和演算器の一例を表す図である。図６と同様に、同図の積和演算器では、値「ａ」の各要素ａ_０〜ａ_７はレジスターＲ００〜Ｒ０７に、値「ｂ」の各要素ｂ_０〜ｂ_７はレジスターＲ１０〜Ｒ１７に、値「ｃ」の各要素ｃ_０〜ｃ_７はレジスターＲ２０〜Ｒ２７に格納される。また、本実施の形態例における積和演算器は、入力レジスター（レジスターＲ００〜Ｒ２７）に前回データを、出力レジスターＯＵＴに前回の積和演算値を保持する。 FIG. 10 is a diagram illustrating an example of a product-sum calculator in the present embodiment when “n = 8” in Equation 3. 6, in the product-sum calculator of FIG. 6, each element a _{0 to} a ₇ of the value “a” is stored in the registers R 00 to R 07, and each element b _{0 to} b ₇ of the value “b” is stored in the registers R 10 to R 10. In R17, each element c _{0 to} c ₇ of the value “c” is stored in the registers R20 to R27. In addition, the product-sum calculator in the present embodiment holds the previous data in the input register (registers R00 to R27) and the previous product-sum operation value in the output register OUT.

また、図１０の積和演算器において、図６と同じ部分については同様の引用番号が付与される。同図の積和演算器は、さらに、レジスターＲ２０〜Ｒ２７、write_2_0からwrite_2_7の論理和s2、選択器ＳＥＬ６、ＳＥＬ７、減算器ＳＵＢ３、遅延器ＤＥＬ７、ＤＥＬ８を有する。write_2_0からwrite_2_7の論理和s2はレジスターＲ２０〜Ｒ２７のいずれかに格納される入力更新データの有無を表す。また、同図の積和演算器では、入力信号inputとして、値「ａ」「ｂ」「ｃ」のいずれかの更新データが入力される。 In the product-sum calculator of FIG. 10, the same reference numbers are assigned to the same parts as those of FIG. The product-sum operation unit shown in the figure further includes registers R20 to R27, logical sum s2 of write_2_0 to write_2_7, selectors SEL6 and SEL7, subtractor SUB3, delay units DEL7 and DEL8. The logical sum s2 from write_2_0 to write_2_7 represents the presence or absence of input update data stored in any of the registers R20 to R27. In addition, in the product-sum operation unit of the same figure, update data of values “a”, “b”, and “c” is input as the input signal input.

図１０の積和演算器の動作は、図６と同様である。同１０の積和演算器では、値「ａ」「ｂ」「ｃ」のいずれかのデータが入力されるため、例えば、値「ａ´」のデータが入力された場合、write_0_0からwrite_0_7の論理和s0はＨレベル、write_1_0からwrite_1_7の論理和s1、及びwrite_2_0からwrite_2_7の論理和s2はＬレベルとなる。従って、乗算器ＭＵＬ１は、値「ａ」に係る選択器ＳＥＬ３による減算器ＳＵＢ１からの減算値（ａ_ｊ´−ａ_ｊ）と、値「ｂ」に係る選択器ＳＥＬ４による遅延器ＤＥＬ３からの前回データ（ｂ_ｊ）と、値「ｃ」に係る選択器ＳＥＬ７による遅延器ＤＥＬ７からの前回データ（ｃ_ｊ）とを乗算する。 The operation of the product-sum calculator of FIG. 10 is the same as that of FIG. In the same 10 product-sum operation unit, any one of the values “a”, “b”, and “c” is input. For example, when data of the value “a ′” is input, the logic of write_0_0 to write_0_7 The sum s0 is at the H level, the logical sum s1 from write_1_0 to write_1_7, and the logical sum s2 from write_2_0 to write_2_7 is at the L level. Accordingly, the multiplier MUL1 includes the subtraction value (a _j ′ −a _j ) from the subtractor SUB1 by the selector SEL3 related to the value “a” and the previous time from the delay DEL3 by the selector SEL4 related to the value “b”. The data (b _j ) is multiplied by the previous data (c _j ) from the delay device DEL 7 by the selector SEL 7 related to the value “c”.

そして、乗算器ＭＵＬ１は、乗算結果を前回の積和演算値からの差分値として加算器ＡＤＤ１に出力する。この差分値は、上述した式３における「（ａ_ｊ´−ａ_ｊ）×ｂ_ｊ×ｃ_ｊ」に対応する。そして、加算器ＡＤＤ１は、当該差分値と、出力レジスターＯＵＴまたは前回の加算器ＡＤＤ１からの出力のいずれかを加算し、更新後の積和演算値として出力レジスターＯＵＴに出力する。 Then, the multiplier MUL1 outputs the multiplication result to the adder ADD1 as a difference value from the previous product-sum operation value. This difference value corresponds to “(a _j ′ −a _j ) × b _j × c _j ” in Expression 3 described above. The adder ADD1 adds the difference value and either the output register OUT or the previous output from the adder ADD1, and outputs the result as an updated product-sum operation value to the output register OUT.

このように、本実施の形態例の積和演算器は、値「ａ」「ｂ」に、さらに値「ｃ」を加えた３つの値の積和演算値についても、図６の積和演算器と同様に、入力と演算のパイプライン処理により、前回の積和演算値に対して、更新された任意の要素データ（入力更新データ）に基づく前回の積和演算値との差分値を加算することによって、更新後の積和演算値を算出する。このため、積和演算器は、演算対象の全ての値「ａ」「ｂ」「ｃ」の要素データを積和演算し直す必要がなく、また、全ての入力更新データが対応するレジスターに蓄えられるのを待たずに演算を開始することができる。 As described above, the product-sum operation unit of the present embodiment also uses the product-sum operation of FIG. 6 for the product-sum operation value of three values obtained by adding the value “c” to the values “a” and “b”. In the same way as the unit, the difference between the previous product-sum operation value based on the updated arbitrary element data (input update data) is added to the previous product-sum operation value by pipeline processing of input and operation By doing so, an updated product-sum operation value is calculated. For this reason, the product-sum operation unit does not need to perform the product-sum operation again on the element data of all the values “a”, “b”, and “c” to be calculated, and stores all the input update data in the corresponding registers. The computation can be started without waiting for it.

これにより、本実施の形態例の積和演算器は、３つの値の積和演算値を算出する場合についても、演算対象の全ての要素データがレジスターに蓄えられるまでのデータ転送時間、及び、全ての要素データの演算時間を要しないことにより、より少ない処理サイクルで積和演算値を算出することができる。なお、本実施の形態例では３つの値の積和演算を行う場合について述べたが、本実施の形態例の積和演算器は、４つ以上の値の積和演算を行う場合にも有効である。 Thereby, the product-sum operation unit of the present embodiment also calculates the data transfer time until all the element data to be calculated are stored in the register, even when calculating the product-sum operation value of the three values, and By not requiring calculation time for all the element data, the product-sum operation value can be calculated with fewer processing cycles. In the present embodiment, the case where the product-sum operation of three values is performed has been described. However, the product-sum operation unit of the present embodiment is also effective when performing the product-sum operation of four or more values. It is.

＜第３の実施の形態例＞
第１の実施の形態例では、値「ａ」「ｂ」のいずれかの要素データが入力される積和演算器について述べたが、第３の実施の形態例では、値「ａ」「ｂ」の同組の要素データが同時に入力される積和演算器について述べる。 <Third Embodiment>
In the first embodiment, the product-sum operation unit to which any element data of the values “a” and “b” is input has been described. In the third embodiment, the values “a” and “b” are described. The sum-of-products calculator to which the same set of element data of "

第３の実施の形態例の積和演算器は、次の式４に基づいて積和演算値を算出する。式４において、値「ａ」、「ａ´」は式２と同様であり、値「ｂ´＝（ｂ_０´，ｂ_１´，ｂ_２´，…，ａ_ｎ−１´）」は前回積和演算された古い値「ｂ＝（ｂ_０，ｂ_１，ｂ_２，…，ｂ_ｎ−１）」である前回データに対してｊ番目の要素が更新されているものとする。そして、「Ｓ」は「ａ」と「ｂ」の積和演算値（以下、前回の積和演算値）であり、「Ｓ´」は「ａ´」と「ｂ´」の積和演算値（以下、更新後の積和演算値）を表す。 The product-sum operation unit of the third embodiment calculates a product-sum operation value based on the following equation 4. In Expression 4, the values “a” and “a ′” are the same as those in Expression 2, and the values “b ′ = (b ₀ ′, b ₁ ′, b ₂ ′,..., A _n−1 ′)” are the previous time. It is assumed that the j-th element is updated with respect to the previous data which is the old value “b = (b ₀ , b ₁ , b ₂ ,..., B _n−1 )” obtained by the product-sum operation. “S” is the product-sum operation value of “a” and “b” (hereinafter, the previous product-sum operation value), and “S ′” is the product-sum operation value of “a ′” and “b ′”. (Hereinafter referred to as an updated product-sum operation value).

式４では、式２と同様に、更新後の積和演算値「Ｓ´」を、前回の積和演算値「Ｓ」に、「Ｓ´」と「Ｓ」の差分値を加算することによって求める。具体的に、更新後の積和演算値「Ｓ´」は、前回の積和演算値「Ｓ」から値「ａ」「ｂ」のｊ番目の要素の乗算値「ａ_ｊ×ｂ_ｊ」を減算し、値「ａ´」「ｂ´」のｊ番目の要素の乗算値「ａ_ｊ´×ｂ_ｊ´」を加算した値（Ｓ´＝Ｓ−（ａ_ｊ×ｂ_ｊ）＋（ａ_ｊ´×ｂ_ｊ´））である。従って、更新後の積和演算値「Ｓ´」は、前回の積和演算値「Ｓ」に差分値「−（ａ_ｊ×ｂ_ｊ）＋（ａ_ｊ´×ｂ_ｊ´）」が加算されることにより算出される。 In Equation 4, as in Equation 2, the updated product-sum operation value “S ′” is added to the previous product-sum operation value “S” by adding the difference value between “S ′” and “S”. Ask. Specifically, the updated product-sum operation value “S ′” is obtained by multiplying the multiplication value “a _j × b _j ” of the j-th element of the values “a” and “b” from the previous product-sum operation value “S”. A value obtained by subtracting and adding the multiplication value “a _j ′ × b _j ′” of the j-th element of the values “a ′” and “b ′” (S ′ = S− (a _j × b _j ) + (a _j '× b _j ')). Accordingly, the updated product-sum operation value “S ′” is obtained by adding the difference value “− (a _j × b _j ) + (a _j ′ × b _j ′)” to the previous product-sum operation value “S”. Is calculated.

式１と式４とを比較すると、値の一部の要素が更新される場合、式１に対して式４の演算量は少ない。従って、式４は、式１に対してより少ない処理サイクルで更新後の積和演算値「Ｓ」を算出することができる。 Comparing Expression 1 and Expression 4, when some elements of the value are updated, the amount of calculation of Expression 4 is smaller than Expression 1. Therefore, Equation 4 can calculate the updated product-sum operation value “S” with fewer processing cycles than Equation 1.

図１１は、式４において「ｎ＝８」とした場合の本実施の形態例における積和演算器の一例を表す図である。図６の積和演算器と同様に、同図の積和演算器では、値「ａ」の各要素ａ_０〜ａ_７はレジスターＲ００〜Ｒ０７に格納され、値「ｂ」の各要素ｂ_０〜ｂ_７はレジスターＲ１０〜Ｒ１７に格納される。また、同図の積和演算器では、各入力レジスターに前回データを、出力レジスターＯＵＴに前回の積和演算値を保持する。また、同図の積和演算器には、値「ａ」「ｂ」の同じ組の更新後のデータが同時に入力されるため、値「ａ」に対応する入力信号input_0と、値「ｂ」に対応する入力信号inout_1とが入力される。 FIG. 11 is a diagram illustrating an example of a product-sum calculator in the present embodiment when “n = 8” in Equation 4. Similar to the product-sum operation unit in FIG. 6, in the product-sum operation unit in FIG. 6, the elements a _{0 to} a ₇ of the value “a” are stored in the registers R 00 to R 07 and the elements b ₀ of the value “b” ~b ₇ is stored in the register R10-R17. In the product-sum operation unit shown in the figure, the previous data is stored in each input register, and the previous product-sum operation value is held in the output register OUT. Furthermore, since the updated data of the same set of values “a” and “b” are simultaneously input to the product-sum calculator of FIG. 23, the input signal input_0 corresponding to the value “a” and the value “b” And an input signal inout_1 corresponding to.

図１１の積和演算器において、入力信号input_0として、値「ａ」に係る入力更新データが入力されると、入力更新データを書き込む前に、選択器ＳＥＬ１はupdateする組の番号s10に対応するレジスターが予め保持する前回データを選択器ＳＥＬ８と乗算器ＭＵＬ３とに出力する。同時に、入力更新データに対応するレジスターがライト信号write_0_0〜write_0_7に応答して、対応するレジスターＲ００〜Ｒ０７に入力更新データが書き込まれる。 In the product-sum calculator of FIG. 11, when input update data related to the value “a” is input as the input signal input_0, the selector SEL1 corresponds to the number s10 of the set to be updated before writing the input update data. The previous data held in the register in advance is output to the selector SEL8 and the multiplier MUL3. At the same time, the registers corresponding to the input update data are written in the corresponding registers R00 to R07 in response to the write signals write_0_0 to write_0_7.

同様にして、入力信号input_0と同じ組のデータであって、値「ｂ」に係るデータが入力信号input_１として入力されると、入力更新データを書き込む前に、選択器ＳＥＬ２はupdateする組の番号s10に対応するレジスターが予め保持する前回データを選択器ＳＥＬ９と乗算器ＭＵＬ３とに出力する。同時に、入力更新データに対応するレジスターがライト信号write_１_0〜write_１_7に応答して、対応するレジスターＲ１０〜Ｒ１７に入力更新データが書き込まれる。 Similarly, when the data of the same set as the input signal input_0 and the data related to the value “b” is input as the input signal input_1, the selector SEL2 updates the number of the set to be updated before writing the input update data. The previous data held in advance in the register corresponding to s10 is output to the selector SEL9 and the multiplier MUL3. At the same time, the registers corresponding to the input update data are written to the corresponding registers R10 to R17 in response to the write signals write_1_0 to write_1_7.

続いて、選択器ＳＥＬ８は、セレクト信号select8（write_0_0からwrite_0_7の論理和s0）に基づいて、入力更新データinput_0と、選択器ＳＥＬ１から出力された入力更新データに対応する前回データのいずれかを選択し乗算器ＭＵＬ２に出力する。具体的に、選択器ＳＥＬ８は、セレクト信号select8がＨレベルの場合は入力更新データinput_0を、セレクト信号select8がＬレベルの場合は選択器ＳＥＬ１から出力された前回データを乗算器ＭＵＬ２に出力する。選択器ＳＥＬ９についても同様である。 Subsequently, the selector SEL8 selects either the input update data input_0 or the previous data corresponding to the input update data output from the selector SEL1 based on the select signal select8 (logical sum s0 of write_0_0 to write_0_7). Output to the multiplier MUL2. Specifically, the selector SEL8 outputs the input update data input_0 to the multiplier MUL2 when the select signal select8 is at the H level and the previous data output from the selector SEL1 when the select signal select8 is at the L level. The same applies to the selector SEL9.

前述したとおり、本実施の形態例の積和演算器では、値「ａ」「ｂ」の同組のデータが同時に入力されるため、write_0_0からwrite_0_7の論理和s0と、write_1_0からwrite_1_7の論理和s1とが、同時にＨレベルとなる。この場合、乗算器ＭＵＬ２は、両入力更新データinput_0、input_1を乗算した乗算値を（ａ_ｊ´×ｂ_ｊ´）、乗算器ＭＵＬ３は両入力更新データに対応する前回データを乗算した乗算値（ａ_ｊ×ｂ_ｊ）を、それぞれ減算器ＳＵＢ４に出力する。 As described above, the product-sum operation unit of the present embodiment inputs the same set of data of the values “a” and “b” at the same time, so the logical sum s0 from write_0_0 to write_0_7 and the logical sum of write_1_0 to write_1_7. s1 becomes H level at the same time. In this case, the multiplier MUL2 multiplies the input update data input_0 and input_1 by multiplication (a _j ′ × b _j ′), and the multiplier MUL3 multiplies the previous data corresponding to both input update data by the multiplication value ( a _j × b _j ) are output to the subtracter SUB4.

そして、減算器ＳＵＢ４は、乗算器ＭＵＬ２が出力する両入力更新データの乗算値（ａ_ｊ´×ｂ_ｊ´）から、乗算器ＭＵＬ３からの出力である各前回データの乗算値（ａ_ｊ×ｂ_ｊ）を減算し、前回の積和演算結果からの差分値として加算器ＡＤＤ１に出力する。この差分値は、上述した式４における「−（ａ_ｊ×ｂ_ｊ）＋（ａ_ｊ´×ｂ_ｊ´）」に対応する。そして、加算器ＡＤＤ１は、当該差分値と、出力レジスターＯＵＴまたは前回の加算器ＡＤＤ１からの出力のいずれかを加算し、更新後の積和演算値として出力レジスターＯＵＴに出力する。 Then, subtractor SUB4 is the multiplier MUL2 multiplies values of the two input update data is outputted from the _{_{(a j '× b j'}} ), multiplied values of the previous data is output from the multiplier MUL3 _{(a j} × b _j ) is subtracted and output to the adder ADD1 as a difference value from the previous product-sum operation result. This difference value corresponds to “− (a _j × b _j ) + (a _j ′ × b _j ′)” in Expression 4 described above. The adder ADD1 adds the difference value and either the output register OUT or the previous output from the adder ADD1, and outputs the result as an updated product-sum operation value to the output register OUT.

このように、本実施の形態例の積和演算器は、値「ａ」「ｂ」の任意の同組の要素データ（入力更新データ）を同時に更新する場合の積和演算値において、入力と演算のパイプライン処理により、前回の積和演算値に入力更新データに基づく前回の積和演算値との差分値を加算することにより、更新後の積和演算値を順次算出することができる。このため、積和演算器は、演算対象の全ての組の要素データを積和演算し直す必要がなく、また、全ての入力更新データが対応するレジスターに蓄えられるのを待たずに演算を開始することができる。 As described above, the product-sum operation unit according to the present embodiment has an input and a product-sum operation value in the case of simultaneously updating any of the same set of element data (input update data) of the values “a” and “b”. By adding the difference value from the previous product-sum operation value based on the input update data to the previous product-sum operation value by the operation pipeline processing, the updated product-sum operation value can be sequentially calculated. For this reason, the product-sum operation unit does not need to perform the product-sum operation again on all the sets of element data to be calculated, and does not wait for all the input update data to be stored in the corresponding register. can do.

これにより、本実施の形態例の積和演算器は、値「ａ」「ｂ」の任意の同組の要素データを同時に更新する場合において、演算対象の全ての要素データがレジスターに蓄えられるまでのデータ転送時間、及び、全ての要素データの演算時間を要しないことにより、より少ない処理サイクルで積和演算値を算出することができる。 As a result, the product-sum operation unit according to the present embodiment, when updating any element data of the same set of values “a” and “b” at the same time, until all the element data to be calculated are stored in the register. Therefore, the product-sum operation value can be calculated with fewer processing cycles.

＜第４の実施の形態例＞
第１の実施の形態例では、１セットの入力レジスター及び出力レジスターを有する積和演算器について述べたが、第４の実施の形態例では、複数の入力レジスター及び出力レジスター（以下、レジスターセット）を保持する積和演算器について述べる。第４の実施の形態例では、例えば、ｍセットの入力レジスター（レジスターＲ００_ｉ〜Ｒ１７_ｉ（０≦ｉ≦ｍ−１））及び出力レジスター（ＯＵＴ_ｉ）を有する。そして、それぞれのレジスターセットに共有の演算部により、各レジスターセットの値「ａ」「ｂ」の積和演算を行う。 <Fourth embodiment>
In the first embodiment, the product-sum operation unit having one set of input registers and output registers has been described. However, in the fourth embodiment, a plurality of input registers and output registers (hereinafter referred to as register sets) are used. A product-sum operation unit that holds In the fourth embodiment, for example, m sets of input registers (registers R00_i to R17_i (0 ≦ i ≦ m−1)) and output registers (OUT_i) are included. Then, a product-sum operation of the values “a” and “b” of each register set is performed by an arithmetic unit shared by each register set.

図１２は、第４の実施の形態例における積和演算器の一例を表す図である。同図の積和演算器は、さらに、updateするレジスターセットの番号s20を有する。updateするレジスターセットの番号s20とは、積和演算対象のレジスターセットの識別番号である。また、同図の積和演算器では、入力信号inputとして、updateするレジスターセットの番号s20で指定されたレジスターセットの値「ａ」「ｂ」のいずれかの更新後のデータが入力される。 FIG. 12 is a diagram illustrating an example of a product-sum calculator in the fourth embodiment. The product-sum calculator of FIG. 6 further has a register set number s20 to be updated. The register set number s20 to be updated is an identification number of the register set subject to the product-sum operation. Further, in the product-sum operation unit in the figure, as the input signal input, the updated data of any of the register set values “a” and “b” designated by the register set number s20 to be updated is input.

図１２の積和演算値において、入力信号inputとしてデータが入力される際に、updateするレジスターセットの番号s20に基づいて、積和演算対象のレジスターセットが選択される。そして、updateするレジスターセットの番号s20で指定されたレジスターセットのレジスターＲ００_ｉ〜Ｒ１７_ｉに保持された前回データが選択器ＳＥＬ１、ＳＥＬ２を介して遅延器ＤＥＬ２、ＤＥＬ３及び減算器ＳＵＢ１、ＳＵＢ２に出力されると共に、当該レジスターに入力更新データが書き込まれる。その後、乗算器ＭＵＬ１の差分値の演算処理までは、図６の積和演算器と同様である。 In the product-sum operation value of FIG. 12, when data is input as the input signal input, the register set subject to product-sum operation is selected based on the register set number s20 to be updated. The previous data held in the registers R00_i to R17_i of the register set designated by the register set number s20 to be updated is output to the delay units DEL2, DEL3 and the subtracters SUB1, SUB2 via the selectors SEL1, SEL2. At the same time, the input update data is written to the register. Thereafter, the processing up to the calculation of the difference value of the multiplier MUL1 is the same as that of the product-sum operation unit of FIG.

続いて、加算器ＡＤＤ１は、乗算器ＭＵＬ１から出力された差分値と、連続入力検出回路ＥＣ２の出力するセレクト信号select11に基づく前回の積和演算値とを加算する。連続入力検出回路ＥＣ２は、連続するクロックサイクルでデータが入力されたか否かに加え、updateするレジスターセットの番号c20が１つ前のクロックサイクルと同じか否かを判定する。連続入力検出回路ＥＣ２は、２つの条件が真の場合、選択器ＳＥＬ１１にＨレベルのセレクト信号select11を出力し、２つの条件のいずれかまたは両方が真ではない場合は、選択器ＳＥＬ１１にＬレベルのセレクト信号select11を出力する。 Subsequently, the adder ADD1 adds the difference value output from the multiplier MUL1 and the previous product-sum operation value based on the select signal select11 output from the continuous input detection circuit EC2. The continuous input detection circuit EC2 determines whether or not the register set number c20 to be updated is the same as the previous clock cycle, in addition to whether or not data is input in successive clock cycles. The continuous input detection circuit EC2 outputs an H level select signal select11 to the selector SEL11 when the two conditions are true, and outputs an L level to the selector SEL11 when either or both of the two conditions are not true. The select signal select11 is output.

そして、選択器ＳＥＬ１１は、Ｈレベルのセレクト信号select11が出力された場合、加算器ＡＤＤ１からの出力を再度加算器ＡＤＤ１に入力し、Ｌレベルのセレクト信号select11が出力された場合、遅延器ＤＥＬ９から入力されたupdateするレジスターセットの番号s20に基づく出力レジスターＯＵＴ_ｉが保持する前回の積和演算値を、加算器ＡＤＤ１に入力する。つまり、加算器ＡＤＤ１は、連続するクロックサイクルで入力更新データが発生し、かつ、１つ前のクロックサイクルとレジスターセット番号が同じである場合は加算器ＡＤＤ１の前回の出力を、それ以外の場合はレジスターセット番号に対応する出力レジスターＯＵＴ_ｉの出力を、前回の積和演算値として入力とする。 The selector SEL11 inputs the output from the adder ADD1 to the adder ADD1 again when the H level select signal select11 is output, and from the delay unit DEL9 when the L level select signal select11 is output. The previous product-sum operation value held in the output register OUT_i based on the input register set number s20 to be updated is input to the adder ADD1. In other words, the adder ADD1 outputs the previous output of the adder ADD1 when input update data is generated in successive clock cycles and the register set number is the same as the previous clock cycle, otherwise Takes the output of the output register OUT_i corresponding to the register set number as the previous product-sum operation value.

そして、遅延器ＤＥＬ１０は、updateするレジスターセットの番号s20を入力更新データに対応する積和演算値が算出されるタイミングに遅延させ、updateするレジスターセットの番号s20に基づく出力レジスターＯＵＴ_ｉを選択し書き込み信号writeを出力する。 Then, the delay unit DEL10 delays the register set number s20 to be updated at the timing when the product-sum operation value corresponding to the input update data is calculated, and selects and writes the output register OUT_i based on the register set number s20 to be updated. Output signal write.

このように、本実施の形態例の積和演算器は、複数のレジスターセットを有し、更新された任意の要素データ（入力更新データ）に対応するレジスターセットの積和演算値を算出する場合についても、入力と演算のパイプライン処理により、前回の積和演算値に対して、入力更新データに基づく前回の積和演算値との差分値を加算することによって、当該レジスターセットの更新後の積和演算値を算出することができる。このため、積和演算器は、演算対象の全ての値の要素データを積和演算し直す必要がなく、また、全ての入力更新データが対応するレジスターに蓄えられるのを待たずに演算を開始することができる。 As described above, the product-sum operation unit according to the present embodiment has a plurality of register sets, and calculates the product-sum operation value of the register set corresponding to the updated arbitrary element data (input update data). In addition, by adding the difference value from the previous product-sum operation value based on the input update data to the previous product-sum operation value by the pipeline processing of input and operation, the updated register set is updated. A product-sum operation value can be calculated. For this reason, the product-sum operation unit does not need to perform the product-sum operation again on the element data of all values to be calculated, and also starts the operation without waiting for all the input update data to be stored in the corresponding register. can do.

これにより、本実施の形態例の積和演算器は、演算対象の全ての要素データがレジスターに蓄えられるまでのデータ転送時間、及び、全ての要素データの演算時間を要しないことにより、複数のレジスターセットそれぞれの値「ａ」「ｂ」の積和演算値をより少ない処理サイクルで高速に算出することができる。 As a result, the product-sum calculator of the present embodiment does not require the data transfer time until all the element data to be calculated are stored in the register, and the calculation time of all the element data. The product-sum operation value of the values “a” and “b” of each register set can be calculated at a high speed with fewer processing cycles.

＜第５の実施の形態例＞
図１３、図１４は、第５の実施の形態例における積和演算器の一例を表す図である。本実施の形態例は、図１３の演算回路と、ｍ個の図１４の演算回路を有する。図１３の演算回路は、入力信号data_0、data_iを、各図１４の演算回路に出力する。入力信号data_0は入力更新データに対応する第１のレジスターセット（レジスターＲ００〜Ｒ０７）の前回データであり、入力信号data_iは入力更新データに対応する第２のレジスターセット（レジスターＲ１０_ｉ〜Ｒ１７_ｉ（１≦ｉ≦ｍ））の前回データである。 <Fifth Embodiment>
FIG. 13 and FIG. 14 are diagrams illustrating an example of a product-sum calculator in the fifth embodiment. This embodiment has the arithmetic circuit in FIG. 13 and m arithmetic circuits in FIG. The arithmetic circuit in FIG. 13 outputs the input signals data_0 and data_i to the arithmetic circuits in FIG. The input signal data_0 is the previous data of the first register set (registers R00 to R07) corresponding to the input update data, and the input signal data_i is the second register set (registers R10_i to R17_i (1 ≦ 1) corresponding to the input update data. i ≦ m)).

第５の実施の形態例では、１組の第１のレジスターセット（レジスターＲ００〜Ｒ０７）と、ｍ組の第２のレジスターセット（レジスターＲ１０_ｉ〜Ｒ１７_ｉ）及び出力レジスターＯＵＴ_ｉを有する積和演算器について述べる。第５の実施の形態例では、例えば、ｍ個の第２のレジスターセットの値と、第２のレジスターセットそれぞれに共有の第１のレジスターセットの値とをそれぞれ積和したｍ個の積和演算値を出力レジスターＯＵＴ_ｉに出力する。本実施の形態例における積和演算器は、例えば、行列とベクトルとの積を求める場合に有効である。この場合、第１のレジスターセットはベクトルに当たり、ｍ個の第２のレジスターセットは「ｍ×８」行列に当たる。 In the fifth embodiment, a product-sum operation unit having one set of first register sets (registers R00 to R07), m sets of second register sets (registers R10_i to R17_i), and an output register OUT_i. State. In the fifth embodiment, for example, m product sums obtained by multiplying and summing the values of m second register sets and the values of the first register set shared by the second register sets, respectively. The calculated value is output to the output register OUT_i. The product-sum operation unit in the present embodiment is effective, for example, when obtaining a product of a matrix and a vector. In this case, the first register set corresponds to a vector, and the m second register sets correspond to an “m × 8” matrix.

式５は、第２のレジスターセットを「ｍ×８」行列（ｂ_1，０，ｂ_1，１，……，ｂ_ｍ，７）、第１のレジスターセットをベクトル（ａ_０，ａ_１，…，ａ_７）とした場合の行列とベクトルの積の演算を表す式である。式５の演算では、行列の各行とベクトルの演算「（ｂ_ｉ，０×ａ_０）＋（ｂ_ｉ，１×ａ_１）＋…＋（ｂ_ｉ，７×ａ_７）」が行われ、「output_ｉ」として算出される。 Equation 5 represents the second register set as an “m × 8” matrix (b ₁ , ₀ , b ₁ , ₁ ,..., B _{m, 7} ), and the first register set as a vector (a ₀ , a ₁ , .., A ₇ ) is an expression representing the operation of the product of a matrix and a vector. In the calculation of Equation 5, each matrix row and vector calculation “(b _{i, 0} × a ₀ ) + (b _{i, 1} × a ₁ ) +... + (B _{i, 7} × a ₇ )” is performed, Calculated as “output_i”.

図１３は、入力更新データに対応する各レジスターセットの前回データdata_0、data_iを出力する演算回路である。同図の演算回路において、第１のレジスターセット（レジスターＲ００〜Ｒ０７）または第２のレジスターセット（レジスターＲ１０_ｉ〜Ｒ１７_ｉ）のいずれかの値が入力される。同図の演算回路において、入力信号inputとしてデータが入力されると、入力更新データが書き込まれる前に、選択器ＳＥＬｍ０〜ｍｍは各レジスターセットのupdateする組の番号s10に対応するレジスターが予め保持する前回データdata_0、data_iを出力する。同時に、入力更新データに対応するレジスターがライト信号write_0_0〜write_m_7に応答し、当該レジスターに入力更新データが書き込まれる。 FIG. 13 is an arithmetic circuit that outputs the previous data data_0 and data_i of each register set corresponding to the input update data. In the arithmetic circuit shown in the figure, the value of either the first register set (registers R00 to R07) or the second register set (registers R10_i to R17_i) is input. In the arithmetic circuit shown in the figure, when data is input as the input signal input, before the input update data is written, the selectors SELm0 to mm hold in advance the registers corresponding to the number s10 of the set to be updated in each register set. The previous data data_0 and data_i are output. At the same time, the register corresponding to the input update data responds to the write signals write_0_0 to write_m_7, and the input update data is written into the register.

図１４は、第１のレジスターセットと、ｍ個のうちの１つの第２のレジスターセットの積和演算値を算出する演算回路である。同図の演算回路は、例えば、行列とベクトルの積を求める演算の場合、行列内のある行とベクトルとの積の演算に当たる。また、write_0_0からwrite_0_7の論理和s0はレジスターＲ００〜Ｒ０７のいずれかに格納される入力更新データの有無を、write_i_0からwrite_i_7の論理和siはレジスターＲ１０_i〜Ｒ１７_ｉのいずれかに格納される入力更新データの有無を表す。 FIG. 14 is an arithmetic circuit that calculates the product-sum operation value of the first register set and one of the m second register sets. For example, in the case of an operation for obtaining a product of a matrix and a vector, the arithmetic circuit shown in FIG. 6 corresponds to the operation of a product of a certain row in the matrix and the vector. The logical sum s0 from write_0_0 to write_0_7 indicates the presence / absence of input update data stored in any of the registers R00 to R07, and the logical sum si from write_i_0 to write_i_7 is input update data stored in any of the registers R10_i to R17_i. Indicates the presence or absence of

図１４の演算回路において、まず、入力更新データinputに加えて、入力更新データinputに対応する第１のレジスターセットの前回データdata_0が、入力更新データに対応する第２のレジスターセットの前回データdata_iが入力される。例えば、第１のレジスターセットに対応するデータが入力された場合、write_0_0からwrite_0_7の論理和s0はＨレベル、write_i_0からwrite_i_7の論理和siはＬレベルとなる。この場合、選択器ＳＥＬ３は、減算器ＳＵＢ１から入力される入力更新データinputから前回データdata_0の減算値を乗算器ＭＵＬ１に出力し、選択器ＳＥＬ４は、遅延器ＤＥＬ３から入力される入力更新データに対応する第２のレジスターセットの前回データdata_iを乗算器ＭＵＬ１に出力する。 In the arithmetic circuit of FIG. 14, first, in addition to the input update data input, the previous data data_0 of the first register set corresponding to the input update data input is changed to the previous data data_i of the second register set corresponding to the input update data. Is entered. For example, when data corresponding to the first register set is input, the logical sum s0 of write_0_0 to write_0_7 is H level, and the logical sum si of write_i_0 to write_i_7 is L level. In this case, the selector SEL3 outputs the subtraction value of the previous data data_0 from the input update data input input from the subtractor SUB1 to the multiplier MUL1, and the selector SEL4 converts the input update data input from the delay device DEL3. The previous data data_i of the corresponding second register set is output to the multiplier MUL1.

そして、乗算器ＭＵＬ１は、それぞれの選択器ＳＥＬ３、ＳＥＬ４からの入力を乗算し差分値として加算器ＡＤＤ１に出力する。続いて、加算器ＡＤＤ１は、当該差分値と、出力レジスターＯＵＴ_ｉまたは前回の加算器ＡＤＤ１からの出力のいずれかを加算し、更新後の積和演算値として出力レジスターＯＵＴ_ｉに出力する。 The multiplier MUL1 multiplies the inputs from the selectors SEL3 and SEL4 and outputs the result as a difference value to the adder ADD1. Subsequently, the adder ADD1 adds the difference value and either the output from the output register OUT_i or the previous adder ADD1, and outputs the result as an updated product-sum operation value to the output register OUT_i.

図１５は、図１３及び図１４の回路をバスに接続する回路を表す図である。同図において、図７のバス接続回路と同じ部分については、同様の引用番号が付与される。本実施の形態例におけるバス接続回路は、さらに、ｍ（１≦ｉ≦ｍ）個の第２のレジスターセットのwrite_i_0からwrite_i_7の論理和siを対応する演算回路（図１４）に出力する。なお、図１５中ではs1からs（m-1）までの回路を省略し、smの回路で代表させている。また、論理和器Ｌ００〜Ｌ０７は、各論理和器に対応する組のライト信号write_0_ｉ、write_1_ｉ、…、write_m_ｉの論理和をそれぞれエンコーダーＥ２に出力する。エンコーダーＥ２は、Ｈレベルの信号を出力する論理和器をupdateする組の番号s10として数値化し、図１３の演算回路に出力する。 FIG. 15 is a diagram illustrating a circuit that connects the circuits of FIGS. 13 and 14 to a bus. In the figure, the same reference numbers are assigned to the same parts as the bus connection circuit of FIG. The bus connection circuit in the present embodiment further outputs the logical sum si of write_i_0 to write_i_7 of m (1 ≦ i ≦ m) second register sets to the corresponding arithmetic circuit (FIG. 14). In FIG. 15, the circuits from s1 to s (m-1) are omitted and are represented by the sm circuit. In addition, the logical adders L00 to L07 output logical sums of sets of write signals write_0_i, write_1_i,..., Write_m_i corresponding to the respective logical adders to the encoder E2. The encoder E2 digitizes a logical adder that outputs an H level signal as a set number s10 to be updated, and outputs it to the arithmetic circuit of FIG.

また、比較器Ｃ３０は、検出アドレスがｍ個の図１４の出力レジスターＯＵＴ_ｉ（１≦ｉ≦ｍ）のアドレスに該当するかを比較し、一致する場合にゲートＧ３に読み出し信号を出力する。また、論理積器ＡＮＤ２は、検出アドレスに対応する出力レジスターＯＵＴ_ｉを選択するセレクト信号select20を選択器ＳＥＬ２０に出力する。具体的に、論理積器ＡＮＤ２は、例えば、連続するアドレスを付与した各出力レジスターＯＵＴ_ｉのアドレスの下所定数桁と、当該所定数の各桁に「１」を保持する値との論理積値を、セレクト信号select20として選択器ＳＥＬ２０に出力する。そして、選択器ＳＥＬ２０は、セレクト信号select20に基づいて、検出アドレスに対応する出力レジスターＯＵＴ_ｉが保持する値output_iをゲートＧ３からデータバスＤＢに出力する。 Further, the comparator C30 compares whether the detected address corresponds to the address of the m output registers OUT_i (1 ≦ i ≦ m) in FIG. 14, and outputs a read signal to the gate G3 if they match. The AND circuit AND2 outputs a select signal select20 for selecting the output register OUT_i corresponding to the detected address to the selector SEL20. Specifically, the logical product AND2 is, for example, a logical product value of a predetermined number of digits under the address of each output register OUT_i given consecutive addresses and a value holding “1” in each digit of the predetermined number Is output to the selector SEL20 as the select signal select20. Then, the selector SEL20 outputs the value output_i held by the output register OUT_i corresponding to the detection address from the gate G3 to the data bus DB based on the select signal select20.

このように、本実施の形態例における積和演算器は、複数セットの第２のレジスター及び出力レジスターと共有の第１のレジスターセットを有し、更新された任意の要素データ（入力更新データ）に対応する複数セットの積和演算値を算出する場合についても有効である。本実施の形態例の積和演算器は、演算とパイプライン処理により、複数セットの前回の積和演算値に対してそれぞれ、入力更新データに基づく前回の積和演算値との差分値を加算することによって、当該複数セットの更新後の積和演算値を算出することができる。このため、積和演算器は、演算対象の全ての値の要素データを積和演算し直す必要がなく、また、全ての入力更新データが対応するレジスターに蓄えられるのを待たずに演算を開始することができる。 As described above, the sum-of-products calculator in this embodiment has a plurality of sets of second registers and output registers and a shared first register set, and updated arbitrary element data (input update data). This is also effective when calculating a plurality of sets of product-sum operation values corresponding to. The product-sum operation unit according to the present embodiment adds a difference value with the previous product-sum operation value based on the input update data to each of a plurality of sets of previous product-sum operation values by operation and pipeline processing. By doing so, it is possible to calculate the product-sum operation value after updating the plurality of sets. For this reason, the product-sum operation unit does not need to perform the product-sum operation again on the element data of all values to be calculated, and also starts the operation without waiting for all the input update data to be stored in the corresponding register. can do.

これにより、本実施の形態例の積和演算器は、複数セットの第２のレジスターセットと共有の第１のレジスターセットとの複数の積和演算値を算出する場合において、演算対象の全ての要素データがレジスターに蓄えられるまでのデータ転送時間、及び、全ての要素データの演算時間を要しないことにより、より少ない処理サイクルで複数セットそれぞれの積和演算値を算出することができる。このように、本実施の形態例における積和演算器は、例えば、行列とベクトルの積を高速に求めることができる。 Thereby, the product-sum operation unit according to the present embodiment calculates all the product-sum operation values of the plurality of second register sets and the shared first register set. By not requiring the data transfer time until the element data is stored in the register and the operation time of all the element data, it is possible to calculate the product-sum operation value for each of the plurality of sets with fewer processing cycles. As described above, the product-sum operation unit in the present embodiment can, for example, obtain a product of a matrix and a vector at high speed.

＜第６の実施の形態例＞
第１の実施の形態例から第５の実施の形態例にかけて、積和演算値を算出する積和演算器について述べた。しかしながら、本発明の演算回路は、積和演算に限定されるものではなく、他の演算に対しても有効である。そこで、第６の実施の形態例では、積和演算以外の演算回路について述べる。 <Sixth embodiment>
In the first embodiment to the fifth embodiment, the product-sum operation unit that calculates the product-sum operation value has been described. However, the arithmetic circuit of the present invention is not limited to the product-sum operation, and is effective for other operations. Therefore, in the sixth embodiment, an arithmetic circuit other than the product-sum operation will be described.

本実施の形態例では、例えば、値「ａ＝（ａ_０，ａ_１，ａ_２，…，ａ_ｎ−１）」と値「ｂ＝（ｂ_０，ｂ_１，ｂ_２，…，ｂ_ｎ−１）」の各組のそれぞれの論理積を排他的論理和する論理和・排他的論理和演算回路（以下、ＡＮＤ・ＸＯＲ演算回路）について述べる。ＡＮＤ・ＸＯＲ演算回路は、一般的に、次の式６に基づいてＡＮＤ・ＸＯＲ演算値を演算する。 In the present embodiment, for example, the value “a = (a ₀ , a ₁ , a ₂ ,..., A _n−1 )” and the value “b = (b ₀ , b ₁ , b ₂ _{,. -1} ) "will be described with respect to a logical sum / exclusive OR operation circuit (hereinafter referred to as an AND / XOR operation circuit) that performs exclusive OR operation on each logical product of each set. The AND / XOR operation circuit generally calculates an AND / XOR operation value based on the following equation (6).

式６において、関数ｆは論理積を求める関数であり、「Ａ」は、値「ａ」と値「ｂ」の各組の論理積がさらに排他的論理和された値である。式６に基づく演算回路では、例えば、値「ａ」のうち一部の要素のみが更新された場合に、値「ａ」「ｂ」の全ての要素について論理積と排他的論理和をし直していた。 In Expression 6, the function f is a function for obtaining a logical product, and “A” is a value obtained by further exclusive-ORing the logical product of each pair of the value “a” and the value “b”. In the arithmetic circuit based on Expression 6, for example, when only some of the elements of the value “a” are updated, the logical product and exclusive OR are re-performed for all the elements of the values “a” and “b”. It was.

そこで、本実施の形態例の演算回路は、次の式７に基づいてＡＮＤ・ＸＯＲ演算値を求める。式７において、値「ａ´＝（ａ_０´，ａ_１´，ａ_２´，…，ａ_ｎ−１´）」は、前回演算された古い値「ａ＝（ａ_０，ａ_１，ａ_２，…，ａ_ｎ−１）」である前回データに対してｊ番目の要素が更新されているものとする。また、「Ａ」は、値「ａ」と値「ｂ」のＡＮＤ・ＸＯＲ演算値（以下、前回のＡＮＤ・ＸＯＲ演算値）であり、「Ａ´」は値「ａ´」と値「ｂ」のＡＮＤ・ＸＯＲ演算値（以下、更新後のＡＮＤ・ＸＯＲ演算値）を表す。 Therefore, the arithmetic circuit according to the present embodiment obtains an AND / XOR arithmetic value based on the following Expression 7. In Equation 7, the value “a ′ = (a ₀ ′, a ₁ ′, a ₂ ′,..., A _n−1 ′)” is the old value “a = (a ₀ , a ₁ , a _2, ..., it is assumed that the j-th element is updated for the previous data is a _n-1) ". “A” is an AND / XOR operation value of the value “a” and the value “b” (hereinafter, the previous AND / XOR operation value), and “A ′” is the value “a ′” and the value “b”. "AND / XOR operation value (hereinafter, updated AND / XOR operation value)".

式７において、更新後のＡＮＤ・ＸＯＲ演算値「Ａ´」は、「値「ａ」「ｂ」のｊ番目の要素の論理積（ｆ（ａ_ｊ，ｂ_ｊ））と、値「ａ´」「ｂ」のｊ番目の要素の論理積（ｆ（ａ_ｊ´，ｂ_ｊ））との排他的論理和」と前回のＡＮＤ・ＸＯＲ演算値「Ａ」との排他的論理和によって求められる（Ａ´＝Ａ＾（ｆ（ａ_ｊ，ｂ_ｊ））＾（ｆ（ａ_ｊ´，ｂ_ｊ）））。この演算式は、「Ａ´＝Ａ＾（ｆ（ａ_ｊ＾ａ_ｊ´，ｂ_ｊ））」のようにまとめられる。従って、更新後のＡＮＤ・ＸＯＲ演算値「Ａ´」は、前回のＡＮＤ・ＸＯＲ演算値「Ａ」と、「ｆ（ａ_ｊ＾ａ_ｊ´，ｂ_ｊ）」との排他的論理和によって求められる。 In Expression 7, the updated AND · XOR operation value “A ′” is obtained by calculating the logical product (f (a _j , b _j )) of the “value“ a ”and“ b ”and the value“ a ′ ”. "Exclusive OR with the logical product (f (a _j ', b _j )) of the _jth element of" b "" and the previous AND / XOR operation value "A". (A ′ = A ^ (f (a _j , b _j )) ^ (f (a _j ′, b _j ))). This arithmetic expression can be summarized as “A ′ = A ^ (f (a _j ^ a _j ′, b _j ))”. Therefore, the updated AND / XOR operation value “A ′” is obtained by exclusive OR of the previous AND / XOR operation value “A” and “f (a _j ^ a _j ′, b _j )”. It is done.

式６と式７とを比較すると、値の一部の要素が更新される場合、式６に対して式７の演算量は少ない。従って、式７は、式６に対してより少ない処理サイクルで更新後のＡＮＤ・ＸＯＲ演算値「Ａ´」を算出することができる。 Comparing Expression 6 and Expression 7, when some elements of the value are updated, the amount of calculation of Expression 7 is smaller than Expression 6. Therefore, Equation 7 can calculate the updated AND · XOR operation value “A ′” with fewer processing cycles than Equation 6.

図１６は、式７において「ｎ＝８」とした場合の本実施の形態例における回路の一例を表す図である。同図の演算回路において、値「ａ」の各要素ａ_０〜ａ_７はレジスターＲ００〜Ｒ０７に、値「ｂ」の各要素ｂ_０〜ｂ_７はレジスターＲ１０〜Ｒ１７に格納される。また、同様にして、本実施の形態例における演算回路は、レジスターＲ００〜Ｒ１７に前回データを、出力レジスターに前回のＡＮＤ・ＸＯＲ演算値を保持する。また、値「ａ」または値「ｂ」のいずれかの更新後のデータが入力信号inputとして入力される。 FIG. 16 is a diagram illustrating an example of a circuit in the present embodiment when “n = 8” in Expression 7. In the arithmetic circuit shown in the figure, the elements a _{0 to} a ₇ having the value “a” are stored in the registers R 00 to R 07, and the elements b _{0 to} b ₇ having the value “b” are stored in the registers R 10 to R 17. Similarly, the arithmetic circuit in the present embodiment holds the previous data in the registers R00 to R17 and the previous AND / XOR operation value in the output register. Also, updated data of either the value “a” or the value “b” is input as the input signal input.

なお、図１６の演算回路の式７「Ａ＾ｆ（ａ_ｊ＾ａ_ｊ´，ｂ_ｊ）」と、図６の積和演算器の式２「Ｓ＋（ａ_ｊ´−ａ_ｊ）×ｂ_ｊ」において、式７の論理積は式２の積算に、式７の排他的論理和は式２の加減算に対応する。そのため、図１６の演算回路は、図６の積和演算回路に対して、減算器ＳＵＢ１、ＳＵＢ２の代わりに排他的論理和器ＸＯＲ１、ＸＯＲ２を、乗算器ＭＵＬ１の代わりに論理積器ＡＮＤ３を、加算器ＡＤＤ１の代わりに排他的論理和器ＸＯＲ３を有する。 In addition, Expression 7 “A ^ f (a _j ^ a _j ′, b _j )” of the arithmetic circuit in FIG. 16 and Expression 2 “S + (a _j ′ −a _j ) × b of the product-sum operation unit in FIG. _In “ _j ”, the logical product of Expression 7 corresponds to the integration of Expression 2, and the exclusive OR of Expression 7 corresponds to the addition and subtraction of Expression 2. Therefore, the arithmetic circuit in FIG. 16 is different from the product-sum arithmetic circuit in FIG. 6 in that the exclusive ORs XOR1 and XOR2 are substituted for the subtracters SUB1 and SUB2, the logical ANDer AND3 is substituted for the multiplier MUL1, Instead of the adder ADD1, an exclusive OR circuit XOR3 is provided.

図１６の演算回路において、例えば、値「ａ´」のデータ（ａ_ｊ´）が入力信号inputとして入力された場合、排他的論理和器ＸＯＲ１は、入力更新データinputと、選択器ＳＥＬ１から出力された入力更新データに対応する前回データ（ａ_ｊ）との排他的論理和を演算して選択器ＳＥＬ３に出力する（ａ_ｊ＾ａ_ｊ´）。そして、選択器ＳＥＬ３は、遅延器ＤＥＬ１からのＨレベルのセレクト信号select3に基づいて、排他的論理和器ＸＯＲ１からの出力を選択して論理積器ＡＮＤ３に出力する。一方、選択器ＳＥＬ４は、Ｌレベルのセレクト信号select4に基づいて、遅延器ＤＥＬ３からの入力である、入力更新データに対応する前回データ（ｂ_ｊ）を選択して論理積器ＡＮＤ３に出力する。 In the arithmetic circuit of FIG. 16, for example, when data (a _j ′) of value “a ′” is input as the input signal input, the exclusive OR XOR1 outputs the input update data input and the selector SEL1. The exclusive OR with the previous data (a _j ) corresponding to the inputted update data is calculated and output to the selector SEL3 (a _j ^ a _j '). Then, the selector SEL3 selects the output from the exclusive OR circuit XOR1 based on the H level select signal select3 from the delay device DEL1, and outputs it to the AND circuit AND3. On the other hand, the selector SEL4 selects the previous data (b _j ) corresponding to the input update data, which is an input from the delay device DEL3, based on the L level select signal select4, and outputs the selected data to the AND circuit AND3.

続いて、論理積器ＡＮＤ３は、選択器ＳＥＬ３による排他的論理和器ＸＯＲ１からの出力（ａ_ｊ＾ａ_ｊ´）と、選択器ＳＥＬ４による遅延器ＤＥＬ３からの前回データ（ｂ_ｊ）との論理積を演算し、排他的論理和器ＸＯＲ３に出力する。この出力は、上述した式７における「ｆ（ａ_ｊ＾ａ_ｊ´，ｂ_ｊ」に対応する。そして、排他的論理和器ＸＯＲ３は、論理積器ＡＮＤ３からの出力と、出力レジスターＯＵＴまたは前回の排他的論理和器ＸＯＲ３からの出力のいずれかとの排他的論理和を求め、更新後のＡＮＤ・ＸＯＲ演算値として出力レジスターＯＵＴに出力する。 Subsequently, the AND circuit AND3 calculates the logic between the output (a _j ^ a _j ′) from the exclusive OR circuit XOR1 by the selector SEL3 and the previous data (b _j ) from the delay device DEL3 by the selector SEL4. The product is calculated and output to the exclusive OR XOR3. This output corresponds to “f (a _j ^ a _j ′, b _j ” in the above-described expression 7. Then, the exclusive OR XOR3 outputs the output from the AND AND3 and the output register OUT or the previous time. XOR with one of the outputs from the XOR3 is output to the output register OUT as the updated AND / XOR operation value.

このように、本実施の形態例の演算回路は、値「ａ」と値「ｂ」のＡＮＤ・ＸＯＲ演算値を求める場合についても、入力と演算のパイプライン処理により、前回のＡＮＤ・ＸＯＲ演算値に基づいて更新後のＡＮＤ・ＸＯＲ演算値を求めることができる。このため、本実施の形態例の演算回路は、演算対象の全ての値の要素データを演算し直す必要がなく、また、全ての入力更新データが対応するレジスターに蓄えられるのを待たずに演算を開始することができる。これにより、本実施の形態例の演算回路は、値「ａ」と値「ｂ」のＡＮＤ・ＸＯＲ演算値を求める場合について、演算対象の全ての要素データがレジスターに蓄えられるまでのデータ転送時間、及び、全ての要素データの演算時間を要しないことにより、より少ない処理サイクルでＡＮＤ・ＸＯＲ演算値を算出することができる。 As described above, the arithmetic circuit according to the present embodiment also obtains the AND / XOR operation value of the value “a” and the value “b” by the pipeline processing of the input and the operation, and the previous AND / XOR operation. An updated AND / XOR operation value can be obtained based on the value. Therefore, the arithmetic circuit according to the present embodiment does not need to recalculate element data of all values to be calculated, and does not wait for all input update data to be stored in the corresponding register. Can start. As a result, the arithmetic circuit according to the present embodiment determines the data transfer time until all the element data to be calculated are stored in the register when the AND / XOR calculation value of the value “a” and the value “b” is obtained. And by not requiring the calculation time of all the element data, the AND / XOR calculation value can be calculated with fewer processing cycles.

このように、本発明の演算回路は、積和演算以外の演算回路にも有効である。このような演算回路は、次のように一般化される。 Thus, the arithmetic circuit of the present invention is also effective for arithmetic circuits other than product-sum operations. Such an arithmetic circuit is generalized as follows.

まず、演算回路は、Ｎ個の要素を有する第１の値（ａ）、第２の値（ｂ）とを保持するレジスターと、第１、２の値に対して「第１演算（積算、論理積）」したＮ個の第１演算値を、さらに「第２演算（加減算、排他的論理和）」した演算結果値（Ｓ、Ａ）を保持するレジスターとを有する。 First, the arithmetic circuit has a register for holding a first value (a) and a second value (b) having N elements, and “first operation (integration, And a register for holding the operation result values (S, A) obtained by performing “second operation (addition / subtraction, exclusive OR)”.

また、演算回路は、「入力された第１の値の１つの前記要素（ａ_ｊ´（＝入力更新データ））と、当該要素に対応する第１の値の要素（ａ_ｊ）とに対して「第２演算（加減算、排他的論理和）」を行う第１の演算器」と、「第１の演算器の出力と、入力された第１の値の要素に対応する第２の値の要素（ｂ_ｊ）とに対して「第１演算（積算、論理積）」を行う第２の演算器」を有する。さらに、演算回路は、第２の演算器の出力と、出力レジスターの演算結果値（Ｓ、Ａ）とに対して「第２演算（加減算、排他的論理和）」を行って出力レジスターに出力する第３の演算器とを有する。 Further, the arithmetic circuit determines that “the one element (a _j ′ (= input update data)) of the input first value and the element (a _j ) of the first value corresponding to the element” "First operation unit for performing second operation (addition / subtraction, exclusive OR)", "output of first operation unit, and second value corresponding to input first value element" And a second arithmetic unit that performs “first operation (integration, logical product)” on the element (b _j ). Further, the arithmetic circuit performs “second operation (addition / subtraction, exclusive OR)” on the output of the second arithmetic unit and the operation result value (S, A) of the output register and outputs the result to the output register. And a third computing unit.

そして、前述の第２の演算器による演算は、「入力された第１の値の要素に対応する第１のレジスター内の第１の値の要素（ａ_ｊ）と当該要素に対応する第２の値の要素（ｂ_ｊ）とに対する「第１演算（積算、論理積）」の結果を第２演算に基づいて打ち消す値と…（１）」、「入力された第１の値の要素（ａ_ｊ´）と当該要素に対応する第２の値の要素（ｂ_ｊ）とに対する「第１演算（積算、論理積）」の結果…（２）」と、に対する「第２演算（加減算、排他的論理和）」…（３）について分配法則を満たす。 Then, the calculation by the second arithmetic unit described above is “the first value element (a _j ) in the first register corresponding to the input first value element and the second value corresponding to the element. A value that cancels the result of the “first operation (integration, logical product)” with respect to the value element (b _j ) based on the second operation (1) ”,“ element of the input first value ( a _j ') and the second value element (b _j ) corresponding to the element (b _j ) result of "first operation (integration, logical product)" (2) "and" second operation (addition / subtraction, Exclusive logical OR) ”(3) satisfies the distribution rule.

以下、上記の一般化された構成を第１の実施の形態例、及び、第６の実施の形態例に対応させて説明する。まず、第１の実施の形態例の積和演算器において、第１演算は積算、第２演算は加減算に当たる。また、演算結果値「Ｓ」は、値「ａ」と値「ｂ」の各要素データの積算値を、さらに加算した積和演算値に当たる。 The above generalized configuration will be described below in correspondence with the first embodiment and the sixth embodiment. First, in the product-sum operation unit of the first embodiment, the first operation corresponds to integration, and the second operation corresponds to addition / subtraction. The calculation result value “S” corresponds to a product-sum calculation value obtained by further adding the integrated values of the element data of the values “a” and “b”.

（１）について、「入力された第１の値の要素に対応する第１のレジスター内の第１の値の要素（ａ_ｊ）」と「当該要素に対応する第２の値の要素（ｂ_ｊ）」とに対する積算の結果とは、「ａ_ｊ×ｂ_ｊ」である。そして、「ａ_ｊ×ｂ_ｊ」を加減算に基づいて打ち消す値とは、「ａ_ｊ×ｂ_ｊ」に加減算することによって「０」となる値、即ち、「−（ａ_ｊ×ｂ_ｊ）」を示す。そして、（２）について、「入力された第１の値の要素（ａ_ｊ´）」と「当該要素に対応する第２の値の要素（ｂ_ｊ）」とに対する積算の結果は、「ａ_ｊ´×ｂ_ｊ」であるため、（３）は「−（ａ_ｊ×ｂ_ｊ）＋（ａ_ｊ´×ｂ_ｊ）」である。 For (1), “the first value element (a _j ) in the first register corresponding to the input first value element” and “the second value element (b) corresponding to the element (b) _j ) ”and the result of the integration is“ a _j × b _j ”. Then, the value of canceling based on the subtraction of _{"a _j} × _b _j", the value becomes "0" by subtracting the _{"a _j} × _b _j", ie, "- _{(a _j} × _b _j)" Indicates. Then, with respect to (2), the result of integration with respect to “input first value element (a _j ′)” and “second value element (b _j ) corresponding to the element” is “a _j ′ × b _j ”, (3) is“ − (a _j × b _j ) + (a _j ′ × b _j ) ”.

第１の実施の形態例における第２の演算器の演算は「（ａ_ｊ´−ａ_ｊ）×ｂ_ｊ」であり、当該演算は（３）「−（ａ_ｊ×ｂ_ｊ）＋（ａ_ｊ´×ｂ_ｊ）」に対して分配法則を満たしている。従って、第１の実施の形態例における演算回路は、上記の構成に該当する。 The operation of the second arithmetic unit in the first embodiment is “(a _j ′ −a _j ) × b _j ”, and the operation is (3) “− (a _j × b _j ) + (a _j ′ × b _j ) ”is satisfied. Therefore, the arithmetic circuit in the first embodiment corresponds to the above configuration.

続いて、第６の実施の形態例の積和演算器において、第１演算は論理積、第２演算は排他的論理和に当たる。また、演算結果値「Ａ」は、値「ａ」と値「ｂ」の各要素データの各論理積を、さらに排他的論理和したＡＮＤ・ＸＯＲ演算値に当たる。 Subsequently, in the product-sum operation unit of the sixth embodiment, the first operation corresponds to a logical product, and the second operation corresponds to an exclusive OR. The operation result value “A” corresponds to an AND / XOR operation value obtained by further exclusive-ORing each logical product of the element data of the value “a” and the value “b”.

（１）について、「入力された第１の値の要素に対応する第１のレジスター内の第１の値の要素（ａ_ｊ）」と「当該要素に対応する第２の値の要素（ｂ_ｊ）」とに対する論理積の結果とは、「ｆ（ａ_ｊ，ｂ_ｊ）」である。そして、「ｆ（ａ_ｊ，ｂ_ｊ）」を排他的論理和に基づいて打ち消す値とは、「ｆ（ａ_ｊ，ｂ_ｊ）」との排他的論理和が「０」となる値である。排他的論理和演算において、同じ値の排他的論理和は「０」となる。従って、「ｆ（ａ_ｊ，ｂ_ｊ）」を排他的論理和に基づいて打ち消す値は同値、即ち「ｆ（ａ_ｊ，ｂ_ｊ）」である。そして、（２）について、「入力された第１の値の要素（ａ_ｊ´）」と「当該要素に対応する第２の値の要素（ｂ_ｊ）」とに対する論理積の結果は、「ｆ（ａ_ｊ´，ｂ_ｊ）」であるため、（３）は「（ｆ（ａ_ｊ，ｂ_ｊ））＾（ｆ（ａ_ｊ´，ｂ_ｊ））」である。 For (1), “the first value element (a _j ) in the first register corresponding to the input first value element” and “the second value element (b) corresponding to the element (b) The result of the logical product for “ _j )” is “f (a _j , b _j )”. Then, the value of canceling based on exclusive OR "f (a _{j, b} _j)" is a value that is a exclusive OR of the "f (a _{j, b} _j)" is "0" . In the exclusive OR operation, the exclusive OR of the same value is “0”. Therefore, the _{"f (a} _{j, b} j)" is a value for canceling based on the exclusive OR equivalence, or _{"f (a} _{j, b} j)". For (2), the result of the logical product of “the input first value element (a _j ′)” and “the second value element (b _j ) corresponding to the element” is “ _{_{f (a j ', b j}} ) for "a is, (3)" _{_{(f (a j, b j}} )) ^ (f (a j' is _{a, b} j)) ".

第６の実施の形態例における第２の演算器の演算「ｆ（ａ_ｊ＾ａ_ｊ´，ｂ_ｊ）」であり、当該演算は（３）「（ｆ（ａ_ｊ，ｂ_ｊ））＾（ｆ（ａ_ｊ´，ｂ_ｊ））」に対して分配法則を満たしている。従って、第６の実施の形態例における演算回路は、上記の構成に該当する。 The operation “f (a _j ^ a _j ′, b _j )” of the second arithmetic unit in the sixth embodiment is the same as (3) “(f (a _j , b _j )) ^ The distribution law is satisfied for (f (a _j ′, b _j )) ”. Therefore, the arithmetic circuit in the sixth embodiment corresponds to the above configuration.

第１の実施の形態例の積和演算器、及び、第６の実施の形態例の演算回路は、上記のように一般化される。従って、本発明の演算回路は、上記の一般化した構成に該当する他の演算回路についても有効であり、当該演算回路は同様にして任意の値が更新された入力データに対する演算結果をより少ない処理サイクルで高速に求めることができる。 The product-sum operation unit of the first embodiment and the operation circuit of the sixth embodiment are generalized as described above. Therefore, the arithmetic circuit of the present invention is also effective for other arithmetic circuits corresponding to the generalized configuration described above, and the arithmetic circuit similarly has fewer arithmetic results for input data in which an arbitrary value is updated. It can be obtained at high speed in the processing cycle.

以上の実施の形態をまとめると、次の付記のとおりである。 The above embodiment is summarized as follows.

（付記１）
Ｎ個の要素を有する第１の値を保持する第１のレジスターと、
Ｎ個の要素を有する第２の値を保持する第２のレジスターと、
前記第１の値と前記第２の値とが積和演算された積和演算値を保持する出力レジスターと、
入力された前記第１の値の１つの前記要素から、当該要素に対応する前記第１のレジスター内の前記第１の値の前記要素を減算する第１の減算器と、
前記第１の減算器の出力と、前記入力された第１の値の要素に対応する前記第２のレジスター内の前記第２の値の前記要素とを乗算する乗算器と、
前記乗算器の出力と、前記出力レジスターの前記積和演算値とを加算して前記出力レジスターに出力する加算器と、
を有することを特徴とする演算回路。 (Appendix 1)
A first register holding a first value having N elements;
A second register holding a second value having N elements;
An output register for holding a product-sum operation value obtained by performing a product-sum operation on the first value and the second value;
A first subtractor for subtracting the element of the first value in the first register corresponding to the element from one of the elements of the input first value;
A multiplier that multiplies the output of the first subtractor with the element of the second value in the second register corresponding to the element of the input first value;
An adder that adds the output of the multiplier and the product-sum operation value of the output register to output to the output register;
An arithmetic circuit comprising:

（付記２）
付記１において、さらに、
前記入力された第１の値の要素に対応する前記第１のレジスター内の第１の値の要素を選択して前記第１の減算器に出力する第１の選択器を有することを特徴とする演算回路。 (Appendix 2)
In Appendix 1,
A first selector for selecting the first value element in the first register corresponding to the inputted first value element and outputting the selected first value element to the first subtractor; Arithmetic circuit to perform.

（付記３）
付記１または２において、さらに、
入力された前記第２の値の１つの前記要素から、当該要素に対応する前記第２のレジスター内の前記第２の値の前記要素を減算する第２の減算器を有し、
前記乗算器は、前記第２の減算器の出力と、前記入力された第２の要素に対応する前記第１のレジスター内の前記第１の値の前記要素とを乗算することを特徴とする演算回路。 (Appendix 3)
In Appendix 1 or 2,
A second subtracter for subtracting the element of the second value in the second register corresponding to the element from one of the elements of the input second value;
The multiplier multiplies the output of the second subtracter by the element of the first value in the first register corresponding to the input second element. Arithmetic circuit.

（付記４）
付記３において、さらに、
前記第１の値の要素が入力された場合は前記第１の減算器の出力を選択して前記乗算器に出力し、前記第２の値の要素が入力された場合は当該要素に対応する前記第１のレジスター内の前記第１の値の前記要素を選択して前記乗算器に出力することを特徴とする演算回路。 (Appendix 4)
In Appendix 3,
When the element of the first value is input, the output of the first subtracter is selected and output to the multiplier, and when the element of the second value is input, it corresponds to the element An arithmetic circuit, wherein the element of the first value in the first register is selected and output to the multiplier.

（付記５）
付記１または２において、さらに、
Ｎ個の要素を有する第３の値を保持する第３のレジスターを有し、
前記出力レジスターは、前記第１の値と前記第２の値に加え、前記第３の値が積和演算された積和演算値を保持し、
前記乗算器は、前記第１の減算器の出力と前記入力された第１の値の要素に対応する前記第２のレジスター内の前記第２の値の前記要素に加えて、前記第３のレジスター内の前記第３の値の前記要素を乗算することを特徴とする演算回路。 (Appendix 5)
In Appendix 1 or 2,
A third register holding a third value having N elements;
The output register holds a product-sum operation value obtained by performing a product-sum operation on the third value in addition to the first value and the second value,
In addition to the element of the second value in the second register corresponding to the element of the input first value and the output of the first subtractor, the multiplier An arithmetic circuit that multiplies the element of the third value in a register.

（付記６）
付記１または２において、
前記第１のレジスターは、前記第１の値を複数セット保持し、
前記第２のレジスターは、前記第２の値を複数セット保持し、
前記出力レジスターは、前記複数セットの第１、第２の値それぞれの前記積和演算値を保持し、
前記第１の減算器は、入力された前記複数セットのうち第１セットの前記第１の値の１つの要素から、当該要素に対応する前記第１セットの前記第１のレジスター内の第１の値の要素を減算し、
前記乗算器は、前記第１の減算器の出力と、前記第１セットの第１の値の要素に対応する前記第１セットの前記第２のレジスター内の第２の値の要素とを乗算し、
前記加算器は、前記乗算器の出力と、前記第１セットの前記出力レジスターの前記積和演算値とを加算することを特徴とする演算回路。 (Appendix 6)
In Appendix 1 or 2,
The first register holds a plurality of sets of the first value;
The second register holds a plurality of sets of the second values;
The output register holds the product-sum operation value of each of the first and second values of the plurality of sets;
The first subtracter receives a first element in the first register of the first set corresponding to the element from one element of the first value of the first set among the plurality of sets input. Subtract elements of the value of
The multiplier multiplies the output of the first subtracter with a second value element in the second register of the first set corresponding to a first value element of the first set. And
The adder adds the output of the multiplier and the product-sum operation value of the output register of the first set.

（付記７）
付記１または２において、
前記第２のレジスターは、前記第２の値を複数セット保持し、
前記出力レジスターは、第１の値と前記複数セットの第２の値それぞれとの前記積和演算値を保持し、
前記乗算器は、前記複数セット毎に、前記第１の減算器の出力と、前記入力された第１の値の要素に対応する当該複数セットのそれぞれの前記第２のレジスター内の第２の値の要素とを乗算し、
前記加算器は、前記複数セット毎に、前記乗算器の出力と、当該複数セットそれぞれの出力レジスターの前記積和演算値とを加算して当該出力レジスターに出力することを特徴とする演算回路。 (Appendix 7)
In Appendix 1 or 2,
The second register holds a plurality of sets of the second values;
The output register holds the product-sum operation value of a first value and each of the plurality of sets of second values;
For each of the plurality of sets, the multiplier is configured to output a second subtractor in each second register of the plurality of sets corresponding to the output of the first subtractor and the element of the input first value. Multiply by the value element,
The adder adds, for each of the plurality of sets, the output of the multiplier and the product-sum operation value of the output register of each of the plurality of sets, and outputs the result to the output register.

（付記８）
Ｎ個の要素を有する第１の値を保持する第１のレジスターと、
Ｎ個の要素を有する第２の値を保持する第２のレジスターと、
前記第１の値と前記第２の値とが積和演算された積和演算値を保持する出力レジスターと、
入力された前記第１の値の１つの前記要素と、当該要素に対応し、入力された前記第２の値の１つの前記要素とを乗算する第１の乗算器と、
前記入力された第１の値の要素に対応する前記第１のレジスター内の前記第１の値の前記要素と、前記入力された第２の値の要素に対応する前記第２のレジスター内の前記第２の値の前記要素とを乗算する第２の乗算器と、
前記第１の乗算器の出力から前記第２の乗算器の出力を減算する減算器と、
前記減算器の出力と、前記出力レジスターの前記積和演算値とを加算して前記出力レジスターに出力する加算器と、
を有することを特徴とする演算回路。 (Appendix 8)
A first register holding a first value having N elements;
A second register holding a second value having N elements;
An output register for holding a product-sum operation value obtained by performing a product-sum operation on the first value and the second value;
A first multiplier for multiplying one of the elements of the inputted first value by one of the elements of the inputted second value corresponding to the element;
The element of the first value in the first register corresponding to the element of the input first value and the element of the second register corresponding to the element of the input second value A second multiplier for multiplying the element of the second value;
A subtractor for subtracting the output of the second multiplier from the output of the first multiplier;
An adder that adds the output of the subtracter and the product-sum operation value of the output register to output to the output register;
An arithmetic circuit comprising:

（付記９）
Ｎ個の要素を有する第１の値を保持する第１のレジスターと、
Ｎ個の要素を有する第２の値を保持する第２のレジスターと、
前記第１の値と前記第２の値とに対して第１演算したＮ個の第１演算値を、さらに第２演算した演算結果値を保持する出力レジスターと、
入力された前記第１の値の１つの前記要素と、当該要素に対応する前記第１のレジスター内の前記第１の値の前記要素とに対して前記第２演算を行う第１の演算器と、
前記第１の演算器の出力と、前記入力された第１の値の要素に対応する前記第２のレジスター内の前記第２の値の前記要素とに対して前記第１演算を行う第２の演算器と、
前記第２の演算器の出力と、前記出力レジスターの前記演算結果値とに対して前記第２演算を行って前記出力レジスターに出力する第３の演算器と、
を有し、
前記第２の演算器による演算は、前記入力された第１の値の要素に対応する前記第１のレジスター内の第１の値の要素と、当該要素に対応する前記第２のレジスター内の第２の値の要素とに対する前記第１演算の結果を前記第２演算に基づいて打ち消す値と、前記入力された第１の値の要素と、当該要素に対応する前記第２のレジスター内の第２の値の要素とに対する前記第１演算の結果と、に対する前記第２演算について分配法則を満たすことを特徴とする演算回路。 (Appendix 9)
A first register holding a first value having N elements;
A second register holding a second value having N elements;
An output register that holds N first calculation values that have been first calculated with respect to the first value and the second value, and further holds a calculation result value that has been second calculated;
A first arithmetic unit that performs the second operation on one element of the input first value and the element of the first value in the first register corresponding to the element When,
A second that performs the first operation on the output of the first calculator and the element of the second value in the second register corresponding to the element of the input first value; With the calculator of
A third computing unit that performs the second computation on the output of the second computing unit and the computation result value of the output register and outputs the second computation to the output register;
Have
The calculation by the second calculator is performed by calculating an element of the first value in the first register corresponding to the element of the input first value and an element of the second register corresponding to the element. A value for canceling the result of the first operation on the element of the second value based on the second operation; an element of the input first value; and a value in the second register corresponding to the element An arithmetic circuit characterized by satisfying a distribution law for the second operation on the result of the first operation on an element of a second value.

（付記１０）
付記９において、
前記第１演算は論理積であり、前記第２演算は排他的論理和であることを特徴とする演算回路。 (Appendix 10)
In Appendix 9,
The arithmetic circuit is characterized in that the first operation is a logical product and the second operation is an exclusive OR.

（付記１１）
付記９において、
前記第１演算は積であり、前記第２演算は加減であることを特徴とする演算回路。 (Appendix 11)
In Appendix 9,
The arithmetic circuit, wherein the first operation is a product and the second operation is addition / subtraction.

１１：ＣＰＵ、１２：ＤＭＡＣ、１３：メモリ、１４：積和演算器、１５：その他のハードウェア 11: CPU, 12: DMAC, 13: Memory, 14: Product-sum calculator, 15: Other hardware

Claims

A first register holding a first value having N elements;
A second register holding a second value having N elements;
An output register for holding a product-sum operation value obtained by performing a product-sum operation on the first value and the second value;
A first subtractor for subtracting the element of the first value in the first register corresponding to the element from one of the elements of the input first value;
A multiplier that multiplies the output of the first subtractor with the element of the second value in the second register corresponding to the element of the input first value;
An adder that adds the output of the multiplier and the product-sum operation value of the output register to output to the output register;
An arithmetic circuit comprising:

The claim 1, further comprising:
A first selector for selecting the first value element in the first register corresponding to the inputted first value element and outputting the selected first value element to the first subtractor; Arithmetic circuit to perform.

In claim 1 or 2, further
A second subtracter for subtracting the element of the second value in the second register corresponding to the element from one of the elements of the input second value;
The multiplier multiplies the output of the second subtracter by the element of the first value in the first register corresponding to the input second element. Arithmetic circuit.

A first register holding a first value having N elements;
A second register holding a second value having N elements;
An output register for holding a product-sum operation value obtained by performing a product-sum operation on the first value and the second value;
A first multiplier for multiplying one of the elements of the inputted first value by one of the elements of the inputted second value corresponding to the element;
The element of the first value in the first register corresponding to the element of the input first value and the element of the second register corresponding to the element of the input second value A second multiplier for multiplying the element of the second value;
A subtractor for subtracting the output of the second multiplier from the output of the first multiplier;
An adder that adds the output of the subtracter and the product-sum operation value of the output register to output to the output register;
An arithmetic circuit comprising:

A first register holding a first value having N elements;
A second register holding a second value having N elements;
An output register that holds N first calculation values that have been first calculated with respect to the first value and the second value, and further holds a calculation result value that has been second calculated;
A first arithmetic unit that performs the second operation on one element of the input first value and the element of the first value in the first register corresponding to the element When,
A second that performs the first operation on the output of the first calculator and the element of the second value in the second register corresponding to the element of the input first value; With the calculator of
A third computing unit that performs the second computation on the output of the second computing unit and the computation result value of the output register and outputs the second computation to the output register;
Have
The calculation by the second calculator is performed by calculating an element of the first value in the first register corresponding to the element of the input first value and an element of the second register corresponding to the element. A value for canceling the result of the first operation on the element of the second value based on the second operation; an element of the input first value; and a value in the second register corresponding to the element An arithmetic circuit characterized by satisfying a distribution law for the second operation on the result of the first operation on an element of a second value.