JPH1173409A

JPH1173409A - Device and method for computing product sum

Info

Publication number: JPH1173409A
Application number: JP23431697A
Authority: JP
Inventors: Hideyuki Fujishima; 秀幸藤嶋
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 1997-08-29
Filing date: 1997-08-29
Publication date: 1999-03-16

Abstract

PROBLEM TO BE SOLVED: To provide a product sum arithmetic unit and a product sum computing method capable of efficiently executing processing in an optional ratio of three-dimensional graphics data to image encoding data. SOLUTION: The arithmetic unit has a graphics input buffer 1 for simultaneously outputting four graphics data at maximum, an image encoding input buffer 2 for simultaneously outputting two rows at maximum in 8×8 pixel data, a transposed buffer 4 for executing the simultaneous writing of two columns at maximum in the 8×8 pixel data and the simultaneous output of two rows at maximum in the 8 X 8 pixel data, a graphics output buffer 5 for simultaneously writing four graphics data at maximum, an image encoding output buffer 6 for simultaneously writing two rows at maximum in the 8×8 pixel data, and four product sum computing elements 3a to 3d capable of executing both of floating point product sum operation and integer product sum operation.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、マルチメディア信
号で重要視される３次元グラフィックスや動画符号化復
号化のための積和演算装置および積和演算方法に関す
る。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a product-sum operation apparatus and a product-sum operation method for encoding / decoding three-dimensional graphics and moving images which are regarded as important in multimedia signals.

【０００２】[0002]

【従来の技術】近年、マルチメディア信号処理の分野は
目覚ましい発展を遂げており、多大な開発資源が投入さ
れている。その中でも自然画像の符号化や３次元グラフ
ィックスのための処理で必要となる積和演算を行うハー
ドウェアが数多く開発されている。一般に自然画像の符
号化に関してはＤＣＴによる圧縮アルゴリズムが多く使
われており、この演算は主に整数の４×４行列パラメー
タと４列ベクトルとの積を求めることが必要となる。ま
た、３次元グラフィックスのための演算は、同次座標系
を利用した浮動小数点数の４×４行列パラメータと４列
ベクトルとの積を求めることが必要となる。2. Description of the Related Art In recent years, the field of multimedia signal processing has been remarkably developed, and a great deal of development resources have been invested. Among them, a large number of hardware for performing a product-sum operation required for encoding a natural image and processing for three-dimensional graphics have been developed. In general, compression algorithms based on DCT are often used for encoding a natural image, and this operation mainly requires obtaining a product of an integer 4 × 4 matrix parameter and a 4-column vector. In addition, an operation for three-dimensional graphics requires obtaining a product of a 4 × 4 matrix parameter of a floating-point number using a homogeneous coordinate system and a 4-column vector.

【０００３】以下、従来の積和演算装置を構成する従来
の積和演算器について説明する。図７は従来の積和演算
器として浮動小数点積和演算器を示すブロック図であ
る。図７において、３１は浮動小数点演算の被乗数を指
数と仮数に分けて格納する被乗数入力レジスタ、３２は
被乗数入力レジスタ３１と同様に指数と仮数に分けて乗
数を格納する乗数入力レジスタ、１３は被乗数入力レジ
スタ１１に格納した指数と乗数入力レジスタ１２に格納
した指数とを入力して加算を行う指数加算器、１４は被
乗数入力レジスタに格納した仮数と乗数入力レジスタ１
２に格納した仮数とを入力して乗算の部分積を生成する
部分積生成器、１５は指数加算器１３における指数加算
の結果と後述の出力レジスタ２２に格納した演算結果と
を入力し、入力した指数の差を求めて比較を行う指数比
較器、１６は部分積生成器１４から入力された部分積を
加算して部分積加算値（つまり被乗数の仮数×乗数の仮
数の演算結果としての積）を得る部分積加算器、１７及
び１８は部分積加算器１６からの部分積加算値と出力レ
ジスタ２２からの仮数とをそれぞれ入力し、指数比較器
１５における比較結果（つまり指数差）に従って選択を
行うセレクタ、１９はセレクタ１７によって出力された
データを指数比較器１５の比較結果に従ってシフトする
シフタ、２０はシフタ１９の出力データとセレクタ１８
の出力データとを入力して仮数の加算を行う加算器、２
１は指数加算器１３から出力される指数加算値と加算器
２０から出力される仮数加算値とを入力して浮動小数点
数の正規化を行う正規化器、２２は正規化器２１から出
力される正規化した浮動小数点数を指数と仮数に分けて
保持する出力レジスタである。A conventional product-sum calculator constituting a conventional product-sum operation device will be described below. FIG. 7 is a block diagram showing a floating-point product-sum operation unit as a conventional product-sum operation unit. In FIG. 7, reference numeral 31 denotes a multiplicand input register for storing a multiplicand of a floating-point operation by dividing it into an exponent and a mantissa, 32 denotes a multiplier input register for storing a multiplier by dividing the exponent and the mantissa similarly to the multiplicand input register 31, and 13 denotes a multiplicand An exponent adder for inputting the exponent stored in the input register 11 and the exponent stored in the multiplier input register 12 and performing addition, 14 is a mantissa stored in the multiplicand input register and the multiplier input register 1
A partial product generator for generating a partial product of multiplication by inputting the mantissa stored in 2 and an input of a result of exponential addition in the exponent adder 13 and an operation result stored in an output register 22 described later. An exponential comparator 16 for calculating the difference between the exponents and comparing the exponents, and a partial product addition value (that is, a product as a result of calculating the mantissa of the multiplicand × the mantissa of the multiplier) by adding the partial products input from the partial product generator 14. The partial product adders 17 and 18 receive the partial product addition value from the partial product adder 16 and the mantissa from the output register 22, respectively, and select according to the comparison result (that is, the exponent difference) in the exponent comparator 15. , A shifter 19 for shifting the data output by the selector 17 in accordance with the comparison result of the exponent comparator 15, and a selector 20 for outputting the data of the shifter 19 and the selector 18.
Adder that inputs the output data of
1 is a normalizer for inputting the exponent addition value output from the exponent adder 13 and the mantissa addition value output from the adder 20 to normalize a floating-point number, and 22 is output from the normalizer 21 This is an output register that holds a normalized floating-point number divided into an exponent and a mantissa.

【０００４】以上のように構成された従来の積和演算器
の動作を図７を用いて説明する。図８は従来の浮動小数
点積和演算器の動作を説明するフローチャートである。[0004] The operation of the conventional product-sum calculator configured as described above will be described with reference to FIG. FIG. 8 is a flowchart for explaining the operation of the conventional floating-point multiply-add unit.

【０００５】図８において、まず、乗数および被乗数が
被乗数入力レジスタ１１および乗数レジスタ１２に入力
される（Ｓ１１）。次に、指数加算器１３が被乗数入力
レジスタ１１に格納した指数と乗数入力レジスタ１２に
格納した指数とを読み込み、加算を行う（Ｓ１２）。同
時に部分積生成器１４では被乗数入力レジスタ１１に格
納した仮数と乗数入力レジスタ１２に格納した仮数とを
入力して乗算のための部分積の生成を行う（Ｓ１２）。
次に、部分積加算器１６で部分積生成器１４で生成した
部分積を全て加算して部分積加算値を得る（Ｓ１３）。
また、指数比較器１５では、指数加算器１３が出力する
指数と累算のための和として出力レジスタ２２に格納し
ている指数との差を指数差として求め、比較する（Ｓ１
３）。In FIG. 8, first, a multiplier and a multiplicand are input to a multiplicand input register 11 and a multiplier register 12 (S11). Next, the exponent adder 13 reads the exponent stored in the multiplicand input register 11 and the exponent stored in the multiplier input register 12, and performs addition (S12). At the same time, the partial product generator 14 inputs the mantissa stored in the multiplicand input register 11 and the mantissa stored in the multiplier input register 12 to generate a partial product for multiplication (S12).
Next, the partial product adder 16 adds all the partial products generated by the partial product generator 14 to obtain a partial product addition value (S13).
Further, the exponent comparator 15 obtains a difference between the exponent output from the exponent adder 13 and the exponent stored in the output register 22 as a sum for accumulation as an exponent difference, and compares them (S1).
3).

【０００６】次に、ステップ１３での比較の結果、指数
加算器１３が出力する指数が大きい場合（Ｓ１４）、セ
レクタ１７は指数の小さい出力レジスタ２２からの仮数
を選択し、シフタ１９で指数比較器１５の出力する指数
差に従ってシフトを行い（Ｓ１５）、セレクタ１８は部
分積加算器１６の出力する部分積加算値を選択する。出
力レジスタ２２からの指数が大きい場合（Ｓ１４）、セ
レクタ１７は部分積加算器１６の出力する部分積加算値
を選択し、シフタ１９で指数比較器１５の出力する指数
差に従ってシフトを行い（Ｓ１６）、また、セレクタ１
８は指数の大きな浮動小数点数の仮数を選択する。Next, as a result of the comparison in step 13, if the exponent output from the exponent adder 13 is large (S14), the selector 17 selects the mantissa from the output register 22 having the small exponent, and the shifter 19 compares the exponent. The shift is performed according to the exponent difference output from the unit 15 (S15), and the selector 18 selects the partial product addition value output from the partial product adder 16. When the exponent from the output register 22 is large (S14), the selector 17 selects the partial product addition value output from the partial product adder 16, and shifts the shifter 19 according to the exponent difference output from the exponent comparator 15 (S16). ) And selector 1
8 selects a mantissa of a floating-point number having a large exponent.

【０００７】次に、シフタ１９から出力される仮数とセ
レクタ１８から出力される仮数とを加算器２０で加算す
る（Ｓ１７）。次に、指数加算器１３から出力される指
数と加算器２０から出力される仮数とが正規化器２１に
入力され、正規化を行い（Ｓ１８）、正規化器２１から
出力される指数および仮数を出力レジスタ２２で保存す
る（Ｓ１９）。Next, the mantissa output from the shifter 19 and the mantissa output from the selector 18 are added by the adder 20 (S17). Next, the exponent output from the exponent adder 13 and the mantissa output from the adder 20 are input to the normalizer 21 for normalization (S18), and the exponent and the mantissa output from the normalizer 21 are performed. Is stored in the output register 22 (S19).

【０００８】従来の積和演算装置は、上述した浮動小数
点数のための積和演算器と一般の整数積和演算器とを備
え、マルチメディア処理に必要な整数の積和演算と浮動
小数点の積和演算のためにそれぞれの専用の積和演算器
を用いる構成が一般にとられている。The conventional multiply-accumulate device includes the above-described multiply-accumulate unit for floating-point numbers and a general integer multiply-accumulate unit. In general, a configuration is used in which a dedicated product-sum calculator is used for the product-sum operation.

【０００９】[0009]

【発明が解決しようとする課題】しかしながら、上記従
来の積和演算装置では、画像符号化や３次元グラフィッ
クスで使われる４×４の行列と４要素の列ベクトルの乗
算を行う場合に、画像符号化には整数積和演算器を用
い、３次元グラフィックスには浮動小数点積和演算器を
用いるため、画像データの量が多いデータを扱う場合に
は浮動小数点積和演算器はほとんど使われることがな
く、全体の処理能力は整数積和演算器によって上限が決
定されてしまうという問題点を有していた。また、３次
元グラフィックスデータが多い場合には逆のことが起こ
り、浮動小数点積和演算器により全体処理能力が制限さ
れていた。さらに、従来の浮動小数点積和演算器は一般
の任意の項数の積和演算に対応するため部分積の加算の
後に比較と選択の処理を必要とし、演算のためのパスが
長くなり、動作速度を向上させるためには多くのハード
ウェアと消費電力を消費するという問題点を有してい
た。さらに、画像符号化のためのデータと３次元グラフ
ィックスのためのデータとは精度に大きな違いがあり、
浮動小数点積和演算器を共用して画像符号化のための演
算を行う場合、多くの回路が使われないまま無駄になる
という問題点を有していた。However, in the conventional product-sum operation apparatus, when multiplying a 4 × 4 matrix and a 4-element column vector used in image coding and three-dimensional graphics, the image is calculated. An integer sum-of-products unit is used for encoding, and a floating-point sum-of-products unit is used for three-dimensional graphics. Therefore, the floating-point sum-of-products unit is mostly used when handling data with a large amount of image data. Therefore, there is a problem that the upper limit of the overall processing capacity is determined by the integer product-sum operation unit. On the other hand, when the amount of three-dimensional graphics data is large, the opposite occurs, and the overall processing capacity is limited by the floating-point multiply-accumulate unit. In addition, the conventional floating-point multiply-accumulate unit requires comparison and selection processing after addition of partial products in order to support general multiply-accumulate operations with an arbitrary number of terms. In order to improve the speed, there is a problem that a lot of hardware and power consumption are consumed. Furthermore, there is a great difference in accuracy between data for image coding and data for three-dimensional graphics,
When an operation for image encoding is performed by sharing the floating-point multiply-accumulate unit, there is a problem that many circuits are wasted without being used.

【００１０】この積和演算装置および積和演算方法で
は、３次元グラフィックスのためのデータ（以下、「３
次元グラフィックス用データ」という）と画像符号化の
ためのデータ（以下、「画像符号化用データ」という）
とが任意の比率で与えられた場合にも効率良く処理を行
うことができ、また、精度の異なる整数に対しても効率
よく積和演算を行うことができ、さらに、演算のための
パスを縮め高速でハードウェアの少ないことが要求され
ている。In the product-sum operation apparatus and the product-sum operation method, data for three-dimensional graphics (hereinafter referred to as "3
Dimensional graphics data) and data for image coding (hereinafter referred to as "image coding data").
Can be processed efficiently even when given at an arbitrary ratio, and the product-sum operation can be performed efficiently even for integers with different precisions. There is a demand for shrinking high-speed and low hardware.

【００１１】本発明は、３次元グラフィックス用データ
と画像符号化用データとが任意の比率で与えられた場合
にも効率良く処理を行うことができ、また、精度の異な
る整数に対しても効率よく積和演算を行うことができ、
さらに、演算のためのパスを縮め高速でハードウェアを
少なくできる積和演算装置、および、３次元グラフィッ
クス用データと画像符号化用データとが任意の比率で与
えられた場合にも効率良く処理が行われ、また、精度の
異なる整数に対しても効率よく積和演算が行われ、さら
に、演算のためのパスを縮めて高速でハードウェアが少
なくなる積和演算方法を提供することを目的とする。The present invention can efficiently perform processing even when three-dimensional graphics data and image encoding data are given at an arbitrary ratio, and can process integers having different precisions. The product-sum operation can be performed efficiently,
Further, a multiply-accumulate operation device capable of reducing the number of hardware paths at high speed by shortening the operation path, and efficiently processing even when data for three-dimensional graphics and data for image encoding are given at an arbitrary ratio. To provide a multiply-accumulate operation method in which multiply-accumulate operations are efficiently performed even for integers having different precisions, and furthermore, a path for the operation is shortened to reduce the number of hardware at high speed. And

【００１２】[0012]

【課題を解決するための手段】上記課題を解決するため
に本発明の積和演算装置は、多くとも４個の３次元グラ
フィックス用データを同時に出力する機能を有するグラ
フィックス入力バッファと、８×８画素データの多くと
も２行分を同時に出力する機能を有する画像符号化入力
バッファと、８×８画素データの多くとも２列分を同時
に書き込む機能と８×８画素データの多くとも２行分を
同時に出力する機能とを有する転置バッファと、多くと
も４個の３次元グラフィックスデータを同時に書き込む
機能を有するグラフィックス出力バッファと、８×８画
素データの多くとも２行分を同時に書き込む機能を有す
る画像符号化出力バッファと、浮動小数点積和演算と整
数積和演算とのどちらでも演算が可能で、グラフィック
ス入力バッファ、画像符号化入力バッファからのグラフ
ィックス用データ、画像符号化用データを入力してグラ
フィックスデータ、画素データを出力する４個の積和演
算器とを有する構成を備えている。In order to solve the above-mentioned problems, a multiply-accumulate operation device according to the present invention comprises: a graphics input buffer having a function of simultaneously outputting at most four pieces of three-dimensional graphics data; An image encoding input buffer having a function of simultaneously outputting at most two rows of x8 pixel data, a function of simultaneously writing at most two columns of 8x8 pixel data, and at most two rows of 8x8 pixel data A transposition buffer having a function of simultaneously outputting data, a graphics output buffer having a function of simultaneously writing at most four three-dimensional graphics data, and a function of simultaneously writing at most two rows of 8 × 8 pixel data The image encoding output buffer having the functions of: a floating-point multiply-accumulate operation and an integer multiply-accumulate operation; Graphics data from the encoded input buffer, and inputs the data for picture coding and a structure having a four multiplier-adder for outputting graphics data, the pixel data.

【００１３】これにより、３次元グラフィックス用デー
タと画像符号化用データとが任意の比率で与えられた場
合にも効率良く処理を行うことができる積和演算装置が
得られる。As a result, a product-sum operation device capable of efficiently performing processing even when three-dimensional graphics data and image coding data are given at an arbitrary ratio is obtained.

【００１４】また、上記課題を解決するための本発明の
積和演算方法は、浮動小数点積和演算と整数積和演算と
のどちらでも演算が可能な４個の積和演算器を用いる積
和演算方法であって、被乗数と乗数とを入力する入力ス
テップと、指数を加算し最大指数を検出する最大指数検
出ステップと、浮動小数点の仮数の部分積を生成する仮
数部分積生成ステップと、精度の少ない２対の整数の部
分積を上位ビット領域と下位ビット領域とに分けて生成
する整数部分積生成ステップと、生成した部分積を加算
して部分積加算値を得る部分積加算ステップと、検出し
た最大指数と生成した部分積に対応する指数との差であ
る指数差を算出する指数差算出ステップと、算出した指
数差に基づいて部分積加算値をシフトするシフトステッ
プと、シフトした部分積加算値を累算して累算値を得る
累算ステップと、最大指数と累算値に従って正規化を行
う正規化ステップと、正規化ステップの出力データを保
持する保持ステップとを有し、浮動小数点積和演算時に
は、入力ステップと加算検出ステップとを４回繰り返
し、仮数部分積生成ステップと部分積加算ステップと指
数差算出ステップとシフトステップと累算ステップとを
４回繰り返し、整数積和演算時には、入力ステップを４
回繰り返し、整数部分積生成ステップと部分積加算ステ
ップと累算ステップとを４回繰り返す過程を有するもの
である。[0014] In addition, a product-sum operation method according to the present invention for solving the above-mentioned problem uses a product-sum operation device using four product-sum operation units capable of performing either floating-point product-sum operation or integer product-sum operation. An arithmetic method, comprising: an input step of inputting a multiplicand and a multiplier; a maximum exponent detecting step of adding exponents to detect a maximum exponent; a mantissa partial product generating step of generating a partial product of a floating-point mantissa; An integer partial product generation step of generating a partial product of two pairs of integers having a small number into an upper bit area and a lower bit area, and a partial product addition step of adding the generated partial products to obtain a partial product addition value; An exponent difference calculating step of calculating an exponent difference that is a difference between the detected maximum exponent and an exponent corresponding to the generated partial product; a shifting step of shifting a partial product addition value based on the calculated exponent difference; An accumulating step of accumulating the divided product addition value to obtain an accumulative value; a normalizing step of normalizing according to the maximum exponent and the accumulative value; and a holding step of holding output data of the normalizing step. In the floating-point multiply-accumulate operation, the input step and the addition detection step are repeated four times, and the mantissa partial product generation step, the partial product addition step, the exponent difference calculation step, the shift step, and the accumulation step are repeated four times. At the time of sum operation, the input step is 4
It has a process of repeating an integer partial product generation step, a partial product addition step, and an accumulation step four times.

【００１５】これにより、３次元グラフィックス用デー
タと画像符号化用データとが任意の比率で与えられた場
合にも効率良く処理が行われ、また、演算のためのパス
を縮めて高速でハードウェアが少なくなる積和演算方法
が得られる。Thus, even when the data for three-dimensional graphics and the data for image encoding are given at an arbitrary ratio, the processing can be performed efficiently. A product-sum operation method with less wear can be obtained.

【００１６】[0016]

【発明の実施の形態】本発明の請求項１に記載の発明
は、多くとも４個の３次元グラフィックス用データを同
時に出力する機能を有するグラフィックス入力バッファ
と、８×８画素データの多くとも２行分を同時に出力す
る機能を有する画像符号化入力バッファと、８×８画素
データの多くとも２列分を同時に書き込む機能と８×８
画素データの多くとも２行分を同時に出力する機能とを
有する転置バッファと、多くとも４個の３次元グラフィ
ックスデータを同時に書き込む機能を有するグラフィッ
クス出力バッファと、８×８画素データの多くとも２行
分を同時に書き込む機能を有する画像符号化出力バッフ
ァと、浮動小数点積和演算と整数積和演算とのどちらで
も演算が可能で、グラフィックス入力バッファ、画像符
号化入力バッファからのグラフィックス用データ、画像
符号化用データを入力してグラフィックスデータ、画素
データを出力する４個の積和演算器とを有することとし
たものであり、３次元グラフィックス用データと画像符
号化用データとが任意の比率で与えられた場合には４個
の積和演算器の機能が切り替えられて効率よく処理が行
われるという作用を有する。DESCRIPTION OF THE PREFERRED EMBODIMENTS The invention according to claim 1 of the present invention provides a graphics input buffer having a function of simultaneously outputting at most four pieces of three-dimensional graphics data, and a large number of 8 × 8 pixel data. An image coding input buffer having a function of simultaneously outputting two rows, a function of simultaneously writing at most two columns of 8 × 8 pixel data, and a function of 8 × 8
A transposition buffer having a function of simultaneously outputting at most two rows of pixel data; a graphics output buffer having a function of simultaneously writing at most four three-dimensional graphics data; An image-encoded output buffer having a function of writing two lines at the same time, and a floating-point multiply-add operation and an integer multiply-add operation can be operated. Data and image coding data to be input to output graphics data and pixel data, and four sum-of-products arithmetic units. The three-dimensional graphics data and the image coding data Is given at an arbitrary ratio, the functions of the four sum-of-products arithmetic units are switched and the processing is performed efficiently. A.

【００１７】請求項２に記載の発明は、請求項１に記載
の発明において、積和演算器は、浮動小数点演算時には
４個の積項の最大指数による桁合わせを行い、整数演算
時には乗算器のための部分積加算において部分積加算器
の上位ビットと下位ビットを分離し、上位ビットと下位
ビットとにそれぞれ異なる数値を入力するとともに所要
の符号拡張を行うこととしたものであり、精度が大きく
異なる３次元グラフィックス用データと画像符号化用デ
ータとが、任意の比率で与えられた場合に４個の積和演
算器の機能が切り替えられて効率よく処理が行われると
いう作用を有する。According to a second aspect of the present invention, in the first aspect of the invention, the multiply-accumulate unit performs digit alignment by a maximum exponent of four product terms at the time of floating-point arithmetic, and a multiplier at the time of integer arithmetic. The upper and lower bits of the partial product adder are separated in the partial product adder, and different numerical values are input to the upper and lower bits, and the required sign extension is performed. When three-dimensional graphics data and image coding data that are greatly different from each other are given at an arbitrary ratio, the function of the four product-sum calculators is switched, and the processing is performed efficiently.

【００１８】請求項３に記載の発明は、請求項１に記載
の発明において、積和演算器は、４組の浮動小数点数を
連続して入力する被乗数入力レジスタと乗数入力レジス
タと、４組の浮動小数点数の入力の際に各組の指数の加
算を行う指数加算器と、指数のうちの最大指数を求めて
一時保存する指数バッファと、４組の浮動小数点数の仮
数を各組毎に順次入力して乗算のための部分積を生成す
る部分積生成器と、部分積生成器から出力される部分積
を加算して部分積加算値を得る部分積加算器と、最大指
数から生成した部分積に対応する指数を減算して指数差
を得る減算器と、部分積加算器から出力される部分積加
算値を指数差に従って桁下がりの方向にシフトするシフ
タと、シフタから順次出力される部分積加算値を合計し
て累算値を得る累算器と、最大指数と累算器から出力さ
れる累算値とに従って正規化を行う正規化器と、正規化
器の出力データを保持する出力レジスタとを有すること
としたものであり、被乗数、乗数の入力時に最大指数を
求め累算の前にシフトすることにより積和演算の繰返し
の中に選択や比較といった処理を行う必要がなくなり、
少ないハードウェアで高速に処理が行われるという作用
を有する。According to a third aspect of the present invention, in the first aspect of the invention, the multiply-accumulate unit includes a multiplicand input register and a multiplier input register for continuously inputting four sets of floating-point numbers, and four sets of An exponent adder that adds each set of exponents when a floating point number is input, an exponent buffer that temporarily stores the largest exponent of the exponent, and a mantissa of four sets of floating point numbers for each set , A partial product generator that generates a partial product for multiplication by sequentially inputting the partial products, a partial product adder that adds partial products output from the partial product generator to obtain a partial product addition value, and a maximum exponent. A subtractor that subtracts the exponent corresponding to the divided partial product to obtain an exponent difference, a shifter that shifts the partial product addition value output from the partial product adder in the direction of borrow according to the exponent difference, and a shifter sequentially output from the shifter Accumulation that sums partial product addition values to obtain an accumulated value And a normalizer that performs normalization according to the maximum exponent and the accumulated value output from the accumulator, and an output register that holds output data of the normalizer, a multiplicand, By finding the maximum exponent when inputting the multiplier and shifting it before accumulation, there is no need to perform processes such as selection and comparison during the repetition of the product-sum operation.
This has the effect that high-speed processing is performed with a small amount of hardware.

【００１９】請求項４に記載の発明は、請求項３に記載
の発明において、部分積生成器は、整数演算時に２対の
入力整数を部分積加算器の上位ビット領域と下位ビット
領域とに分けて入力し、上位ビット領域に割り当てられ
た１対の入力整数の被乗数には下位ビット領域に対応す
る部分にマスクをして符号拡張を行い、下位ビット領域
に割り当てられた他の１対の入力整数は上位ビット領域
に対応する部分にマスクして符号拡張を行うこととした
ものであり、浮動小数点数とは精度の大きく異なる整数
に対しても効率よく積和演算が行われるという作用を有
する。According to a fourth aspect of the present invention, in the third aspect of the present invention, the partial product generator converts two pairs of input integers into an upper bit area and a lower bit area of the partial product adder during an integer operation. The multiplicand of a pair of input integers assigned to the upper bit area is divided and input, a sign is extended by masking a portion corresponding to the lower bit area, and another pair of input integers assigned to the lower bit area is assigned. The input integer masks the part corresponding to the high-order bit area and performs sign extension. This has the effect that the product-sum operation can be performed efficiently even for integers that differ greatly in precision from floating-point numbers. Have.

【００２０】請求項５に記載の発明は、浮動小数点積和
演算と整数積和演算とのどちらでも演算が可能な４個の
積和演算器を用いる積和演算方法であって、被乗数と乗
数とを入力する入力ステップと、指数を加算し最大指数
を検出する最大指数検出ステップと、浮動小数点の仮数
の部分積を生成する仮数部分積生成ステップと、精度の
少ない２対の整数の部分積を上位ビット領域と下位ビッ
ト領域とに分けて生成する整数部分積生成ステップと、
生成した部分積を加算して部分積加算値を得る部分積加
算ステップと、検出した最大指数と生成した部分積に対
応する指数との差である指数差を算出する指数差算出ス
テップと、算出した指数差に基づいて部分積加算値をシ
フトするシフトステップと、シフトした部分積加算値を
累算して累算値を得る累算ステップと、最大指数と累算
値に従って正規化を行う正規化ステップと、正規化器の
出力データを保持する保持ステップとを有し、浮動小数
点積和演算時には、入力ステップと加算検出ステップと
を４回繰り返し、仮数部分積生成ステップと部分積加算
ステップと指数差算出ステップとシフトステップと累算
ステップとを４回繰り返し、整数積和演算時には、入力
ステップを４回繰り返し、整数部分積生成ステップと部
分積加算ステップと累算ステップとを４回繰り返すこと
としたものであり、３次元グラフィックス用データと画
像符号化用データとが任意の比率で与えられた場合には
４個の積和演算器の機能が切り替えられて効率よく処理
が行われ、また演算のためのパス（クリティカルパス）
を縮めて高速でハードウェアが少なくなるという作用を
有する。According to a fifth aspect of the present invention, there is provided a multiply-accumulate method using four multiply-accumulate operators capable of performing either a floating-point multiply-accumulate operation or an integer multiply-accumulate operation. , An exponent detection step for adding exponents to detect a maximum exponent, a mantissa partial product generation step for generating a floating-point mantissa partial product, and a low-precision two-pair integer partial product Generating an integer partial product by dividing the upper bit area and the lower bit area into
A partial product addition step of adding the generated partial products to obtain a partial product addition value; an exponent difference calculation step of calculating an exponent difference that is a difference between the detected maximum exponent and an exponent corresponding to the generated partial product; Shift step for shifting the partial product addition value based on the calculated exponent difference, accumulation step for accumulating the shifted partial product addition value to obtain an accumulated value, and normalization for normalizing according to the maximum exponent and the accumulated value. And a holding step for holding the output data of the normalizer. At the time of the floating-point product-sum operation, the input step and the addition detection step are repeated four times, and a mantissa partial product generation step, a partial product addition step, The exponent difference calculation step, the shift step, and the accumulation step are repeated four times. At the time of the integer product-sum operation, the input step is repeated four times, and the integer partial product generation step and the partial product addition step are performed. The accumulation step is repeated four times, and the functions of the four sum-of-products units are switched when the data for three-dimensional graphics and the data for image encoding are given at an arbitrary ratio. The processing is performed efficiently, and the operation path (critical path)
Has the effect of reducing hardware speed and reducing hardware.

【００２１】請求項６に記載の発明は、請求項５に記載
の発明において、整数部分積生成ステップにおいて、整
数演算時、２対の入力整数を部分積加算器の上位ビット
領域と下位ビット領域とに分けて入力し、上位ビット領
域に割り当てられた１対の入力整数の被乗数には下位領
域に対応する部分にマスクをして符号拡張を行い、下位
ビット領域に割り当てられたもう１対の入力整数は上位
ビット領域に対応する部分にマスクして符号拡張を行う
ことにより部分積生成を行うこととしたものであり、浮
動小数点数とは精度の大きく異なる整数に対しても効率
よく積和演算が行われるという作用を有する。According to a sixth aspect of the present invention, in the invention of the fifth aspect, in the integer partial product generating step, at the time of an integer operation, two pairs of input integers are used in an upper bit area and a lower bit area of a partial product adder. The sign multiplication is performed by masking the portion corresponding to the lower region for the multiplicand of a pair of input integers assigned to the upper bit region, and performing another sign assignment to the lower bit region. Input integers are used to generate partial products by masking the part corresponding to the high-order bit area and performing sign extension, and efficiently multiply and accumulate even integers that differ greatly in precision from floating-point numbers. This has the effect of performing calculations.

【００２２】以下、本発明の実施の形態について、図１
〜図５を用いて説明する。（実施の形態）図１は本発明の実施の形態による積和演
算装置を示すブロック図である。図１において、１はグ
ラフィックス入力バッファ、２は画像符号化入力バッフ
ァ、３ａ〜３ｄは積和演算器、４は転置バッファ、５は
グラフィックス出力バッファ、６は画像符号化出力バッ
ファである。Hereinafter, an embodiment of the present invention will be described with reference to FIG.
This will be described with reference to FIG. (Embodiment) FIG. 1 is a block diagram showing a product-sum operation device according to an embodiment of the present invention. In FIG. 1, reference numeral 1 denotes a graphics input buffer, 2 denotes an image coding input buffer, 3a to 3d denote product-sum operators, 4 denotes a transposition buffer, 5 denotes a graphics output buffer, and 6 denotes an image coding output buffer.

【００２３】以上のように構成された積和演算装置につ
いて、機能等を説明する。グラフィックス入力バッファ
１は３次元グラフィックス用データを供給するもので、
４つの頂点データを同時に出力する機能を有する。画像
符号化入力バッファ２は画像符号化用データ（つまり画
素データ）を供給するもので、８×８画素データの２つ
の行を同時に出力する機能を有する。これらのバッファ
１、２は各積和演算器３ａ〜３ｄにそれぞれ接続され
る。積和演算器３ａ〜３ｄはグラフィックス入力バッフ
ァ１、画像符号化入力バッファ２、転置バッファ４の中
から一つを選択し、入力データを選択されたバッファか
ら受け取る。また、積和演算器３ａ〜３ｄはグラフィッ
クス出力バッファ５、画像符号化出力バッファ６、転置
バッファ４の中から一つを選択し、その出力データを選
択されたバッファへ書き込む。転置バッファ４は８×８
画素データの２つの行を同時に出力し、また同じく２つ
の列を同時に入力する機能を有する。また、グラフィッ
クス出力バッファ５は４つの入力を同時に書き込む機能
を有し、画像符号化出力バッファ６は２つの入力を同時
に書き込む機能を有する。The functions and the like of the product-sum operation device configured as described above will be described. The graphics input buffer 1 supplies three-dimensional graphics data.
It has a function to output four vertex data simultaneously. The image encoding input buffer 2 supplies image encoding data (that is, pixel data), and has a function of simultaneously outputting two rows of 8 × 8 pixel data. These buffers 1 and 2 are connected to the respective product-sum calculators 3a to 3d. The product-sum calculators 3a to 3d select one of the graphics input buffer 1, the image coding input buffer 2, and the transposition buffer 4, and receive input data from the selected buffer. The product-sum calculators 3a to 3d select one of the graphics output buffer 5, the image encoding output buffer 6, and the transposition buffer 4, and write the output data to the selected buffer. Transposition buffer 4 is 8 × 8
It has a function of simultaneously outputting two rows of pixel data and simultaneously inputting two columns. The graphics output buffer 5 has a function of writing four inputs at the same time, and the image encoding output buffer 6 has a function of writing two inputs at the same time.

【００２４】次に、本発明の実施の形態による積和演算
装置について、その動作を図２を用いて説明する。図２
（ａ）、（ｂ）、図３（ａ）、（ｂ）は本発明の実施の
形態による積和演算装置の動作の説明図である。本実施
の形態では、データは、３次元グラフィックス用データ
と画像符号化用データとが任意の比率で供給されると仮
定している。この仮定に対して、４つの積和演算器３ａ
〜３ｄを利用して４：０、３：１、２：２、０：４の各
比率で積和演算器３ａ〜３ｄの機能を設定し、それぞれ
が実行可能であることを示す。Next, the operation of the product-sum operation device according to the embodiment of the present invention will be described with reference to FIG. FIG.
(A), (b), and FIGS. 3 (a) and (b) are explanatory diagrams of the operation of the product-sum operation device according to the embodiment of the present invention. In the present embodiment, it is assumed that the data is supplied at an arbitrary ratio between the data for three-dimensional graphics and the data for image encoding. For this assumption, four product-sum operation units 3a
The functions of the product-sum calculators 3a to 3d are set at the respective ratios of 4: 0, 3: 1, 2: 2, and 0: 4 by using to 3d to indicate that each of them can be executed.

【００２５】３次元グラフィックス用データのみを処理
する場合、図２（ａ）に示すように、４：０の組合わせ
で積和演算器３ａ〜３ｄに機能の割付けを行う。このと
きグラフィックス入力バッファ１は４つの積和演算器３
ａ〜３ｄに対して４つのデータを同時に供給し、それぞ
れの積和演算器３ａ〜３ｄで３次元グラフィックスのた
めの演算を行う。演算の結果はそれぞれグラフィックス
出力バッファ５に同時に書き込まれる。When processing only three-dimensional graphics data, as shown in FIG. 2A, functions are assigned to the product-sum calculators 3a to 3d in a 4: 0 combination. At this time, the graphics input buffer 1 has four product-sum calculators 3
Four data are simultaneously supplied to a to 3d, and the product-sum calculators 3a to 3d perform calculations for three-dimensional graphics. The results of the operations are simultaneously written to the graphics output buffer 5, respectively.

【００２６】３次元グラフィックス用データと画像符号
化用データとが混在する場合、必要に応じて積和演算器
３ａ〜３ｄの割当てを変化させる。積和演算器３ａ〜３
ｄの割当てを図２（ｂ）のように３：１とするとき、グ
ラフィックス入力バッファ１は３つの積和演算器３ａ〜
３ｃに対して頂点データ（グラフィックス用データ）を
出力する。また、画像符号化入力バッファ２は１つの積
和演算器３ｄにデータを供給する。グラフィックス用デ
ータについては、４：０の場合と同様に各積和演算器３
ａ〜３ｃがそれぞれ演算を行い、結果をグラフィックス
出力バッファ５に書き込む。画像符号化用データに関し
ては、積和演算器３ｄで１次元ＤＣＴ演算を行い、演算
結果は転置バッファ４に格納される。８行分の１次元Ｄ
ＣＴ演算が終了した後、転置バッファ４から転置された
データを入力し、更に８列分の１次元ＤＣＴ演算を行
い、演算結果を画像符号化出力バッファ６に順次書き込
む。When the data for three-dimensional graphics and the data for image encoding are mixed, the assignment of the product-sum calculators 3a to 3d is changed as required. Multiply-accumulate operators 3a-3
When the assignment of d is 3: 1 as shown in FIG. 2B, the graphics input buffer 1 has three product-sum calculators 3a to 3a.
Vertex data (graphics data) is output to 3c. The image encoding input buffer 2 supplies data to one product-sum calculator 3d. As for the graphics data, each product-sum operation unit 3
a to 3c each perform an operation and write the result to the graphics output buffer 5. For the data for image encoding, one-dimensional DCT operation is performed by the product-sum operation unit 3 d, and the operation result is stored in the transposition buffer 4. One-dimensional D for 8 rows
After the CT calculation is completed, the transposed data is input from the transposition buffer 4, one-dimensional DCT calculation is further performed for eight columns, and the calculation results are sequentially written to the image encoding output buffer 6.

【００２７】積和演算器３ａ〜３ｄの割当てが図３
（ａ）のように２：２のとき、グラフィックス入力バッ
ファ１は２つの積和演算器３ａ、３ｂにデータを供給す
る。また、画像符号化入力バッファ２は積和演算器３ｄ
にデータを供給し、積和演算器３ｄから出力されたデー
タは、転置バッファ４に書き込まれる。転置バッファ４
に書き込まれたデータは、積和演算器３ｃに入力され、
その結果は画像符号化出力バッファ６に書き込まれる。The assignment of the sum-of-products calculators 3a to 3d is shown in FIG.
In the case of 2: 2 as in (a), the graphics input buffer 1 supplies data to the two sum-of-products calculators 3a and 3b. The image encoding input buffer 2 is a product-sum calculator 3d
And the data output from the product-sum operation unit 3 d is written into the transposition buffer 4. Transpose buffer 4
Is input to the product-sum calculator 3c,
The result is written to the image encoding output buffer 6.

【００２８】グラフィックス用データについては、４：
０の場合と同様に各積和演算器３ａ、３ｂがそれぞれ演
算を行い、演算結果をグラフィックス出力バッファ５に
書き込む。画像符号化用データに関しては、一方の積和
演算器３ｄで１次元ＣＤＴ演算を行い、演算結果を転置
バッファ４に格納する。１ブロック分の１次元ＤＣＴ演
算が終了した後、もう一方の積和演算器３ｃが転置バッ
ファ４からデータを入力し、列成分の１次元ＤＣＴ演算
を開始する。この１次元ＤＣＴ演算の演算結果は順次画
像符号化出力バッファ６に書き込まれる。また、この列
成分の演算を行っているとき、同時に次のブロックに対
する行成分の１次元ＤＣＴ演算が一方の積和演算器３ｄ
で実行される。For graphics data:
Similarly to the case of 0, each of the product-sum calculators 3a and 3b performs a calculation, and writes the calculation result to the graphics output buffer 5. For the image encoding data, one-dimensional CDT operation is performed by one product-sum operation unit 3d, and the operation result is stored in the transposition buffer 4. After the one-dimensional DCT operation for one block is completed, the other product-sum operation unit 3c inputs data from the transposition buffer 4 and starts the one-dimensional DCT operation for the column components. The operation results of the one-dimensional DCT operation are sequentially written to the image encoding output buffer 6. When this column component operation is being performed, the one-dimensional DCT operation of the row component for the next block is simultaneously performed by one product-sum operation unit 3d.
Executed in

【００２９】全てのデータが画像符号化用データである
とき、積和演算器３ａ〜３ｄの割当てを図３（ｂ）のよ
うに４：０とする。このとき、画像符号化入力バッファ
２から２つの積和演算器３ｂ、３ｄに対して行成分のデ
ータが偶数行と奇数行に分けられて出力される。この２
つの積和演算器３ｂ、３ｄではそれぞれ１次元ＤＣＴ演
算が行われ、演算結果は転置バッファ４に格納される。
１ブロック分の行成分に対する１次元ＤＣＴ演算が終了
した後、残りの２つの積和演算器３ａ、３ｃは転置バッ
ファ４からデータを取り出し、列成分に対する１次元Ｄ
ＣＴ演算をそれぞれ偶数行と奇数行について開始し、演
算結果を画像符号化出力バッファに書き込む。When all the data is image encoding data, the assignment of the product-sum calculators 3a to 3d is set to 4: 0 as shown in FIG. 3B. At this time, the row component data is output from the image encoding input buffer 2 to the two product-sum calculators 3b and 3d, being divided into even rows and odd rows. This 2
Each of the product-sum calculators 3 b and 3 d performs a one-dimensional DCT calculation, and the calculation result is stored in the transposition buffer 4.
After the one-dimensional DCT operation on the row components of one block is completed, the remaining two multiply-accumulate units 3a and 3c take out the data from the transposition buffer 4 and perform the one-dimensional D-operation on the column components.
The CT operation is started for even and odd rows, respectively, and the operation result is written to an image encoding output buffer.

【００３０】このようにして、３次元グラフィックス用
データと画像符号化用データとが任意の比率で供給され
るときに効率よく処理を行うことができる。In this manner, when the three-dimensional graphics data and the image encoding data are supplied at an arbitrary ratio, the processing can be efficiently performed.

【００３１】図４は図１の積和演算装置を構成する積和
演算器３ａ、３ｂ、３ｃ、３ｄを示すブロック図であ
る。図４において、３１は被乗数入力レジスタ、３２は
乗数入力レジスタ、３３は指数加算器、３４は部分積生
成器、３５は指数バッファ、３６は部分積加算器、３７
は減算器、３８はシフタ、３９は累算器、４０は正規化
器、４１は出力レジスタである。FIG. 4 is a block diagram showing the product-sum calculators 3a, 3b, 3c, and 3d constituting the product-sum calculator of FIG. In FIG. 4, 31 is a multiplicand input register, 32 is a multiplier input register, 33 is an exponent adder, 34 is a partial product generator, 35 is an exponent buffer, 36 is a partial product adder, 37
Is a subtractor, 38 is a shifter, 39 is an accumulator, 40 is a normalizer, and 41 is an output register.

【００３２】以上のように構成された積和演算器につい
て、その機能等を説明する。被乗数入力レジスタ３１は
被乗数を格納するものであり、外部からの書込みが行わ
れ、各数値は指数と仮数に分けて格納される。ここで具
体的に被乗数は４列ベクトルの各要素である。乗数入力
レジスタ３２は乗数を格納するもので、行列パラメータ
の行成分の値が指数と仮数に分けて格納される。ここで
具体的に乗数は４×４行列の各要素で、１行分毎に入力
される。指数加算器３３は、被乗数入力レジスタ３１に
格納した指数と乗数入力レジスタ３２に格納した指数と
の加算を行う。The function and the like of the product-sum calculator configured as described above will be described. The multiplicand input register 31 stores the multiplicand, is externally written, and stores each numerical value separately as an exponent and a mantissa. Here, the multiplicand is specifically each element of the four-column vector. The multiplier input register 32 stores the multiplier, and stores the values of the row components of the matrix parameters separately for exponents and mantissas. Here, specifically, the multiplier is an element of a 4 × 4 matrix and is input for each row. The exponent adder 33 adds the exponent stored in the multiplicand input register 31 and the exponent stored in the multiplier input register 32.

【００３３】部分積生成器３４は、被乗数入力レジスタ
３１に格納した仮数と乗数入力レジスタ３２に格納した
仮数とに従って乗算のための部分積を生成する。指数バ
ッファ３５は指数加算器３３によって出力される指数を
その指数のうちの最大指数を求めながら格納する。部分
積加算器３６は部分積生成器３４から入力された部分積
を加算して部分積加算値を出力する。減算器３７は、指
数バッファ３５から出力される最大指数と現在計算が実
行されている部分積に対応する指数とを入力してその差
（指数差）を出力する。シフタ３８は部分積加算器３６
から出力される仮数の積（部分積加算値）を減算器３７
から出力される指数差に従ってシフトする。累算器３９
はシフタ３８から出力される仮数と累算器３９から出力
される仮数とを加算する。正規化器４０は、指数バッフ
ァ３５から出力される最大指数と累算器３９から出力さ
れる仮数とに対して正規化処理を行う。出力レジスタ４
１は正規化器４０から出力される正規化値を指数と仮数
に分けて格納する。The partial product generator 34 generates a partial product for multiplication according to the mantissa stored in the multiplicand input register 31 and the mantissa stored in the multiplier input register 32. The exponent buffer 35 stores the exponent output from the exponent adder 33 while obtaining the maximum exponent among the exponents. The partial product adder 36 adds the partial products input from the partial product generator 34 and outputs a partial product addition value. The subtracter 37 inputs the maximum exponent output from the exponent buffer 35 and the exponent corresponding to the partial product for which the current calculation is being performed, and outputs the difference (exponent difference). The shifter 38 is a partial product adder 36
Subtracting the product of the mantissas (partial product addition value) output from
Shift according to the exponent difference output from. Accumulator 39
Adds the mantissa output from the shifter 38 and the mantissa output from the accumulator 39. The normalizer 40 performs a normalization process on the maximum exponent output from the exponent buffer 35 and the mantissa output from the accumulator 39. Output register 4
1 stores the normalized value output from the normalizer 40 by dividing it into an exponent and a mantissa.

【００３４】以上のような構成、機能等を有する図４の
積和演算器について、その動作を図５を用いて説明す
る。図５は図４の積和演算器の動作を示すフローチャー
トである。The operation of the product-sum calculator of FIG. 4 having the above-described configuration, functions, and the like will be described with reference to FIG. FIG. 5 is a flowchart showing the operation of the product-sum operation unit of FIG.

【００３５】図５において、まず、被乗数入力レジスタ
３１の指数０および仮数０の格納場所に数値の入力がな
され、同様に、乗数入力レジスタ３２の指数０および仮
数０の格納場所に数値の入力がなされる（Ｓ１、入力ス
テップ）。次に、被乗数入力レジスタ３１に入力された
指数０と乗数入力レジスタ３２に入力された指数０とが
指数加算器３３で加算され、指数バッファ３５でその和
と最大の指数とが比較され、大きい方の指数が最大指数
として格納される（Ｓ２、最大指数検出ステップ）。In FIG. 5, first, a numerical value is input to the storage location of the exponent 0 and the mantissa 0 of the multiplicand input register 31. Similarly, a numerical value is input to the storage location of the exponent 0 and the mantissa 0 of the multiplier input register 32. This is performed (S1, input step). Next, the exponent 0 input to the multiplicand input register 31 and the exponent 0 input to the multiplier input register 32 are added by the exponent adder 33, and the sum is compared with the maximum exponent in the exponent buffer 35, and the result is larger. The other exponent is stored as the maximum exponent (S2, maximum exponent detection step).

【００３６】上述のステップ１とステップ２とを４回繰
り返し、最大指数の格納場所には４つの部分積に対応す
る指数のうち最大のものが格納される。Steps 1 and 2 are repeated four times, and the largest exponent corresponding to the four partial products is stored in the storage location of the largest exponent.

【００３７】次に、部分積生成器３４で被乗数入力レジ
スタ３１と乗数入力レジスタ３２のそれぞれに入力され
た仮数０に従い部分積が生成され（Ｓ３、部分積生成ス
テップ）、この部分積はすべて加算されて部分積加算値
となる（Ｓ４、部分積加算ステップ）。この部分積加算
値は部分積加算器３６から出力される。次に、ステップ
２で求めた最大指数と現在計算されている部分積に対応
する指数との差である指数差が減算器３７で求められ
（Ｓ５、指数差算出ステップ）、この指数差に従ってシ
フタ３８は部分積加算器３６から出力される部分積加算
値を右方向（桁上がり方向）にシフトする（Ｓ６、シフ
トステップ）。次に、ステップ６でシフトされて桁合わ
せされた部分積加算値を累算器３９で累算する（Ｓ７、
累算ステップ）。Next, the partial product generator 34 generates a partial product according to the mantissa 0 input to each of the multiplicand input register 31 and the multiplier input register 32 (S3, partial product generation step), and all the partial products are added. The result is a partial product addition value (S4, partial product addition step). This partial product addition value is output from the partial product adder 36. Next, an exponent difference, which is a difference between the maximum exponent obtained in step 2 and the exponent corresponding to the currently calculated partial product, is obtained by the subtractor 37 (S5, exponent difference calculating step), and the shifter is operated according to the exponent difference. Numeral 38 shifts the partial product addition value output from the partial product adder 36 rightward (carrying direction) (S6, shift step). Next, the accumulator 39 accumulates the partial product addition values shifted and digitized in step 6 (S7,
Accumulation step).

【００３８】上述のステップ３〜ステップ７を４回繰り
返す。このようにして計算された４つの部分積加算値の
和は正規化器４０で正規化され（Ｓ８、正規化ステッ
プ）、出力レジスタに格納される（Ｓ９、保存ステッ
プ）。Steps 3 to 7 are repeated four times. The sum of the four partial product addition values calculated in this way is normalized by the normalizer 40 (S8, normalization step) and stored in the output register (S9, storage step).

【００３９】図６は図４の積和演算器の部分積生成器３
４を示すブロック図である。図６において、３４１は被
乗数の仮数を格納する入力レジスタ、３４２は乗数の仮
数を格納する入力レジスタ、３４３は各入力レジスタ３
４１、３４２から値を受け、後述の部分積レジスタ３４
４に値を出力する組み合わせ回路、３４４は組み合わせ
回路３４３から出力される値を格納する部分積レジスタ
である。FIG. 6 shows a partial product generator 3 of the product-sum operation unit in FIG.
FIG. In FIG. 6, reference numeral 341 denotes an input register for storing a mantissa of a multiplicand; 342, an input register for storing a mantissa of a multiplier;
41, 342, and receives a value from a partial product register 34 described later.
The combinational circuit 344 for outputting a value to 4 is a partial product register for storing the value output from the combinational circuit 343.

【００４０】以上のように構成された部分積生成器３４
について、２つの乗算を同時に行う場合の動作を説明す
る。（数１）は、図６の組み合わせ回路３４３で実現さ
れる論理式を示す。The partial product generator 34 configured as described above
The operation when two multiplications are performed simultaneously will be described. (Equation 1) shows a logical expression realized by the combination circuit 343 of FIG.

【００４１】[0041]

【数１】 (Equation 1)

【００４２】（数１）において、浮動小数点数の場合の
乗数及び被乗数を３２ビット、整数演算の場合それぞれ
１６ビットと仮定する。“＆”はベクトルのビット単位
での論理積を表し、“＆＆”はベクトルの各ビットと１
ビットの２進数との論理積を表し、結果はベクトルとな
る。ここで０≦ｎ，ｍ＜３１に対してp(n)は３２ビット
のベクトル、P(n,m)はｎ番目のベクトルの第ｍビット、
pp(n)は出力されるｎ番目の６３ビットの部分積、PP(n,
m)はｎ番目の部分積の第ｍビット、par(m)は乗数の第ｍ
番目のビット、varは被乗数、var(m)は被乗数の第ｍビ
ットを表す。In (Equation 1), it is assumed that the multiplier and the multiplicand in the case of a floating-point number are 32 bits, and in the case of an integer operation, each are 16 bits. “&” Represents a logical AND of a vector in bits, and “&&” represents each bit of the vector and 1
Represents the logical AND of a bit with a binary number, and the result is a vector. Here, for 0 ≦ n, m <31, p (n) is a 32-bit vector, P (n, m) is the m-th bit of the n-th vector,
pp (n) is the n-th 63-bit partial product to be output, PP (n,
m) is the mth bit of the nth partial product, and par (m) is the mth bit of the multiplier.
The ith bit, var, is the multiplicand, and var (m) is the m-th bit of the multiplicand.

【００４３】ｋは入力されるvarのビット数を表し、浮
動小数点演算時にはｋ＝３２、整数演算時にはｋ＝１６
とする。また、lwbおよびupbは整数演算時と浮動小数点
演算時にそれぞれ異なる値が割り当てられる定数で、整
数演算時はlwb＝0x0000FFFF、upb＝0xFFFF0000が割り当
てられ、浮動小数点演算時にはlwb＝0xFFFFFFFF、upb＝
0xFFFFFFFFが割り当てられる。この部分積生成器３４で
は入力レジスタ３４１、３４２に与えられた値を常に組
み合わせ回路３４３に入力し、定められたタイミングで
部分積レジスタ３４４に組み合わせ回路３４３の出力を
格納する。K represents the number of bits of the input var, and k = 32 for floating-point arithmetic and k = 16 for integer arithmetic.
And Further, lwb and upb are constants to which different values are assigned at the time of integer operation and floating point operation, respectively. At the time of integer operation, lwb = 0x0000FFFF and upb = 0xFFFF0000 are assigned. At the time of floating point operation, lwb = 0xFFFFFFFF and upb =
0xFFFFFFFF is assigned. The partial product generator 34 always inputs the values given to the input registers 341 and 342 to the combination circuit 343 and stores the output of the combination circuit 343 in the partial product register 344 at a predetermined timing.

【００４４】このような部分積生成器３４を用いて２つ
の乗算を同時に行う場合の動作について更に説明する。
図４における被乗数入力レジスタ３１と乗数入力レジス
タ３２のそれぞれの仮数のための領域を上位ビットと下
位ビットで分割し、それぞれ異なる変数を入力する。図
６の例ではＡ×Ａ’とＢ×Ｂ’が同時に計算される。部
分積生成器３４には上記のように上位ビットと下位ビッ
トに分割されたそれぞれの仮数が入力される。ここでは
組み合わせ回路３４３により部分積が生成される。この
とき部分積レジスタ３４４の部分積０〜１５には下位ビ
ット側のＢ×Ｂ’のための部分積が格納され、部分積１
６〜３１には上位ビット側のＡ×Ａ’のための部分積が
格納される。ここでは整数の乗算を行うので、図４の浮
動小数点演算の指数に関する演算ブロックは機能しな
い。従ってシフタ３８はそのままの値を累算器３９に送
り続ける。このようにして２つの乗算とその４つの部分
積の和を同時に計算することができる。この部分積生成
器３４における特長は部分積の数を削減する既知の手段
によっても実現できることは明白である。The operation when two multiplications are performed simultaneously using the partial product generator 34 will be further described.
The area for each mantissa of the multiplicand input register 31 and the multiplier input register 32 in FIG. 4 is divided into upper bits and lower bits, and different variables are input. In the example of FIG. 6, A × A ′ and B × B ′ are calculated simultaneously. The partial product generator 34 receives the respective mantissas divided into the upper bits and the lower bits as described above. Here, a partial product is generated by the combination circuit 343. At this time, the partial products 0 to 15 of the partial product register 344 store the partial products for B × B ′ on the lower bit side.
6 to 31 store a partial product for A × A ′ on the upper bit side. Here, since the integer multiplication is performed, the operation block relating to the exponent of the floating-point operation in FIG. 4 does not function. Therefore, the shifter 38 continues to send the value as it is to the accumulator 39. In this way, the sum of two multiplications and their four partial products can be calculated simultaneously. Obviously, the features of the partial product generator 34 can also be realized by known means for reducing the number of partial products.

【００４５】以上のように本実施の形態によれば、浮動
小数点積和演算と整数積和演算とのどちらでも演算が可
能な４個の積和演算器を有することにより、３次元グラ
フィックス用データと画像符号化用データとが任意の比
率で与えられた場合には４個の積和演算器の機能を切り
替えて効率よく処理を行うことができる。また、被乗
数、乗数の入力時に最大指数を求め累算の前にシフトす
ることにより積和演算の繰返しの中に選択や比較といっ
た処理を行う必要がなくなるので、少ないハードウェア
で高速に処理を行うことができる。さらに、部分積生成
器３４は、整数演算時、２対の入力整数を部分積加算器
の上位ビット領域と下位ビット領域に分けて入力し、上
記ビット領域に割り当てられた１対の入力整数の被乗数
には下位領域に対応する部分にマスクをして符号拡張を
行い、下位ビット領域に割り当てられたもう１対の入力
整数は上位ビット領域に対応する部分にマスクして符号
拡張を行うことにより浮動小数点数とは精度の大きく異
なる整数に対しても効率よく積和演算を行うことができ
る。As described above, according to the present embodiment, by providing four multiply-accumulate units capable of performing both the floating-point multiply-add operation and the integer multiply-add operation, the three-dimensional graphics When the data and the image coding data are given at an arbitrary ratio, the functions of the four product-sum calculators can be switched to perform the processing efficiently. In addition, when a multiplicand or a multiplier is input, a maximum exponent is obtained and shifted before accumulation, so that it is not necessary to perform processes such as selection and comparison during repetition of a multiply-accumulate operation, so that high-speed processing is performed with a small amount of hardware. be able to. Further, at the time of integer operation, the partial product generator 34 inputs two pairs of input integers into the upper bit area and the lower bit area of the partial product adder, and inputs the pair of input integers. For the multiplicand, sign extension is performed by masking the portion corresponding to the lower bit region, and another pair of input integers assigned to the lower bit region is masked for the portion corresponding to the upper bit region to perform sign extension. A product-sum operation can be efficiently performed even on an integer having a precision significantly different from that of a floating-point number.

【００４６】[0046]

【発明の効果】以上のように請求項１に記載の積和演算
装置によれば、多くとも４個の３次元グラフィックス用
データを同時に出力する機能を有するグラフィックス入
力バッファと、８×８画素データの多くとも２行分を同
時に出力する機能を有する画像符号化入力バッファと、
８×８画素データの多くとも２列分を同時に書き込む機
能と８×８画素データの多くとも２行分を同時に出力す
る機能とを有する転置バッファと、多くとも４個の３次
元グラフィックスデータを同時に書き込む機能を有する
グラフィックス出力バッファと、８×８画素データの多
くとも２行分を同時に書き込む機能を有する画像符号化
出力バッファと、浮動小数点積和演算と整数積和演算と
のどちらでも演算が可能で、グラフィックス入力バッフ
ァ、画像符号化入力バッファからのグラフィックス用デ
ータ、画像符号化用データを入力してグラフィックスデ
ータ、画素データを出力する４個の積和演算器とを有す
ることにより、３次元グラフィックス用データと画像符
号化用データとが任意の比率で与えられた場合には４個
の積和演算器の機能を切り替えて効率よく処理を行うこ
とができるという有利な効果が得られる。As described above, according to the multiply-accumulate operation device according to the first aspect, a graphics input buffer having a function of simultaneously outputting at most four data for three-dimensional graphics, and an 8 × 8 buffer An image encoding input buffer having a function of simultaneously outputting at most two rows of pixel data;
A transposition buffer having a function of simultaneously writing at most two columns of 8 × 8 pixel data and a function of simultaneously outputting at most two rows of 8 × 8 pixel data, and at most four pieces of three-dimensional graphics data A graphics output buffer having a function of writing at the same time, an image encoding output buffer having a function of writing at most two rows of 8 × 8 pixel data at the same time, and an operation of either the floating-point multiply-add operation or the integer multiply-add operation And a graphics input buffer, graphics data from the image encoding input buffer, and four multiply-accumulate units for inputting image encoding data and outputting graphics data and pixel data. When the data for three-dimensional graphics and the data for image encoding are given at an arbitrary ratio, the functions of four product-sum calculators An advantageous effect that it is possible to perform efficient processing is obtained by switching.

【００４７】請求項２に記載の発明によれば、請求項１
に記載の発明において、積和演算器は、浮動小数点演算
時には４個の積項の最大指数による桁合わせを行い、整
数演算時には乗算器のための部分積加算において部分積
加算器の上位ビットと下位ビットを分離し、上位ビット
と下位ビットとにそれぞれ異なる数値を入力するととも
に所要の符号拡張を行うことにより、精度が大きく異な
る３次元グラフィックス用データと画像符号化用データ
とが、任意の比率で与えられた場合に４個の積和演算器
の機能が切り替えられて効率よく処理が行われるという
有利な効果が得られる。According to the invention described in claim 2, according to claim 1,
In the invention described in the above, the multiply-accumulate unit performs digit alignment by the largest exponent of the four product terms at the time of floating-point arithmetic, and at the time of integer arithmetic, the upper bits of the partial product adder in the partial product addition for the multiplier. By separating the lower bits, inputting different numerical values to the upper bits and the lower bits, and performing the required sign extension, the data for three-dimensional graphics and the data for image encoding, which differ greatly in accuracy, can be arbitrarily selected. When given by the ratio, the function of the four sum-of-products arithmetic units is switched, and the advantageous effect that the processing is performed efficiently is obtained.

【００４８】請求項３に記載の発明によれば、請求項１
に記載の発明において、積和演算器は、４組の浮動小数
点数を連続して入力する被乗数入力レジスタと乗数入力
レジスタと、４組の浮動小数点数の入力の際に各組の指
数の加算を行う指数加算器と、指数のうちの最大指数を
求めて一時保存する指数バッファと、４組の浮動小数点
数の仮数を各組毎に順次入力して乗算のための部分積を
生成する部分積生成器と、部分積生成器から出力される
部分積を加算して部分積加算値を得る部分積加算器と、
最大指数から生成した部分積に対応する指数を減算して
指数差を得る減算器と、部分積加算器から出力される部
分積加算値を指数差に従って桁下がりの方向にシフトす
るシフタと、シフタから順次出力される部分積加算値を
合計して累算値を得る累算器と、最大指数と累算器から
出力される累算値とに従って正規化を行う正規化器と、
正規化器の出力データを保持する出力レジスタとを有す
ることにより、被乗数、乗数の入力時に最大指数を求め
累算の前にシフトすることにより積和演算の繰返しの中
に選択や比較といった処理を行う必要がなくなるので、
少ないハードウェアで高速に処理を行うことができると
いう有利な効果が得られる。According to the third aspect of the present invention, the first aspect is provided.
In the invention described in (1), the multiply-accumulate unit includes a multiplicand input register and a multiplier input register for continuously inputting four sets of floating-point numbers, and adding the exponents of each set when the four sets of floating-point numbers are input. Adder for performing exponentiation, an exponent buffer for temporarily obtaining the largest exponent of exponents, and a part for sequentially inputting the mantissas of four sets of floating-point numbers for each set to generate a partial product for multiplication A product generator, and a partial product adder that adds partial products output from the partial product generator to obtain a partial product addition value,
A subtracter that subtracts an exponent corresponding to a partial product generated from a maximum exponent to obtain an exponent difference, a shifter that shifts a partial product addition value output from the partial product adder in a downflow direction according to the exponent difference, and a shifter An accumulator that obtains an accumulated value by summing partial product addition values sequentially output from the accumulator, a normalizer that normalizes according to the maximum exponent and the accumulated value output from the accumulator,
By having an output register that holds the output data of the normalizer, the maximum exponent is obtained when the multiplicand and multiplier are input and shifted before accumulation, so that processing such as selection and comparison can be performed during repetition of the product-sum operation. You do n’t have to do that,
An advantageous effect that high-speed processing can be performed with a small amount of hardware is obtained.

【００４９】請求項４に記載の発明によれば、請求項３
に記載の発明において、部分積生成器は、整数演算時に
２対の入力整数を部分積加算器の上位ビット領域と下位
ビット領域とに分けて入力し、上位ビット領域に割り当
てられた１対の入力整数の被乗数には下位ビット領域に
対応する部分にマスクをして符号拡張を行い、下位ビッ
ト領域に割り当てられた他の１対の入力整数は上位ビッ
ト領域に対応する部分にマスクして符号拡張を行うこと
により、浮動小数点数とは精度の大きく異なる整数に対
しても効率よく積和演算を行うことができるという有利
な効果が得られる。According to the invention set forth in claim 4, according to claim 3,
In the invention described in 1 above, the partial product generator inputs two pairs of input integers into an upper bit region and a lower bit region of the partial product adder at the time of integer operation, and inputs one pair of integers assigned to the upper bit region. The multiplicand of the input integer is sign-extended by masking the portion corresponding to the lower bit area, and the other pair of input integers assigned to the lower bit area is masked by coding the part corresponding to the upper bit area. By performing the extension, it is possible to obtain an advantageous effect that a product-sum operation can be efficiently performed even on an integer whose precision is significantly different from that of a floating-point number.

【００５０】請求項５に記載の発明によれば、浮動小数
点積和演算と整数積和演算とのどちらでも演算が可能な
４個の積和演算器を用いる積和演算方法であって、被乗
数と乗数とを入力する入力ステップと、指数を加算し最
大指数を検出する最大指数検出ステップと、浮動小数点
の仮数の部分積を生成する仮数部分積生成ステップと、
精度の少ない２対の整数の部分積を上位ビット領域と下
位ビット領域とに分けて生成する整数部分積生成ステッ
プと、生成した部分積を加算して部分積加算値を得る部
分積加算ステップと、検出した最大指数と生成した部分
積に対応する指数との差である指数差を算出する指数差
算出ステップと、算出した指数差に基づいて部分積加算
値をシフトするシフトステップと、シフトした部分積加
算値を累算して累算値を得る累算ステップと、最大指数
と累算値に従って正規化を行う正規化ステップと、正規
化器の出力データを保持する保持ステップとを有し、浮
動小数点積和演算時には、入力ステップと加算検出ステ
ップとを４回繰り返し、仮数部分積生成ステップと部分
積加算ステップと指数差算出ステップとシフトステップ
と累算ステップとを４回繰り返し、整数積和演算時に
は、入力ステップを４回繰り返し、整数部分積生成ステ
ップと部分積加算ステップと累算ステップとを４回繰り
返すことにより、３次元グラフィックス用データと画像
符号化用データとが任意の比率で与えられた場合には４
個の積和演算器の機能が切り替えられて効率よく処理が
行われ、また演算のためのパス（クリティカルパス）を
縮めて高速でハードウェアが少なくなるという有利な効
果が得られる。According to a fifth aspect of the present invention, there is provided a multiply-accumulate method using four multiply-accumulate operators capable of performing either a floating-point multiply-add operation or an integer multiply-add operation. An input step of inputting an exponent and a multiplier; a maximum exponent detection step of adding an exponent to detect a maximum exponent; a mantissa partial product generation step of generating a partial product of a floating-point mantissa;
An integer partial product generating step of generating a partial product of two pairs of integers with low precision into an upper bit area and a lower bit area, and a partial product adding step of adding the generated partial products to obtain a partial product added value An exponent difference calculating step of calculating an exponent difference that is a difference between the detected maximum exponent and an exponent corresponding to the generated partial product; a shifting step of shifting a partial product addition value based on the calculated exponent difference; An accumulating step of accumulating the partial product addition value to obtain an accumulated value, a normalizing step of normalizing according to a maximum exponent and the accumulated value, and a holding step of holding output data of the normalizer. In the floating-point multiply-add operation, the input step and the addition detection step are repeated four times, and the mantissa partial product generation step, the partial product addition step, the exponent difference calculation step, the shift step, the accumulation step, At the time of the integer product-sum operation, the input step is repeated four times, and the integer partial product generation step, the partial product addition step, and the accumulation step are repeated four times. 4 if data and data are given in any ratio
The functions of the multiply-accumulate units are switched to perform the processing efficiently, and the advantageous effect of reducing the number of hardware at high speed by shortening the operation path (critical path) is obtained.

【００５１】請求項６に記載の発明によれば、請求項５
に記載の発明において、整数部分積生成ステップにおい
て、整数演算時、２対の入力整数を部分積加算器の上位
ビット領域と下位ビット領域とに分けて入力し、上位ビ
ット領域に割り当てられた１対の入力整数の被乗数には
下位領域に対応する部分にマスクをして符号拡張を行
い、下位ビット領域に割り当てられたもう１対の入力整
数は上位ビット領域に対応する部分にマスクして符号拡
張を行うことにより部分積生成を行うことにより、浮動
小数点数とは精度の大きく異なる整数に対しても効率よ
く積和演算が行われるという有利な効果が得られる。According to the invention described in claim 6, according to claim 5,
In the invention described in the above, in the integer partial product generation step, at the time of integer operation, two pairs of input integers are separately input into an upper bit area and a lower bit area of the partial product adder, and 1 is assigned to the upper bit area. The multiplicand of a pair of input integers is sign-extended by masking the portion corresponding to the lower bit region, and the other pair of input integers assigned to the lower bit region is masked by coding the portion corresponding to the upper bit region. By performing the partial product generation by performing the extension, there is obtained an advantageous effect that the product-sum operation is efficiently performed even for an integer having a precision significantly different from that of the floating-point number.

[Brief description of the drawings]

【図１】本発明の実施の形態１による積和演算装置を示
すブロック図FIG. 1 is a block diagram showing a product-sum operation device according to a first embodiment of the present invention;

【図２】（ａ）本発明の実施の形態１による積和演算装
置の動作の説明図（ｂ）本発明の実施の形態１による積和演算装置の動作
の説明図FIG. 2A is an explanatory diagram of an operation of the product-sum operation device according to the first embodiment of the present invention; FIG. 2B is an explanatory diagram of an operation of the product-sum operation device according to the first embodiment of the present invention;

【図３】（ａ）本発明の実施の形態１による積和演算装
置の動作の説明図（ｂ）本発明の実施の形態１による積和演算装置の動作
の説明図FIG. 3A is an explanatory diagram of an operation of the product-sum operation device according to the first embodiment of the present invention; FIG. 3B is an explanatory diagram of an operation of the product-sum operation device according to the first embodiment of the present invention;

【図４】図１の積和演算装置を構成する積和演算器を示
すブロック図FIG. 4 is a block diagram showing a product-sum operation unit included in the product-sum operation device of FIG. 1;

【図５】図４の積和演算器の動作を示すフローチャートFIG. 5 is a flowchart showing the operation of the product-sum operation unit in FIG. 4;

【図６】図４の積和演算器の部分積生成器を示すブロッ
ク図FIG. 6 is a block diagram showing a partial product generator of the product-sum operation unit in FIG. 4;

【図７】従来の積和演算器として浮動小数点積和演算器
を示すブロック図FIG. 7 is a block diagram showing a floating-point product-sum operation unit as a conventional product-sum operation unit;

【図８】従来の浮動小数点積和演算器の動作を説明する
フローチャートFIG. 8 is a flowchart for explaining the operation of a conventional floating-point multiply-accumulate unit;

[Explanation of symbols]

１グラフィックス入力バッファ２画像符号化入力バッファ３ａ、３ｂ、３ｃ、３ｄ積和演算器４転置バッファ５グラフィックス出力バッファ６画像符号化出力バッファ３１被乗数入力レジスタ３２乗数入力レジスタ３３指数加算器３４部分積生成器３５指数バッファ３６部分積加算器３７減算器３８シフタ３９累算器４０正規化器４１出力レジスタ３４１、３４２入力レジスタ３４３組み合わせ回路３４４部分積レジスタ DESCRIPTION OF SYMBOLS 1 Graphics input buffer 2 Image encoding input buffer 3a, 3b, 3c, 3d Multiply-accumulator 4 Transpose buffer 5 Graphics output buffer 6 Image encoding output buffer 31 Multiplicand input register 32 Multiplier input register 33 Exponent adder 34 Part Product generator 35 Exponent buffer 36 Partial product adder 37 Subtractor 38 Shifter 39 Accumulator 40 Normalizer 41 Output register 341 342 Input register 343 Combination circuit 344 Partial product register

Claims

[Claims]

1. A graphics input buffer having a function of simultaneously outputting at most four pieces of three-dimensional graphics data, and an image encoding having a function of simultaneously outputting at most two rows of 8.times.8 pixel data. Input buffer and 8
A transposition buffer having a function of simultaneously writing at most two columns of x8 pixel data and a function of simultaneously outputting at most two rows of 8x8 pixel data, and at most four three-dimensional graphics data simultaneously A graphics output buffer having a function of writing, an image encoding output buffer having a function of writing at most two rows of 8 × 8 pixel data at the same time, and an arithmetic operation using either a floating-point multiply-add operation or an integer multiply-add operation The graphics input buffer, graphics data from the image encoding input buffer, and four multiply-accumulate units for inputting image encoding data and outputting graphics data and pixel data. A sum-of-products arithmetic unit, characterized in that:

2. The multiply-accumulate unit performs digit alignment by the largest exponent of four product terms during a floating-point operation, and performs upper-order bits of a partial product adder in a partial product addition for a multiplier during an integer operation. 2. The multiply-accumulate operation device according to claim 1, wherein the lower bits are separated, different numerical values are input to the upper bits and the lower bits, and necessary sign extension is performed.

3. The multiply-accumulate unit includes a multiplicand input register and a multiplier input register for continuously inputting four sets of floating-point numbers, and an exponent of each set when the four sets of floating-point numbers are input. An exponent adder for performing addition, an exponent buffer for obtaining and temporarily storing the largest exponent of the exponents, and sequentially inputting the mantissas of the four sets of floating-point numbers for each set to form a partial product for multiplication. A partial product generator to generate, a partial product adder that adds partial products output from the partial product generator to obtain a partial product addition value, and subtracts an exponent corresponding to the generated partial product from the maximum exponent A subtracter that obtains an exponent difference, a shifter that shifts the partial product addition value output from the partial product adder in the direction of borrow according to the exponent difference, and a partial product addition value that is sequentially output from the shifter. An accumulator for summing to obtain an accumulated value; 2. The normalizer according to claim 1, further comprising: a normalizer that performs normalization according to an exponent and an accumulated value output from the accumulator; and an output register that holds output data of the normalizer. Product-sum operation unit.

4. The partial product generator inputs two pairs of input integers into an upper bit area and a lower bit area of a partial product adder at the time of an integer operation, and inputs a pair of integers assigned to the upper bit area. The multiplicand of the input integer is sign-extended by masking the portion corresponding to the lower bit area, and the other pair of input integers assigned to the lower bit area is masked by coding the part corresponding to the upper bit area. 4. The multiply-accumulate operation device according to claim 3, wherein extension is performed.

5. A multiply-and-accumulate method using four multiply-and-accumulate calculators capable of performing both a floating-point multiply-add operation and an integer multiply-add operation, and an input step of inputting a multiplicand and a multiplier. A maximum exponent detection step of adding an exponent to detect a maximum exponent, a mantissa partial product generation step of generating a partial product of a floating-point mantissa, and a low-precision partial product of two pairs of integers in a high-order bit area and a low-order bit. An integer partial product generation step for generating the partial product separately for each area; a partial product addition step for adding the generated partial products to obtain a partial product addition value; an exponent corresponding to the detected maximum exponent and the generated partial product An exponent difference calculating step of calculating an exponent difference that is a difference between the two, a shift step of shifting the partial product addition value based on the calculated exponent difference, and accumulating the shifted partial product addition value. The value An accumulating step, a normalizing step of performing normalization according to the maximum exponent and the accumulated value, and a holding step of holding output data of the normalizer. Step and the addition detection step are repeated four times, and the mantissa partial product generation step, the partial product addition step, the exponent difference calculation step, the shift step, and the accumulation step are repeated four times. , The input step is repeated four times, and the integer partial product generation step, the partial product addition step, and the accumulation step are repeated four times.

6. In the integer partial product generating step, at the time of an integer operation, two pairs of input integers are input separately into an upper bit area and a lower bit area of a partial product adder, and 1 is assigned to the upper bit area. The multiplicand of a pair of input integers is sign-extended by masking the portion corresponding to the lower bit region, and the other pair of input integers assigned to the lower bit region is masked by coding the portion corresponding to the upper bit region. The partial product generation is performed by performing extension.
The product-sum operation method described in 1.