JP2003029960A

JP2003029960A - Elimination of rounding step in short path of floating point adder

Info

Publication number: JP2003029960A
Application number: JP2002167379A
Authority: JP
Inventors: Ajay Naini; ナイーニアジェイ; Atul Dhablania; ダブラニアアテュル; Warren James; ジェームズウォーレン
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2001-06-07
Filing date: 2002-06-07
Publication date: 2003-01-31

Abstract

PROBLEM TO BE SOLVED: To provide a dual concurrent pipeline floating point adder unit shortening arithmetic delay time in a short path. SOLUTION: The device is provided with two concurrent data paths which are the short path and a long path. In the case that a floating point arithmetic operation is a subtraction operation and an exponent difference between two operands is 0, or in the case that the floating point arithmetic operation is the subtraction operation, the exponent difference is 1 and the mantissa of the operand having a larger exponent is within the range of a predetermined number, the short path is used so as to generate the result of the floating point arithmetic operation. In the case that the floating point arithmetic operation is addition operation, or in the case that the floating point arithmetic operation is the subtraction operation and the exponent difference is larger than 1, or in the case that the floating point arithmetic operation is the subtraction operation, the exponent difference is 1 and the mantissa of the operand having the larger exponent is within the range of a different predetermined number, the long path is used so as to generate the result of the floating point arithmetic operation.

Description

Detailed Description of the Invention

【０００１】[0001]

【発明の属する技術分野】本発明は、コンピュータプロ
セッサにおける浮動小数点加算器ユニットを設計するた
めの技術に関する。FIELD OF THE INVENTION The present invention relates to techniques for designing a floating point adder unit in a computer processor.

【０００２】[0002]

【従来の技術】コンピュータプロセッサは、通常は、浮
動小数点表示の形の２つの数（オペランド）を加算また
は減算するための浮動小数点加算器を含む。浮動小数点
表示では、数は±ｍ×Ｒ^eの形式で表現され、ここでｍ
は仮数と呼ばれ、Ｒは基数（または基底）であり、ｅは
指数である。典型的には、浮動小数点加算器では、基数
は暗黙に定義され、固定された数のビットが浮動小数点
数の各々のために確保される。固定された数のビットの
中の１つのビットがその浮動小数点数の符号のために確
保され、予め決められた数のビットが指数のために確保
され、予め決められた数のビットが仮数のために確保さ
れる。仮数のためのビット数が浮動小数点数の精度を決
定し、指数のビット数が、表現可能な数の値域を決定す
る。したがって、固定ビット形式が精度と値域との間の
トレードオフである。Computer processors typically include a floating point adder for adding or subtracting two numbers (operands) in the form of floating point representations. In floating point representation, numbers are represented in the form ± m × R ^e , where m
Is a mantissa, R is a radix (or base), and e is an exponent. Typically, in floating point adders, the radix is implicitly defined, and a fixed number of bits is reserved for each of the floating point numbers. One bit of the fixed number of bits is reserved for the sign of the floating point number, a predetermined number of bits is reserved for the exponent, and a predetermined number of bits is the mantissa. Reserved for. The number of bits for the mantissa determines the precision of the floating point number, and the number of bits for the exponent determines the range of representable numbers. Therefore, the fixed bit format is a trade-off between precision and range.

【０００３】ゼロ以外の浮動小数点数が、仮数における
１０進小数点の左に正確に１つの非ゼロの数字があるよ
うに仮数値と指数値とを調整することによって正規化さ
れることが可能である。したがって、１の先行ビットが
暗黙に定義され、データの１つの余分のビットを与える
ように、または、表現可能な仮数を２倍にするように、
隠されることが可能である。さらに、典型的には基数も
暗黙に定義され、ハードウェアにおいて表現される必要
はない。例えば、引用により本明細書に取り込まれるＩ
ＥＥＥ−７５４規格における３２ビット単精度形式で
は、正規化された浮動小数点数Ｎ＝±ｍ×Ｒ^eは次のよ
うに格納されることが可能であり、Non-zero floating point numbers can be normalized by adjusting the mantissa and exponent values so that there is exactly one non-zero digit to the left of the decimal point in the mantissa. is there. Therefore, a leading bit of 1 is implicitly defined, to give one extra bit of data, or to double the representable mantissa.
It can be hidden. Furthermore, the radix is also typically implicitly defined and need not be represented in hardware. For example, I, which is incorporated herein by reference
In the 32-bit single precision format in the EEE-754 standard, the normalized floating point number N = ± m × R ^e can be stored as:

【数１】ここでＳは符号ビットに関する値であり（正の数の場合
にＳは０であり、負の数の場合にＳは１である）、Ｅ
は、−１２７を超える指数ｅを表す８ビット値であり
（ｅは−１２６から＋１２７の範囲内である）、また
は、Ｅ＝ｅ＋１２７であり、および、Ｆはｍの小数部を
表す２３ビット値である。したがって、Ｎ＝（−１）^S＊１．Ｆ＊２^E-127 である。[Equation 1] Where S is a value related to the sign bit (S is 0 for positive numbers, S is 1 for negative numbers), E
Is an 8-bit value representing the exponent e greater than -127 (e is in the range -126 to +127), or E = e + 127, and F is a 23-bit value representing the fractional part of m. Is. Therefore, N = (− 1) ^S * 1. It is F * 2 ^E-127 .

【０００４】浮動小数点加算演算は、典型的には、
（１）２つのオペランドの仮数を位置合せするために必
要とされるシフトの量を求めるための指数減算ステップ
と、（２）小さい方のオペランドの仮数を右シフトする
ことによって２つのオペランドの１０進小数点を位置合
せするための位置合せステップと、（３）実際の算術演
算のための仮数加算または仮数減算ステップと、（４）
結果として得られる数の符号を求めるための変換ステッ
プと、（５）結果として得られる数を正規化するために
必要とされる左シフトまたは右シフトの量を求めるため
の先行１検出ステップと、（６）結果として得られた数
を正規化するための事後正規化ステップと、場合によっ
ては、（７）結果として得られた数における数字の数が
特定の形式によって許容される数字の合計数を超える場
合の丸めステップとを含む。これらの多数のステップが
逐次的に行われる場合には、浮動小数点加算器ユニット
の性能が低速である可能性がある。Floating point addition operations are typically
(1) an exponential subtraction step to determine the amount of shift required to align the mantissas of the two operands, and (2) 10 of the two operands by right shifting the mantissa of the smaller operand. An alignment step for aligning the decimal points, (3) a mantissa addition or mantissa subtraction step for the actual arithmetic operation, and (4)
A transformation step to determine the sign of the resulting number, and (5) a leading 1 detection step to determine the amount of left or right shift required to normalize the resulting number. (6) a post-normalization step to normalize the resulting number and, in some cases, (7) the number of digits in the resulting number is the total number of digits allowed by the particular format. Rounding step in case of exceeding. The performance of the floating point adder unit can be slow if these many steps are performed sequentially.

【０００５】浮動小数点加算器ユニットは、二重並列パ
イプラインパスを使用することによって改善されること
が可能である。Nhon T.QuachおよびMichael J.Flynn,
“An improved algorithm for high speed floating-po
int addition”, Stanford Technical Report CSL-TR-9
0-442を参照されたい。図１が、浮動小数点演算を並行
して行うように構成されている、２つの並行パイプライ
ンパス、すなわち、ショートパス（短路）１０１とロン
グパス（長路）１０２とを有する従来の浮動小数点加算
ユニット１００を示す。図１の浮動小数点加算ユニット
は潜在的な速度上の利点を有するが、二重並行パイプラ
インパスのための比較的複雑なハードウェア実装も必要
とする。The floating point adder unit can be improved by using dual parallel pipeline paths. Nhon T. Quach and Michael J. Flynn,
“An improved algorithm for high speed floating-po
int addition ”, Stanford Technical Report CSL-TR-9
See 0-442. FIG. 1 illustrates a conventional floating point adder unit having two parallel pipeline paths, short path (short path) 101 and long path (long path) 102, configured to perform floating point operations in parallel. Indicates 100. Although the floating point adder unit of FIG. 1 has potential speed advantages, it also requires a relatively complex hardware implementation for dual parallel pipeline paths.

【０００６】[0006]

【表１】 [Table 1]

【０００７】典型的には、図１の従来の浮動小数点加算
ユニットでは、各パイプラインパスは、一方向だけの仮
数のシフトしか必要としないように構成されている。図
１を参照すると、ショートパスは、２つの浮動小数点オ
ペランドの指数の間の差が０または１である場合に、有
効な減算演算のために使用される。ロングパスは、すべ
ての加算演算のために、および、２つの浮動小数点オペ
ランドの指数差が１よりも大きい場合には減算演算のた
めに使用される。これが表Ｉに要約されている。Typically, in the conventional floating point adder unit of FIG. 1, each pipeline pass is configured so that it only requires a mantissa shift in one direction. Referring to FIG. 1, the short path is used for a valid subtraction operation when the difference between the exponents of two floating point operands is 0 or 1. Longpass is used for all add operations and for subtract operations when the exponent difference of two floating point operands is greater than one. This is summarized in Table I.

【０００８】ロングパスでは、仮数が、指数差に基づい
て小さい方のオペランドを右シフトすることによって位
置合せされる。ロングパスにおける加算結果が丸めを必
要とすることがあるが、先行ゼロが無いので、事後正規
化のための左シフトは不要である。ロングパスにおける
減算結果は、多くとも１つの先行ゼロしか無いので、丸
めと多くとも１つの左シフトとを必要とすることがあ
る。In the long pass, the mantissa is aligned by right shifting the smaller operand based on the exponential difference. The addition result in the long pass may need rounding, but since there is no leading zero, no left shift for post normalization is required. Rounding and at most one left shift may be required because the subtraction result in a long pass has at most one leading zero.

【０００９】[0009]

【表２】 [Table 2]

【００１０】２つのオペランドの指数差がゼロまたは１
である場合には、ショートパスが減算演算に使用される
ので、仮数の位置合せは１ビットに限定されるが、減算
後の正規化のための左シフタ(shifter)が必要とされる
ことがある。さらに、ショートパスでは、１の指数差を
伴う減算演算の中には最終結果の丸めを必要とするもの
があるだろうが、この丸めは典型的には正規化演算後に
増分器(incrementer)によって行われる。これを表IIに
要約する。The exponent difference between the two operands is zero or one
, The short pass is used for the subtraction operation, so the mantissa alignment is limited to 1 bit, but a left shifter for normalization after subtraction may be required. is there. Furthermore, in shortpass some subtraction operations with an exponent difference of 1 may require rounding of the final result, but this rounding is typically done by an incrementer after the normalization operation. Done. This is summarized in Table II.

【００１１】典型的には、ＩＥＥＥ−７５４規格に記述
されている通りの保護ビット、丸めビット、および／ま
たは、スティッキービットが、浮動小数点演算の結果を
丸めるために使用される。次に表IIを参照すると、指数
差がゼロである場合には、２つのオペランドの仮数は位
置合せされる必要はない。仮数減算の結果が１よりも小
さい場合には、正規化ステップが正規化左シフタを使用
して行われる。ＩＥＥＥ−７５４規格に述べられている
通りに、保護ビット、丸めビット、および／または、ス
ティッキービットが空なので、この正規化ステップの後
では結果の丸めは不要である。２つのオペランドの指数
差が１である場合には、この２つのオペランドの仮数
は、小さい方のオペランドの仮数を１つだけ右シフトす
ることによって位置合せされる必要がある。さらに、右
シフトに関して保護ビットが空でないことがあるので、
丸めステップが仮数減算後に必要とされることもある。
保護ビットが１の値を有する場合には、ＩＥＥＥ−７５
４に準拠した結果を得るために、丸め演算が行われる必
要がある。Protective bits, rounding bits, and / or sticky bits, as described in the IEEE-754 standard, are typically used to round the results of floating point operations. Referring now to Table II, the mantissas of the two operands need not be aligned if the exponent difference is zero. If the result of the mantissa subtraction is less than 1, the normalization step is performed using the normalization left shifter. No rounding of the result is required after this normalization step, as the guard bits, rounding bits and / or sticky bits are empty, as described in the IEEE-754 standard. If the exponent difference between the two operands is 1, then the mantissas of the two operands must be aligned by right shifting the mantissas of the smaller operands by one. Moreover, since the guard bits may not be empty for right shifts,
A rounding step may be required after mantissa subtraction.
If the protection bit has a value of 1, IEEE-75
Rounding operations need to be performed in order to obtain a 4 compliant result.

【００１２】実例として、１の指数差と２４ビットの精
度とを有する次の２つのオペランドを考察する。Ａ＝1.110000000000000000000000*２⁰、および、Ｂ＝1.000000000000000000000001*２^-1 この２つのオペランドはショートパスで減算され、この
ショートパスは、小さい方のオペランドＢを１つだけ右
シフトすることによって仮数を位置合せする。仮数Ａ：1.11000000000000000000000 仮数Ｂ：0.10000000000000000000001 ＡからＢを減算した結果として、２４ビット精度の範囲
内に残るために丸めを必要とする数が得られる。ここ
で、ＡからのＢの減算の丸められていない結果は、Ａ−Ｂ＝1.001111111111111111111111 であり、末尾のビットは保護ビットである。ＩＥＥＥ−
７５４規格に準拠しているためには、保護ビットが１の
値を有する場合に、丸めステップが、特定の丸めモード
のために必要とされることがあり、その結果として、Ａ
−Ｂの丸められた結果は、 1.01000000000000000000000 である。As an illustration, consider the following two operands with an exponent difference of 1 and a precision of 24 bits. A = 1.110000000000000000000000 * 2 ⁰ and B = 1.0000000000000000000000000001 * 2 ^-1 These two operands are subtracted by a short pass, which aligns the mantissa by right shifting the smaller operand B by one. To do. Mantissa A: 1.11000000000000000000000 Mantissa B: 0.10000000000000000000001 As a result of subtracting B from A, a number that needs rounding to remain within the range of 24-bit precision is obtained. Here, the unrounded result of the subtraction of B from A is AB = 1.001111111111111111111111, with the last bit being a guard bit. IEEE-
To comply with the H.754 standard, a rounding step may be required for a particular rounding mode if the guard bits have a value of 1, resulting in A
The rounded result of -B is 1.01000000000000000000000.

【００１３】ショートパスにおける丸めステップは典型
的には増分器によって行われる。この増分器の結果とし
てショートパスによる遅延がロングパスによる遅延より
も潜在的に大きいので、この増分器は望ましくない。さ
らに、増分器を実装するためには、追加のハードウェア
（例えば、論理ゲート）が必要とされる。The rounding step in the short pass is typically done by an incrementer. This incrementer is undesirable because the delay due to the short path is potentially greater than the delay due to the long path as a result of this incrementer. Moreover, additional hardware (eg, logic gates) is required to implement the incrementer.

【００１４】[0014]

【発明が解決しようとする課題】したがって、改善され
た二重並行パイプライン浮動小数点加算器技術が必要と
されている。Therefore, there is a need for improved dual parallel pipelined floating point adder technology.

【００１５】[0015]

【課題を解決するための手段】本発明の装置と方法は、
浮動小数点表示の形の少なくとも２つのオペランドを含
む浮動小数点演算を行う働きをする。この装置は、２つ
の並行データパス、すなわち、ショートパスとロングパ
スとを含む。ショートパスは、浮動小数点演算が減算演
算でありかつ２つのオペランドの指数の間の差（「指数
差」）が０である場合に、または、浮動小数点演算が減
算演算であり、指数差が１であり、かつ、より大きな指
数を有するオペランドの仮数が予め決められた数の範囲
内にある場合に、浮動小数点演算の結果を生じさせるた
めに使用される。ロングパスは、浮動小数点演算が加算
演算である場合に、または、浮動小数点演算が減算演算
であり、かつ、指数差が１よりも大きい場合に、また
は、浮動小数点演算が減算演算であり、かつ、指数差が
１であり、かつ、より大きな指数を有するオペランドの
仮数が別の予め決められた数の範囲内にある場合に、浮
動小数点演算の結果を生じさせるために使用される。浮
動小数点演算のためのデータパスの選択に関してこの論
理を使用する場合には、ショートパスは、減算後の正規
化のための増分器のような手段を必要とはしない。The apparatus and method of the present invention comprises:
Serves to perform floating point operations involving at least two operands in the form of floating point representations. This device includes two parallel data paths, a short path and a long path. Short pass is when the floating point operation is a subtraction operation and the difference between the exponents of the two operands (“exponential difference”) is 0, or the floating point operation is a subtraction operation and the exponent difference is 1 , And is used to produce the result of a floating point operation when the mantissa of the operand with the larger exponent is within a predetermined number range. Long pass is when the floating point operation is an addition operation, or when the floating point operation is a subtraction operation and the exponent difference is greater than 1, or the floating point operation is a subtraction operation, and Used to produce the result of a floating point operation when the exponent difference is 1 and the mantissa of the operand with the larger exponent is within another predetermined number. When using this logic in selecting the data path for floating point operations, the short path does not require such means as an incrementer for post subtraction normalization.

【００１６】[0016]

【発明の実施の形態】以下の詳細な説明はＩＥＥＥ−７
５４規格に基づいており、本発明の完全な理解を実現す
るためにこの規格における浮動小数点表示形式の多くの
特定の詳細を含む。しかし、本発明が、ＩＥＥＥ−７５
４規格の範囲の外で、および／または、これらの特定の
詳細なしに実施されてもよいということが、当業者によ
って理解されるだろう。DETAILED DESCRIPTION OF THE INVENTION The following detailed description is IEEE-7.
54 standard and includes many specific details of the floating point representation format in this standard to provide a thorough understanding of the invention. However, according to the present invention, the IEEE-75
It will be appreciated by those skilled in the art that it may be implemented outside the scope of the four standards and / or without these specific details.

【００１７】本発明は、浮動小数点加算器のシステムと
方法とアルゴリズムとを含む。本発明の好ましい実施形
態の幾つかの細部が、本明細書に引例として組み入れら
れているAjay Naini, Atul Dhablania, Warren James
および Debjit Das Sarma による論文“1-Ghz HAL SPAR
C 64 Dual Floating-point Unit With RAS Features”,
Proceedings of the 15^th IEEE Symposium on Compute
r Arithmetic, 11-13June 2001, Vail, Colorado, pp.
173-183に説明されている。The present invention includes a floating point adder system, method and algorithm. Some details of the preferred embodiments of the present invention are incorporated herein by reference, Ajay Naini, Atul Dhablania, Warren James.
And Debjit Das Sarma's paper “1-Ghz HAL SPAR
C 64 Dual Floating-point Unit With RAS Features ”,
^{Proceedings of the 15 th IEEE Symposium on} Compute
r Arithmetic, 11-13June 2001, Vail, Colorado, pp.
173-183.

【００１８】図２は、２つの浮動小数点オペランドFIG. 2 shows two floating point operands.

【数２】に関して加算演算または減算演算を行うための本発明の
プロセス２００を示す流れ図である。プロセス２００
は、位置合せステップ２１０と、３つの並行サブプロセ
ス、すなわち、ショートパスプロセス２３０とロングパ
スプロセス２４０と選択プロセス２２０とを含む。位置
合せステップ２１０では、２つのオペランドが受け取ら
れてＩＥＥＥ倍精度形式によって位置合せされる。プロ
セス２００は、さらに、選択プロセス２２０の決定に基
づいてショートパスプロセス２３０またはロングパスプ
ロセス２４０からの結果を選択するための結果選択ステ
ップ２５０を含む。[Equation 2] 5 is a flow chart illustrating a process 200 of the present invention for performing an addition or subtraction operation on. Process 200
Includes an alignment step 210 and three parallel sub-processes: a short pass process 230, a long pass process 240 and a selection process 220. In the alignment step 210, the two operands are received and aligned in IEEE double precision format. Process 200 further includes a result selection step 250 for selecting a result from short pass process 230 or long pass process 240 based on the determination of selection process 220.

【００１９】本発明の一実施形態では、選択プロセス
は、表IIIに示されている基準セットに基づいて加算演
算または減算演算のためにショートパスプロセス２３０
またはロングパスプロセス２４０によって生じせしめら
れた結果を選択するかどうかを決定する。表IIIを参照
すると、１．演算が減算演算であり、かつ、オペランドＡ、Ｂ
の指数の間の差が０である場合、または、２．演算が減算演算であり、オペランドＡ、Ｂの指数
の間の差が１であり、かつ、より大きな指数を有するオ
ペランドの仮数が１．５未満である場合に、ショートパ
スプロセス２３０からの結果が結果選択ステップ２５０
で選択される。In one embodiment of the invention, the selection process uses a short pass process 230 for add or subtract operations based on the criteria set shown in Table III.
Or decide whether to select the result produced by the long pass process 240. Referring to Table III, 1. The operation is a subtraction operation and the operands A and B are
The difference between the indices of 0 is 0, or If the operation is a subtraction operation, the difference between the exponents of operands A, B is 1, and the mantissa of the operand with the larger exponent is less than 1.5, then the result from the short pass process 230 is Result selection step 250
Selected in.

【００２０】[0020]

【表３】 [Table 3]

【００２１】再び表IIIを参照すると、１．演算が加算演算である場合、２．演算が減算演算であり、かつ、オペランドＡ、Ｂ
の指数の間の差が１よりも大きい場合、または、３．演算が減算演算であり、オペランドＡ、Ｂの指数
の間の差が１であり、かつ、より大きな指数を有するオ
ペランドの仮数が１．５以上である場合、にロングパス
プロセス２４０からの結果が結果選択ステップ２５０で
選択される。Referring again to Table III, 1. If the operation is an addition operation, 1. The operation is a subtraction operation and the operands A and B are
2. If the difference between the indices of is greater than 1, or If the operation is a subtraction operation, the difference between the exponents of operands A and B is 1, and the mantissa of the operand with the larger exponent is greater than or equal to 1.5, then the result from longpass process 240 is the result. It is selected in the selection step 250.

【００２２】[0022]

【表４】 [Table 4]

【００２３】別の実施形態では、選択プロセスが、表IV
に示されている基準にしたがってショートパスプロセス
２３０またはロングパスプロセス２４０からの結果を選
択するかどうかを決定する。表IVを参照すると、この表
に示されているように、１．演算が減算演算であり、かつ、オペランドＡ、Ｂ
の指数の間の差が０である場合、または、２．演算が減算演算であり、オペランドＡ、Ｂの指数
の間の差が１であり、かつ、より大きい指数を有するオ
ペランドの仮数が１．５以下である場合、にショートパ
スプロセス２３０からの結果が結果選択ステップ２５０
で選択される。In another embodiment, the selection process is based on Table IV
It is determined whether to select the result from the short pass process 230 or the long pass process 240 according to the criteria shown in. Referring to Table IV, as shown in this table: 1. The operation is a subtraction operation and the operands A and B are
The difference between the indices of 0 is 0, or If the operation is a subtraction operation, the difference between the exponents of operands A and B is 1, and the mantissa of the operand with the larger exponent is less than or equal to 1.5, then the result from short pass process 230 is Result selection step 250
Selected in.

【００２４】表IVを参照すると、この表に示されている
ように、１．演算が加算演算である場合、２．演算が減算演算であり、かつ、オペランドＡ、Ｂ
の指数の間の差が１よりも大きい場合、または、３．演算が減算演算であり、オペランドＡ、Ｂの指数
の間の差が１であり、かつ、より大きい指数を有するオ
ペランドの仮数が１．５よりも大きい場合、にロングパ
スプロセス２４０からの結果が結果選択ステップ２５０
で選択される。Referring to Table IV, as shown in this table: 1. If the operation is an addition operation, 1. The operation is a subtraction operation and the operands A and B are
2. If the difference between the indices of is greater than 1, or If the operation is a subtraction operation, the difference between the exponents of operands A and B is 1, and the mantissa of the operand with the larger exponent is greater than 1.5, then the result from longpass process 240 is the result. Selection step 250
Selected in.

【００２５】本発明の一実施形態では、選択プロセス２
２０は、浮動小数点演算が加算演算または減算演算であ
るかどうかを判定するためのステップを含む。演算が加
算演算であるという判定に応答して、選択プロセス２０
０が、ロングパスプロセス２４０からの結果を選択する
ことを決定し、この決定を結果選択ステップ２５０に送
る。一方、演算が減算演算であるという判定に応答し
て、選択プロセス２００は２つのオペランドの指数差を
求めることに進む。指数差が１よりも大きいという判定
に応答して、プロセス２００は、ロングパスプロセス２
４０からの結果を選択することを決定し、この決定を結
果選択ステップ２５０に送る。指数差が０であるという
判定に応答して、プロセス２００はショートパスプロセ
ス２３０からの結果を選択することを決定し、この決定
を結果選択ステップ２５０に送る。指数差が１であると
いうステップ２２５での判定に応答して、プロセス２０
０は、より大きい指数を有するオペランドの仮数が１．
５未満である（または、表IVの選択規則が使用される場
合には１．５以下である）かどうかを判定することに進
む。より大きい仮数が１．５未満である（または、表IV
の選択規則が使用される場合には１．５以下である）と
いう判定に応答して、プロセス２００は、ショートパス
プロセス２３０からの結果を選択することを決定し、こ
の決定を結果選択ステップ２５０に送る。一方、より大
きい指数を有するオペランドの仮数が１．５以上である
（または、表IVの選択規則が使用される場合には１．５
よりも大きい）という判定に応答して、プロセス２００
はロングパスプロセス２４０からの結果を選択すること
を決定し、この決定を結果選択ステップ２５０に送る。In one embodiment of the invention, the selection process 2
20 includes steps for determining whether the floating point operation is an addition or subtraction operation. In response to the determination that the operation is an addition operation, the selection process 20
0 decides to select the result from the longpass process 240 and sends this decision to the result selection step 250. On the other hand, in response to determining that the operation is a subtraction operation, selection process 200 proceeds to determine the exponent difference of the two operands. In response to determining that the exponent difference is greater than 1, the process 200 determines whether the longpass process 2
It decides to select the result from 40 and sends this decision to the result selection step 250. In response to determining that the exponent difference is zero, process 200 determines to select the result from shortpass process 230 and sends this decision to result selection step 250. In response to the determination in step 225 that the exponent difference is 1, the process 20
0 is the mantissa of the operand with the larger exponent 1.
Proceed to determine if it is less than 5 (or less than or equal to 1.5 if the selection rules of Table IV are used). Greater mantissa is less than 1.5 (or Table IV
Process 1.5 is less than or equal to 1.5 if the selection rule is used), the process 200 decides to select the result from the shortpass process 230, and this decision is taken as a result selection step 250. Send to. On the other hand, the mantissa of the operand with the larger exponent is greater than or equal to 1.5 (or 1.5 if the selection rules of Table IV are used).
Greater than or equal to) process 200.
Decides to select the result from the longpass process 240 and sends this decision to the result selection step 250.

【００２６】表IIIに示されている選択規則は、表IVに
示されている選択規則よりも実行が容易である。表III
の選択規則を使用する場合には、より大きい指数を有す
るオペランドの仮数の２つのＭＳＢビットだけを、その
仮数が１．５未満の値（ショートパスからの結果が選択
されるべきであることを意味する）または１．５以上の
値（ロングパスからの結果が選択されるべきであること
を意味する）を有するかどうかを判定するために、ステ
ップ２２７で調べればよい。The selection rules shown in Table III are easier to implement than the selection rules shown in Table IV. Table III
If the selection rule is used, then only the two MSB bits of the mantissa of the operand with the larger exponent are considered to have values whose mantissa is less than 1.5 (the result from the short path should be selected. It may be determined in step 227 to determine whether it has a value of (meaning) or a value greater than or equal to 1.5 (meaning that the result from the long pass should be selected).

【００２７】図３は、ショートパスプロセス２３０をよ
り詳細に示す流れ図である。ショートパスプロセス２３
０の始めには、２組のステップが互いに並行して行わ
れ、この両方の組のステップの後に正規化ステップ３２
４のための左シフトが行われる。第１の組のステップは
ステップ３１０で始まり、このステップ３１０ではオペ
ランドＡの２つの最下位指数ビットがオペランドＢの２
つの最下位指数ビットと比較される。オペランドＡの２
つの最下位指数ビットがオペランドＢの２つの最下位指
数ビットとは異なっているという判定に応答して、位置
合せおよびスワップステップ３１４が後で行われ、この
ステップ３１４では、（２つの最下位指数ビットに基づ
いて）より小さい指数を有するオペランドが、その２つ
のオペランドの仮数を位置合せするために１ビットだけ
右シフトされ、その２つのオペランドは、より小さい指
数を有するオペランドの仮数がより大きい指数を有する
オペランドから減算されるようにスワップされてもよ
い。一方、ステップ３１０において、オペランドＡの２
つの最下位指数ビットがオペランドＢの２つの最下位指
数ビットと同一であると判定される場合には、仮数比較
ステップ３１６が、どちらの仮数がより大きいかを判定
するために行われ、その後にスワップステップ３１８が
続き、このステップ３１８では、小さい方の仮数が大き
い方の仮数から減算されるようにスワップされるだろ
う。位置合せおよびスワップステップ３１６またはスワ
ップステップ３１８のどちらかの後には、仮数減算ステ
ップ３２０が続き、このステップ３２０では、小さい方
のオペランドの仮数が大きい方のオペランドの仮数から
減算される。FIG. 3 is a flow chart showing the short pass process 230 in more detail. Short pass process 23
At the beginning of 0, two sets of steps are performed in parallel with each other, and both sets of steps are followed by a normalization step 32.
A left shift for 4 is done. The first set of steps begins at step 310, where the two least significant exponent bits of operand A are 2 of operand B.
Compared to the two least significant exponent bits. 2 of operand A
In response to determining that the one least significant exponent bit is different from the two least significant exponent bits of operand B, an alignment and swap step 314 is performed later, in which step (314 The operand with the smaller exponent (based on the bits) is right-shifted by one bit to align the mantissas of the two operands, and the two operands are the exponents with the larger mantissas of the operands with the smaller exponents. May be swapped to be subtracted from the operands with. On the other hand, in step 310, the operand A 2
If it is determined that the one least significant exponent bit is the same as the two least significant exponent bits of operand B, then a mantissa comparison step 316 is performed to determine which mantissa is greater, and then A swap step 318 follows, which will swap the smaller mantissa so that it is subtracted from the larger mantissa. Either the alignment and swap step 316 or the swap step 318 is followed by the mantissa subtraction step 320, where the mantissa of the smaller operand is subtracted from the mantissa of the larger operand.

【００２８】第２の組のステップは、減算ステップ３２
０の最終結果における先行ゼロを予測するための先行ゼ
ロ予測ステップ３１２と、予測された先行ゼロを計数し
て符号化するための先行ゼロ計数ステップ３２２とを含
む。先行ゼロ予測ステップ３１２では、アルゴリズム
が、２つの先行ゼロベクトルＺ₁、Ｚ₂を生成するために
使用される。先行ゼロベクトルＺ₁は、Ａの仮数がＢの
仮数から減算されるということを仮定して生成され、先
行ゼロベクトルＺ₂は、Ｂの仮数がＡの仮数から減算さ
れるということを仮定して生成される。したがって、第
２の組のステップは、さらに、ステップ３１０、３１６
からの結果に基づいて選択するための先行ゼロベクトル
選択ステップ３１５を含み、この先行ゼロベクトル選択
ステップ３１５では、大きい方のオペランドが判定さ
れ、小さい方のオペランドの仮数が大きい方のオペラン
ドの仮数から減算されるという仮定によって先行ゼロベ
クトルが生成される。The second set of steps is the subtraction step 32.
It includes a leading zero prediction step 312 for predicting leading zeros in the final result of 0 and a leading zero counting step 322 for counting and encoding the predicted leading zeros. In the leading zero prediction step 312, the algorithm is used to generate _two leading zero vectors Z ₁ , Z ₂ . The leading zero vector Z ₁ is generated assuming that the mantissa of A is subtracted from the mantissa of B, and the leading zero vector Z ₂ is assumed that the mantissa of B is subtracted from the mantissa of A. Generated. Therefore, the second set of steps further comprises steps 310, 316.
A leading zero vector selection step 315 for selecting based on the result from, where the larger operand is determined and the mantissa of the smaller operand is determined from the mantissa of the larger operand. The assumption of being subtracted produces a leading zero vector.

【００２９】例えば、オペランドＡが大きい方のオペラ
ンドであるということがステップ３１０またはステップ
３１６で判定される場合には、先行ゼロベクトルＺ₂が
先行ゼロ選択ステップ３１５で選択される。本発明の一
実施形態では、オペランドＡ、Ｂの仮数の各々は５２ビ
ットを有し、例えば、ｍ_A＝（ａ₅₁ａ₅₀......ａ₀）、および、ｍ_B＝（ｂ₅₁ｂ
₅₀......b₀）である。ｍ_Aとｍ_Bとが位置合せされていると仮定する
と、Ｚ₂を生成するためのアルゴリズムは、Ｂを逆にす
ることによってその２つの補数減算を実行し、それによ
って、For example, if it is determined in step 310 or step 316 that operand A is the larger operand, the leading zero vector Z ₂ is selected in leading zero selection step 315. In one embodiment of the invention, each of the mantissas of operands A, B has 52 bits, for example, m _A = (a ₅₁ a ₅₀ ...... a ₀ ) and m _B = (b ₅₁ b
₅₀ ...... b ₀ ). Assuming that m _A and m _B are aligned, the algorithm for generating Z ₂ performs its two's complement subtraction by reversing B, thereby

【数３】ここで、・、＋、および、[Equation 3] Where :, +, and

【数４】は、ＡＮＤ演算子、ＯＲ演算子、および、排他的ＯＲ演
算子をそれぞれに表示し、および、上線はＮＯＴ演算子
を表示する。全桁上げチェーン(full carry chain)を使
用しない上述の予測アルゴリズムは、減算ステップ３２
０の最終結果における先行ゼロの数と比較した場合に、
１つだけ少ない数の先行ゼロを有する予測ベクトルＺ₂
を生成する。したがって、ショートパスプロセスは、さ
らに、この誤りを訂正するための先行ゼロ訂正ステップ
３３０を含む。[Equation 4] Indicates the AND operator, the OR operator, and the exclusive OR operator, respectively, and the overline indicates the NOT operator. The prediction algorithm described above, which does not use a full carry chain, uses the subtraction step 32.
When compared to the number of leading zeros in the final result of 0,
Prediction vector Z ₂ with one less number of leading zeros
To generate. Therefore, the short pass process further includes a leading zero correction step 330 to correct this error.

【００３０】先行ゼロ計数ステップ３２２において、選
択された先行ゼロベクトルにおける先行ゼロが計数され
て符号化される。これと並行して、仮数減算が減算ステ
ップ３２０において計算される。ステップ３１２、３１
５、３２２が、先行ゼロの数と、したがって、正規化の
ための左シフトの量とを予測し、および、減算ステップ
３２０と並行して行われるので、正規化ステップ３２４
のための左シフトが、予測されるシフトの量にしたがっ
て減算ステップ３２０の直後に行われることが可能であ
る。正規化ステップ３２４のための左シフトの際に、減
算ステップ３２０からの結果が、予測された先行ゼロの
数にしたがって左シフトされる。先行ゼロ予測ステップ
３１２のための先行ゼロ予測アルゴリズムの結果として
生じるあらゆる誤りを訂正するために、先行ゼロ訂正ス
テップ３３０が正規化ステップ３２４のための左シフト
の後に続く。In a leading zero counting step 322, leading zeros in the selected leading zero vector are counted and encoded. In parallel with this, mantissa subtraction is calculated in subtraction step 320. Steps 312 and 31
Normalization step 324, since 5, 322 predicts the number of leading zeros, and thus the amount of left shift for normalization, and is performed in parallel with subtraction step 320.
It is possible that a left shift for ∘ is performed immediately after the subtraction step 320 according to the expected amount of shift. During the left shift for normalization step 324, the result from subtraction step 320 is left shifted according to the predicted number of leading zeros. A leading zero correction step 330 follows the left shift for the normalization step 324 to correct any resulting errors of the leading zero prediction algorithm for the leading zero prediction step 312.

【００３１】ショートパスにおいて行われる減算演算の
結果が１以下の仮数値を常に有するので、ショートパス
は丸めステップを必要としない。例えば、指数差が１の
値を有し、かつ、大きい方のオペランドの仮数が１．５
未満である場合には、被減数が（１，１．５）の範囲内
であり、１ビットだけシフトされた減数が［０．５，
１）の範囲内であり、（０，１）の範囲内に含まれる結
果が得られる。この結果が１未満である場合には、正規
化ステップ３２４のための左シフトにおいて、保護ビッ
トが少なくとも最下位ビット位置（ＬＳＢ）に移動させ
られる。したがって、ショートパスにおいては丸めは不
要である。Short pass does not require a rounding step because the result of the subtraction operation performed in the short pass always has a mantissa value less than or equal to 1. For example, the exponent difference has a value of 1, and the mantissa of the larger operand is 1.5.
If it is less than, then the minuend is in the range of (1,1.5) and the divisor reduced by 1 bit is [0.5,
Within the range of 1), the result included within the range of (0, 1) is obtained. If the result is less than one, the left shift for normalization step 324 moves the guard bit to at least the least significant bit position (LSB). Therefore, rounding is not necessary in the short pass.

【００３２】図４は、ロングパスプロセス２４０をさら
に詳細に示す流れ図である。図４を参照すると、ステッ
プ２１０でロングパスプロセスが選択されることに応答
して、ｅ^A−ｅ^Bのより低位のビットとｅ^B−ｅ^Aのより低
位のビットの両方がステップ４１０ａ、４１０ｂにおい
てそれぞれに並行して計算される。ｅ^A−ｅ^Bの２つの最
下位ビットのようなｅ^A−ｅ^Bのより低位のビットは、ス
テップ４１２ａにおいて、０、１、２、または、３ビッ
トだけオペランドＢの仮数を部分的に右シフトさせるた
めに使用され、一方、ｅ^A−ｅ^Bの全結果がステップ４１
１ａにおいて並行して計算される。これと同時に、ｅ^B
−ｅ^Aの２つの最下位ビットのようなｅ^B−ｅ^Aのより低
位のビットは、ステップ４１２ｂにおいて、０、１、
２、または、３ビットだけオペランドＡの仮数を部分的
に右シフトさせるために使用され、一方、ｅ^B−ｅ^Aの全
結果がステップ４１１ｂにおいて並行して計算される。
ｅ^A−ｅ^Bまたはｅ^B−ｅ^Aの全結果に基づいて、ＡとＢの
うちの小さい方のオペランドがステップ４１４で選択さ
れ、および、部分的に右シフトされた小さい方のオペラ
ンドがステップ４１６で選択される。さらに、ステップ
４１６では、部分的に右シフトされた小さい方のオペラ
ンドが、その小さい方のオペランドが減算演算のための
減数位置にあるように他方のオペランドとスワップされ
ることが可能である。ステップ４１６の後にステップ４
２０が続き、このステップ４２０では、小さい方のオペ
ランドが、常に正であるｅ^A−ｅ^Bまたはｅ^B−ｅ^Aの全結
果にしたがってさらに右シフトさせられることがある。
データが小さい方のオペランドの仮数から右シフトさせ
られるので、丸め情報がステップ４２８で並行して計算
され、このステップ４２８はＩＥＥＥ−７５４規格に準
拠した丸め論理を行う。FIG. 4 is a flow chart showing the longpass process 240 in more detail. Referring to FIG. 4, in response to the long path process being selected in step 210, both the lower bits of e ^A -e ^{B and} the lower bits of e ^B -e ^A are step 410a, 410b. Calculated in parallel for each. e ^A -e lower level bits of e ^A -e ^B, such as the two least significant bits of ^B, in step 412a, 0, 1, 2, or partially right mantissa only 3 bits operand B Used to shift, while the total result of e ^A -e ^B is step 41
Calculated in parallel in 1a. At the same time, e ^B
E ^B -e lower level bits of ^A, such as the two least significant bits of -e ^A, in step 412b, 0, 1,
It is used to partially right shift the mantissa of operand A by 2 or 3 bits, while the entire result of e ^B -e ^A is computed in parallel in step 411b.
Based on the total result of e ^A -e ^B or e ^B -e ^A , the smaller operand of A and B is selected at step 414 and the partially right-shifted smaller operand is stepped. 416 is selected. In addition, in step 416, the partially right-shifted smaller operand may be swapped with the other operand so that the smaller operand is in the reduced position for the subtraction operation. After step 416, step 4
20, followed by step 420, where the smaller operand may be further right shifted according to the overall result of e ^A -e ^B or e ^B -e ^A , which is always positive.
Since the data is shifted right from the mantissa of the smaller operand, rounding information is calculated in parallel at step 428, which performs rounding logic in accordance with the IEEE-754 standard.

【００３３】ステップ４２８と並行して、仮数の加算ま
たは減算がステップ４２６、４２７で行われる。適正な
ＩＥＥＥ−７５４丸めのために、Ａ＋Ｂ、Ａ＋Ｂ＋１、
および、Ａ＋Ｂ＋２の計算が必要とされる。この計算
は、Ａ＋ＢとＡ＋Ｂ＋２とをそれぞれに生成するために
２つの並行したステップ４２４、４２６だけを使用して
実現されることが可能であるが、これは、Ａ＋Ｂ＋１が
これら２つの結果から導き出されることが可能だからで
あり、この導出がステップ４３０で行われる。ステップ
４３０では、さらに、第１のレベルの結果の選択が、ロ
ングパスプロセスに関するＡ＋Ｂ、Ａ＋Ｂ＋１、およ
び、Ａ＋Ｂ＋２の結果の中から結果を選択するために行
われ、この選択された結果が、ステップ４２８からの丸
め情報に基づいて正規化されおよび／または丸められ
る。In parallel with step 428, mantissa addition or subtraction is performed in steps 426 and 427. For proper IEEE-754 rounding, A + B, A + B + 1,
And the calculation of A + B + 2 is required. This calculation can be realized using only two parallel steps 424, 426 to generate A + B and A + B + 2 respectively, which is derived from A + B + 1 from these two results. This is possible because this derivation is done in step 430. At step 430, a first level result selection is further performed to select a result from the A + B, A + B + 1, and A + B + 2 results for the longpass process, the selected result from step 428. Normalized and / or rounded based on rounding information.

【００３４】本発明では、１の指数差に関する減算演算
と、１．５以上の大きさ値を有する大きい方のオペラン
ドの仮数に関する減算演算とが、ロングパスにおいて行
われる。上述したように、ロングパスでは、小さい方の
オペランドの仮数が右シフトによって位置合せされる。
指数差が１であり、かつ、大きい方のオペランドの仮数
が１．５より大きい値を有する場合には、減算が、
［１．５，２）の範囲内の被減数と［０．５，１）の範
囲内の減数とに制限され、その結果は（０．５，２）の
範囲内に含まれる。したがって、ロングパスにおける加
算後の右シフトは不要であり、潜在的な単一ビットの左
シフトが、結果選択ステップ４３０の第１の段階中に処
理されてもよい。In the present invention, the subtraction operation on the exponent difference of 1 and the subtraction operation on the mantissa of the larger operand having a magnitude value of 1.5 or more are performed in the long pass. As mentioned above, in the long pass, the mantissas of the smaller operands are aligned by right shifting.
If the exponent difference is 1 and the mantissa of the larger operand has a value greater than 1.5, the subtraction is
Constrained to the subscripts in the range [1.5,2) and the subscripts in the range [0.5,1), the results are contained in the range (0.5,2). Therefore, a right shift after addition in the long pass is not needed and a potential single bit left shift may be processed during the first stage of the result selection step 430.

【００３５】図５は、上述のプロセス２００を実装する
浮動小数点加算器ユニット５００の全体的構造を示す略
ブロック図である。図５を参照すると、浮動小数点加算
器ユニット５００は、プロセス２００中の位置合せステ
ップ２１０を実装する位置合せモジュール５１０を含
む。この位置合せモジュール５１０は、加算または減算
のためにオペランドＡ、Ｂを入力として受け取り、ＩＥ
ＥＥ倍精度形式によってそのオペランドを位置合せす
る。FIG. 5 is a schematic block diagram showing the overall structure of a floating point adder unit 500 implementing the process 200 described above. Referring to FIG. 5, the floating point adder unit 500 includes an alignment module 510 that implements the alignment step 210 in process 200. This registration module 510 receives as inputs the operands A, B for addition or subtraction and IE
Align the operands with the EE double precision format.

【００３６】浮動小数点加算器ユニット５００は、２つ
の並行パイプラインパス、すなわち、ショートパスとロ
ングパスとを有する。ショートパスでは、浮動小数点加
算器ユニット５００は、プロセス２００のステップ３１
０を実装するための位置合せモジュール５１０に接続さ
れている「１」予測器(one predictor)５１４−１によ
る指数差を含む。本発明の一実施形態では、「１」予測
器５１４−１による指数差が、各オペランドの２つの最
下位指数ビットだけを調べることによって、２つのオペ
ランドにおける指数差がゼロであるかどうかを判定す
る。さらに、浮動小数点加算器ユニット５００は、位置
合せモジュール５１０と「１」予測器５１４−１による
指数差とに接続されている第１のスワップモジュール５
１４を含む。この第１のスワップモジュール５１４は、
プロセス２００内のステップ３１４を実装する回路素子
を含み、すなわち、オペランドＡ、Ｂに関する指数差が
ゼロでないということを「１」予測器５１４−１による
指数差によって判定することに応答して、第１のスワッ
プモジュール５１４は、（各オペランドの２つの最下位
指数ビットだけを調べることによって判定された）より
小さい指数を有するオペランドの仮数を１ビットだけ右
シフトさせることによって２つのオペランドの仮数を位
置合せし、および、より小さい指数を有するオペランド
が偶然に被減数位置にある場合にその２つのオペランド
をスワップする。さらに、浮動小数点加算器ユニット５
００は、位置合せモジュール５１０に接続されているオ
ペランド比較モジュール５１８−１と、第１のスワップ
モジュール５１４とオペランド比較モジュール５１８−
１とに接続されている第２のスワップモジュール５１８
とを含む。「１」予測器５１４−１による指数差が２つ
のオペランドにおける指数差がゼロであると判定する場
合には、オペランド比較モジュール５１８−１と第２の
スワップモジュール５１８とが、プロセス２００中のス
テップ３１６、３１８を行うことによって応答し、すな
わち、オペランド比較モジュール５１８−１は２つのオ
ペランドの仮数を比較して、どちらのオペランドの値が
大きいかを判定し、および、大きい方のオペランドが偶
然にも減数である場合には、第２のスワップモジュール
５１８は、その大きい方のオペランドが被減数位置に移
動させられるようにそのオペランドをスワップする。The floating point adder unit 500 has two parallel pipeline paths, a short path and a long path. In the short pass, the floating point adder unit 500 returns to step 31 of process 200.
Includes the exponential difference due to the "one" predictor 514-1 connected to the alignment module 510 for implementing a zero. In one embodiment of the present invention, the exponent difference by the "1" predictor 514-1 determines if the exponent difference in two operands is zero by examining only the two least significant exponent bits of each operand. To do. In addition, the floating point adder unit 500 includes a first swap module 5 connected to the alignment module 510 and the exponential difference provided by the "1" predictor 514-1.
Including 14. This first swap module 514
A circuit element implementing step 314 in process 200, that is, in response to determining that the exponent difference for operands A, B is non-zero by the exponent difference by the "1" predictor 514-1. Swap module 514 of 1 locates the mantissas of the two operands by right shifting the mantissas of the operands with the smaller exponent (as determined by examining only the two least significant exponent bits of each operand) by one bit. Match and swap the two operands if they happen to be in the minuend position. Furthermore, the floating point adder unit 5
00 is an operand comparison module 518-1 connected to the alignment module 510, a first swap module 514 and an operand comparison module 518-.
A second swap module 518 connected to
Including and If the exponent difference by the “1” predictor 514-1 determines that the exponent difference between the two operands is zero, the operand comparison module 518-1 and the second swap module 518 are the steps in process 200. 316, 318, that is, the operand comparison module 518-1 compares the mantissas of the two operands to determine which operand has the greater value, and the greater operand happens to occur. Is also a divisor, the second swap module 518 swaps the larger operand so that it is moved to the minuend position.

【００３７】さらに、ショートパスにおいては、浮動小
数点加算器ユニット５００は、位置合せモジュール５１
０に接続されている先行ゼロ予測モジュール５１２と、
「１」予測器５１４−１による指数差およびオペランド
比較モジュール５１８−１と先行ゼロ予測モジュール５
１２とに接続されている選択モジュール５１５とを含
む。先行ゼロ予測モジュール５１２はプロセス２００の
ステップ３１２を行い、選択モジュール５１５はプロセ
ス２００のステップ３１５を行う。先行ゼロ予測モジュ
ール５１２は、上述したように、プロセス２００の先行
ゼロ予測ステップ３１０のための先行ゼロ予測アルゴリ
ズムによって指示された論理演算を行うように構成され
ている従来通りの論理回路を含む。In addition, in the short pass, the floating point adder unit 500 uses the alignment module 51.
A leading zero prediction module 512 connected to 0,
Exponent difference and operand comparison module 518-1 and leading zero prediction module 5 by "1" predictor 514-1
12 and a selection module 515 connected to. Leading zero prediction module 512 performs step 312 of process 200 and selection module 515 performs step 315 of process 200. Leading zero prediction module 512 includes conventional logic circuitry configured to perform the logical operations dictated by the leading zero prediction algorithm for leading zero prediction step 310 of process 200, as described above.

【００３８】さらに、ショートパスにおいては、浮動小
数点加算器ユニット５００は、パイプライン段１内の第
２のスワップモジュールに接続されている第１の加算器
５２０と、選択モジュール５１５に接続されている先行
ゼロカウンタ５２２と、第１の加算器と先行ゼロカウン
タとに接続されているショートパス内の左シフタ５２４
とを含む。第１の加算器５２０は、プロセス２００内の
仮数減算ステップ３２０を行うように構成されている論
理回路を含む。先行ゼロカウンタは、プロセス２００内
の先行ゼロ計数ステップ３２２を行い、および、左シフ
タ５２４は、プロセス２００内の正規化ステップ３２４
のための左シフトを行う。Further, in the short pass, the floating point adder unit 500 is connected to the first adder 520 connected to the second swap module in pipeline stage 1 and the selection module 515. Leading zero counter 522 and left shifter 524 in the short path connected to the first adder and leading zero counter
Including and First adder 520 includes logic circuitry configured to perform mantissa subtraction step 320 in process 200. The leading zero counter performs the leading zero counting step 322 in process 200, and the left shifter 524 causes the normalizing step 324 in process 200.
Do a left shift for.

【００３９】さらに、ショートパスにおいては、浮動小
数点加算器ユニット５００は、パイプライン段２内の左
シフタ５２４に接続されている先行ゼロ訂正モジュール
５３０を含む。この先行ゼロ訂正モジュールは、プロセ
ス２００内の先行ゼロ訂正ステップ３３０を行う。Further, in the short pass, floating point adder unit 500 includes a leading zero correction module 530 connected to left shifter 524 in pipeline stage 2. The leading zero correction module performs leading zero correction step 330 in process 200.

【００４０】ロングパスにおいては、浮動小数点加算器
ユニット５００は、選択および位置合せモジュール５１
０に両方とも接続されている第２の加算器５１１ａと第
３の加算器５１１ｂとを含む。第２の加算器５１１ａは
位置合せモジュール５１０からオペランドＡ、Ｂを受け
取り、プロセス２００におけるｅ^A−ｅ^Bのより低位のビ
ットを計算するためのステップ４１０ａと、ｅ^A−ｅ^Bの
全結果に関するビットの残りを計算するためのステップ
４１１ａとを行う。第３の加算器５１１ｂは、位置合せ
モジュール５１０からオペランドＡ、Ｂを受け取り、プ
ロセス２００におけるｅ^B−ｅ^Aのより低位のビットを計
算するためのステップ４１０ｂと、ｅ^B−ｅ^Aの全結果に
関するビットの残りを計算するためのステップ４１１ｂ
とを行う。In the long pass, the floating point adder unit 500 includes a selection and alignment module 51.
It includes a second adder 511a and a third adder 511b both connected to 0. Operand A from the second adder 511a is aligned module 510 receives the B, a step 410a for calculating the lower level of the bits of e ^A -e ^B in the process 200, for all the results of e ^A -e ^B Step 411a for calculating the rest of the bits. The third adder 511b receives the operands A, B from the alignment module 510, step 410b for calculating the lower bits of e ^B -e ^A in process 200, and the total result of e ^B -e ^A. Step 411b for calculating the rest of the bits for
And do.

【００４１】ロングパスにおいては、浮動小数点加算器
ユニット５００は、さらに、位置合せモジュール５１０
と第２の加算器５１１ａとに接続されている第１の右シ
フタ５１３ａと、選択および位置合せモジュール５１０
と第３の加算器５１１ｂとに接続されている第２の右シ
フタ５１３ｂとを含む。第１の右シフタ５１３ａは、選
択および位置合せモジュール５１０からオペランドＡ、
Ｂを受け取り、第２の加算器５１１ａからｅ^A−ｅ^Bのよ
り低位のビットを受け取り、ｅ^A−ｅ^Bのより低位のビッ
トに基づいてオペランドＢの仮数を０、１、２、また
は、３ビットだけ部分的に右シフトさせるためのステッ
プ４１２ａを行う。同様に、第２の右シフタ５１３ａ
は、選択および位置合せモジュール５１０からオペラン
ドＡ、Ｂを受け取り、第３の加算器５１１ｂからｅ^B−
ｅ^Aのより低位のビットを受け取り、および、ｅ^B−ｅ^A
のより低位のビットに基づいてオペランドＡの仮数を
０、１、２、または、３ビットだけ部分的に右シフトさ
せるためのステップ４１２ｂを行う。In the long pass, the floating point adder unit 500 further includes an alignment module 510.
And a first right shifter 513a connected to the second adder 511a and a selection and alignment module 510.
And a second right shifter 513b connected to the third adder 511b. The first right shifter 513a receives the operand A from the selection and alignment module 510.
B and receives the lower bits of e ^A -e ^B from the second adder 511a and sets the mantissa of the operand B to 0, 1, 2, or based on the lower bits of e ^A -e ^B. Step 412a for partially shifting right by 3 bits is performed. Similarly, the second right shifter 513a
Receives operands A, B from the selection and alignment module 510, and a third adder 511b to e ^B −.
receive the lower bits of e ^A , and e ^B −e ^A
Step 412b is performed to partially right shift the mantissa of operand A by 0, 1, 2, or 3 bits based on the lower bits of.

【００４２】さらに、ロングパスにおいては、浮動小数
点加算器ユニット５００は、第２の加算器５１１ａと第
３の加算器５１１ｂとに接続されているオペランド選択
モジュール５１７と、第１の右シフタ５１３ａと第２の
右シフタ５１３ｂとオペランド選択モジュール５１７と
に接続されている選択／スワップモジュール５１６とを
含む。オペランド選択モジュール５１７はプロセス２０
０におけるステップ４１４を行い、すなわち、このオペ
ランド選択モジュールは、第２の加算器５１１ａおよび
／または第３の加算器５１１ｂの結果に基づいてオペラ
ンドＡ、Ｂの間で小さい方のオペランドを選択する。こ
の選択は選択／スワップモジュール５１６に出力され、
選択／スワップモジュール５１６は、Ａ、Ｂの部分的に
右シフトされた仮数の間において小さい方のオペランド
の部分的に右シフトされた仮数を選択することによっ
て、プロセス２００のステップ４１６を行う。さらに、
減算演算であり、かつ、小さい方のオペランドが偶然に
被減数である場合には、選択／スワップモジュール５１
６は、小さい方のオペランドが減数位置に移動されるよ
うにその２つのオペランドをスワップする。Further, in the long pass, the floating point adder unit 500 includes an operand selection module 517 connected to the second adder 511a and the third adder 511b, a first right shifter 513a and a first right shifter 513a. Includes a select / swap module 516 connected to a second right shifter 513b and an operand select module 517. Operand selection module 517 processes 20
Perform step 414 at 0, i.e., the operand selection module selects the smaller operand between operands A and B based on the result of the second adder 511a and / or the third adder 511b. This selection is output to the selection / swap module 516,
Select / swap module 516 performs step 416 of process 200 by selecting the partially right-shifted mantissa of the smaller operand between the partially right-shifted mantissas of A, B. further,
If it is a subtraction operation and the smaller operand happens to be the minuend, the selection / swap module 51
6 swaps the two operands so that the smaller operand is moved to the reduced position.

【００４３】さらに、ロングパスにおいては、浮動小数
点加算器ユニット５００は、オペランド選択モジュール
５１４と選択／スワップモジュール５１６とに接続され
ている第３の右シフタ５２１と、第３の右シフタ５２１
に接続されている３−２桁上げ保存加算器（ＣＳＡ）５
２３と、この３−２ＣＳＡ５２３に接続されている第
４の加算器５２５と、第３の右シフタ５２１に接続され
ている第５の加算器５２６とを含む。第３の右シフタ５
２１は、プロセス２００における右シフトステップ４２
０を行い、第２の加算器５１１ａまたは第３の加算器５
１１ｂの結果のより高位のビットに基づいて小さい方の
オペランドの部分的にシフトされた仮数をシフトさせ
る。第４の加算器５２５は、プロセス２００におけるＡ
＋Ｂ＋２を計算するためのステップ４２４を行い、第５
の加算器５２６は、プロセス２００におけるＡ＋Ｂを計
算するためのステップ４２６を行う。３−２ＣＳＡは、
２倍精度と単精度との場合にＬＳＢ位置が異なっている
ので、Ａ＋Ｂ＋２の生成のために使用される。Further, in the long pass, the floating point adder unit 500 has a third right shifter 521 and a third right shifter 521 connected to the operand selection module 514 and the selection / swap module 516.
3-2 carry save adder (CSA) 5 connected to
23, a fourth adder 525 connected to the 3-2CSA 523, and a fifth adder 526 connected to the third right shifter 521. Third right shifter 5
21 is the right shift step 42 in the process 200.
0, and the second adder 511a or the third adder 5
Shift the partially shifted mantissa of the smaller operand based on the higher bits of the result of 11b. The fourth adder 525 determines A in process 200.
Perform step 424 to calculate + B + 2, and
The adder 526 of 1 performs step 426 for calculating A + B in process 200. 3-2CSA is
Since the LSB position is different between double precision and single precision, it is used for generating A + B + 2.

【００４４】ロングパスにおいては、浮動小数点加算器
ユニット５００は、さらに、第２の右シフタ５２０に接
続されている丸め論理モジュール５２８を含む。データ
が第２の右シフタ５２０において小さい方のオペランド
の仮数から右シフトされているので、ＩＥＥＥ−７５４
規格に記述されているように、ＬＳＢ＋１ビットと、Ｌ
ＳＢと、保護ビットと、丸めビットと、スティッキービ
ットとを含む丸め情報が、丸め論理モジュール５２８に
よって並行して計算される。In the long pass, the floating point adder unit 500 further includes a rounding logic module 528 connected to the second right shifter 520. Since the data is right-shifted from the mantissa of the smaller operand in the second right shifter 520, IEEE-754
As described in the standard, LSB + 1 bit and L
Rounding information, including SBs, guard bits, rounding bits, and sticky bits are calculated in parallel by rounding logic module 528.

【００４５】さらに、ロングパスにおいては、浮動小数
点加算器ユニット５００は、第４の加算器５２４と第５
の加算器５２６と丸め論理モジュール５２８とに接続さ
れている第１段結果選択モジュール５３１を含む。この
第１段結果選択モジュール５３１は、プロセス２００に
おける第１段結果選択ステップ４３０を行う。図６Ａ
は、第１段結果選択モジュール５３１によって加算演算
のために使用される結果選択基準と丸めおよび正規化ア
ルゴリズムとを含む表である。図６Ｂは、第１段結果選
択モジュール５３１によって減算演算のために使用され
る結果選択基準と丸めおよび正規化アルゴリズムとを含
む表である。Further, in the long pass, the floating point adder unit 500 includes a fourth adder 524 and a fifth adder 524.
A first-stage result selection module 531 connected to the adder 526 and the rounding logic module 528. The first stage result selection module 531 performs the first stage result selection step 430 in the process 200. Figure 6A
Is a table containing the result selection criteria and rounding and normalization algorithms used by the first stage result selection module 531 for addition operations. FIG. 6B is a table containing the result selection criteria and rounding and normalization algorithms used by the first stage result selection module 531 for subtraction operations.

【００４６】図６Ａに示されているように、第１段結果
選択モジュールによって行われる丸めおよび正規化はオ
ーバフロービットに基づいており、このオーバフロービ
ットは、選択された結果の最上位ビット（ＭＳＢ）、丸
めビット、および、ＬＳＢよりも高い１ビット位置であ
る。丸めビットは、すべての丸めモードに関してＩＥＥ
Ｅ−７５４に準拠した丸めを行うための事前正規化され
た結果の条件付き増分を示し、および、ＩＥＥＥ−７５
４規格にしたがって、丸めモード、符号、保護ビット、
丸めビット、スティッキービット、および／または、オ
ーバフロービットに基づいて計算される。As shown in FIG. 6A, the rounding and normalization performed by the first stage result selection module is based on the overflow bit, which is the most significant bit (MSB) of the selected result. , Round bit, and 1 bit position higher than LSB. Rounding bit is IEEE for all rounding modes
FIG. 6 shows the conditional increment of the pre-normalized result for rounding according to E-754, and IEEE-75
Rounding mode, sign, protection bit,
Calculated based on rounding bits, sticky bits, and / or overflow bits.

【００４７】図６Ｂの表を参照すると、範囲（０．５，
２）内に含まれる減算結果は、その結果が１未満である
場合に、正規化のための１ビットの左シフトを必要とす
るだろう。この場合には、丸め位置がＬＳＢから保護ビ
ットに変化するだろう。採用可能なこれら２つの位置
は、Ａ＋Ｂ結果およびＡ＋Ｂ＋２結果と組み合わされ
た、ＬＳＢ＋１位置とＬＳＢ位置とにおける２つのフィ
ルビット（ｆｉｌｌｂｉｔ）を使用して処理されるこ
とが可能である。例えば、ＬＳＢと保護ビットとが共に
１である場合には、切り上げが行われ、より上位のビッ
トにおけるＡ＋Ｂ＋２の選択によって反映され、そのプ
ロセスにおいてＬＳＢビットをゼロにするだろう。ロン
グパスに関して結果を生じさせるためにフィルビットが
どのようにＡ＋Ｂ結果およびＡ＋Ｂ＋２結果と組み合わ
されるかに関するさらなる詳細が、図６Ｂに含まれてい
る。図６Ａの表と図６Ｂの表とに示されている丸めアル
ゴリズムが、ＩＥＥＥ−７５４規格に記述されている通
りの４つの丸めモードのすべてをサポートする。Referring to the table of FIG. 6B, the range (0.5,
The subtraction result contained within 2) will require a 1-bit left shift for normalization if the result is less than one. In this case, the rounding position would change from LSB to guard bit. These two possible positions can be processed using two fill bits at the LSB + 1 and LSB positions combined with the A + B and A + B + 2 results. For example, if the LSB and guard bit are both 1, then rounding up will occur, reflected by the selection of A + B + 2 in the higher bits, which will zero the LSB bit in the process. Further details on how fill bits are combined with A + B and A + B + 2 results to produce a result for a long pass are included in FIG. 6B. The rounding algorithms shown in the table of FIG. 6A and the table of FIG. 6B support all four rounding modes as described in the IEEE-754 standard.

【００４８】図５を参照すると、浮動小数点加算器ユニ
ット５００は、さらに、選択論理モジュール５４０を含
み、この選択論理モジュール５４０は位置合せモジュー
ルに接続されており、かつ、選択プロセス２２０を行う
ように構成されている、すなわち、表IIIまたは表IVに
おける選択基準に基づいてショートパスまたはロングパ
スからの結果を選択するかどうかを決定するように構成
されている論理回路を含む。浮動小数点加算器ユニット
５００は、さらに、ショートパス内の左シフタ５２４と
ロングパス内の第１段結果選択モジュール５３１と選択
論理モジュール５４０とに接続されている、結果選択モ
ジュール５５０を含む。結果選択モジュール５５０は、
プロセス２００内の結果選択ステップ２５０を行うよう
に構成されている、すなわち、選択論理モジュール５４
０からの決定に基づいてショートパスからの結果とロン
グパスからの結果との間で結果を選択するように構成さ
れている論理回路を含む。Referring to FIG. 5, the floating point adder unit 500 further includes a selection logic module 540, which is connected to the alignment module and performs the selection process 220. A logic circuit configured, i.e. configured to determine whether to select a result from a short path or a long path based on the selection criteria in Table III or Table IV. The floating point adder unit 500 further includes a result selection module 550 connected to the left shifter 524 in the short path and the first stage result selection module 531 in the long path and the selection logic module 540. The result selection module 550
Configured to perform result selection step 250 within process 200, ie, selection logic module 54.
A logic circuit configured to select a result between a short path result and a long path result based on a determination from zero.

【００４９】[0049]

【発明の効果】本発明は、図１に示されている従来の加
算器のショートパス内のパイプライン段３内の増分器を
取り除くことによって、ハードウェアのコストを低減さ
せる。The present invention reduces hardware costs by eliminating the incrementer in pipeline stage 3 in the short path of the conventional adder shown in FIG.

【００５０】本発明は、さらに、従来の浮動小数点加算
器のショートパス内の丸めステップに関連付けられた時
間遅延を排除する。The present invention further eliminates the time delay associated with the rounding step in the short path of conventional floating point adders.

[Brief description of drawings]

【図１】並行パイプラインパスを有する従来の浮動小数
点加算器ユニットのブロック図である。FIG. 1 is a block diagram of a conventional floating point adder unit with parallel pipeline paths.

【図２】本発明の一実施形態による２つの浮動小数点オ
ペランドに対して加算演算または減算演算を行うための
プロセスを示す流れ図である。FIG. 2 is a flow diagram illustrating a process for performing add or subtract operations on two floating point operands according to one embodiment of the invention.

【図３】本発明の一実施形態による２つの浮動小数点オ
ペランドに対して加算演算または減算演算を行うための
プロセスにおけるショートパスプロセスを示す流れ図で
ある。FIG. 3 is a flow diagram illustrating a short pass process in a process for performing add or subtract operations on two floating point operands according to one embodiment of the invention.

【図４】本発明の一実施形態による２つの浮動小数点オ
ペランドに対して加算演算または減算演算を行うための
プロセスにおけるロングパスプロセスを示す流れ図であ
る。FIG. 4 is a flow diagram illustrating a long pass process in a process for performing add or subtract operations on two floating point operands according to one embodiment of the invention.

【図５】本発明による２つの並行データパスを有する浮
動小数点加算器ユニットの略ブロック図である。FIG. 5 is a schematic block diagram of a floating point adder unit with two parallel data paths according to the present invention.

【図６】（Ａ）は、本発明による浮動小数点加算器ユニ
ットのロングパスにおける加算結果を選択するための基
準を含む表図であり、（Ｂ）は、本発明による浮動小数
点加算器ユニットのロングパスにおける減算結果を選択
するための基準を含む表図である。6A is a table including criteria for selecting an addition result in a long pass of a floating point adder unit according to the present invention, and FIG. 6B is a long pass of a floating point adder unit according to the present invention. FIG. 6 is a table diagram including criteria for selecting a subtraction result in FIG.

[Explanation of symbols]

２００…プロセス２１０…位置合せステップ２２０…選択プロセス２３０…ショートパスプロセス２４０…ロングパスプロセス２５０…結果選択ステップ３１２…先行ゼロ予測ステップ３１４…位置合せおよびスワップステップ３１５…先行ゼロベクトル選択ステップ３１６…仮数比較ステップ３１８…スワップステップ３２０…仮数減算ステップ３２２…先行ゼロ計算ステップ３２４…正規化ステップ３３０…先行ゼロ訂正ステップ 200 ... Process 210 ... Positioning step 220 ... Selection process 230 ... Short pass process 240 ... Long pass process 250 ... Result selection step 312 ... Leading zero prediction step 314 ... Alignment and swap steps 315 ... Leading zero vector selection step 316 ... Mantissa comparison step 318 ... Swap step 320 ... Mantissa subtraction step 322 ... Leading zero calculation step 324 ... Normalization step 330 ... Leading zero correction step

───────────────────────────────────────────────────── フロントページの続き (72)発明者アテュルダブラニアアメリカ合衆国，カリフォルニア 95138, サンノゼ，グレニーグルズドライブ 5817 (72)発明者ウォーレンジェームズアメリカ合衆国，カリフォルニア 95128, サンノゼ，パムラーアベニュ 591 Ｆターム(参考） 5B016 AA01 AA02 BA04 BB02 CA01 CD01 ─────────────────────────────────────────────────── ─── Continued front page (72) Inventor Atatur Dabulania United States, California 95138, San Jose, Glenniegles Drive 5817 (72) Inventor Warren James United States, California 95128, San Jose, Pamler Avenue 591 F term (reference) 5B016 AA01 AA02 BA04 BB02 CA01 CD01

Claims

[Claims]

1. A floating point adder having a long path and a short path, each having a mantissa and an exponent.
Selecting a path for a subtraction operation including two operands, selecting the long path in response to a difference between exponents of the two operands being greater than one; Selecting the short path in response to a difference between the exponents of the operands being zero, and the operand having a difference between the exponents of the two operands being 1 and having a greater exponent In response to the mantissa of being within the range of the first predetermined number,
The difference between selecting the short path and the exponent of the larger operand and the exponent of the smaller operand is 1, and the mantissa of the larger operand is a second predetermined number. Selecting the long path in response to being within the range of.

2. The first predetermined number range comprises a number less than 1.5 and the second predetermined number range comprises a number greater than or equal to 1.5. The method according to Item 1.

3. The first predetermined number range consists of numbers less than or equal to 1.5, and the second predetermined number range consists of numbers greater than 1.5. The method of claim 1.

4. A floating point adder unit having two parallel data paths, a short path and a long path, each producing a result for a floating point operation including two operands each having a mantissa and an exponent. A method of selecting a result between a result produced by a short pass and a result produced by the long pass, in response to the floating point operation being an addition operation,
In response to selecting the result produced by the long pass, and in response to the floating point operation being a subtraction operation and the difference between the exponents of the two operands being greater than 1; In response to selecting the generated result and in which the floating point operation is a subtraction operation and the difference between the exponents of the two operands is zero. Selecting the result, the floating point operation being a subtraction operation, the difference between the exponents of the two operands being 1, and the mantissa of the operand having a larger exponent being a first predetermined. Selecting the result produced by the short path in response to being within a range of a predetermined number, and the floating point operation being subtracted. Arithmetic, the difference between the exponents of the two operands is 1, and the mantissa of the operand with a greater exponent is within a second predetermined number, Selecting the result caused by the long pass.

5. The first predetermined range of numbers comprises less than 1.5 and the second predetermined range of numbers comprises greater than or equal to 1.5. Item 4. The method according to Item 4.

6. The first predetermined range of numbers consists of numbers less than or equal to 1.5 and the second predetermined range of numbers consists of numbers greater than 1.5. The method of claim 4.

7. A floating point adder unit comprising a short pass and a long pass, said short pass comprising no means for rounding the subtraction result.

8. A floating point adder unit for performing a floating point operation on two operands, each of which receives said two operands and is capable of occurring with respect to said floating point operation comprising said two operands. Two parallel data paths, a short path and a long path, that produce a result, and the short as a result of the floating point operation using the method of any of claims 3, 4 or 5. Selection logic running in parallel with each of the data paths including logic circuitry configured to determine whether to select the probable result from a path or the probable result from the long path. Floating point adder unit including module and.

9. A result selection module further comprising a result selection module connected to said short path, said long path and said selection logic module, and the result of said floating point operation based on said decision made by said selection logic module. 8. The floating point adder unit of claim 7, including a logic circuit configured to select between the possible result from the short path and the possible result from the long path.

10. The floating point adder unit of claim 8, wherein the short path does not include means for rounding the subtraction result.