WO2013145276A1 - 演算処理装置及び演算処理装置の制御方法 - Google Patents
演算処理装置及び演算処理装置の制御方法 Download PDFInfo
- Publication number
- WO2013145276A1 WO2013145276A1 PCT/JP2012/058646 JP2012058646W WO2013145276A1 WO 2013145276 A1 WO2013145276 A1 WO 2013145276A1 JP 2012058646 W JP2012058646 W JP 2012058646W WO 2013145276 A1 WO2013145276 A1 WO 2013145276A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- coefficient
- arithmetic processing
- input data
- instruction
- data
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/38—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
- G06F7/48—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
- G06F7/544—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices for evaluating functions by calculation
- G06F7/556—Logarithmic or exponential functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/38—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
- G06F7/48—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
- G06F7/483—Computations with numbers represented by a non-linear combination of denominational numbers, e.g. rational numbers, logarithmic number system or floating-point numbers
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2207/00—Indexing scheme relating to methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F2207/483—Indexing scheme relating to group G06F7/483
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2207/00—Indexing scheme relating to methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F2207/556—Indexing scheme relating to group G06F7/556
Definitions
- the present invention relates to an arithmetic processing unit and a control method for the arithmetic processing unit.
- the coefficient calculation process prior to the execution of the exponential function Taylor series operation is implemented by software and uses a combination of multiple conventional instructions to transfer data between floating-point registers and integer registers, Many arithmetic processes using integer arithmetic units such as arithmetic and shift arithmetic are performed. For this reason, a large number of instructions are required to process the entire exponential function calculation, and the performance such as compression of instruction issue throughput is reduced.
- a set of coefficient tables that store coefficient data for Taylor series operations of mathematical functions is stored in a dedicated memory, and coefficient data required for Taylor series operations are read directly from the coefficient table and supplied to the floating-point multiply-add operator
- an arithmetic processing device that can execute a Taylor series operation at high speed
- an arithmetic processing unit with a dedicated trigonometric function auxiliary instruction is proposed as a command to determine the Taylor series expansion function and calculate the input argument to the expansion function before executing the Taylor series operation of the trigonometric function. (For example, refer to Patent Document 2).
- the present invention aims to speed up the calculation of exponential functions.
- an exponent part of a coefficient expressed in a floating-point number format when an exponential function is decomposed into a series operation and a coefficient for the series operation is based on a first part of input data to be input.
- An exponent generation unit to generate, a storage unit to store the mantissa part of the coefficient, a constant generation unit to read constant data corresponding to the second part of the input data from the storage unit, and an instruction to be executed calculate the coefficient of the exponent function
- the selection unit has a selection unit that selects and outputs constant data from the constant generation unit.
- FIG. 1 is a diagram illustrating a configuration example of an arithmetic processing device according to an embodiment of the present invention.
- FIG. 2 is a diagram illustrating an example of a constant table.
- FIG. 3 is a diagram for explaining a coefficient calculation process in the present embodiment.
- FIG. 4 is a diagram illustrating another configuration example of the arithmetic processing device according to the present embodiment.
- FIG. 1 is a diagram showing a configuration example of an arithmetic processing unit connected to a memory (main memory) as a main storage device according to an embodiment of the present invention.
- the arithmetic processing device according to the present embodiment is connected to a memory (main memory) 11 outside the arithmetic processing device, and includes a cache memory 12 that stores a part of data in the main memory 11, a renaming register 13, a register file 14, and the like. And bypass data 15.
- the arithmetic processing apparatus according to the present embodiment includes multiplexers 16 to 18, 23, a first arithmetic unit 19, and a second arithmetic unit 20.
- the information processing apparatus includes at least an arithmetic processing unit and a memory 11.
- the register file 14 includes all the registers used when the arithmetic units 19 and 20 execute arithmetic operations.
- the renaming register 13 is provided to eliminate the inverse dependence and output dependence of the operand data.
- the bypass data 15 is data (operation result data) used in bypassing for eliminating a data hazard in the instruction pipeline of the arithmetic processing unit.
- the register value stored in the entry of the renaming register 13 is moved to the register file 14 at the time of retirement.
- the multiplexer 16 receives three types of operand data: data output from the register file 14, data output from the renaming register 13, and bypass data 15. The multiplexer 16 selects any one of the three types of operand data and outputs it as source data rs1. Multiplexers 17 and 18 receive three types of operand data as in multiplexer 16, and output selected operand data as source data rs2 and rs3.
- the first arithmetic unit 19 is a floating-point multiply-accumulate arithmetic unit that uses the source data rs1, rs2, and rs3 input from the multiplexers 16, 17, and 18 to multiply the product of the value of rs1 and the value of rs2 by rs3. Performs product-sum operations that add values. For example, the first computing unit 19 performs a Taylor series operation by performing a product-sum operation using the input source data rs1, rs2, and rs3.
- the second arithmetic unit 20 is an arithmetic unit that performs arithmetic processing related to an exponential function arithmetic auxiliary instruction (fexppad).
- the second calculator 20 performs a coefficient calculation process for obtaining a coefficient when the exponential function is decomposed into a Taylor series calculation and a coefficient for the Taylor series calculation, using the source data rs ⁇ b> 2 input from the multiplexer 17.
- the Taylor series calculation when the exponential function is decomposed into a Taylor series calculation and a coefficient is performed by, for example, the first calculator 19.
- the second arithmetic unit 20 performs the coefficient calculation process using the source data rs2, but this is an example.
- the second computing unit 20 may perform the coefficient calculation process using the source data rs1, or may perform the coefficient calculation process using the source data rs3.
- the exponential function operation auxiliary instruction performs an operation of ⁇ 1′b0, sdat [16: 6], Texp [sdat [5: 0]] [51: 0] ⁇ , where sdat is input source data.
- the instruction to perform. ⁇ 1'b0, sdat [16: 6], Texp [sdat [5: 0]] [51: 0] ⁇ conforms to the data format of IEEE 754 double-precision floating-point numbers. That is, the 63rd bit as the code part (code bit) is set to “0”, and the 62nd through 52nd bits as the exponent part (exponential bit) are set as the 16th to 6th bits of the source data sdat.
- the 51st to 0th bits which are the mantissa part (mantissa part bits) are replaced with the 51st bit to the 51st bit of the data extracted from the constant table Texp with the indexes indicated by the 5th to 0th bits of the source data sdat. Set to the 0th bit.
- the constant table Texp is provided as a constant table 21 included in the second computing unit 20.
- the constant table Texp is a 64-entry constant table storing a value of (2 ** (i / 64)) according to the data format of IEEE 754 double-precision floating-point number. “**” indicates a power and i is an integer in the range of 0 to 63.
- i 0 to 63
- the value of (2 ** (i / 64)) is set to IEEE754 as shown in FIG.
- the sign part s and the exponent part e shown according to the data format of the double-precision floating point number are the same regardless of the value of i.
- the constant table Texp should store at least the value fi of the mantissa part f of the value of (2 ** (i / 64)). By storing only the mantissa part instead of all the values of (2 ** (i / 64)), the storage capacity required for the constant table Texp can be reduced.
- the instruction type code 22 is input to the multiplexer 23 as the selection signal SEL.
- the multiplexer 23 outputs either the output of the first calculator 19 or the output of the second calculator 20 in accordance with the selection signal SEL.
- the instruction type code is an exponential function operation auxiliary instruction (fexppad)
- the value of the selection signal SEL is set to “1”, so that the multiplexer 23 outputs the output of the second arithmetic unit 20. Select to output.
- the instruction type code is not an exponential function calculation auxiliary instruction (fexppad)
- the value of the selection signal SEL is set to “0” so that the multiplexer 23 selects the output of the first arithmetic unit 19. Output.
- exp (x) is decomposed into Taylor series calculation and coefficients for Taylor series calculation as follows.
- exp (y2) can be calculated with a Taylor series calculation and (2 ** z) is calculated as a coefficient because sufficient accuracy is obtained with a finite order. That is, in the calculation of the exponential function exp (x), exp (y2) is calculated by the Taylor series calculation by the first calculator 19 and (2 ** z) is calculated by the coefficient calculation process by the second calculator 20. To do.
- Zi int (x / log (2) * 64 + bias * 64 + 0.5).
- int (x) represents a value when the value x is rounded down to an integer.
- the value expressed as follows corresponds to the value of p + bias.
- the mantissa part of the coefficient 2 ** z is obtained by extracting the data by the index. Therefore, the coefficient 2 ** z can be calculated by the operation of ⁇ 1′b0, sdat [16: 6], Texp [sdat [5: 0]] [51: 0] ⁇ .
- Zi is read from the floating-point register into the memory by the instruction C1, and zi read into the memory is read into the integer register as zi by the instruction C2.
- the instruction C3 a bitwise AND operation between zii and the value 63 is performed, and the operation result is assigned to Texpe.
- the instruction C4 shifts the Text to the upper 3 bits, and assigns the result to the text. This is because it is necessary to point addresses at 8-byte intervals when referring to the table.
- the instruction C5 the table is referred to using an address obtained by adding the base addresses Texpb and Texppo on the memory in which the table is stored, and the data read from the table is substituted into p2zi. .
- the value 2047 is substituted for p2zmm by the instruction C6.
- 11 bits of mask data are created and assigned to p2zmm.
- the instruction C7 shifts p2zmm by 6 bits upward.
- the bitwise AND operation of zii and p2zmm is performed by the instruction C8, the calculation result is substituted into p2zm, and p2zm is shifted to the upper side by 46 bits by the instruction C9, thereby obtaining the exponent part of the coefficient. It is done.
- a bitwise OR operation is performed on the exponent part of the obtained coefficient and p2zi which is data read from the table, and the operation result is substituted into p2zi. Thereby, the value of the coefficient 2 ** z for the Taylor series calculation is obtained.
- p2z [63] is “0”
- p2z [62:52] is zi [16: 6]
- p2z [51: 0] is a bit concatenation so that the data output from the constant table 21 according to zi [5: 0] (the mantissa part of 2 ** (i / 64)) is obtained, the coefficient 2 * for the Taylor series operation * Z coefficient calculation processing is performed.
- the coefficient calculation process for the Taylor series calculation can be performed with one exponential function calculation auxiliary instruction (fexppad), so that eleven assembler instructions can be reduced compared to the conventional case. .
- the coefficient calculation process for the Taylor series operation in the exponential function calculation can be performed with one instruction, and the exponential function calculation can be speeded up. Therefore, the instruction throughput in the arithmetic processing unit can be improved and the performance can be improved.
- a circuit newly provided for executing a calculation based on an exponential function calculation auxiliary instruction (fexppad) with respect to a general calculation processing apparatus includes a second calculation unit 20 and a multiplexer 23. It is. Therefore, by adding a few additional circuits, it is possible to speed up the coefficient calculation process when the exponential function is decomposed into Taylor series calculation and coefficients for Taylor series calculation, and to speed up the exponential function calculation.
- the calculation performance of the coefficient calculation process when the exponential function is decomposed into coefficients for Taylor series calculation and Taylor series calculation is improved by 9 times (two pipelines by integer calculators, floating point) (When there are two pipelines with computing units). Further, in the conventional method, a load / store instruction must be executed in the table reference in the coefficient calculation process for the Taylor series operation, so that a cache miss may occur. In the embodiment, the calculation performance of the coefficient calculation process for the Taylor series calculation is improved by 9 times or more.
- the arithmetic processing apparatus is not limited to the configuration shown in FIG. 1.
- the arithmetic processing apparatus is a SIMD (Single Instruction stream-Multiple Data Stream) type arithmetic processing. It is good also as an apparatus.
- FIG. 4 shows a 2 SIMD arithmetic processing unit as an example.
- the first arithmetic processing unit includes a renaming register 13A, a register file 14A, bypass data 15A, multiplexers 16A to 18A, 23A, a first arithmetic unit 19A, and a second arithmetic unit 20A.
- a renaming register 13B, a register file 14B, bypass data 15B, multiplexers 16B to 18B, 23B, a first arithmetic unit 19B, and a second arithmetic unit 20B are provided.
- the arithmetic processing unit is configured, and the same arithmetic processing is executed in parallel by the first arithmetic processing unit and the second arithmetic processing unit with one instruction for two pieces of data.
- 4 illustrates a 2 SIMD arithmetic processing apparatus, but a configuration such as 4 SIMD or 8 SIMD is also possible by further providing an arithmetic processing unit.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Pure & Applied Mathematics (AREA)
- Mathematical Analysis (AREA)
- Computational Mathematics (AREA)
- Computing Systems (AREA)
- Mathematical Optimization (AREA)
- General Engineering & Computer Science (AREA)
- Nonlinear Science (AREA)
- Complex Calculations (AREA)
- Advance Control (AREA)
Abstract
Description
Unit)等の演算処理装置において、指数関数等の演算は、一般的にテーラー級数演算を用いて行われる。指数関数の演算においてテーラー級数演算をある有限な次数で打ち切ったときに十分な精度が得られるようにするためには、指数関数を、有限の次数で与えられた精度に収束するテーラー級数の演算であるテーラー級数演算と係数に分解する必要がある。
なお、以下の説明において「**」はべき乗を示し、「!」は階乗を示し、「*」は乗算を示すものとする。また、「log2()」は底2の対数を示し、「log()」は底e(ネイピア数)の対数を示すものとする。
exp(x)
=(2**log2(e))**x
=(2**(1/log(2))**x
=2**(x/log(2))
=2**(y+z)
=(2**y)*(2**z)
=exp(log(2**y))*(2**z)
=exp(y*log(2))*(2**z)
=exp(y2)*(2**z)
stdf zi,[] …(命令C1)
ldx [],zii …(命令C2)
and zii,63,Texpe …(命令C3)
sllx Texpe,3,Texpo …(命令C4)
ldx [Texpb+Texpo],p2zi …(命令C5)
mov 2047,p2zmm …(命令C6)
sllx p2zmm,6,p2zmm …(命令C7)
and zii,p2zmm,p2zm …(命令C8)
sllx p2zm,46,p2zm …(命令C9)
or p2zi,p2zm,p2zi …(命令C10)
stdx p2zi,[] …(命令C11)
lddf [],p2z …(命令C12)
fexpad zi,p2z …(命令I1)
12 キャッシュメモリ
13 リネーミングレジスタ
14 レジスタファイル
15 バイパスデータ
16、17、18、23 マルチプレクサ
19 積和演算器
20 演算器
21 定数テーブル
22 命令種別コード
Claims (6)
- 指数関数を級数演算と前記級数演算に対する係数とに分解した場合における浮動小数点数形式で表現した前記係数の指数部を、入力される入力データの第1の部分に基づいて生成する指数生成部と、
前記係数の仮数部を記憶する記憶部と、
前記記憶部から、前記入力データの第2の部分に応じた定数データを読み出す定数生成部と、
実行する命令が前記指数関数の係数を算出する係数算出命令である場合、前記定数生成部からの定数データを選択して出力する選択部を有することを特徴とする演算処理装置。 - 前記記憶部は、
前記入力データの第2の部分が示す値i(iは自然数)に対応して(2**(i/(2**第2の部分のビット幅)))(**はべき乗を示す)の値を浮動小数点数形式で表現した仮数部を前記定数データとして記憶することを特徴とする請求項1記載の演算処理装置。 - 前記入力データの第1の部分は、前記入力データの(n+11)ビット目~(n+1)ビット目(nは自然数)であり、
前記入力データの第2の部分は、前記入力データのnビット目~0ビット目であることを特徴とする請求項2記載の演算処理装置。 - 前記入力データの(n+11)ビット目~(n+1)ビット目を前記係数を浮動小数点数形式で表した指数部とし、前記入力データのnビット目~0ビット目により前記記憶部を参照して得られた定数データを前記係数を浮動小数点数形式で表した仮数部とすることを特徴とする請求項3記載の演算処理装置。
- 前記入力データを用いた積和演算を行う積和演算器を有し、
実行する命令が前記係数算出命令以外の命令である場合、前記積和演算器からの前記入力データを用いた積和演算の結果である積和演算結果を選択して出力することを特徴とする請求項1~4のいずれか1項に記載の演算処理装置。 - 演算処理装置の制御方法において、
前記演算処理装置が有する指数生成部が、指数関数を級数演算と前記級数演算に対する係数とに分解した場合における浮動小数点数形式で表現した前記係数の指数部を、入力される入力データの第1の部分に基づいて生成し、
前記演算処理装置が有する定数生成部が、前記係数の仮数部を記憶する記憶部から、前記入力データの第2の部分に応じた定数データを読み出し、
実行する命令が前記指数関数の係数を算出する係数算出命令である場合、前記演算処理装置が有する選択部が、前記定数生成部からの定数データを選択して出力することを特徴とする演算処理装置の制御方法。
Priority Applications (5)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2014507237A JP5794385B2 (ja) | 2012-03-30 | 2012-03-30 | 演算処理装置及び演算処理装置の制御方法 |
EP12872640.3A EP2833258B1 (en) | 2012-03-30 | 2012-03-30 | Arithmetic processing unit and method for controlling arithmetic processing unit |
CN201280071614.8A CN104169866B (zh) | 2012-03-30 | 2012-03-30 | 运算处理装置以及运算处理装置的控制方法 |
PCT/JP2012/058646 WO2013145276A1 (ja) | 2012-03-30 | 2012-03-30 | 演算処理装置及び演算処理装置の制御方法 |
US14/479,392 US9477442B2 (en) | 2012-03-30 | 2014-09-08 | Processor and control method of processor |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/JP2012/058646 WO2013145276A1 (ja) | 2012-03-30 | 2012-03-30 | 演算処理装置及び演算処理装置の制御方法 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US14/479,392 Continuation US9477442B2 (en) | 2012-03-30 | 2014-09-08 | Processor and control method of processor |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2013145276A1 true WO2013145276A1 (ja) | 2013-10-03 |
Family
ID=49258639
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2012/058646 WO2013145276A1 (ja) | 2012-03-30 | 2012-03-30 | 演算処理装置及び演算処理装置の制御方法 |
Country Status (5)
Country | Link |
---|---|
US (1) | US9477442B2 (ja) |
EP (1) | EP2833258B1 (ja) |
JP (1) | JP5794385B2 (ja) |
CN (1) | CN104169866B (ja) |
WO (1) | WO2013145276A1 (ja) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP3118737A1 (en) | 2015-07-16 | 2017-01-18 | Fujitsu Limited | Arithmetic processing device and method of controlling arithmetic processing device |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108595147B (zh) * | 2018-01-02 | 2021-03-23 | 上海兆芯集成电路有限公司 | 具有级数运算执行电路的微处理器 |
JP6933810B2 (ja) * | 2018-01-24 | 2021-09-08 | 富士通株式会社 | 演算処理装置および演算処理装置の制御方法 |
US11372621B2 (en) | 2020-06-04 | 2022-06-28 | Apple Inc. | Circuitry for floating-point power function |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH11259675A (ja) * | 1998-03-06 | 1999-09-24 | Nec Corp | 3次元グラフィックス処理装置における光源計算高速化の為の指数関数演算方法および装置 |
US20020087608A1 (en) * | 1996-12-17 | 2002-07-04 | Rarick Leonard D. | Apparatus for computing transcendental functions quickly |
US20040010532A1 (en) * | 2002-07-09 | 2004-01-15 | Silicon Integrated Systems Corp. | Apparatus and method for computing a logarithm of a floating-point number |
JP2008234076A (ja) | 2007-03-16 | 2008-10-02 | Fujitsu Ltd | 演算処理装置 |
JP2011013728A (ja) | 2009-06-30 | 2011-01-20 | Fujitsu Ltd | 演算処理装置 |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2000099493A (ja) * | 1998-09-18 | 2000-04-07 | Fuji Xerox Co Ltd | 誤差関数計算装置 |
US7509363B2 (en) * | 2001-07-30 | 2009-03-24 | Ati Technologies Ulc | Method and system for approximating sine and cosine functions |
US7143126B2 (en) * | 2003-06-26 | 2006-11-28 | International Business Machines Corporation | Method and apparatus for implementing power of two floating point estimation |
US9128790B2 (en) * | 2009-01-30 | 2015-09-08 | Intel Corporation | Digital signal processor having instruction set with an exponential function using reduced look-up table |
-
2012
- 2012-03-30 WO PCT/JP2012/058646 patent/WO2013145276A1/ja active Application Filing
- 2012-03-30 CN CN201280071614.8A patent/CN104169866B/zh active Active
- 2012-03-30 JP JP2014507237A patent/JP5794385B2/ja active Active
- 2012-03-30 EP EP12872640.3A patent/EP2833258B1/en active Active
-
2014
- 2014-09-08 US US14/479,392 patent/US9477442B2/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20020087608A1 (en) * | 1996-12-17 | 2002-07-04 | Rarick Leonard D. | Apparatus for computing transcendental functions quickly |
JPH11259675A (ja) * | 1998-03-06 | 1999-09-24 | Nec Corp | 3次元グラフィックス処理装置における光源計算高速化の為の指数関数演算方法および装置 |
US20040010532A1 (en) * | 2002-07-09 | 2004-01-15 | Silicon Integrated Systems Corp. | Apparatus and method for computing a logarithm of a floating-point number |
JP2008234076A (ja) | 2007-03-16 | 2008-10-02 | Fujitsu Ltd | 演算処理装置 |
JP2011013728A (ja) | 2009-06-30 | 2011-01-20 | Fujitsu Ltd | 演算処理装置 |
Non-Patent Citations (2)
Title |
---|
See also references of EP2833258A4 * |
TANG: "Table-Driven Implementation of the Exponential Function in IEEE Floating-Point Arithmetic", ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE, vol. 15, no. 2, June 1989 (1989-06-01), pages 144 - 157, XP055152038 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP3118737A1 (en) | 2015-07-16 | 2017-01-18 | Fujitsu Limited | Arithmetic processing device and method of controlling arithmetic processing device |
US10037188B2 (en) | 2015-07-16 | 2018-07-31 | Fujitsu Limited | Arithmetic processing device and method of controlling arithmetic processing device |
Also Published As
Publication number | Publication date |
---|---|
US20140379772A1 (en) | 2014-12-25 |
JPWO2013145276A1 (ja) | 2015-08-03 |
CN104169866A (zh) | 2014-11-26 |
JP5794385B2 (ja) | 2015-10-14 |
US9477442B2 (en) | 2016-10-25 |
CN104169866B (zh) | 2017-08-29 |
EP2833258A4 (en) | 2015-03-18 |
EP2833258A1 (en) | 2015-02-04 |
EP2833258B1 (en) | 2016-09-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
KR102447636B1 (ko) | 부동 소수점 수를 누산하기 위한 산술 연산을 수행하는 장치 및 방법 | |
JP2018500635A (ja) | プログラム可能な有効度データを使用するデータ処理装置および方法 | |
JP6933810B2 (ja) | 演算処理装置および演算処理装置の制御方法 | |
JP5640081B2 (ja) | 飽和を伴う整数乗算および乗算加算演算 | |
JP5794385B2 (ja) | 演算処理装置及び演算処理装置の制御方法 | |
JP5304483B2 (ja) | 演算処理装置 | |
JP4476210B2 (ja) | 逆数演算の結果値の初期推定値を求めるデータ処理装置および方法 | |
JP5193358B2 (ja) | 多項式データ処理演算 | |
US10776207B2 (en) | Load exploitation and improved pipelineability of hardware instructions | |
TWI653577B (zh) | 整合算術及邏輯處理的裝置 | |
Schulte et al. | Floating-point division algorithms for an x86 microprocessor with a rectangular multiplier | |
TW201905845A (zh) | 全精度及部分精度數值的計算方法及裝置 | |
Fiolhais et al. | An efficient exact fused dot product processor in FPGA | |
Jaiswal et al. | Taylor series based architecture for quadruple precision floating point division | |
EP3118737B1 (en) | Arithmetic processing device and method of controlling arithmetic processing device | |
JP6604393B2 (ja) | ベクトルプロセッサ、演算実行方法、プログラム | |
Rudnicki et al. | FPGA implementation of the multiplication operation in multiple-precision arithmetic | |
Kakde et al. | FPGA implementation of 128-bit fused multiply add unit for crypto processors | |
Ravi et al. | Analysis and study of different multipliers to design floating point MAC units for digital signal processing applications | |
EP2884403A1 (en) | Apparatus and method for calculating exponentiation operations and root extraction | |
CN114327360A (zh) | 运算单元、浮点数计算的方法、装置、芯片和计算设备 | |
Jaiswal et al. | Taylor series based architecture for Quadruple Precision | |
JPH01300338A (ja) | 浮動小数点乗算器 | |
Amaricai et al. | FPGA implementations of low precision floating point multiply-accumulate | |
Raut et al. | Floating-Point Multiplier for DSP Using Vertically and Crosswise Algorithm |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 12872640 Country of ref document: EP Kind code of ref document: A1 |
|
REEP | Request for entry into the european phase |
Ref document number: 2012872640 Country of ref document: EP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2012872640 Country of ref document: EP |
|
ENP | Entry into the national phase |
Ref document number: 2014507237 Country of ref document: JP Kind code of ref document: A |
|
NENP | Non-entry into the national phase |
Ref country code: DE |