GB2522194A - Multiply adder - Google Patents

Multiply adder Download PDF

Info

Publication number
GB2522194A
GB2522194A GB1400644.9A GB201400644A GB2522194A GB 2522194 A GB2522194 A GB 2522194A GB 201400644 A GB201400644 A GB 201400644A GB 2522194 A GB2522194 A GB 2522194A
Authority
GB
United Kingdom
Prior art keywords
value
exponent
product
mantissa
exponent value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
GB1400644.9A
Other versions
GB2522194B (en
GB201400644D0 (en
Inventor
David Raymond Lutz
Neil Burgess
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
ARM Ltd
Original Assignee
ARM Ltd
Advanced Risc Machines Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ARM Ltd, Advanced Risc Machines Ltd filed Critical ARM Ltd
Priority to GB1400644.9A priority Critical patent/GB2522194B/en
Publication of GB201400644D0 publication Critical patent/GB201400644D0/en
Priority to US14/566,981 priority patent/US9696964B2/en
Priority to CN201510005354.XA priority patent/CN104778028B/en
Priority to KR1020150001770A priority patent/KR102318494B1/en
Publication of GB2522194A publication Critical patent/GB2522194A/en
Application granted granted Critical
Publication of GB2522194B publication Critical patent/GB2522194B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/544Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices for evaluating functions by calculation
    • G06F7/5443Sum of products
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/50Adding; Subtracting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/483Computations with numbers represented by a non-linear combination of denominational numbers, e.g. rational numbers, logarithmic number system or floating-point numbers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/52Multiplying; Dividing

Abstract

A floating point multiply add circuit (24) includes a multiplier( 26) and an adder (28) for computing A+(B*C). The input operands A, B and C together with the result value all have a normal exponent value range, such as a range consistent with the IEEE Standard 754. The product value is passed from the multiplier (26) to the adder (28) as an extended exponent value range that extends lower than the normal exponent value range. Shifters (Fig 4, 48, 50) within the adder can take account of the extended exponent value range of the product as necessary in order to bring the result value back into the normal exponent value range. The product may be unrounded. Count leading zero circuitry may also be used to determine the leading zeros in the mantissa of B & C to be used to shift the mantissa if the count is greater than zero. The adder may comprise a shifter to align the addend with the product. The multiplier may generate a flag to indicate that the product has a value lower than the exponent value range.

Description

MULTIPLY ADDER
This disclosure relates to the field of data processing systems. More particularly, this disclosure relates to multiply add arithmetic within data processing systems.
It is known to provide data processing systems with arithmetic circuitry that performs multiply add operations of the form A+(B*C), where A, B and C are all floating point numbers.
The input operands and the output results will typically have an expected format specifying an exponent value and a mantissa value for the floating point number concerned. The number of bits used to represent the floating point number will place a constraint upon the range of possible exponent values that are supported for a given floating point number format. One example of such floating point number formats are given in the IEEE Standard 754, Viewed from one aspect the present disclosure provides apparatus for performing an arithmetic operation A + (B * C), where A, B and C are floating point numbers each having an exponent value within an exponent value range and a mantissa value, said apparatus comprising: a multiplier configured to multiply B and C to generate a product having a product mantissa value and a product exponent value; and an adder configured to add A and said product to generate a result value; wherein said multiplier is configured to generate said product exponent value passed to said adder with an extended exponent value range that extends to lower values than said exponent value range; and said adder is configured to receive said product exponent value with said extended exponent value range and to generate said result value with a result exponent within said exponent value range.
The present technique recognises that while the input operands A, B and C to a multiply add operation together with the result value may all have exponents within an exponent value range, it is possible to use an extended exponent value range for the product exponent which is passed between the multiplier and the adder. This extended exponent value range extends sower than the (normal) exponent value range thereby increasing the number of ways in which subnormal floating point product values may be represented. Accordingly, a requirement to shift the product value so that the product exponent falls thin the (normal) exponent value range may be avoided and the consequent time taken to perform such a shift also avoided. This increases the speed with which a multiply add operation may be performed. The technique recognises that the adder will typically already include shifters for aligning the operand A and the product as part of the add operation and accordingly any additional shift arising due to the product exponent lying outside of the (normal) exponent value range maybe accommodated within the shift operation perfonned in the adder without introducing extra processing delay.
The product passed from the multiplier to the adder may be unrounded. Accordingly, the mantissa (or fraction) passed from the multiplier to the adder will include more bits than are available to represent the mantissa within the inputs or the outputs, but are required to achieve the desired level of accuracy within the results being calculated. This form of multiply add circuitry is a fused multiply adder.
In order to efficiently deal with subnormal input operands to the multiplier (i.e. floating point numbers with a magnitude such that with the smallest exponent value which can be represented, the mantissa value starts with one or more zeros rather than the nonnally assumed leading "1" at the head of the mantissa) some embodiments are such that said multiplier comprises: first count-leading-zero circuitry configured to determine a count value CLZB of a number of leading zeros in a mantissa value of B; a first shifter configured to left shift said mantissa of B by CLZB places to form a shifted mantissa of B if CLZB is greater than zero; second count-leading-zero circuitry configured to determine a count value CLZC of a number of leading zeros in a mantissa value of C; and a second shifter configured to left shift said mantissa of C by CLZC places to form a shifted mantissa of C if CLZC is greater than zero.
In some embodiments the multiplier may be configured to form the product exponent as a sum of at least an exponent value of B, an exponent value of C, -CLZB and -CLZC. Accordingly the product exponent value with its extended exponent value range may take account of any left shifts which have been performed upon the mantissa values of B and C. In some embodiments an overflow value may be added into the product exponent to take account of an overflow from the assumed MSB position when the product of the mantissa of B and the mantissa of C is calculated.
The adder may include an adder shifter responsive to the exponent value of A and the product exponent to perform a shift operation of at least one of the mantissa of A and the product mantissa to align these in magnitude before the addition is performed. This adder shifter is responsive to the extended exponent value range of the product exponent in determining the shifts to be performed to the mantissa of A and the product mantissa.
In some embodiments the multiplier may be configured to generate an out-of-range exponent flag signal which is sent to the adder to indicate that the product exponent has a value lower than the (normal) exponent value range. Such a flag signal may be used to switch in any additional processing required to handle the product exponent if this falls outside of the (normal) exponent value range.
While not restricted to such use, the present technique may be emp'oyed within systems in which the exponent value range is in accordance with IEEE Standard 754 and the extended exponent value range includes negative exponent values. It will be appreciated that the particular ranges will depend upon the precision of the floating point numbers being represented, e.g. single precision or double precision.
Viewed from another aspect the present disclosure provides apparatus for performing an arithmetic operation A + (B * C), where A, B and C are floating point numbers each having an exponent value within an exponent value range and a mantissa value, said apparatus comprising: multiplier means for multiplying B and C to generate a product having a product mantissa value and a product exponent value; and adder means for adding A and said product to generate a result value; wherein said multiplier means generates said product exponent value passed to said adder with an extended exponent value range that extends to lower values than said exponent value range; and said adder means receives said product exponent value with said extended exponent value range and generates said result value with a result exponent within said exponent value range.
Viewed from a further aspect the present disclosure provides a method of performing an arithmetic operation A + (B * C), where A, B and C are floating point numbers each having an exponent value within an exponent value range and a mantissa value, said apparatus comprising the steps of: multiplying B and C to generate a product having a product mantissa value and a product exponent value; and adding A and said product to generate a result value; wherein said step of multiplying generates said product exponent value passed to an adder with an extended exponent value range that extends to lower values than said exponent value range; and said step of adding receives said product exponent value with said extended exponent value range and generates said result value with a result exponent within said exponent value range.
Embodiments of the disclosure will now be described, by way of example only, with reference to the accompanying drawings in which: Figure 1 schematically illustrates a data processing system including a processor having a floating point pipeline which inc'udes multip'y adder circuitry; Figure 2 schematically illustrates multiply adder circuitry and the format of data values at various points; Figure 3 schematically illustrates a portion of a multiplier; Figure 4 schematically illustrates a portion of an adder; and Figure 5 is a flow diagram schematically illustrating the operation of the multiplier of Figure 3.
Figure 1 schematically illustrates a data processing apparatus 2 in the form of a processor 4 coupled to a memory 6. The memory 6 stores a program 8 and data 10. The program 8 comprises program instructions which, when executed by the processor 4, manipulate the data 10. The program instmctions may include floating point program instructions, These floating point program instructions may include multiply add instructions. The floating point instructions operate upon floating point numbers comprising an exponent value and a mantissa value. These values may be represented in accordance with the IEEE Standard 754. It will be appreciated that the use of other floating point standards is also possible and that the present techniques are not limited to use with the IEEE standard 754. A mantissa value when the exponent is within the normal exponent value range will include an implied "1" at its most significant bit position. Accordingly, the data actually manipulated and stored will be the fractional part of the mantissa and the leading "1" will be assumed. In the case of a subnormal number where the exponent value is out of range (the exponent is at its minimum value), the assumed leading "1" will not be present and instead there will be a variable number of leading "0" values. The leading value of the mantissa will be assumed to be a "0" and the number of zeros leading the fractional value may be counted in order to determine the effective exponent value of the floating point number concerned.
The processor 4 of Figure 1 includes several execution pipelines including a load store pipeline 12, an integer pipeline 14, a SI1VID pipeline 16 and a floating point pipeline 18. Program instructions fetched from the memory 6 by a fetch stage 20 are passed to an issue stage 22 where they are issued into an appropriate one of the execution pipelines 12, 14, 16, 18. The floating point pipeline 18 includes circuitry for performing a multiply add operation upon floating point numbers as will be described further below.
The number of processing cycles taken to perform a multiply add operation may be an important performance characteristic. Some embodiments may advantageously reduce the number of processing cycles taken to perform a multiply add operation.
Figure 2 schematically illustrates multiply add circuitry 24 including a multiplier 26 and an adder 28. The input operands to the multiply add circuitry 24 are A, B and C. Each of these input operands may have the IEEE Standard 754 format and accordingly have an exponent value range dependent upon the precision of the number concerned (i.e. the normal exponent value range). The multiplier 26 performs a multiplication of the operands B and C to produce a product. This product is passed to the adder 28 where it is added to the operand A. The output from the adder 28 is a result value which also has the IEEE Standard 754 format.
In accordance with the present techniques, the output from the multiplier 26 to the adder 28 (i.e. the product) is an unrounded value and has an extended exponent value range. This extended exponent value range extends lower (e.g. to negative exponent values) than the exponent value range employed for A, B, C and the result value. Adapting the multiplier 26 to generate a product with such an extended exponent value range and the adder 28 to receive the product with such an extended exponent value range avoids any need to manipulate the product back into a form having the exponent value range (normal exponent value range) between the multiplier 26 and the adder 28. Avoiding this additional manipulation speeds up the operation of the multiply add circuitry 24.
Figure 3 schematically illustrates the multiplier 26. There are three execution stages El, E2 and E3. In the first stage El, count-leading-zero circuitry 30,32 respectively count the number of leading zeros in the mantissa of the operands B and C. Shifters 34,36,38 then shift these mantissa values (and derivatives thereof required for methods adding larger and/or signed multiples of the base producL e.g. Booth multipliers) to form the outputs from the stage El. Left shifts will be applied when the mantissa concerned is subnormal in order to align the mantissas for B and C before the Booth multiplication operation is performed. These applied shifts are held in the count leading zero values determined and are used to form the product exponent value for the product value which is passed to the adder 28. In particular, the count-leading-zero circuitry 30 determines a count leading zero value CLZB for the mantissa of B. The count4eading-zero circuitry 34 detennines a count leading zero value CLZC for the mantissa of C. The second stage E2 within the multiplier 26 performs the Booth multiplication and generates two 107-bit partial product values D and E which are supplied to the third stage E3.
The third stage S performs a bit addition of these partial products with an adder 40. The resulting product mantissa is an unrounded value as consistent with the operation of the fused multiply add circuitry described herein. The product mantissa is output from the multiplier 26 providing an exception such as an infinity, a not-a-number or a condition code both failed does not occur. If any of these conditions do arise, then a special value is output from the multiplier 26 instead of the product mantissa.
The exponent value of the operand B, the exponent value of the operand C, the CLZB value and the CLZC value are used by the multiplier 26 to form the product exponent value which has an extended exponent value range (extends lower that the normal exponent value range) and that is passed to the adder 28. An adder 42 within the multiplier 26 performs a sum of the above exponent inputs together with a value indicating whether an overflow occurred when the product mantissa was calculated so as to form the product exponent. The adder 42 thus performs a sum of the exponent of B, the exponent of C, -CLZB, -CLZC and an overflow value.
Figure 4 schematically illustrates the adder 28 formed of three stages E4, ES aiid E6, The adder 28 receives the input operand A in the IEEE Standard 734 format with the normal exponent value range as well as the product output from the multiplier 26 which is unrounded and which has the extended exponent value range. A negative flag value INF is also passed between the multiplier 26 and the adder 28 to indicate that the product exponent lies within a region lower than that represented by the normal exponent value range. This negative flag may be used to control manipulation of the product exponent in a manner that is consistent with it representing negative values when the normal exponent value range is assumed to be a positive value.
Within the stage E4 a leading zero prediction circuit 44 determines whether or not the sum of the mantissa ofA and the product mantissa will have any leading zeros. This together with the exponent value for A and the product exponent value (including negative flag) are supplied to alignment control circuitry 46 which determines any shifts to be applied to the mantissa of A and the product mantissa before they are added.
Shifters 48, 50 within stage ES will apply shifts as determined by the alignment control circuitry 46 to form an aligned value of A and an aligned product value which can then be supplied to a bit adder 52 in the stage E6 of the adder 28 to form the result mantissa value. It will be appreciated that the shifters 48 and 50 are controlled by the alignment control circuitry 46 which itself is responsive to the product exponent value having the extended exponent value range.
Accordingly, the shifters 48, 50 can be controlled to perform any required shift necessary to bring the resuli vMue back into the normal exponent value range as may be required to generate an IEEE Standard 754 compliant result value. There is no need to bring the exponent value of the product passed from the multiplier 26 to the adder 28 back into the normal exponent value range of the IEEE Standard 754 format as any necessary adjustment can be made in the shifts performed by the shifters 48, 30 within the adder 28 without incurring an additional time penalty.
Compared with a standard adder, the present technique sends the adder on extra bit (NE) indicating that the product exponent is to be treated as a negative number (ic. what would otherwise look like a large exponent is in fact a very small exponent). The alignment control circuitry 46 receives the negative flag value NF and treats the exponent value accordingly.
Figure 5 is a flow diagram schematically illustrating the operation of the multiplier 26.
Figure 5 shows the processing occurring in a serial manner. It ll be appreciated that in practice the multiplier circuitry 26 may perform various of these operations parallel or in a different order.
It will also be appreciated that in other embodiments the roles of"1"s and "0"s may be reversed in a manner that will be understood by those in this field to operate in a similar manner.
At step 54 a count of leading zeros for B mantissa is made and the value CLZB is set accordingly. Step 56 determines whether the count leading zero value for the mantissa B is greater than zero. If the count leading zero value is greater than zero, then step 58 left shifts the B mantissa by the CLZB value. If the CLZB value is zero, then step 58 is bypassed.
At step 60 the leading zeros of the C mantissa are counted and used to set the CLZC value.
Step 62 determines whether the CLZC value is greater than zero. If the CLZC value is greater than zero, then step 64 serves to left shift the C mantissa by a number of places corresponding to the CLZC value. If the determination at step 62 is that the CLZC value is not greater than zero, then step 64 is bypassed.
At step 66 a multiply of the B mantissa and C mantissa is performed and any overflow is detected. At step 68 the product exponent is formed as the sum of the exponent of B, the exponent of C, -CLZB, -CLZC and a value of +1 if an overflow at step 66 was detected. If the product exponent so calculated is negative, then a negative flag NF is set to signal this to the adder 28. At step 70 the product mantissa, the product exponent and the negative flag are output to the adder 28.
Although illustrative embodiments of the invention have been described in detail herein with reference to the accompanying drawings, it is to be understood that the invention is not limited to those precise embodiments, and that various changes and modifications can be effected therein by one skilled in the art without departing from the scope and spirit of the invention as defined by the appended claims.

Claims (11)

  1. CLAIMSI. Apparatus for performing an arithmetic operation A + (B * C), where A, B and C are floating point numbers each having an exponent value within an exponent value range and a mantissa value, said apparatus comprising: a multiplier configured to multiply B and C to generate a product having a product mantissa value and a product exponent value; and an adder configured to add A and said product to generate a result value; wherein said multiplier is configured to generate said product exponent value passed to said adder with an extended exponent value range that extends to lower values than said exponent value range; and said adder is configured to receive said product exponent value with said extended exponent value range and to generate said result value with a result exponent within said exponent value range.
  2. 2. Apparatus as claimed in claim 1, wherein said product passed from said multiplier to said adder is unrounded.
  3. 3. Apparatus as claimed in any one of claims I and 2, wherein said multiplier comprises first count-leading-zero circuitry configured to determine a first count value of a number of leading zeros in a mantissa value of B; a first shifter configured to left shift said mantissa of B by a number of places equal to said first count value to form a shifted mantissa of B if said first count value is greater than zero; second count-leading-zero circuitry configured to determine a second count value of a number of leading zeros in a mantissa value of C; and a second shifter configured to left shift said mantissa of C by a number of places equal to said second count value to form a shifted mantissa of C if said second count value is greater than zero.
  4. 4. Apparatus as claimed in claim 3, wherein said multiplier is configured to form said product exponent as a sum of at least an exponent value of B, an exponent value of C, minus said first count value and minus said second count value.
  5. 5. Apparatus as claimed in any one of the preceding claims, wherein said adder comprises an adder shifter responsive to an exponent value of A and said product exponent to perform a shift operation upon at least one of a mantissa of A and said product mantissa to align in magnitude said mantissa of A and said product mantissa.
  6. 6. Apparatus as claimed in claim in any one of the preceding claims, wherein said multiplier is configured to generate an out-of-range exponent flag signal sent to said adder to indicate that said product exponent has a value lower than said exponent value range.
  7. 7. Apparatus as claimed in any one of the preceding claims, wherein said exponent value range is in accordance with IEEE Standard 754 and said extended exponent value range includes negative exponent values.
  8. 8. Apparatus for performing an arithmetic operation A + (B * C), where A, B and C are floating point numbers each having an exponent value within an exponent value range and a mantissa value, said apparatus comprising: multiplier means for multiplying B and C to generate a product having a product mantissa value and a product exponent value; and adder means for adding A and said product to generate a result value; wherein said multiplier means generates said product exponent value passed to said adder with an extended exponent value range that extends to lower values than said exponent value range; and said adder means receives said product exponent value with said extended exponent value range and generates said result value with a result exponent within said exponent value range.
  9. 9. A method of performing an arithmetic operation A+ (B * C), where A, B and C are floating point numbers each having an exponent value within an exponent value range and a mantissa value, said apparatus comprising the steps of: multiplying B and C to generate a product having a product mantissa value and a product exponent value; and adding A and said product to generate a result value; wherein said step of multiplying generates said product exponent value passed to an adder with an extended exponent value range that extends to lower values than said exponent value range; and said step of adding receives said product exponent value with said extended exponent value range and generates said result value with a result exponent within said exponent value range.
  10. 10. Apparatus for performing an arithmetic operation substantially as hereinbefore described with reference to the accompanying drawings.
  11. 11. A method of performing an arithmetic operation substantially as hereinbefore described with reference to the accompanying drawings.
GB1400644.9A 2014-01-15 2014-01-15 Multiply adder Active GB2522194B (en)

Priority Applications (4)

Application Number Priority Date Filing Date Title
GB1400644.9A GB2522194B (en) 2014-01-15 2014-01-15 Multiply adder
US14/566,981 US9696964B2 (en) 2014-01-15 2014-12-11 Multiply adder
CN201510005354.XA CN104778028B (en) 2014-01-15 2015-01-06 Adder and multiplier
KR1020150001770A KR102318494B1 (en) 2014-01-15 2015-01-07 Multiply adder

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
GB1400644.9A GB2522194B (en) 2014-01-15 2014-01-15 Multiply adder

Publications (3)

Publication Number Publication Date
GB201400644D0 GB201400644D0 (en) 2014-03-05
GB2522194A true GB2522194A (en) 2015-07-22
GB2522194B GB2522194B (en) 2021-04-28

Family

ID=50238976

Family Applications (1)

Application Number Title Priority Date Filing Date
GB1400644.9A Active GB2522194B (en) 2014-01-15 2014-01-15 Multiply adder

Country Status (4)

Country Link
US (1) US9696964B2 (en)
KR (1) KR102318494B1 (en)
CN (1) CN104778028B (en)
GB (1) GB2522194B (en)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9703531B2 (en) * 2015-11-12 2017-07-11 Arm Limited Multiplication of first and second operands using redundant representation
WO2017166026A1 (en) * 2016-03-28 2017-10-05 武汉芯泰科技有限公司 Multiplier-accumulator, multiplier-accumulator array and digital filter
US10402168B2 (en) * 2016-10-01 2019-09-03 Intel Corporation Low energy consumption mantissa multiplication for floating point multiply-add operations
CN107168678B (en) * 2017-05-09 2020-10-27 清华大学 Multiply-add computing device and floating-point multiply-add computing method
US11200186B2 (en) 2018-06-30 2021-12-14 Intel Corporation Apparatuses, methods, and systems for operations in a configurable spatial accelerator
CN109634558B (en) * 2018-12-12 2020-01-14 上海燧原科技有限公司 Programmable mixed precision arithmetic unit
US11256476B2 (en) 2019-08-08 2022-02-22 Achronix Semiconductor Corporation Multiple mode arithmetic circuit
US11907713B2 (en) 2019-12-28 2024-02-20 Intel Corporation Apparatuses, methods, and systems for fused operations using sign modification in a processing element of a configurable spatial accelerator
CN112558918B (en) * 2020-12-11 2022-05-27 北京百度网讯科技有限公司 Multiply-add operation method and device for neural network

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070074008A1 (en) * 2005-09-28 2007-03-29 Donofrio David D Mixed mode floating-point pipeline with extended functions
US20100063987A1 (en) * 2008-09-09 2010-03-11 International Business Machines Corporation Supporting multiple formats in a floating point processor
US20100125621A1 (en) * 2008-11-20 2010-05-20 Advanced Micro Devices, Inc. Arithmetic processing device and methods thereof
US20110072066A1 (en) * 2009-09-21 2011-03-24 Arm Limited Apparatus and method for performing fused multiply add floating point operation
US20130007075A1 (en) * 2011-06-29 2013-01-03 Advanced Micro Devices, Inc. Methods and apparatus for compressing partial products during a fused multiply-and-accumulate (fmac) operation on operands having a packed-single-precision format

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5761105A (en) * 1995-09-26 1998-06-02 Advanced Micro Devices, Inc. Reservation station including addressable constant store for a floating point processing unit
US5768169A (en) * 1995-10-02 1998-06-16 Intel Corporation Method and apparatus for improved processing of numeric applications in the presence of subnormal numbers in a computer system
CN100555212C (en) * 2007-07-18 2009-10-28 中国科学院计算技术研究所 The carry calibration equipment of a kind of floating dual MAC and multiplication CSA compressed tree thereof
US8244789B1 (en) * 2008-03-14 2012-08-14 Altera Corporation Normalization of floating point operations in a programmable integrated circuit device
CN102339217B (en) * 2010-07-27 2014-09-10 中兴通讯股份有限公司 Fusion processing device and method for floating-point number multiplication-addition device
US8965945B2 (en) * 2011-02-17 2015-02-24 Arm Limited Apparatus and method for performing floating point addition

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070074008A1 (en) * 2005-09-28 2007-03-29 Donofrio David D Mixed mode floating-point pipeline with extended functions
US20100063987A1 (en) * 2008-09-09 2010-03-11 International Business Machines Corporation Supporting multiple formats in a floating point processor
US20100125621A1 (en) * 2008-11-20 2010-05-20 Advanced Micro Devices, Inc. Arithmetic processing device and methods thereof
US20110072066A1 (en) * 2009-09-21 2011-03-24 Arm Limited Apparatus and method for performing fused multiply add floating point operation
US20130007075A1 (en) * 2011-06-29 2013-01-03 Advanced Micro Devices, Inc. Methods and apparatus for compressing partial products during a fused multiply-and-accumulate (fmac) operation on operands having a packed-single-precision format

Also Published As

Publication number Publication date
US20150199173A1 (en) 2015-07-16
GB2522194B (en) 2021-04-28
GB201400644D0 (en) 2014-03-05
KR20150085471A (en) 2015-07-23
CN104778028B (en) 2019-06-07
US9696964B2 (en) 2017-07-04
CN104778028A (en) 2015-07-15
KR102318494B1 (en) 2021-10-28

Similar Documents

Publication Publication Date Title
US9696964B2 (en) Multiply adder
US9841948B2 (en) Microarchitecture for floating point fused multiply-add with exponent scaling
US8965945B2 (en) Apparatus and method for performing floating point addition
KR20190090817A (en) Apparatus and method for performing arithmetic operations to accumulate floating point numbers
US9483232B2 (en) Data processing apparatus and method for multiplying floating point operands
JP6415236B2 (en) Apparatus and system including floating point addition unit, and floating point addition method
US9823897B2 (en) Apparatus and method for floating-point multiplication
US10019228B2 (en) Accuracy-conserving floating-point value aggregation
CN107025091B (en) Binary fused multiply-add floating point calculation
US20180203667A1 (en) Fused-multiply-add floating-point operations on 128 bit wide operands
US20080301213A1 (en) Division with rectangular multiplier supporting multiple precisions and operand types
KR102412746B1 (en) Apparatus and method for performing floating-point square root operation
CN106250098B (en) Apparatus and method for controlling rounding when performing floating point operations
US10489115B2 (en) Shift amount correction for multiply-add
US9836279B2 (en) Apparatus and method for floating-point multiplication
US10459689B2 (en) Calculation of a number of iterations
US9280316B2 (en) Fast normalization in a mixed precision floating-point unit
EP2884403A1 (en) Apparatus and method for calculating exponentiation operations and root extraction
JP2010049614A (en) Computer