WO2014183195A1 - Decimal floating-point fused multiplier-adder - Google Patents

Decimal floating-point fused multiplier-adder Download PDF

Info

Publication number
WO2014183195A1
WO2014183195A1 PCT/CA2014/000420 CA2014000420W WO2014183195A1 WO 2014183195 A1 WO2014183195 A1 WO 2014183195A1 CA 2014000420 W CA2014000420 W CA 2014000420W WO 2014183195 A1 WO2014183195 A1 WO 2014183195A1
Authority
WO
WIPO (PCT)
Prior art keywords
digit
significand
node
operand
zero
Prior art date
Application number
PCT/CA2014/000420
Other languages
French (fr)
Inventor
Seokbum KO
Liu HAN
Original Assignee
University Of Saskatchewan
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University Of Saskatchewan filed Critical University Of Saskatchewan
Publication of WO2014183195A1 publication Critical patent/WO2014183195A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/483Computations with numbers represented by a non-linear combination of denominational numbers, e.g. rational numbers, logarithmic number system or floating-point numbers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/491Computations with decimal numbers radix 12 or 20.
    • G06F7/4915Multiplying; Dividing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/544Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices for evaluating functions by calculation
    • G06F7/5443Sum of products
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2207/00Indexing scheme relating to methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F2207/491Indexing scheme relating to groups G06F7/491 - G06F7/4917
    • G06F2207/4911Decimal floating-point representation

Definitions

  • TITLE DECIMAL FLOATING-POINT FUSED MULTIPLIER-ADDER
  • the present subject-matter relates to decimal floating-point fused multiplier- adder, and more particularly to a decimal floating-point fused multiplier-adder using redundant internal encoding for improved performance.
  • decimal floating- point has now been included in IEEE standard 754-2008.
  • Fused multiplication-addition merges the rounding operates of at least one multiplication function with at least one addition function.
  • a leading non-zero digit detection module which includes a leading non-zero detector for receiving an operand having digits in a signed digit-set having a range of [m, n] , m ⁇ -8, n ⁇ 8, ABS m - n) > 9.
  • the detector is adapted to detect an initial position of the leading non-zero digit of the operand.
  • the leading non-zero digit detection module further includes a leading non-zero digit corrector for selectively correcting the position of the initial position of the leading non- zero digit by at most one position in the less significant direction based on pattern analysis of the digits of the operand. For example, the leading non-zero digit corrector corrects the initial position of the leading non-zero digit to the next less significant digit if:
  • the value of the initial leading non-zero digit is 1 and the value of the next non-zero less significant digit after the initial leading non-zero digit is negative; or the value of the initial leading non-zero digit is -1 and the value of the next non-zero less significant digit after the initial leading non-zero is positive.
  • the embodiments described herein provide in another aspect a combined rounder-conversion module for processing a signed operand having a sign and a significand having / digits in a signed digit-set having a range of [m, n], m > -9, n ⁇ 8, ABS(m - n) ⁇ 9.
  • the combined rounder-conversion module includes an inverter for selectively inverting the / - 1 most significant digits of the significand based on the sign of the operand to output a bit-inverted intermediate; a calculation unit for determining the propagation bits of the / - 2 most significant digits of the significand; a calculation unit for determining the generation bits of the - 2 most significant digits significand; a rounding increment generation unit for determining an increment value based on at least the sign of the operand, the least significant digit of the significand, and a sticky digit representing values of one or more less significant digits of the least significant digit of the significand; a negative carry generation unit for determining a negative carry signal based on the sign of the operand, the increment value, the value of the second least significant digit of the significand, the propagation bits of the / - 2 most significant digits, and the generation bits of the I - 2 most significant digits; a correction signal generator for
  • the embodiments described herein provide in another yet aspect decimal floating-point fused multiplier-adder for carrying out addition and multiplication operations on a first operand, a second operand and a third operand.
  • the multiplier-adder includes a multiplication module for multiplying a first significand of the first operand and a second significand of the second operand to output an intermediate product; a pre-alignment module for shifting a third significand of the third operand to output a pre-aligned addend; an addition module for adding the intermediate product and the pre-aligned addend to output an intermediate sum; and a post-alignment module for shifting the intermediate sum based on the shifting of the third significand of the third operand and a preferred exponent to output a post-aligned sum.
  • the decimal floating-point fused multiplier-adder includes the leading non-zero digit detection module described herein. According to one exemplary embodiment, the decimal floating-point fused multiplier-adder includes the combined rounder-conversion module described herein.
  • Figure 1 illustrates a schematic diagram of an exemplary decimal floatingpoint multiplier-adder
  • Figure 2 illustrates a detailed schematic diagram of an exemplary decimal floating-point multiplier-adder
  • Figure 3 illustrates a schematic diagram of an exemplary multiplier with redundant internal encodings
  • Figure 4 illustrates a schematic circuit diagram of an exemplary multiplier
  • Figure 5 illustrates a schematic diagram of an exemplary alignment of the intermediate product and the pre-aligned added
  • Figure 6A illustrates a schematic diagram of an exemplary pre-alignment module of the exemplary decimal floating-point multiplier-adder
  • Figure 6B illustrates a schematic diagram of a hardware implementation of an exemplary digit shifter
  • Figure 7A illustrates a schematic diagram of a first exemplary case of digit shifting
  • Figure 7B illustrates a schematic diagram of a second exemplary case of digit shifting
  • Figure 7C illustrates a schematic diagram of a third exemplary case of digit shifting
  • Figure 7D illustrates a schematic diagram of a fourth exemplary case of digit shifting
  • Figure 7E illustrates a schematic diagram of a fifth exemplary case of digit shifting
  • Figure 8 illustrates a schematic diagram of an exemplary leading non-zero digit detection unit
  • Figure 9 illustrates a schematic diagram of an exemplary leading non-zero digit corrector implemented in hardware
  • Figure 10 illustrates a schematic diagram of a portion of an exemplary post- alignment module
  • Figure 11 illustrates a schematic diagram of an exemplary combined rounder- conversion module.
  • inventions are described herein with reference to various algorithms, modules, methods, calculation units, circuits and architectures. It will be understood that such algorithms, modules, methods, calculation units, circuits and architectures can be implemented in hardware or machine, such as in electrical and/or electronic circuits, according to various methods known in the art.
  • embodiments described herein may be implemented on or embedded within a microchip, microprocessor, co-processor, programmable logic, field programmable gate array (FPGA) central processing unit (CPU), graphics processing unit (GPU), Accelerated processing unit (APU), system-on-chip (SOC) and/or application specific integrated circuits (ASICs).
  • the co- processor can be coupled to or integrated with a processing unit in which certain operations required by the processing unit can be offloaded to the co-processor.
  • the term "significant digit” as used herein refers to a digit in a string of digits representing a number, wherein the digits are positioned within the string according to significance. Typically, digits positioned to the left of a particular digit are more significant and digits positioned to the right of a particular digit are less significant. For example in a number "892", the leftmost hundreds digit 8 is more significant than the middle tens digit 9 and the rightmost ones digit 2 is less significant than the middle tens digit 9.
  • the "left” direction as used herein with reference to significant digits of a number means the direction of more significant digits.
  • the "right” direction as used herein with reference to significant digits of a number means the direction of less significant digits.
  • significant figures refers to the digits in a string of digits that contribute to precision. Typically, the number of significant figures in a number will be defined, such as according to a standard such as IEEE 754-2008.
  • FIG. 1 therein illustrated is a schematic diagram of an exemplary decimal floating-point fused multiplier-adder 100.
  • the floating-point multiplier- adder 100 includes a DPD decoder 102, which receives a first operand 104, second operand 106, and third operand 108.
  • the DPD decoder 102 further decodes the first operand 104 into a first significand, a first sign bit and a first exponent, the second operand 106 into a second significand, a second sign bit and a second exponent, and the third operand 108 into a third significand, a third sign bit and a third exponent.
  • the first operand 104 and the second operand 106 are multiplicands and the third operand 108 is an addend.
  • each significand has a defined length of significant digits.
  • each of the first significand, the second significand and the third significand have a length of n digits.
  • the first significand, the second significand, and the third significand are each represented by a 16-digit string.
  • the decimal floating-point fused multiplier-adder 100 further includes a multiplier 112 for carrying out unsigned multiplication of the first significand and the second significand.
  • the multiplier 112 outputs an intermediate product.
  • the decimal floating-point fused multiplier-adder 100 further includes a pre- alignment module that includes a pre-alignment calculation unit 116 for determining a direction and an amount of shifting of the third significand.
  • the pre-alignment module also includes digit shifting unit 120 for shifting the third significand according to the amount of shifting determined by the pre-alignment calculation unit 1 16.
  • the pre-alignment module outputs a pre-aligned addend CZ sh .
  • the decimal floating-point fused multiplier-adder 100 further includes a decimal carry free adder 124 for adding the intermediate product outputted from the multiplier 1 12 with the pre-aligned addend CZ sh outputted from the pre-alignment module.
  • the decimal carry free adder 124 outputs an intermediate sum suml 128.
  • the decimal floating-point fused multiplier-adder 100 further includes a post- alignment module for shifting the intermediate sum suml 128 according to a preferred exponent to be achieved and the number of digits in the intermediate sum suml 128.
  • the decimal floating-point multiplier-adder 100 includes a digit detection unit 132 for detecting leading zeros and trailing zeros of the intermediate sum suml 128.
  • the decimal floatingpoint multiplier-adder 100 further includes a calculation unit 136 for determining the amount of shifting of the intermediate suml 128.
  • the decimal floating point multiplier-adder 100 further includes a right shifter 140 for shifting the intermediate sum suml 128 by the amount determined by the calculation unit 136.
  • the post-alignment module outputs a post- aligned sum sum2 144 having a defined length.
  • the length of the post-aligned sum sum2 144 is approximately equal to the length n of the significands of the input operands.
  • the decimal floating-point fused multiplier-adder 100 further includes a combined rounder-conversion module 148 for rounding the post-aligned sum sum2 144 to the desired number of significant figures and for converting the post-aligned sum to a digit set of [0,9].
  • the combined rounder-conversion module 148 outputs an unprocessed final result 152.
  • the decimal floating-point multiplier-adder 100 further includes a postprocessing module 156 and a DPD encoder 160, which together process the unprocessed final result 152 and encodes the unprocessed final result 152 along with a calculated sign and exponent value to output a processed output 164.
  • FIG. 2 therein illustrated is a schematic diagram of a detailed decimal floating-point multiplier-adder 100 according to various exemplary embodiments.
  • the multiplier 1 12 receives the first significand and second significand as input.
  • An intermediate product is outputted from the multiplier 112. Where both the first significand and the second significand have a length of n-digits, the length of the intermediate product can have a maximum of 2n + 1 digits.
  • the intermediate product has a length 33 4-bit digits.
  • the intermediate sum is in a digit-set having a range of [0,9].
  • the multiplier 1 12 is a multiplier with redundant internal encodings. That is, during multiplication, the first significand and second significand are represented in an alternative digit-set other than the digit-set having a range of [0,9].
  • the intermediate product outputted by the multiplier 1 12 has a digit-set in the range of [m, n], m ⁇ — 8, n ⁇ 8, ABS(m - n) ⁇ 9.
  • Han and Ko High-Speed Parallel Decimal Multiplication with Redundant Internal Encodings, IEEE Transactions on Computers, Vol. 62, No. 5, describes a suitable multiplier with redundant internal encodings, which is hereby incorporated by reference.
  • FIG. 3 therein illustrated is a schematic diagram of an exemplary multiplier 1 12 with redundant internal encodings.
  • the exemplary multiplier 1 12 includes a partial product generation unit 204, a signed digit recoder 208, a selector 212 and a partial product reduction unit 216.
  • the partial product generation unit 204 generates partial products equal to 1X multiple, 2X multiple, 3X multiple, 4X multiple and a 5X multiple of the first significand that are free of carry propagation.
  • the 1X-5X multiples of the first significand can be represented in a signed digit-set having a range of [m, n], m > -8, n ⁇ 8, ABS(m - n) > 9.
  • Table 1 represents exemplary 1X-5X multipliers calculations based on the first significand having digits in the range of [0,9]. It will be appreciated that the 1X-5X multipliers are represented in a signed-digit set having a range equal to or smaller than [- 8,7]. Therefore, the digits can still be represented in 4-bits.
  • the signed digit recoder 208 recodes the second significand into a recoded significand having a digit-set in the range of [-5,5]. By recoding the second significand in this digit-set, each digit of the recoded significand can be used to determine which of the multiples 1X-5X generated by the partial product generation unit 204 is to be selected for the addition of the partial products.
  • Table 1 represents exemplary Vj recoded digit outputs of the recoded significand based on the input BCD operands of the second significand, wherein W t represents the residual digit that has the same weight as a current BCD digit, T i+ 1 and K i+2 are the transfer digits to the next two more significant which are 10 times and 100 times the weight of the current BCD digit.
  • the selector 212 receives the value of a digit of the recoded significand and selects the appropriate partial product 1X-5X based on the received value. Where the recoded significand is negative, each of the bits of the selected partial product is inverted by an invertor 220 to obtain the negative of the selected partial product.
  • the partial products reduction unit 216 adds the selected partial products in order to calculate the intermediate product.
  • the intermediate product is in a signed digit-set having a range equal to or smaller than [-8,7] or [-6,6] according to Table 1..
  • FIG. 4 therein illustrated is a schematic diagram of a hardware implemented equivalent of the exemplary multiplier of Figure 3.
  • a pre-alignment module 240 of the exemplary decimal floating-point multiplier-adder 100 includes a first adder 242, a second adder 246, a left-shifting module 248, a right shifting module 250, a selection generation unit 252 and a selector 254.
  • the first adder 242 and second adder 246 calculate intermediate signals based on the values of the first exponent, the second exponent, and the third exponent.
  • the intermediate signals correspond to the amount of shifting of the third significand based on different cases of the relationship between first operand, the second operand, and the third operand.
  • the left-shifting module 248 and right-shifting module 250 receives one or more of the intermediate signals from the first adder 242 and the second adder 246 and respectively shifts the third significand of the third operand (addend) by the amount defined in the intermediate signals.
  • the selection generation unit 252 determines which of the different cases of the relationship between the first operand, the second operand, and the third operand is present based on values of the first exponent, the second exponent, the third exponent, and the position of the leading non-zero digit of the third significand (addend).
  • the output of the selection generation unit 252 represents which of the various cases is present.
  • the selector 254 receives the output of the selection generation unit 252 and selects one of the outputs of the left-shifting module 248 and right shifting module 250 as the pre-aligned addend CZ sh .
  • the selector 254 further includes an inverter for inverting the bits of the third significand prior to outputting the pre-aligned addend CZ sh if the third operand is negative (sign of the third operand is negative).
  • the inverter can be implemented as an array of XOR gates. Inverting the bits of the third significand achieves one's complement of every digit of the third significand.
  • An operation mode EOF signal can be determined according to the sign of the third operand.
  • the multiplication and addition of the three operands is carried out free of shifting of the first and second operands prior to the multiplication. Accordingly, only the third significand of the third operand is shifted.
  • the shifting of the significand of the third operand is carried out so that a corresponding exponent for the shifted third significand equals the exponent of the intermediate product (the sum of the first exponent and the second exponent), thereby ensuring that the addition in the adder 124 is carried out on two intermediate operands having the same exponent.
  • the third significand may have to be shifted in the more significant direction by at most 2n + 1 digits. Similarly, the third significand may have to be shifted in the less significant direction by at most n digits. Accordingly, the range of shifting of the third significand has a width of 4n + 2 digits. Accordingly, the pre-aligned addend CZ sh also has a width of An + 2 digits. Since the pre-aligned addend CZ sh is wider, digits of the pre-aligned addend CZ sh that correspond to digits not occupied by the shifted third significand are padded with 0.
  • the third exponent of the third operand is significantly greater than the exponent of the intermediate product (the sum of the first exponent and the second exponent).
  • the difference between the third exponent and the exponent of the intermediate product is greater than an amount the third significand can be shifted in the more significant direction without overflowing.
  • the third significand is shifted in the more significant direction by an amount corresponding to the length 2n + 1 plus the amount of leading zeros in the third significand. Since there will be overflow of the most significant digits of the third significand, this fact is recorded and the amount of the overflow is also recorded.
  • the third exponent of the third operand (addend) is greater than the exponent of the intermediate product by a difference that does not exceed the maximum possible amount of shifting of the third significand in the more significant direction without overflowing. Accordingly, the third significand can be shifted in the more significant direction by an amount equal to the difference between the third exponent and the exponent of the intermediate product.
  • the third exponent of the third operand (addend) is less than the exponent of the intermediate product by a difference that does not exceed the maximum possible amount of shifting of the third significand in the less significant direction without overflowing. Accordingly, the third significand can be shifted in the less significant direction by an amount equal to the difference between the third exponent and the exponent of the intermediate product.
  • the third exponent of the third operand is significantly less than the exponent of the intermediate product.
  • the difference between the exponent of the intermediate product and the third exponent is less than an amount of shifting of the third significant in the less significant direction without overflowing.
  • the third significand is shifted in the less significant direction by an amount corresponding to the length 2n. Since there will be overflow of the least significant digits of the third significand, this fact is recorded and the amount of the overflow is also recorded.
  • the amount of the shifting based on the presence of one of the four described can be determined according to:
  • the pre-alignment module 240 only shifts the third significand.
  • the multiplication at the multiplier 1 12 is carried out on the first significand and the second significand free of any shifting of the first and second significands.
  • the multiplication at the multiplier 1 12 and the pre-alignment of the third significand at the pre-alignment module 240 can be carried out in parallel, thereby achieving a savings in time and an improvement in speed.
  • the pre-alignment module 240 further includes a miscellaneous signals generation unit 256 for generating at least a operation mode EOP, the exponent value Expl corresponding to the exponent value of the pre-aligned addend, and a first sticky digit Stickyl for tracking the value of bits shifted out of range in the less significant direction.
  • a miscellaneous signals generation unit 256 for generating at least a operation mode EOP, the exponent value Expl corresponding to the exponent value of the pre-aligned addend, and a first sticky digit Stickyl for tracking the value of bits shifted out of range in the less significant direction.
  • FIG. 5 therein illustrated is an exemplary alignment of the intermediate product with the pre-aligned addend CZ sh . Since the third significand can be shifted in both the more significant direction and the less significant direction, the width of the pre-aligned addend CZ sh is wider than the width of the third significand and the intermediate product. Accordingly, the exponent value Expl corresponding to the exponent value of the pre-aligned addend CZ sh is different from the exponent of the intermediate product EP.
  • the exponent value Expl of the pre-aligned addend CZ sh is less than the exponent of the intermediate product EP due to the n digits of the pre-aligned addend CZ sh that are provided for shifting of the third significand in the less significand direction.
  • the exponent value Expl is n less than the exponent of the intermediate product EP.
  • RSHOR(CZ) means the bit-by-bit OR of all right shifted digits out of the third significand.
  • FIG. 6A therein illustrated is a schematic diagram of a hardware implementation of an exemplary pre-alignment module 240.
  • the first adder 242 is implemented as a binary prefix tree adder to determine the amount of shifting Lsal in the more significant direction.
  • the second adder 246 is implemented as a second binary prefix tree adder to determine the amount of shifting Rsal in the less significant direction.
  • the left and right shifting amount Lsal and Rsal are calculated simultaneously by the two binary prefix tree adders 242, 246. For example, since the maximum amount of shifting in the left direction or the right direction are constant, only lower bits of the outputs of the two adders 242, 246 are fed into the shifters.
  • the number of leading zeros in the addend LZD(CZ) is not determined before determining the amount of shifting in the more significant direction Lsa ⁇ .
  • the selection signal outputted by the selection generator 252 can be determined from whether the third exponent is greater than or less than the exponent of the intermediate product and the value of the overflow signal OV .
  • FIG. 6B therein illustrated is a schematic diagram of a hardware implementation of an exemplary left-shifter 248 of the pre-alignment module. Since the widths of the inputted third significand (n digits) and output of the shifter 248 (4n + 2 digits) are different, it is possible to reduce the hardware cost of the shifter compared to a typical digit-shifter. According to the example illustrated in Figure 6B a simplified model of the proposed left shifter is shown to shift one bit input x to left. Since the less significant bits of result are obtained earlier than the more significant bits in the binary adder, the multiplexors for shifting less digits are placed on the top of the shifter. It will be understood that a symmetrical structure can be used for a right shifter. In comparison to a typical shifter having the same width on both input and output, the exemplary shifter uses approximately 37% less multiplexors.
  • the adder 124 includes a correction digit generation unit 280 and first adder 282 and second adder 284.
  • a correction digit generation unit 280 For example, Han et al. Non-speculative Decimal Signed Digit Adder, Circuits and Systems (ISCAS), 201 1 IEEE International Symposium, which is hereby incorporated by reference, describes a suitable adder that can be appropriately modified for inclusion in the exemplary decimal floatingpoint multiplier-adder 100.
  • the adder 124 receives as its input the intermediate product outputted from the multiplier 1 12 and the pre-aligned addend CZ sh . For each digit of the intermediate product and the pre-aligned addend, the first adder 282 determines a first temporary sum W t . As described in Han et al. , the correction digit generation unit 280 calculates for each digit of the intermediate product and the pre-aligned addend a transfer digit for the next most significant digit T i+1 and a complement digit based on the transfer digit from the next less significant digit
  • the first adder 282 and the correction digit generation unit 280 is adapted for the fact that the intermediate product is in a digit-set having a range of[m, n], m > -8, n ⁇ 8, ABS(rn - n) > 9.
  • the temporary sum Wj and the transfer digit for the next most significant digit T l+ 1 can be determined according to Table 2:
  • the adder module 124 outputs the intermediate sum suml 128, which has a digit-set in the range of [-8,7].
  • a value of an exponent corresponding to the equal intermediate sum suml 128 is equal to the exponent of the intermediate product EP.
  • the intermediate product is in a digit-set having the specific range of [-8,7] or smaller
  • the intermediate sum suml 128, is also in a digit-set in the range of [-8,7].
  • the digit-set [-8,7] can be represented in the same number of bits (4-bits) as the digit-set [0,9].
  • the exemplary decimal floating-point multiplier adder 100 includes a post alignment module 300 that receives the intermediate sum suml 128 outputted by the adder 124.
  • the post-alignment module 300 includes an intermediate signal generator 132, a position of the leading non-zero digit detection module 304, a trailing zero digit detection module 306, the post-alignment calculation unit 136 and the right shifter unit 140.
  • a final result should have a preferred exponent EC where possible.
  • the post-alignment module 300 determines an amount of shifting of the intermediate sum suml 128 that would either allow the unprocessed final result 152 to achieve the preferred exponent EC or, where the preferred exponent EC cannot be achieved, approach the preferred exponent EC.
  • the intermediate signal generator 132 calculates a plurality of intermediate values which are used to determine the required amount of shifting of the intermediate sum suml 128.
  • a first intermediate value DIFF pre corresponds to the difference between the preferred exponent EC and the exponent Expl of the intermediate product.
  • DIFFpre > ° corresponds to a situation where shifting in the less significant direction of the intermediate sum suml 128 is required in order to achieve the preferred exponent EC.
  • DIFFabs is defined as the absolute value of the difference between the exponents of the intermediate product and the preferred exponent.
  • DIFF pre ⁇ 0 corresponds to a situation where the preferred exponent EC cannot be achieved. Accordingly, the amount of shifting in the less significant direction of the intermediate sum suml 128 depends on the position of the leading non-zero digit of the intermediate sum suml 128. In this situation, LZD(CZ) + 3n + 1 corresponds to the maximum possible amount of shifting in the more significant direction digits of the third significand in the pre-alignment module 240. Since left overflow happens in this case, DIFFabs is always larger than LSD(CZ) + 2n + 1. Thus DIFFpre is less than n, and the analysis of shifting is similar to the case where DIFF pre > 0.
  • DI F A SiEP - EC
  • the leading non-zero digit detection module 304 is operable to determine the effective position of the leading non-zero digit of the intermediate sum suml 128.
  • the position of the leading non-zero digit detection module 304 is also operable to determine the number of leading zero digits in the intermediate sum suml 128 before the effective position of the leading non-zero digit. These values are later used to determine the amount of shifting of the intermediate sum suml 128. According to various exemplary embodiments, any leading non-zero digit detection module 304 known in the art may be used.
  • Effective position of the leading non-zero digit herein refers to the position of the digit that corresponds to a most-significant non-zero digit when taking into account the signed digit-set of the intermediate sum suml 128.
  • the interspersion of positive and negative digits can result in a particular number value to be represented using a greater number of non-zero digits than necessary.
  • the digits following the most- significant non-zero digit must be analyzed to determine whether correction is needed in order to determine the effective position of the leading non-zero digit.
  • the trailing zero detection module 306 determines the amount of trailing zeros in the intermediate sum suml 128. This value is later used to determine the amount of shifting of the intermediate sum suml 128. According to various exemplary embodiments, any trailing zeros detection module 306 known in the art may be used.
  • the post-alignment calculation unit 136 is operable to determine an amount rsal of the shifting of the intermediate sum suml 128 in the less significant direction by the shifter unit 140.
  • the post-alignment calculation unit 136 receives the outputs from the leading non-zero-digit detection module 304, the output from the trailing zero detection module 306, and the DIFF pre intermediate value outputted by intermediate signals generation unit 132. Shifting of the intermediate sum suml 128 is carried out in order to achieve the preferred exponent EC or to achieve a result that is close to the preferred exponent EC.
  • FIG. 7A therein illustrated is a schematic diagram of a first case according to which an amount of the shifting in the less significant direction of the intermediate sum suml 128 is to be determined.
  • the number of digits between the effective position of the leading non-zero digit (LOP') and the number of trailing zeros (TZD) is less than n digits.
  • the difference between the exponent of the intermediate product EP and the preferred exponent EC (DIFF pre ) is less than the number of trailing zeros of the intermediate sum suml 128. Accordingly, to obtain the post-aligned sum 144, the intermediate sum suml 128 is shifted in the less significant direction by an amount equal to DIFF pre .
  • the intermediate sum suml 128 can be exactly represented in the n digits of the post-aligned sum 144.
  • the post-aligned sum 144 is equal to the intermediate sum suml 128 right shifted by an amount equal to DIFF pre .
  • the preferred exponent EC is achieved while all the digits between the effective position of the leading non-zero digit and the trailing zeros are retained in the post-aligned sum 144.
  • FIG. 7B therein illustrated is a schematic diagram of a second case according to which an amount of shifting in the less significant direction of the intermediate sum suml 128 is to be determined.
  • the number of digits between the effective position of the leading non-zero digit (position of the leading non-zero digit) and the trailing zeros of the intermediate sum suml 128 is less than DIFFpre . Accordingly, not all of the digits between the position of the leading non-zero digit and the trailing zeros are initially retained in the post-aligned sum 144. In such cases, to obtain the post-aligned sum, the intermediate sum suml 128 is shifted further towards the less significant direction so that the n most significant digits after the effective position of the leading non-zero digit (including the leading non zero) are retained.
  • FIG. 7C therein illustrated is a schematic diagram of a third case according to which an amount of shifting in the less significant direction of the intermediate sum suml 128 is to be determined.
  • the difference between the exponent of the intermediate product and the preferred exponent (DIFF pre ) is greater than the number of trailing zeros of the intermediate sum suml 128.
  • the intermediate sum suml 128 can be shifted in the less significant direction by an amount that is less than or equal to the number of trailing zeros.
  • the intermediate sum suml 128 is shifted by an amount equal to the number of trailing zeros to obtain the post-aligned sum 144.
  • the preferred exponent cannot be reached and the adjusted exponent (Exp2) is less than and closest to the preferred exponent. Furthermore, in the first case illustrated in FIG. 7A, the second case illustrated in FIG. 7B and the third case illustrated in FIG. 7C, all of the significant figures of the intermediate sum suml 128 between the effective leading non-zero digit and the trailing zeros are retained after the shifting in the less significant direction.
  • FIG. 7D therein illustrated is a schematic diagram of a fourth case according to which an amount of shifting in the less significant direction of the intermediate sum suml 128 is to be determined.
  • the number of significant digits between the effective position of the leading non-zero digit (leading one position) and trailing zeros is greater than n. Accordingly, not all of the significant digits of the intermediate sum suml 128 can be retained within the post-aligned sum 144.
  • the intermediate sum suml 128 is shifted in the less significant direction by an amount so that the n most significant digits after the effective position of the leading non-zero digit (including the leading non zero digit) are retained in the post-aligned sum 144.
  • This amount of shifting in the less significant direction corresponds to a difference between the effective position of the leading non-zero digit and the size n of the post-aligned sum 144.
  • FIG. IE therein illustrated is a schematic diagram of a fifth case according to which an amount of shifting in the less significant direction of the intermediate sum suml 128 is to be determined.
  • DIFF pre is negative
  • the preferred exponent is smaller than the exponent of the intermediate product. Accordingly, to achieve the preferred exponent, the intermediate sum suml 128 should be shifted in the more significant direction.
  • the size of the post- aligned sum 144 is smaller than the intermediate sum suml 128, the intermediate sum suml 128 cannot be shifted in the more significant direction.
  • the intermediate sum suml 128 is shifted in the less significant direction by an amount so that the n most significant digits after the effective position of the leading non-zero digit (including the leading non zero digit) are retained.
  • This amount of shifting in the less significant direction corresponds to a difference between the effective position of the leading non-zero digit and the size n of the post-aligned sum.
  • the exponent of the final result ⁇ exp2) is updated according to amount of the shifting of the intermediate sum suml 128.
  • the shifting in the less significant direction is a right shift and the amount of the shifting (rsa2) according to which of the described five cases is applicable can be determined according to:
  • LOP' is an effective position of the leading non-zero digit.
  • the right shifter unit 140 receives as its input the intermediate sum suml 128 outputted by the addition module 124 and right shifts the intermediate sum suml 128 according to the amount rsa2 determined by the post-alignment calculation unit 136.
  • the post-alignment module further includes a sticky digit generator 320 and a second right shifting module 324.
  • the sticky digit sticky2 is used to track the values of one or more digits of the intermediate sum suml 128 that are lost due to the right shifting, but which may also be required for modules downstream of the post-alignment module 300.
  • the post-alignment module 300 further includes a second miscellaneous signal generation module 328.
  • a sign of the final result and an exponent of the final result 152 are updated in the second miscellaneous signal generation module 328.
  • the leading non-zero digit detection unit 400 consists essentially of a simple leading non-zero digit detector 404 and a leading one corrector 408.
  • the position of the leading non-zero digit detection unit 700 receives an operand 410 having digits in a signed digit-set having a range of [m, n], m ⁇ -8, n ⁇ 8, ABS(m - n) > 9.
  • the operand 410 consists of a significand and a sign bit or exponent is not required. Where a sign bit or exponent is included, it does not need to be considered in the leading non-zero digit detection.
  • the signed digit-set in the range of range of [m, n] can be a decimal redundant encoding of digits in the range [0,9].
  • the position of the simple leading non-zero digit detector 404 is enabled to detect an initial position of the leading non-zero digit in the string of digits of the operand 410.
  • the initial position of the leading non-zero digit corresponds to most significant non-zero digit and is typically the left-most non-zero digit in the string of digits of the significand.
  • the leading non-zero digit detector 404 can be implemented according to any method known in the art.
  • the initial position of the leading non-zero digit detected by the simple leading non-zero digit detector 404 may not correspond to the effective most significant non-zero digit. Concluding that the initial position of the leading non-zero digit is the effective position of the leading non-zero digit can lead to the misinterpretation of which digits of the operand 410 are significant figures.
  • next non-zero less significant digit of the opposite sign either subtracts from (where initial position of the leading non-zero digit equals 1 and next non-zero less significant digit is negative) or adds to (where initial position of the leading non-zero digit equals ⁇ and next non-zero less significant digit is positive) the leading non-zero digit, thereby causing the leading 1 or ⁇ digit to be converted 0.
  • the effective position of the leading non-zero digit is found at a position of a digit less significant than the initial leading non-zero digit detected by the simple leading non-zero digit detector 404.
  • each of the 9 digits will also be converted to 0.
  • the significand "19982345" when converted becomes "00022345". It will be appreciated that whereas the initial position of the leading non-zero digit is detected as the most significant digit having the value 1, the converted significand "00022345" has a position of the leading non-zero digit at its fourth most significant digit (the most leftwise 2). Similarly, the significand " ⁇ 9982345" becomes "00017655"
  • the leading non-zero digit corrector 408 is operable to selectively correct the position of the initial of the leading nonzero digit by at most one digit in the less significant direction based on pattern analysis of the digits of the significand operand.
  • TABLE 3 shows all the possible string patterns for the input operand 410 that need to be considered for making a decision as to whether or not the initial position of the leading non-zero digit should be corrected.
  • the initial position of the leading non-zero digit corresponds to the effective position of the leading non-zero digit.
  • the position of the next less significant digit of the initial leading non-zero digit is the effective position of the leading non-zero digit.
  • the number of leading zeros k is increased (i.e. the initial position of the initial leading non-zero digit is to be corrected by one position in the less significant direction) in only two situations. These situations arise either when the value of the initial leading non-zero digit is 1 and the value of the next non-zero less significant digit after the initial leading non-zero digit is negative; or when the value of the initial leading non-zero digit is -1 and the value of the next non-zero less significant digit after the initial leading non-zero is positive. In these two situations, the effective position of the leading non-zero digit is one position in the less significant direction than the position of the initial leading non-zero digit.
  • the leading non-zero digit corrector 408 corrects the position of the initial leading non-zero detected by the simple leading non-zero digit detector 404 to the position one digit over in the less significant direction. In all other situations, the position of the initial leading non-zero digit detected by the simple leading non-zero detector 404 is the effective position and does not need to be corrected by the leading non-zero digit corrector 408.
  • each node of the tree structure has a left branch input and a right branch input, and provides an output based on the left branch input value and the right branch input value.
  • the root node 412 has its left branch input 414 the most significant digit of the operand 410 and has as its right branch input 416 the second most significant digit of the operand 410 and has an output 417.
  • a child node 418 has as its left branch input 420 the output of its parent node and has as its right branch input 422 the next less significant digit of the digit that is the right branch of its parent node.
  • the child node further has an output 424.
  • the child node 418 has as its parent node the root node 412.
  • the leaf node 426 has as its left branch input 428 the output of its parent node and has as its right branch input 430 the least significant digit of the operand 410.
  • the output value 432 of the leaf node 426 indicates whether or not the initial position of the leading non-zero digit detected by the simple leading non-zero digit detector 404 should be corrected.
  • the output is determined according to the following equations:
  • node(d) p + if p +l ⁇ z r + z l ⁇ p +r
  • node(d) po if z l ⁇ po r + po l ⁇ z r
  • node(d) z if z l ⁇ z r
  • node(d) n if n l + z l ⁇ n r
  • node(d) y if y l + z l ⁇ y r + po 1 ⁇ n r ; or
  • node d) n ⁇ if n ⁇ l ⁇ z r + z l ⁇ n ⁇ r
  • node(d) no if z L ⁇ no r + no 1 ⁇ z r
  • node(d) z if z l ⁇ z r
  • node(d) y if y l + z l ⁇ y r + no 1 ⁇ p r
  • node(d) denotes the output of any particular node, and wherein if the output of the leaf node is equal to y, the leading one corrector 408 corrects the initial position of the leading non-zero digit to the next less significant digit.
  • the equations can be implemented using combination logic within each node.
  • each node of the first tree structure corresponding to a positive initial leading non-zero digit determines a node output based on the equations:
  • node(d) po if z l ⁇ po r + po 1 ⁇ z r
  • node(d) z if z l ⁇ z r
  • node(d) n ⁇ if n ⁇ l ⁇ z r + z l ⁇ ri ⁇ r
  • node(d) no if z l ⁇ no r + no 1 ⁇ z r
  • node(d) z if z' ⁇ z r
  • node(d) p if p + z l ⁇ p r
  • node(d) y if y l + z l ⁇ y r + no 1 ⁇ p r
  • the output of either one of the root nodes of the first tree structure and the second tree structure is equal to y, then the position of the initial leading non-zero digit should be corrected.
  • the outputs of the root nodes of the two tree structures can be passed through an OR gate.
  • the initial position of the leading non-zero digit LOP is outputted from the simple leading nonzero digit detector 404.
  • the output LOP is fed via a first path through a decrementer 436 (LOP - 1) to a selector 440 and via a second path directly to the selector 440.
  • LOP - 1 decrementer 436
  • the first path corresponds to when the initial position is to be corrected
  • the second path corresponds to when the initial position is the effective position and does not need to be corrected.
  • the correct value between the two path is selected by the selector 440 based on the output of the leading non-zero digit corrector 408.
  • the output of the selector is the effective position LOP' of the leading non-zero digit of the operand 410.
  • FIG. 10 therein illustrated is an exemplary portion of the post-alignment module 300 being used in conjunction with a simple leading non-zero detector 404 and a leading non-zero digit corrector 408 as described with reference to the exemplary leading non-zero detection unit 400.
  • Both simple leading non-zero detector 404 and a leading non-zero digit corrector 408 have as its input operand 4 0 the intermediate sum suml 128 outputted from the adder module 124.
  • the output LOP of the leading non-zero detector 404 is the position of initial position of the leading non-zero digit.
  • the amount of the shifting in the less significant direction rsa2 of the intermediate sum sural 128 is the difference between the effective position of the position of the leading non-zero digit and the size n of the post-aligned sum 144.
  • intermediate sum suml 128 has digits in a signed digit-set having a range of [m, n], m ⁇ -8, n ⁇ 8, ABS(m - n) > 9
  • the initial position of the leading non-zero digit of the significant is corrected by at most one position in the less significand direction. Accordingly, calculating the difference between the effective position of the leading non-zero digit (LOP') and the size n of the post-aligned sum can be divided into two situations.
  • the position of the initial non-zero digit LOP does not need to be corrected by the leading non-zero digit corrector 408.
  • the size n of the post-aligned sum 144 equals 16
  • LOP' - 16 LOP - 16.
  • a first decision module 450 applies the equations for determining rsal for the first situation where the initial position of the leading non-zero digit LOP does not need to be corrected.
  • the first decision module 450 includes a first subtraction module 454 that calculates the difference between the position of the leading non-zero digit and the size n of the post-aligned sum 144 (ex: 16 digits) as LOP - 16.
  • a second decision module 460 applies the equations for determining rsal for the second situation where the initial position of the leading non-zero digit LOP is corrected by one position in the less significant direction.
  • the first second module 460 includes a second subtraction module 464 that calculates the difference between the initial position of the leading non-zero digit and the size n of the post-aligned sum 144 (ex: 16 digits) as LOP - 15.
  • the exemplary post-alignment calculation unit 448 further includes a selector
  • the use of a separate simple leading non-zero digit detector 404 and leading non-zero digit corrector 408 allows for determination of the amount of right shifting rsal and the determination of the correction of the initial position of the leading non-zero digit to be carried out in parallel. This achieves a time saving in the post-alignment module 448.
  • the first decision module 450 and the second decision module 460 both apply the equations for determining rsal for the two possible cases of the initial digit correction at the same time as the leading non-zero digit corrector 408 determines whether correction is required.
  • FIG. 1 1 therein illustrated is an exemplary combination rounder-conversion module 900 for rounding an input operand formed of a sign digit sign 902 and a significand in 903 having / digits (in[l - 1: 0]) in a signed digit-set having a range of [m, n], m > -9, n ⁇ 8, ABS(m - n) > 9.
  • the I - 1 digits are significant figures, which are to be rounded by the least significant digit of the significand (in ⁇ 0 ⁇ ), herein referred to as the rounding digit.
  • the sign digit signl 902 corresponds to a sign of the input operand when taking into account prior redundant encoding of an initial operand initially represented in an unsigned digit set into a signed digit-set.
  • the combination rounder-conversion module 900 also receives a sticky digit stickyl, which represents values of less significant digits of the rounding digit.
  • the sticky digit can be representative of the value of the next non-zero less significant digit of the rounding digit.
  • the sticky digit can be represented in a digit-set having a range that is smaller than the range of digit-set of the significand.
  • the sticky digit can be represented in two bits to denote whether the next non-zero less significant digit is positive, negative or equal to 0.
  • rounding of significant figures by the rounding digit depends on both the value of the next non-zero less significant digit of the rounding digit and on the sign of the operand.
  • the rounding of the significant figures can further depend on the sign of the significand (as denoted by the sign of the position of the leading non-zero digit of the significand). Rounding is furthermore always based on the value of the rounding digit.
  • the exemplary rounder-conversion module further includes a calculation unit for determining the propagation bits of the / - 2 most significant digits of the significand and a calculation unit for determining the generation bits of the 1 - 2 most significant digits of the significand.
  • the calculation units for generating the propagation bits of the I - 2 most significant digits of the significand and the generation bits of the I - 2 most significant digits of the significand are implemented within a single unit 908.
  • the single unit is a / - 2 bit prefix tree structure, however various other known methods of propagation bit and generation bit calculation may be used.
  • the exemplary rounder-conversion module further includes a rounding increment generation unit for determining an increment value RD inc .
  • RD inc is the value by which the /— 1 most significant digits of the significand, representing / - 1 significant figures, should be incremented.
  • TABLE 4 provides a complete set of possible rounding increments based on the various combinations for a digit-set in the range of [-8,7] for various modes of rounding.
  • RD denotes the value of the Rounding Digit
  • SD denotes the value of the Sticky Digit
  • x denotes don't care
  • LE denotes that the least significant figure is even.
  • the sticky digit is equal to -1 if the next non-zero less significant digit of the rounding digit is negative, the sticky digit is equal to 1 if the next non-zero less significant digit of the rounding digit is positive, and the sticky digit is equal to 0 if all less significant digits of the rounding digit is equal to 0.
  • the exemplary rounder-conversion module 900 further includes a negative carry generation unit for determining a negative carry signal. Whether a negative carry will arise depends on the least significant figure of significand, which corresponds to the digit ⁇ ' ⁇ 1 ⁇ . Whether a negative carry will arise further depends on the least significant figure of the significand as incremented by the increment value RD inc . Whether a negative carry will arise further depends on the sign of the operand, Sign2.
  • the negative carry generation unit can further generate the remainder of the negative carry signal and further generate a complete carry signal C.
  • the rest of the negative carry can be determined based on the determined / - 2 propagation bits and the determined / - 2 generation bits.
  • the remainder of the negative carry signal can be determined according to the equation:
  • VQ_ 1:0 0 1 -2:O&(P.-2;O
  • NC ⁇ O ⁇ ) and the complete carry signal C can be determined according to:
  • the negative carry generation unit 912 is implemented with a negative carry signal least significant digit generator 912 that is discrete from a complete carry signal generator 916.
  • the negative carry signal least significant digit generator 912 can be implemented in combination logic according to known methods.
  • the carry signal generator 916 can be implemented in combination logic according to known methods.
  • implementing the negative carry signal least significant digit generator 912 separately allows the determination of the least significant negative carry digit NC ⁇ 0 ⁇ to be carried out in parallel with the generating of the I - 2 propagation bits and the generating I - 2 generation bits in the prefix tree structure 908.
  • the outputs of the negative carry signal least significant digit generator 912 and the prefix tree structure 908 can be readily combined in the complete carry signal generator 916.
  • the exemplary rounder-conversion module 900 further includes a correction signal generator 920 for generating a correction signal Cor2.
  • the correction signal Cor2 represents the amount of correction of the significant figure digits of the significand in based on both rounding increment, the selective negation of the significand (i.e. sign of the operand Sign2), and the complete carry signal C. Taking into account various combinations of the complete carry signal C and the rounding increment, according to one exemplary embodiment, it has been discovered that the following equations provide a complete set of possible values for the correction signal Corl.
  • the exemplary rounder-conversion module 900 further includes an adder 924 for adding digits of the correction signal Cor2 with corresponding digits of the selectively negated / - 1 most significant digits of the significand in outputted as the inverted intermediate.
  • the resulting sum is a rounded and digit-set converted result representing a final result.
  • the resulting sum is in the convention BCD digit-set [0,9] and is the final result 152.
  • the adder 924 is a carry look ahead array, however it will be understood that any other suitable adder known in the art may be used to add the correction signal Cor2 with the inverted intermediate sum.
  • the exemplary rounder-conversion module is free of positive carry propagation. That is, the module will not experience positive carry propagation. Only negative carry (borrow) propagation is experienced. [0123] According to various exemplary embodiments, the rounder-conversion module can be used in any design for carrying out arithmetic operations wherein an intermediate operand having the same properties as the operand is generated.
  • the combined rounder- conversion module 900 can be included in the decimal floating-point multiplier adder 100.
  • the significand in of the input operand for the combined rounder-conversion module 900 is the post-aligned sum sum.2 144.
  • the sign of the input operand sign! is the sign of the initial leading non-zero digit of the intermediate sum suml 128.
  • the sticky digit sticky! of the input operand for the combined rounder-conversion module 900 represents values of digits that overflow due to shifting of the intermediate sum suml in the less significant direction.
  • the output CR 152 of the adder array 924 is the significand portion of the final result.
  • the rounder conversion module 900 can further include a sign generation module 930 for determining a sign SR of the final result.
  • sign SR is equal to signF and is determined based on the sign of the input operand sign! and the sign of the first operand SX and the sign of the second operand SY of the decimal floating-point fused multiplier adder 100.
  • the rounder conversion module 900 can further include an exponent generation module 934 for determining an exponent ER of the final result.
  • ER is equal to the sum of the exponent expl of the intermediate sum suml and the amount of shifting rsa2 in the post-alignment module 300.
  • the significand CR of the final result, exponent ER of the final result, and sign SR of the final result are provided to the post processor 156 and DPD Encoder 160 to compute the processed output 164.
  • the sticky digit generator 320 for determining sticky digit sticky2 is included as part of the exemplary post-alignment module 300.
  • the sticky digit generator 320 is implemented as two prefix tree structures for determining a value p and a value z.
  • the detection algorithm for the sticky digit is similar to the carry propagation process.
  • the sticky digit can be represented in 2 bits.
  • n 4.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computing Systems (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Nonlinear Science (AREA)
  • Complex Calculations (AREA)

Abstract

A decimal floating-point fused multiplier-adder (DFMA) for carrying out addition and multiplication operations on a first operand, a second operand and a third operand includes a multiplication module for multiplying a first significand of the first operand and a second significand of the second operand to output an intermediate product. The DFMA further includes a pre-alignment module for shifting a third significand of the third operand to output a pre-aligned addend, an addition module for adding the intermediate product and the pre-aligned addend to output an intermediate sum, and a post-alignment module for shifting the intermediate sum based on the shifting of the third significand of the third operand and a preferred exponent to output a post-aligned sum. The post-alignment module of the DFMA may also include a leading non-zero digit detection that receives the intermediate sum as its operand. The post-alignment module of the DFMA may also include a combined rounder-conversion module.

Description

TITLE: DECIMAL FLOATING-POINT FUSED MULTIPLIER-ADDER
FIELD
[0001 ] The present subject-matter relates to decimal floating-point fused multiplier- adder, and more particularly to a decimal floating-point fused multiplier-adder using redundant internal encoding for improved performance.
INTRODUCTION
[0002] The representation of the decimal fraction has shown to be more accurate and more precise when compared with binary floating-point arithmetic in some specific applications, such as financial computing, banking, and billing systems. Decimal floating- point has now been included in IEEE standard 754-2008. Fused multiplication-addition merges the rounding operates of at least one multiplication function with at least one addition function.
[0003] A. Akkas and M. J. Schulte, "A decimal floating-point fused multiply-add unit with a novel decimal leading-zero anticipator" in 22nd IEEE International Conference on Application-specific Systems, Architectures and Processors, Sep. 201 1 describes a DFP- FMA design uses a previously published parallel fixed point decimal multiplier for multiplication and a Kogge-Stone parallel prefix adder for decimal addition.
SUMMARY
[0004] The embodiments described herein provide in one aspect a leading non-zero digit detection module, which includes a leading non-zero detector for receiving an operand having digits in a signed digit-set having a range of [m, n] , m≥ -8, n≤ 8, ABS m - n) > 9. The detector is adapted to detect an initial position of the leading non-zero digit of the operand. The leading non-zero digit detection module further includes a leading non-zero digit corrector for selectively correcting the position of the initial position of the leading non- zero digit by at most one position in the less significant direction based on pattern analysis of the digits of the operand. For example, the leading non-zero digit corrector corrects the initial position of the leading non-zero digit to the next less significant digit if:
the value of the initial leading non-zero digit is 1 and the value of the next non-zero less significant digit after the initial leading non-zero digit is negative; or the value of the initial leading non-zero digit is -1 and the value of the next non-zero less significant digit after the initial leading non-zero is positive.
[0005] The embodiments described herein provide in another aspect a combined rounder-conversion module for processing a signed operand having a sign and a significand having / digits in a signed digit-set having a range of [m, n], m > -9, n≤ 8, ABS(m - n)≥ 9. The combined rounder-conversion module includes an inverter for selectively inverting the / - 1 most significant digits of the significand based on the sign of the operand to output a bit-inverted intermediate; a calculation unit for determining the propagation bits of the / - 2 most significant digits of the significand; a calculation unit for determining the generation bits of the - 2 most significant digits significand; a rounding increment generation unit for determining an increment value based on at least the sign of the operand, the least significant digit of the significand, and a sticky digit representing values of one or more less significant digits of the least significant digit of the significand; a negative carry generation unit for determining a negative carry signal based on the sign of the operand, the increment value, the value of the second least significant digit of the significand, the propagation bits of the / - 2 most significant digits, and the generation bits of the I - 2 most significant digits; a correction signal generator for generating a correction signal based on the negative carry signal; an adder for adding the bit-inverted intermediate with the correction signal to output a rounded-converted result.
[0006] The embodiments described herein provide in another yet aspect decimal floating-point fused multiplier-adder for carrying out addition and multiplication operations on a first operand, a second operand and a third operand. The multiplier-adder includes a multiplication module for multiplying a first significand of the first operand and a second significand of the second operand to output an intermediate product; a pre-alignment module for shifting a third significand of the third operand to output a pre-aligned addend; an addition module for adding the intermediate product and the pre-aligned addend to output an intermediate sum; and a post-alignment module for shifting the intermediate sum based on the shifting of the third significand of the third operand and a preferred exponent to output a post-aligned sum. According to one exemplary embodiment, the decimal floating-point fused multiplier-adder includes the leading non-zero digit detection module described herein. According to one exemplary embodiment, the decimal floating-point fused multiplier-adder includes the combined rounder-conversion module described herein.
DRAWINGS
[0007] For a better understanding of the embodiments described herein and to show more clearly how they may be carried into effect, reference will now be made, by way of example only, to the accompanying drawings which show at least one exemplary embodiment, and in which:
[0008] Figure 1 illustrates a schematic diagram of an exemplary decimal floatingpoint multiplier-adder;
[0009] Figure 2 illustrates a detailed schematic diagram of an exemplary decimal floating-point multiplier-adder;
[0010] Figure 3 illustrates a schematic diagram of an exemplary multiplier with redundant internal encodings;
[001 1] Figure 4 illustrates a schematic circuit diagram of an exemplary multiplier;
[0012] Figure 5 illustrates a schematic diagram of an exemplary alignment of the intermediate product and the pre-aligned added;
[0013] Figure 6A illustrates a schematic diagram of an exemplary pre-alignment module of the exemplary decimal floating-point multiplier-adder;
[0014] Figure 6B illustrates a schematic diagram of a hardware implementation of an exemplary digit shifter;
[0015] Figure 7A illustrates a schematic diagram of a first exemplary case of digit shifting;
[0016] Figure 7B illustrates a schematic diagram of a second exemplary case of digit shifting;
[0017] Figure 7C illustrates a schematic diagram of a third exemplary case of digit shifting;
[0018] Figure 7D illustrates a schematic diagram of a fourth exemplary case of digit shifting; [0019] Figure 7E illustrates a schematic diagram of a fifth exemplary case of digit shifting;
[0020] Figure 8 illustrates a schematic diagram of an exemplary leading non-zero digit detection unit;
[0021] Figure 9 illustrates a schematic diagram of an exemplary leading non-zero digit corrector implemented in hardware;
[0022] Figure 10 illustrates a schematic diagram of a portion of an exemplary post- alignment module; and
[0023] Figure 11 illustrates a schematic diagram of an exemplary combined rounder- conversion module.
DESCRIPTION OF VARIOUS EMBODIMENTS
[0024] It will be appreciated that, for simplicity and clarity of illustration, where considered appropriate, reference numerals may be repeated among the figures to indicate corresponding or analogous elements or steps. In addition, numerous specific details are set forth in order to provide a thorough understanding of the exemplary embodiments described herein. However, it will be understood by those of ordinary skill in the art that the embodiments described herein may be practiced without these specific details. In other instances, well-known methods, procedures and components have not been described in detail so as not to obscure the embodiments described herein. Furthermore, this description is not to be considered as limiting the scope of the embodiments described herein in any way but rather as merely describing the implementation of the various embodiments described herein.
[0025] The exemplary embodiments are described herein with reference to various algorithms, modules, methods, calculation units, circuits and architectures. It will be understood that such algorithms, modules, methods, calculation units, circuits and architectures can be implemented in hardware or machine, such as in electrical and/or electronic circuits, according to various methods known in the art. For example, and without limitation, embodiments described herein may be implemented on or embedded within a microchip, microprocessor, co-processor, programmable logic, field programmable gate array (FPGA) central processing unit (CPU), graphics processing unit (GPU), Accelerated processing unit (APU), system-on-chip (SOC) and/or application specific integrated circuits (ASICs). For example, where the embodiments are implemented as a co-processor, the co- processor can be coupled to or integrated with a processing unit in which certain operations required by the processing unit can be offloaded to the co-processor.
[0026] The term "significant digit" as used herein refers to a digit in a string of digits representing a number, wherein the digits are positioned within the string according to significance. Typically, digits positioned to the left of a particular digit are more significant and digits positioned to the right of a particular digit are less significant. For example in a number "892", the leftmost hundreds digit 8 is more significant than the middle tens digit 9 and the rightmost ones digit 2 is less significant than the middle tens digit 9. The "left" direction as used herein with reference to significant digits of a number means the direction of more significant digits. The "right" direction as used herein with reference to significant digits of a number means the direction of less significant digits.
[0027] The term "significant figures" as used herein refers to the digits in a string of digits that contribute to precision. Typically, the number of significant figures in a number will be defined, such as according to a standard such as IEEE 754-2008.
[0028] Referring now to Figure 1 , therein illustrated is a schematic diagram of an exemplary decimal floating-point fused multiplier-adder 100. The floating-point multiplier- adder 100 includes a DPD decoder 102, which receives a first operand 104, second operand 106, and third operand 108. The DPD decoder 102 further decodes the first operand 104 into a first significand, a first sign bit and a first exponent, the second operand 106 into a second significand, a second sign bit and a second exponent, and the third operand 108 into a third significand, a third sign bit and a third exponent. According to various exemplary embodiments, the first operand 104 and the second operand 106 are multiplicands and the third operand 108 is an addend. According to various exemplary embodiments, each significand has a defined length of significant digits. For example, each of the first significand, the second significand and the third significand have a length of n digits. For example, the first significand, the second significand, and the third significand are each represented by a 16-digit string.
[0029] The decimal floating-point fused multiplier-adder 100 further includes a multiplier 112 for carrying out unsigned multiplication of the first significand and the second significand. The multiplier 112 outputs an intermediate product.
[0030] The decimal floating-point fused multiplier-adder 100 further includes a pre- alignment module that includes a pre-alignment calculation unit 116 for determining a direction and an amount of shifting of the third significand. The pre-alignment module also includes digit shifting unit 120 for shifting the third significand according to the amount of shifting determined by the pre-alignment calculation unit 1 16. The pre-alignment module outputs a pre-aligned addend CZsh.
[0031 ] The decimal floating-point fused multiplier-adder 100 further includes a decimal carry free adder 124 for adding the intermediate product outputted from the multiplier 1 12 with the pre-aligned addend CZsh outputted from the pre-alignment module. The decimal carry free adder 124 outputs an intermediate sum suml 128.
[0032] The decimal floating-point fused multiplier-adder 100 further includes a post- alignment module for shifting the intermediate sum suml 128 according to a preferred exponent to be achieved and the number of digits in the intermediate sum suml 128. The decimal floating-point multiplier-adder 100 includes a digit detection unit 132 for detecting leading zeros and trailing zeros of the intermediate sum suml 128. The decimal floatingpoint multiplier-adder 100 further includes a calculation unit 136 for determining the amount of shifting of the intermediate suml 128. The decimal floating point multiplier-adder 100 further includes a right shifter 140 for shifting the intermediate sum suml 128 by the amount determined by the calculation unit 136. The post-alignment module outputs a post- aligned sum sum2 144 having a defined length. Typically, the length of the post-aligned sum sum2 144 is approximately equal to the length n of the significands of the input operands.
[0033] The decimal floating-point fused multiplier-adder 100 further includes a combined rounder-conversion module 148 for rounding the post-aligned sum sum2 144 to the desired number of significant figures and for converting the post-aligned sum to a digit set of [0,9]. The combined rounder-conversion module 148 outputs an unprocessed final result 152.
[0034] The decimal floating-point multiplier-adder 100 further includes a postprocessing module 156 and a DPD encoder 160, which together process the unprocessed final result 152 and encodes the unprocessed final result 152 along with a calculated sign and exponent value to output a processed output 164.
[0035] Referring now to Figure 2, therein illustrated is a schematic diagram of a detailed decimal floating-point multiplier-adder 100 according to various exemplary embodiments. The multiplier 1 12 receives the first significand and second significand as input. An intermediate product is outputted from the multiplier 112. Where both the first significand and the second significand have a length of n-digits, the length of the intermediate product can have a maximum of 2n + 1 digits. For example, according to various industry standards such as IEEE 754-2008, where the first significand and the second significand both have a length of 16 4-bit digits, the intermediate product has a length 33 4-bit digits. According to various exemplary embodiments, the intermediate sum is in a digit-set having a range of [0,9].
[0036] According to various exemplary embodiments, the multiplier 1 12 is a multiplier with redundant internal encodings. That is, during multiplication, the first significand and second significand are represented in an alternative digit-set other than the digit-set having a range of [0,9]. For example, the intermediate product outputted by the multiplier 1 12 has a digit-set in the range of [m, n], m≥— 8, n < 8, ABS(m - n)≥ 9. For example, Han and Ko, High-Speed Parallel Decimal Multiplication with Redundant Internal Encodings, IEEE Transactions on Computers, Vol. 62, No. 5, describes a suitable multiplier with redundant internal encodings, which is hereby incorporated by reference.
[0037] Referring now to Figure 3, therein illustrated is a schematic diagram of an exemplary multiplier 1 12 with redundant internal encodings. The exemplary multiplier 1 12, includes a partial product generation unit 204, a signed digit recoder 208, a selector 212 and a partial product reduction unit 216.
[0038] The partial product generation unit 204 generates partial products equal to 1X multiple, 2X multiple, 3X multiple, 4X multiple and a 5X multiple of the first significand that are free of carry propagation. The 1X-5X multiples of the first significand can be represented in a signed digit-set having a range of [m, n], m > -8, n < 8, ABS(m - n) > 9. For example Table 1 represents exemplary 1X-5X multipliers calculations based on the first significand having digits in the range of [0,9]. It will be appreciated that the 1X-5X multipliers are represented in a signed-digit set having a range equal to or smaller than [- 8,7]. Therefore, the digits can still be represented in 4-bits.
[0039] The signed digit recoder 208 recodes the second significand into a recoded significand having a digit-set in the range of [-5,5]. By recoding the second significand in this digit-set, each digit of the recoded significand can be used to determine which of the multiples 1X-5X generated by the partial product generation unit 204 is to be selected for the addition of the partial products. For example, Table 1 represents exemplary Vj recoded digit outputs of the recoded significand based on the input BCD operands of the second significand, wherein Wt represents the residual digit that has the same weight as a current BCD digit, Ti+ 1 and Ki+2 are the transfer digits to the next two more significant which are 10 times and 100 times the weight of the current BCD digit.
[0040] The selector 212 receives the value of a digit of the recoded significand and selects the appropriate partial product 1X-5X based on the received value. Where the recoded significand is negative, each of the bits of the selected partial product is inverted by an invertor 220 to obtain the negative of the selected partial product.
[0041] The partial products reduction unit 216 adds the selected partial products in order to calculate the intermediate product. According to the exemplary multiplier, the intermediate product is in a signed digit-set having a range equal to or smaller than [-8,7] or [-6,6] according to Table 1..
Figure imgf000010_0001
TABLE 1
[0042] Referring now to Figure 4, therein illustrated is a schematic diagram of a hardware implemented equivalent of the exemplary multiplier of Figure 3.
[0043] Referring back to Figure 2 and to Figure 6A illustrating a schematic diagram of an exemplary pre-alignment module, a pre-alignment module 240 of the exemplary decimal floating-point multiplier-adder 100 includes a first adder 242, a second adder 246, a left-shifting module 248, a right shifting module 250, a selection generation unit 252 and a selector 254.
[0044] The first adder 242 and second adder 246 calculate intermediate signals based on the values of the first exponent, the second exponent, and the third exponent. The intermediate signals correspond to the amount of shifting of the third significand based on different cases of the relationship between first operand, the second operand, and the third operand.
[0045] The left-shifting module 248 and right-shifting module 250 receives one or more of the intermediate signals from the first adder 242 and the second adder 246 and respectively shifts the third significand of the third operand (addend) by the amount defined in the intermediate signals.
[0046] The selection generation unit 252 determines which of the different cases of the relationship between the first operand, the second operand, and the third operand is present based on values of the first exponent, the second exponent, the third exponent, and the position of the leading non-zero digit of the third significand (addend). The output of the selection generation unit 252 represents which of the various cases is present.
[0047] The selector 254 receives the output of the selection generation unit 252 and selects one of the outputs of the left-shifting module 248 and right shifting module 250 as the pre-aligned addend CZsh. According to various exemplary embodiments, the selector 254 further includes an inverter for inverting the bits of the third significand prior to outputting the pre-aligned addend CZsh if the third operand is negative (sign of the third operand is negative). For example, the inverter can be implemented as an array of XOR gates. Inverting the bits of the third significand achieves one's complement of every digit of the third significand. An operation mode EOF signal can be determined according to the sign of the third operand. For example, if the third operand is positive EOP[n - 1: 0] = 0 and if the third operand is negative EOP [n— 1: 0] = 1 so that two's complement of the inverted digits of the third significand can be achieved.
[0048] According to various exemplary embodiments of the decimal floating-point multiplication-adder 100, the multiplication and addition of the three operands is carried out free of shifting of the first and second operands prior to the multiplication. Accordingly, only the third significand of the third operand is shifted. The shifting of the significand of the third operand is carried out so that a corresponding exponent for the shifted third significand equals the exponent of the intermediate product (the sum of the first exponent and the second exponent), thereby ensuring that the addition in the adder 124 is carried out on two intermediate operands having the same exponent. Since the intermediate product can have a maximum length of 2n + 1 digits, the third significand may have to be shifted in the more significant direction by at most 2n + 1 digits. Similarly, the third significand may have to be shifted in the less significant direction by at most n digits. Accordingly, the range of shifting of the third significand has a width of 4n + 2 digits. Accordingly, the pre-aligned addend CZsh also has a width of An + 2 digits. Since the pre-aligned addend CZsh is wider, digits of the pre-aligned addend CZshthat correspond to digits not occupied by the shifted third significand are padded with 0.
[0049] In a first case, the third exponent of the third operand (addend) is significantly greater than the exponent of the intermediate product (the sum of the first exponent and the second exponent). In particular, the difference between the third exponent and the exponent of the intermediate product is greater than an amount the third significand can be shifted in the more significant direction without overflowing. In this first case, the third significand is shifted in the more significant direction by an amount corresponding to the length 2n + 1 plus the amount of leading zeros in the third significand. Since there will be overflow of the most significant digits of the third significand, this fact is recorded and the amount of the overflow is also recorded.
[0050] In the second case, the third exponent of the third operand (addend) is greater than the exponent of the intermediate product by a difference that does not exceed the maximum possible amount of shifting of the third significand in the more significant direction without overflowing. Accordingly, the third significand can be shifted in the more significant direction by an amount equal to the difference between the third exponent and the exponent of the intermediate product.
[0051] In the third case, the third exponent of the third operand (addend) is less than the exponent of the intermediate product by a difference that does not exceed the maximum possible amount of shifting of the third significand in the less significant direction without overflowing. Accordingly, the third significand can be shifted in the less significant direction by an amount equal to the difference between the third exponent and the exponent of the intermediate product.
[0052] In the fourth case, the third exponent of the third operand is significantly less than the exponent of the intermediate product. In particular, the difference between the exponent of the intermediate product and the third exponent is less than an amount of shifting of the third significant in the less significant direction without overflowing. In this fourth case, the third significand is shifted in the less significant direction by an amount corresponding to the length 2n. Since there will be overflow of the least significant digits of the third significand, this fact is recorded and the amount of the overflow is also recorded.
[0053] According to one example, the amount of the shifting based on the presence of one of the four described can be determined according to:
if 2n + 1 + LZD(CZ < EZ - EP) then
Lsal = 2n + l + LZD(CZ);
OV = "10";
else if (0 < EZ - EP ≤2n + l + LZD(CZ)) then
Lsal = EZ - EP;
OV = 00;
else if (0 < EP - EZ ≤ 2n) then
Rsal = EP - EZ;
OV - 00;
else if (2n < EP - EZ) then
Rsal = 2n;
OV = 01;
end
ALGORITHM 1 , wherein LZD(CZ) is the number of leading zeros of the third significand, EZ is the exponent of the third significand, EP is the exponent of the intermediate product, Lsal is the determined amount of shifting in the more significant direction of the third significand; OV tracks the presence of overflow wherein OV = 10 denotes left shift overflow and OV = 01 denotes right shift overflow, and Rsal is the determined amount of shifting in the less significant direction of the third significand.
[0054] It will be appreciated that the pre-alignment module 240 only shifts the third significand. The multiplication at the multiplier 1 12 is carried out on the first significand and the second significand free of any shifting of the first and second significands. Advantageously, in not having to shift the first and second significands prior to multiplication, the multiplication at the multiplier 1 12 and the pre-alignment of the third significand at the pre-alignment module 240 can be carried out in parallel, thereby achieving a savings in time and an improvement in speed.
[0055] According to various exemplary embodiments, the pre-alignment module 240 further includes a miscellaneous signals generation unit 256 for generating at least a operation mode EOP, the exponent value Expl corresponding to the exponent value of the pre-aligned addend, and a first sticky digit Stickyl for tracking the value of bits shifted out of range in the less significant direction.
[0056] Referring now to FIG. 5, therein illustrated is an exemplary alignment of the intermediate product with the pre-aligned addend CZsh. Since the third significand can be shifted in both the more significant direction and the less significant direction, the width of the pre-aligned addend CZsh is wider than the width of the third significand and the intermediate product. Accordingly, the exponent value Expl corresponding to the exponent value of the pre-aligned addend CZsh is different from the exponent of the intermediate product EP. In particular, according to the second, third and fourth cases of shifting described above, the exponent value Expl of the pre-aligned addend CZsh is less than the exponent of the intermediate product EP due to the n digits of the pre-aligned addend CZsh that are provided for shifting of the third significand in the less significand direction. According to these three cases, the exponent value Expl is n less than the exponent of the intermediate product EP.
[0057] According to the first case of shifting described above, the shifting of the third significand in the more significant direction results in a high number of trailing zero digits in the pre-aligned addend CZsh and the exponent value Expl is less than the exponent of the intermediate product EP. When taking into account a preferred exponent value EC to be achieved, Expl is equal to EC - LCD(CZ) - (3n + 1). In all other cases, Expl is equal to EP ~ n.
[0058] For example, where n = 16 the miscellaneous signals can be determined according to: EOP = SX SY . SZ if (OV = "10" )
Expl = EC - LZD(C' ) - 49;
else
Expl = EP - 16:
endif
if (RSHOR(CZ) = 0)
Stirky l : ••01 )" :
else
ii(EOP = 1 )
Stick yl— *Ί ' ;
else
Stickyl = "'ΟΓ' ;
endif
endif
Algorithm 2 wherein RSHOR(CZ) means the bit-by-bit OR of all right shifted digits out of the third significand.
[0059] Referring now to Figure 6A, therein illustrated is a schematic diagram of a hardware implementation of an exemplary pre-alignment module 240. The first adder 242 is implemented as a binary prefix tree adder to determine the amount of shifting Lsal in the more significant direction. The second adder 246 is implemented as a second binary prefix tree adder to determine the amount of shifting Rsal in the less significant direction. The left and right shifting amount Lsal and Rsal are calculated simultaneously by the two binary prefix tree adders 242, 246. For example, since the maximum amount of shifting in the left direction or the right direction are constant, only lower bits of the outputs of the two adders 242, 246 are fed into the shifters.
[0060] According to one exemplary embodiment, to reduce the timing delay, the number of leading zeros in the addend LZD(CZ) is not determined before determining the amount of shifting in the more significant direction Lsa^ . Instead, the addend without the leading zeros (CZwolz) is outputted by a first left shifter 248a, and the selector 254 selects the correct digits of the CZwolz if the first case occurs {OV = 10). The selection signal outputted by the selection generator 252 can be determined from whether the third exponent is greater than or less than the exponent of the intermediate product and the value of the overflow signal OV .
[0061 ] Referring now to Figure 6B, therein illustrated is a schematic diagram of a hardware implementation of an exemplary left-shifter 248 of the pre-alignment module. Since the widths of the inputted third significand (n digits) and output of the shifter 248 (4n + 2 digits) are different, it is possible to reduce the hardware cost of the shifter compared to a typical digit-shifter. According to the example illustrated in Figure 6B a simplified model of the proposed left shifter is shown to shift one bit input x to left. Since the less significant bits of result are obtained earlier than the more significant bits in the binary adder, the multiplexors for shifting less digits are placed on the top of the shifter. It will be understood that a symmetrical structure can be used for a right shifter. In comparison to a typical shifter having the same width on both input and output, the exemplary shifter uses approximately 37% less multiplexors.
[0062] Referring back to Figure 2, the adder 124 includes a correction digit generation unit 280 and first adder 282 and second adder 284. For example, Han et al. Non-speculative Decimal Signed Digit Adder, Circuits and Systems (ISCAS), 201 1 IEEE International Symposium, which is hereby incorporated by reference, describes a suitable adder that can be appropriately modified for inclusion in the exemplary decimal floatingpoint multiplier-adder 100.
[0063] The adder 124 receives as its input the intermediate product outputted from the multiplier 1 12 and the pre-aligned addend CZsh. For each digit of the intermediate product and the pre-aligned addend, the first adder 282 determines a first temporary sum Wt . As described in Han et al. , the correction digit generation unit 280 calculates for each digit of the intermediate product and the pre-aligned addend a transfer digit for the next most significant digit Ti+1 and a complement digit based on the transfer digit from the next less significant digit
[0064] According to one exemplary embodiment, the first adder 282 and the correction digit generation unit 280 is adapted for the fact that the intermediate product is in a digit-set having a range of[m, n], m > -8, n < 8, ABS(rn - n) > 9. For example, where the intermediate product is in a digit-set having a specific range of [-8,7], the temporary sum Wj and the transfer digit for the next most significant digit Tl+ 1 can be determined according to Table 2:
Figure imgf000017_0001
TABLE 2
[0065] The adder module 124 outputs the intermediate sum suml 128, which has a digit-set in the range of [-8,7]. A value of an exponent corresponding to the equal intermediate sum suml 128 is equal to the exponent of the intermediate product EP. Where the intermediate product is in a digit-set having the specific range of [-8,7] or smaller, the intermediate sum suml 128, is also in a digit-set in the range of [-8,7]. Advantageously, the digit-set [-8,7] can be represented in the same number of bits (4-bits) as the digit-set [0,9]. [0066] Referring back to Figures 1 and 2, the exemplary decimal floating-point multiplier adder 100 includes a post alignment module 300 that receives the intermediate sum suml 128 outputted by the adder 124. The post-alignment module 300 includes an intermediate signal generator 132, a position of the leading non-zero digit detection module 304, a trailing zero digit detection module 306, the post-alignment calculation unit 136 and the right shifter unit 140. According to various industry standards, such as IEEE 754-2008, a final result should have a preferred exponent EC where possible. The post-alignment module 300 determines an amount of shifting of the intermediate sum suml 128 that would either allow the unprocessed final result 152 to achieve the preferred exponent EC or, where the preferred exponent EC cannot be achieved, approach the preferred exponent EC.
[0067] The intermediate signal generator 132 calculates a plurality of intermediate values which are used to determine the required amount of shifting of the intermediate sum suml 128. A first intermediate value DIFFpre corresponds to the difference between the preferred exponent EC and the exponent Expl of the intermediate product.
[0068] DIFFpre > ° corresponds to a situation where shifting in the less significant direction of the intermediate sum suml 128 is required in order to achieve the preferred exponent EC. DIFFabs is defined as the absolute value of the difference between the exponents of the intermediate product and the preferred exponent.
[0069] 0≤ DIFFpren > corresponds to a situation where the amount of shifting in the less significant direction depends on the number of significant digits between the leading non-zero digits of the intermediate sum suml 128 and the first trailing zero of the intermediate sum suml. It will be understood that n corresponds to the length in digits of the significand of the unprocessed final result 152. Within this situation, where DIFFpre = n, there is an overlap with the situation where DIFFpre > 0. [0070] DIFFpre < 0 corresponds to a situation where shifting in the more significant direction of the intermediate sum suml 128 is required in order to achieve the preferred exponent EC. However, since the significand of the third operand 108 was shifted in the pre-alignment module 240 in a manner that ensures that the required precision (number of significant figures) is always achieved, DIFFpre < 0 corresponds to a situation where the preferred exponent EC cannot be achieved. Accordingly, the amount of shifting in the less significant direction of the intermediate sum suml 128 depends on the position of the leading non-zero digit of the intermediate sum suml 128. In this situation, LZD(CZ) + 3n + 1 corresponds to the maximum possible amount of shifting in the more significant direction digits of the third significand in the pre-alignment module 240. Since left overflow happens in this case, DIFFabs is always larger than LSD(CZ) + 2n + 1. Thus DIFFpre is less than n, and the analysis of shifting is similar to the case where DIFFpre > 0.
[0071] For example, in the decimal floating-point multiplier-adder operating on operands having 16-bit long significands (n = 16), the intermediate values are determined according to:
if (EP > EC) then
* right shift addend */
Expl = · i :
E pp = EC;
D1FF(„, = EC - EP + 10;
DlFEprr = Dill 10;
I'll /',.. < 10;
else
/ * lef t. s hi ft ddend * /
if [OV = ()) then
/··.<..· - /'/' IC;
Expp = EP;
DIE I),,.,, = EP - EP+ \<i
DlFF,,ri = 10;
else
/'., ; ! = EC - LZD(CZ) - 49;
Expp = EP;
/)// / . ,, = /. /' - EC + LZDiCZ) - 19;
D FF,,,.. = -DlFF„t,s - LZD(CZ) - 11);
DIFF,.,. < 10;
end
end
DIFFpr< = Expp- Expl;
DI F = A SiEP - EC);
Expp = M AX(-:m. MIN(E . EC));
ABS{) me JUS the absolute value function.
[0072] The leading non-zero digit detection module 304 is operable to determine the effective position of the leading non-zero digit of the intermediate sum suml 128. The position of the leading non-zero digit detection module 304 is also operable to determine the number of leading zero digits in the intermediate sum suml 128 before the effective position of the leading non-zero digit. These values are later used to determine the amount of shifting of the intermediate sum suml 128. According to various exemplary embodiments, any leading non-zero digit detection module 304 known in the art may be used.
[0073] "Effective position of the leading non-zero digit" herein refers to the position of the digit that corresponds to a most-significant non-zero digit when taking into account the signed digit-set of the intermediate sum suml 128. For example, due to the intermediate sum suml 128 being represented in the signed digit-set, the interspersion of positive and negative digits can result in a particular number value to be represented using a greater number of non-zero digits than necessary. In such cases, the digits following the most- significant non-zero digit must be analyzed to determine whether correction is needed in order to determine the effective position of the leading non-zero digit.
[0074] The trailing zero detection module 306 determines the amount of trailing zeros in the intermediate sum suml 128. This value is later used to determine the amount of shifting of the intermediate sum suml 128. According to various exemplary embodiments, any trailing zeros detection module 306 known in the art may be used.
[0075] The post-alignment calculation unit 136 is operable to determine an amount rsal of the shifting of the intermediate sum suml 128 in the less significant direction by the shifter unit 140. The post-alignment calculation unit 136 receives the outputs from the leading non-zero-digit detection module 304, the output from the trailing zero detection module 306, and the DIFFpre intermediate value outputted by intermediate signals generation unit 132. Shifting of the intermediate sum suml 128 is carried out in order to achieve the preferred exponent EC or to achieve a result that is close to the preferred exponent EC.
[0076] Referring now to FIG. 7A, therein illustrated is a schematic diagram of a first case according to which an amount of the shifting in the less significant direction of the intermediate sum suml 128 is to be determined. According to the first case, the number of digits between the effective position of the leading non-zero digit (LOP') and the number of trailing zeros (TZD) is less than n digits. In this first case, the difference between the exponent of the intermediate product EP and the preferred exponent EC (DIFFpre) is less than the number of trailing zeros of the intermediate sum suml 128. Accordingly, to obtain the post-aligned sum 144, the intermediate sum suml 128 is shifted in the less significant direction by an amount equal to DIFFpre . According to this first case, the intermediate sum suml 128 can be exactly represented in the n digits of the post-aligned sum 144. According to the first case, the post-aligned sum 144 is equal to the intermediate sum suml 128 right shifted by an amount equal to DIFFpre . In this case, the preferred exponent EC is achieved while all the digits between the effective position of the leading non-zero digit and the trailing zeros are retained in the post-aligned sum 144.
[0077] Referring now to FIG. 7B, therein illustrated is a schematic diagram of a second case according to which an amount of shifting in the less significant direction of the intermediate sum suml 128 is to be determined. According to the second case, the number of digits between the effective position of the leading non-zero digit (position of the leading non-zero digit) and the trailing zeros of the intermediate sum suml 128 is less than DIFFpre . Accordingly, not all of the digits between the position of the leading non-zero digit and the trailing zeros are initially retained in the post-aligned sum 144. In such cases, to obtain the post-aligned sum, the intermediate sum suml 128 is shifted further towards the less significant direction so that the n most significant digits after the effective position of the leading non-zero digit (including the leading non zero) are retained.
[0078] Referring now to FIG. 7C, therein illustrated is a schematic diagram of a third case according to which an amount of shifting in the less significant direction of the intermediate sum suml 128 is to be determined. According to the third case, the difference between the exponent of the intermediate product and the preferred exponent (DIFFpre) is greater than the number of trailing zeros of the intermediate sum suml 128. Accordingly, the intermediate sum suml 128 can be shifted in the less significant direction by an amount that is less than or equal to the number of trailing zeros. According to one exemplary embodiment, the intermediate sum suml 128 is shifted by an amount equal to the number of trailing zeros to obtain the post-aligned sum 144. In the third case, the preferred exponent cannot be reached and the adjusted exponent (Exp2) is less than and closest to the preferred exponent. Furthermore, in the first case illustrated in FIG. 7A, the second case illustrated in FIG. 7B and the third case illustrated in FIG. 7C, all of the significant figures of the intermediate sum suml 128 between the effective leading non-zero digit and the trailing zeros are retained after the shifting in the less significant direction.
[0079] Referring now to FIG. 7D, therein illustrated is a schematic diagram of a fourth case according to which an amount of shifting in the less significant direction of the intermediate sum suml 128 is to be determined. According to fourth case, the number of significant digits between the effective position of the leading non-zero digit (leading one position) and trailing zeros is greater than n. Accordingly, not all of the significant digits of the intermediate sum suml 128 can be retained within the post-aligned sum 144. Accordingly, to obtain the post-aligned sum 144, the intermediate sum suml 128 is shifted in the less significant direction by an amount so that the n most significant digits after the effective position of the leading non-zero digit (including the leading non zero digit) are retained in the post-aligned sum 144. This amount of shifting in the less significant direction corresponds to a difference between the effective position of the leading non-zero digit and the size n of the post-aligned sum 144.
[0080] Referring now to FIG. IE, therein illustrated is a schematic diagram of a fifth case according to which an amount of shifting in the less significant direction of the intermediate sum suml 128 is to be determined. According to the fifth case, DIFFpre is negative, and the preferred exponent is smaller than the exponent of the intermediate product. Accordingly, to achieve the preferred exponent, the intermediate sum suml 128 should be shifted in the more significant direction. However, since the size of the post- aligned sum 144 is smaller than the intermediate sum suml 128, the intermediate sum suml 128 cannot be shifted in the more significant direction. Accordingly, to obtain the post-aligned sum 144, the intermediate sum suml 128 is shifted in the less significant direction by an amount so that the n most significant digits after the effective position of the leading non-zero digit (including the leading non zero digit) are retained. This amount of shifting in the less significant direction corresponds to a difference between the effective position of the leading non-zero digit and the size n of the post-aligned sum.
[0081 ] In each of the five cases described above, the exponent of the final result {exp2) is updated according to amount of the shifting of the intermediate sum suml 128. [0082] According to one example, the shifting in the less significant direction is a right shift and the amount of the shifting (rsa2) according to which of the described five cases is applicable can be determined according to:
if (LOP' - TZD ≤ n and TZD - DIFFpre ≥ θ) then > n) then
Figure imgf000023_0001
d TZD - DlFFpre < 0) then
/* case 3
Rsa2 = TZD;
else if (LOP' - TZD > n or DlFFpre < 0)then
I /* case 4 or 5
Rsa2 = LOP' - n;
end
where LOP' is an effective position of the leading non-zero digit.
[0083] The right shifter unit 140 receives as its input the intermediate sum suml 128 outputted by the addition module 124 and right shifts the intermediate sum suml 128 according to the amount rsa2 determined by the post-alignment calculation unit 136.
[0084] The post-alignment module further includes a sticky digit generator 320 and a second right shifting module 324. The sticky digit sticky2 is used to track the values of one or more digits of the intermediate sum suml 128 that are lost due to the right shifting, but which may also be required for modules downstream of the post-alignment module 300. The post-alignment module 300 further includes a second miscellaneous signal generation module 328. In particular, a sign of the final result and an exponent of the final result 152 are updated in the second miscellaneous signal generation module 328. For example, the adjusted exponent exp2 is updated based on the amount of right shifting. For example, exp2 = expl + rsa2.
[0085] Referring now to Figure 8, therein illustrated is an exemplary a leading nonzero digit detection unit 400. The leading non-zero digit detection unit 400 consists essentially of a simple leading non-zero digit detector 404 and a leading one corrector 408. The position of the leading non-zero digit detection unit 700 receives an operand 410 having digits in a signed digit-set having a range of [m, n], m≥ -8, n≤ 8, ABS(m - n) > 9. For example, the operand 410 consists of a significand and a sign bit or exponent is not required. Where a sign bit or exponent is included, it does not need to be considered in the leading non-zero digit detection. The signed digit-set in the range of range of [m, n] can be a decimal redundant encoding of digits in the range [0,9].
[0086] According to the exemplary leading non-zero digit detection unit 400, the position of the simple leading non-zero digit detector 404 is enabled to detect an initial position of the leading non-zero digit in the string of digits of the operand 410. The initial position of the leading non-zero digit corresponds to most significant non-zero digit and is typically the left-most non-zero digit in the string of digits of the significand. The leading non-zero digit detector 404 can be implemented according to any method known in the art.
[0087] Due to the use of signed digit sets, the initial position of the leading non-zero digit detected by the simple leading non-zero digit detector 404 may not correspond to the effective most significant non-zero digit. Concluding that the initial position of the leading non-zero digit is the effective position of the leading non-zero digit can lead to the misinterpretation of which digits of the operand 410 are significant figures. In any signed digit set used as redundant encoding of the [0,9] digit-set, the presence of an initial leading non-zero digit equal to 1 or Ϊ followed by a next non-zero less significant digit that is of the opposite sign will result in the initial position of the leading non-zero digit to not correspond to the effective most significant non-zero digit. This is because the next non-zero less significant digit of the opposite sign either subtracts from (where initial position of the leading non-zero digit equals 1 and next non-zero less significant digit is negative) or adds to (where initial position of the leading non-zero digit equals Ϊ and next non-zero less significant digit is positive) the leading non-zero digit, thereby causing the leading 1 or Ϊ digit to be converted 0. In these cases, the effective position of the leading non-zero digit is found at a position of a digit less significant than the initial leading non-zero digit detected by the simple leading non-zero digit detector 404. [0088] By way of example, in a signed digit set having a range [x, y], x≤—9\ \y≥ 9 for redundant encoding of digits in the range [0,9], the presence of the digit 9 or 9 results in a plurality of digits immediately less significant than the leading 1 or Ϊ digit causes these digits to also be converted to 0, and thereby not represent the effective position of the leading non-zero digit. The necessity to examine many digits immediately less significant than the leading 1 or Ϊ digit introduces additional complexity. For example, in a significand "19982345", the leading 1 will be converted to 0 as a result of the next non-zero less significant digit being negative. Furthermore, due to the presence of multiple 9 in the digits immediately less significant than the leading 1, each of the 9 digits will also be converted to 0. The significand "19982345" when converted becomes "00022345". It will be appreciated that whereas the initial position of the leading non-zero digit is detected as the most significant digit having the value 1, the converted significand "00022345" has a position of the leading non-zero digit at its fourth most significant digit (the most leftwise 2). Similarly, the significand "Ϊ9982345" becomes "00017655"
[0089] It has been discovered that in an operand having digits in the range [m, n] , m > -8, n < 8, ABS(rn - n) > 9, the absence of a 9 or 9 digit results in the effective position of the leading non-zero digit being at most only one digit position different from the position of the initial non-zero digit detected by the simple leading non-zero detector 404. This is because there is at most negative carry propagation by one digit position. For example, a significand "18872345" will be converted to "01132345" and the position of the initial leading one digit needs to only be corrected by one digit position.
[0090] According to various exemplary embodiments, the leading non-zero digit corrector 408 is operable to selectively correct the position of the initial of the leading nonzero digit by at most one digit in the less significant direction based on pattern analysis of the digits of the significand operand. For example TABLE 3 shows all the possible string patterns for the input operand 410 that need to be considered for making a decision as to whether or not the initial position of the leading non-zero digit should be corrected. Where correction is not required, the initial position of the leading non-zero digit corresponds to the effective position of the leading non-zero digit. Where correction is required, the position of the next less significant digit of the initial leading non-zero digit is the effective position of the leading non-zero digit.
Figure imgf000026_0001
z : (s = 0); p : (s > 0); « : ( < 0); p~ : (s > 1 ); n": (s < -1) ;
k >— 0, / >= 0; a-: don't care
TABLE 3 where the sign of operand 410 suml correspond to the sign of the initial leading non-zero digit detected by the simple leading non-zero digit detector 404.
[0091] As shown, the number of leading zeros k is increased (i.e. the initial position of the initial leading non-zero digit is to be corrected by one position in the less significant direction) in only two situations. These situations arise either when the value of the initial leading non-zero digit is 1 and the value of the next non-zero less significant digit after the initial leading non-zero digit is negative; or when the value of the initial leading non-zero digit is -1 and the value of the next non-zero less significant digit after the initial leading non-zero is positive. In these two situations, the effective position of the leading non-zero digit is one position in the less significant direction than the position of the initial leading non-zero digit. Accordingly, the leading non-zero digit corrector 408 corrects the position of the initial leading non-zero detected by the simple leading non-zero digit detector404 to the position one digit over in the less significant direction. In all other situations, the position of the initial leading non-zero digit detected by the simple leading non-zero detector 404 is the effective position and does not need to be corrected by the leading non-zero digit corrector 408.
[0092] Referring now to Figure 9, therein illustrated is an exemplary hardware structure of a portion of the leading non-zero digit corrector 408 implemented in hardware as a tree structure. Each node of the tree structure has a left branch input and a right branch input, and provides an output based on the left branch input value and the right branch input value. The root node 412 has its left branch input 414 the most significant digit of the operand 410 and has as its right branch input 416 the second most significant digit of the operand 410 and has an output 417. A child node 418 has as its left branch input 420 the output of its parent node and has as its right branch input 422 the next less significant digit of the digit that is the right branch of its parent node. The child node further has an output 424. As shown in Figure 9, the child node 418 has as its parent node the root node 412. The leaf node 426 has as its left branch input 428 the output of its parent node and has as its right branch input 430 the least significant digit of the operand 410. The output value 432 of the leaf node 426 indicates whether or not the initial position of the leading non-zero digit detected by the simple leading non-zero digit detector 404 should be corrected.
[0093] According to one exemplary embodiment, for each node, the output is determined according to the following equations:
if the initial non-zero digit is positive:
node(d) = p+ if p+l · zr + zl · p+r
node(d) = po if zl ■ por + pol ■ zr
node(d) = z if zl · zr
node(d) = n if nl + zl ■ nr
node(d) = y if yl + zl ■ yr + po1 ■ nr; or
if the initial non-zero digit is negative:
node d) = n~ if n~l■ zr + zl · n~r
node(d) = no if zL · nor + no1 · zr
node(d) = z if zl · zr
node{d)— p if p + zl-pr
node(d) = y if yl + zl■ yr + no1 ■ pr
wherein for a digit signal s,p+: (s > l); po: (s = l); p: (s > 0);z: (s = 0); no: (s = — l); n": (s <— l); n: (s < 0), node(d) denotes the output of any particular node, and wherein if the output of the leaf node is equal to y, the leading one corrector 408 corrects the initial position of the leading non-zero digit to the next less significant digit. For example, the equations can be implemented using combination logic within each node.
[0094] The equations for a node can be presented in the following Table 4:
Figure imgf000028_0001
TABLE 4
[0095] For example, as shown in Figure 9, the tree structure is shown for when the initial leading non-zero digit is positive. According to various exemplary, two tree structures can be implemented in parallel, with first tree structure for the first case of the initial leading non-zero digit being positive and a second tree structure for the second case of the initial leading non-zero digit being negative. That is, each node of the first tree structure, corresponding to a positive initial leading non-zero digit determines a node output based on the equations:
node d)— p+ if p+l · zr + z' p+r
node(d) = po if zl ■ por + po1 ■ zr
node(d) = z if zl ■ zr
node(d)— n if nl + zl ■ nr
node(d)— y if yl + zl · yr + po1 ■ nr] and the second tree structure, corresponding to a negative initial leading non-zero digit determines a node output based on the equations:
node(d) = n~ if n~l · zr + zl · ri~r
node(d) = no if zl ■ nor + no1 ■ zr
node(d) = z if z' zr
node(d) = p if p + zl ■ pr
node(d) = y if yl + zl · yr + no1 · pr
If the output of either one of the root nodes of the first tree structure and the second tree structure is equal to y, then the position of the initial leading non-zero digit should be corrected. For example, the outputs of the root nodes of the two tree structures can be passed through an OR gate.
[0096] Referring back to Figure 8, according to one exemplary embodiment, the initial position of the leading non-zero digit LOP is outputted from the simple leading nonzero digit detector 404. The output LOP is fed via a first path through a decrementer 436 (LOP - 1) to a selector 440 and via a second path directly to the selector 440. It will be appreciated that the first path corresponds to when the initial position is to be corrected, and the second path corresponds to when the initial position is the effective position and does not need to be corrected. The correct value between the two path is selected by the selector 440 based on the output of the leading non-zero digit corrector 408. The output of the selector is the effective position LOP' of the leading non-zero digit of the operand 410.
[0097] Referring now to Figure 10, therein illustrated is an exemplary portion of the post-alignment module 300 being used in conjunction with a simple leading non-zero detector 404 and a leading non-zero digit corrector 408 as described with reference to the exemplary leading non-zero detection unit 400. Both simple leading non-zero detector 404 and a leading non-zero digit corrector 408 have as its input operand 4 0 the intermediate sum suml 128 outputted from the adder module 124. According to the exemplary post- alignment module 300, the output LOP of the leading non-zero detector 404 is the position of initial position of the leading non-zero digit.
[0098] Referring back to the five possible cases of right shifting of the intermediate sum suml 128. In the second case of Figure 7B, fourth case of Figure 7D and fifth case of Figure 7E, the amount of the shifting in the less significant direction rsa2 of the intermediate sum sural 128 is the difference between the effective position of the position of the leading non-zero digit and the size n of the post-aligned sum 144. Where intermediate sum suml 128 has digits in a signed digit-set having a range of [m, n], m≥ -8, n < 8, ABS(m - n) > 9, the initial position of the leading non-zero digit of the significant is corrected by at most one position in the less significand direction. Accordingly, calculating the difference between the effective position of the leading non-zero digit (LOP') and the size n of the post-aligned sum can be divided into two situations.
[0100] In the first situation, the position of the initial non-zero digit LOP does not need to be corrected by the leading non-zero digit corrector 408. For example, where the size n of the post-aligned sum 144 equals 16, LOP' - 16 = LOP - 16.
[0101 ] In the second situation, the position of the initial non-zero digit LOP must be corrected by the leading non-zero digit corrector 408 by correcting the position of the initial leading non-zero digit by one position in the less significant direction. For example, where the size n of the post-aligned sum equals 16, LOP' - 16 = LOP - 15.
[0102] Referring to Figure 10, the determination of the amount of shifting in the less significant direction rsal for both situations of the effective position of the leading non-zero digit LOP' is calculated in parallel. For example, a first decision module 450 applies the equations for determining rsal for the first situation where the initial position of the leading non-zero digit LOP does not need to be corrected. For example, the first decision module 450 includes a first subtraction module 454 that calculates the difference between the position of the leading non-zero digit and the size n of the post-aligned sum 144 (ex: 16 digits) as LOP - 16.
[0103] For example, a second decision module 460 applies the equations for determining rsal for the second situation where the initial position of the leading non-zero digit LOP is corrected by one position in the less significant direction. For example, the first second module 460 includes a second subtraction module 464 that calculates the difference between the initial position of the leading non-zero digit and the size n of the post-aligned sum 144 (ex: 16 digits) as LOP - 15. [0104] The exemplary post-alignment calculation unit 448 further includes a selector
468, which receives the output of the first decision module 450 and the output of the second decision module 460 and selects the correct output based on the output of the leading non-zero digit corrector 408.
[0105] Advantageously, according to the exemplary post alignment calculation unit 448 of Figure 10, the use of a separate simple leading non-zero digit detector 404 and leading non-zero digit corrector 408 allows for determination of the amount of right shifting rsal and the determination of the correction of the initial position of the leading non-zero digit to be carried out in parallel. This achieves a time saving in the post-alignment module 448. In particular, the first decision module 450 and the second decision module 460 both apply the equations for determining rsal for the two possible cases of the initial digit correction at the same time as the leading non-zero digit corrector 408 determines whether correction is required. By contrast, and by way of example, if another post-alignment calculation unit receives an effective position of the leading non-zero digit output before determining rsal, the determination of whether correction of the initial position leading nonzero digit is required and the determination of the shifting amount rsal will have to be carried out sequentially.
[0106] Referring now to Figure 1 1 , therein illustrated is an exemplary combination rounder-conversion module 900 for rounding an input operand formed of a sign digit sign 902 and a significand in 903 having / digits (in[l - 1: 0]) in a signed digit-set having a range of [m, n], m > -9, n≤ 8, ABS(m - n) > 9.
[0107] In the significand having / digits, the I - 1 digits are significant figures, which are to be rounded by the least significant digit of the significand (in{0}), herein referred to as the rounding digit.
[0108] The sign digit signl 902 corresponds to a sign of the input operand when taking into account prior redundant encoding of an initial operand initially represented in an unsigned digit set into a signed digit-set.
[0109] The combination rounder-conversion module 900 also receives a sticky digit stickyl, which represents values of less significant digits of the rounding digit. According to various exemplary embodiments, the sticky digit can be representative of the value of the next non-zero less significant digit of the rounding digit. For example, the sticky digit can be represented in a digit-set having a range that is smaller than the range of digit-set of the significand. For example, the sticky digit can be represented in two bits to denote whether the next non-zero less significant digit is positive, negative or equal to 0.
[01 10] Due to the digits of the significand being in a signed digit-set and the operand being signed, two specific factors affect the rounding of the significant figures of the input number. For example, in a given positive first operand having a significand "ssss. . ssss51234" and a given positive second operand having a significand "5sss. . 55S551234" wherein s denotes a significant figure, the digit 5 immediately to the right of the least significant significant figure is the rounding digit in both operands. According to a ties-to-away rounding mode, the first operand will be rounded to "ssss.. ssss + 1", while the second operand will be rounded to "ssss. . ssss" due to the rounding digit 5 being decreased by the digit Ϊ in the next less significant digit.
[01 1 1 ] For example, whereas the positive first operand having the significand "ssss. . ssss5 l234" is rounded to "ssss. . ssss + 1", if the first operand is negative according to the sign bit of the operand, the significand "ssss. . ssss51234" is negated to "ssss. .5-55551234" and the negated significand will be rounded to "ssss. . ssss - 1". For example, for a positive third operand having the significand "s5ss. . s55551234", the sign of the third operand is negative {sign! = 1), the signifcand will also be negated to "ssss. . ssss51234", and the negated significand will be rounded to "ssss. . ssss - 1".
[01 12] It will be appreciated that the rounding of significant figures by the rounding digit depends on both the value of the next non-zero less significant digit of the rounding digit and on the sign of the operand. The rounding of the significant figures can further depend on the sign of the significand (as denoted by the sign of the position of the leading non-zero digit of the significand). Rounding is furthermore always based on the value of the rounding digit.
[01 13] Continuing with Figure 1 1 , the exemplary rounder-conversion module includes an invertor 904 for selectively inverting the / - 1 most significant digits of the significand of the operand. These digits can be the significant figures of the significand of the operand. The inversion is based on the sign of the operand. For example, sign! = 0 denotes that the operand is positive and sign2 = 1 denotes that the operand is negative. Accordingly, using an XOR array as the invertor 904 having the inputs sign! and the / - 1 most significant digits of the significand, the invertor 904 will invert the Z— 1 most significant digits of the significand if the operand is negative. The selectively negated I - 1 most significant digits of the significand is outputted as an inverted intermediate.
[0114] The exemplary rounder-conversion module further includes a calculation unit for determining the propagation bits of the / - 2 most significant digits of the significand and a calculation unit for determining the generation bits of the 1 - 2 most significant digits of the significand. According to the example shown in Figure 1 1 , the calculation units for generating the propagation bits of the I - 2 most significant digits of the significand and the generation bits of the I - 2 most significant digits of the significand are implemented within a single unit 908. As shown in Figure 1 1 , the single unit is a / - 2 bit prefix tree structure, however various other known methods of propagation bit and generation bit calculation may be used.
[0115] The exemplary rounder-conversion module further includes a rounding increment generation unit for determining an increment value RDinc. RDinc is the value by which the /— 1 most significant digits of the significand, representing / - 1 significant figures, should be incremented. Taking into account various combinations of the sign of the operand, the sign of the significand, the value of the rounding digit, and the value of the sticky digit, it has been discovered that the following TABLE 4 provides a complete set of possible rounding increments based on the various combinations for a digit-set in the range of [-8,7] for various modes of rounding.
Figure imgf000034_0001
TABLE 4, wherein RD denotes the value of the Rounding Digit, SD denotes the value of the Sticky Digit, x denotes don't care, LE denotes that the least significant figure is even. The sticky digit is equal to -1 if the next non-zero less significant digit of the rounding digit is negative, the sticky digit is equal to 1 if the next non-zero less significant digit of the rounding digit is positive, and the sticky digit is equal to 0 if all less significant digits of the rounding digit is equal to 0. SignF denotes the sign of the final result, and is determined according to: SignF = Sign2@SignS wherein SignS is the sign of a first addend that was added (or subtracted) to a second addend to obtain the input operand in.
[01 16] The exemplary rounder-conversion module 900 further includes a negative carry generation unit for determining a negative carry signal. Whether a negative carry will arise depends on the least significant figure of significand, which corresponds to the digit ί'η{1}. Whether a negative carry will arise further depends on the least significant figure of the significand as incremented by the increment value RDinc. Whether a negative carry will arise further depends on the sign of the operand, Sign2. Taking into account various combinations of the sign of the operand Sign!, the value of least significant figure of the significand and the increment value RDinc, according to one exemplary embodiment, it has been discovered that the following equations provide a complete set of possible values of the least significant negative carry digit NC{0}.
1 if Sum2{l} < -1 or
NC{0}+i = (Sum2{\) = -l&Sign2 = 1)
0 otherwise
Figure imgf000035_0001
1 if Sum2{l} < 1 or
NCiO}-1 = (Sum2{l} = 18iSign2
0 otherwise
Figure imgf000035_0002
[01 17] The negative carry generation unit can further generate the remainder of the negative carry signal and further generate a complete carry signal C. According to the value of the least significant negative carry digit, the rest of the negative carry can be determined based on the determined / - 2 propagation bits and the determined / - 2 generation bits. For example, the remainder of the negative carry signal can be determined according to the equation:
VQ_1:0 = 01-2:O&(P.-2;O |NC{O}) and the complete carry signal C can be determined according to:
C = {NC[l - 2: 0]; Sign2}
[01 18] As shown in Figure 1 1 , the negative carry generation unit 912 is implemented with a negative carry signal least significant digit generator 912 that is discrete from a complete carry signal generator 916. According to various exemplary embodiments, the negative carry signal least significant digit generator 912 can be implemented in combination logic according to known methods. According to various exemplary complete the carry signal generator 916 can be implemented in combination logic according to known methods.
[01 19] Advantageously, implementing the negative carry signal least significant digit generator 912 separately allows the determination of the least significant negative carry digit NC{0} to be carried out in parallel with the generating of the I - 2 propagation bits and the generating I - 2 generation bits in the prefix tree structure 908. The outputs of the negative carry signal least significant digit generator 912 and the prefix tree structure 908 can be readily combined in the complete carry signal generator 916.
[0120] The exemplary rounder-conversion module 900 further includes a correction signal generator 920 for generating a correction signal Cor2. The correction signal Cor2 represents the amount of correction of the significant figure digits of the significand in based on both rounding increment, the selective negation of the significand (i.e. sign of the operand Sign2), and the complete carry signal C. Taking into account various combinations of the complete carry signal C and the rounding increment, according to one exemplary embodiment, it has been discovered that the following equations provide a complete set of possible values for the correction signal Corl.
Figure imgf000037_0001
[0121] The exemplary rounder-conversion module 900 further includes an adder 924 for adding digits of the correction signal Cor2 with corresponding digits of the selectively negated / - 1 most significant digits of the significand in outputted as the inverted intermediate. The resulting sum is a rounded and digit-set converted result representing a final result. For example, the resulting sum is in the convention BCD digit-set [0,9] and is the final result 152. According to one exemplary embodiment, the adder 924 is a carry look ahead array, however it will be understood that any other suitable adder known in the art may be used to add the correction signal Cor2 with the inverted intermediate sum.
[0 22] Advantageously, the exemplary rounder-conversion module is free of positive carry propagation. That is, the module will not experience positive carry propagation. Only negative carry (borrow) propagation is experienced. [0123] According to various exemplary embodiments, the rounder-conversion module can be used in any design for carrying out arithmetic operations wherein an intermediate operand having the same properties as the operand is generated.
[0124] According to various exemplary embodiments, the combined rounder- conversion module 900 can be included in the decimal floating-point multiplier adder 100. The significand in of the input operand for the combined rounder-conversion module 900 is the post-aligned sum sum.2 144. The sign of the input operand sign! is the sign of the initial leading non-zero digit of the intermediate sum suml 128. The sticky digit sticky! of the input operand for the combined rounder-conversion module 900 represents values of digits that overflow due to shifting of the intermediate sum suml in the less significant direction. The output CR 152 of the adder array 924 is the significand portion of the final result.
[0125] The rounder conversion module 900 can further include a sign generation module 930 for determining a sign SR of the final result. For example, sign SR is equal to signF and is determined based on the sign of the input operand sign! and the sign of the first operand SX and the sign of the second operand SY of the decimal floating-point fused multiplier adder 100. For example signF - sign2©(SX 0 SY).
[0126] The rounder conversion module 900 can further include an exponent generation module 934 for determining an exponent ER of the final result. For example, ER is equal to the sum of the exponent expl of the intermediate sum suml and the amount of shifting rsa2 in the post-alignment module 300.
[0127] The significand CR of the final result, exponent ER of the final result, and sign SR of the final result are provided to the post processor 156 and DPD Encoder 160 to compute the processed output 164.
[0128] Referring back to Figure 2, the sticky digit generator 320 for determining sticky digit sticky2 is included as part of the exemplary post-alignment module 300. According to one exemplary embodiment, the sticky digit generator 320 is implemented as two prefix tree structures for determining a value p and a value z. The detection algorithm for the sticky digit is similar to the carry propagation process. For example, the values p can be determined according to: p = ' + z' · pr
and the value z can be determined according to:
ί r
Z = Z Z
wherein if p is equal to 1 , the next non-zero less significant digit of the of the rounding digit of the post-aligned sum 144 (the least significant digit of the post-aligned sum 144) is positive, if z is equal to 1 , all less significant digits of the rounding digit is equal to zero and if both p and z are 0, the next non-zero less significant digit of the rounding digit of the post- aligned sum 144 (the least significant digit of the post-aligned sum 144) is negative. It will be appreciated that the sticky digit can be represented in 2 bits.
EXAMPLE 1
According to a first exemplary calculation:
Tnput:
SX = 0 CX = 0963625455443960 EX = 18 SY = 0 CY = 7S2S17S241591672 EY = -1
SZ = 1 = 9999358S77665432 EZ = 31
Calculation:
1. Multiplication:
Product =
012463432142204420125102041301120
2. Pre- Alignment:
EP = 17 EZ = 31
L a\ = 14 (active) ?5fll =— 14
CZsfc = 11../11αΜ099995377654311...11
EOP = 1 5f;c½1 = 00 (zero) £zp1 = 1
3. Addition:
012463432142204420125102041301120 -00...00999953387766543200. -00
00...001345656323043Π 231251020473011200...00
4. Poet-Alignment:
E 2 = 31
Sum2 = 34565632304311231
Sign! = 0 Sticky! = 01 (positive) Exp2 = 32 6. Rounding:
ED = 1
ED,r = 0 Chd = 1
C = 11111100100010110
Output:
3E = 0 CR = 6043443170429077 EE = 32 EXAMPLE 2
According to this example, n = 4.
Input:
sx = 0 cx = 3960 EX = 1 S
SY = 0 CY = 1672 EY = - 1 sz = 1 cz = 6522 EZ = 20
Calculation:
1: ultiplicat
CY = 02332 J2120
J_2120
2 CX = 12120
12120
3 " = 12120 12120
Product = 013421120
Pre-Alignment:
EP = 17 EZ - 20
ΙΛΪΙ = 3 (active)
CZsk = 17633111
EOP = 1 Sridt l = 00 (zero) Expl
3, Addition:
013421120 (Prodia)
17633111 ( .-¾)
11111111 (EOP)
00101120
4 , P ost -Ali gnm ent :
Rsa2= 5
Sum 2 = 0112
Sign2 = 0 5ndh'2= 00 (zero) £ j»2 = 18
5r Rounding:
£Z> = 0
ΛΖ). = 0 Chd = 0
C = 11000
Output:
SR = 0 Cj? = 9912 ER = 18 [0129] While the above description provides examples of the embodiments, it will be appreciated that some features and/or functions of the described embodiments are susceptible to modification without departing from the spirit and principles of operation of the described embodiments. Accordingly, what has been described above has been intended to be illustrative and non-limiting and it will be understood by persons skilled in the art that other variants and modifications may be made without departing from the scope of the invention as defined in the claims appended hereto.

Claims

CLAIMS:
1 . A decimal floating-point multiplier-adder for carrying out addition and multiplication operations on a first operand, a second operand and a third operand, the multiplier-adder comprising:
a multiplication module for multiplying a first significand of the first operand and a second significand of the second operand to output an intermediate product;
a pre-alignment module for shifting a third significand of the third operand to output a pre-aligned addend;
an addition module for adding the intermediate product and the pre-aligned addend to output an intermediate sum; and
a post-alignment module for shifting the intermediate sum based on the shifting of the third significand of the third operand and a preferred exponent to output a post-aligned sum.
2. The decimal floating-point multiplier-adder of claim 1 ,
wherein the pre-alignment module shifts the third significand based on a difference between an exponent of the third operand and a sum of the exponents of the first and second operands;
wherein the significand of the first operand being unshifted and the significand of the second operand being unshifted are multiplied by the multiplication module; and
wherein the third significand is shifted by the pre-alignment module in parallel with the multiplying of the first significand and the second significand.
3. The decimal floating-point multiplier adder of any one of claims 1 or 2,
wherein the first significand has n digits, the second significand has n digits and the third significand has n digits; and
wherein the pre-aligned addend represents the significand of the third operand shifted in a range of 4n + 2 digits.
4. The decimal floating-point multiplier adder of claim 3, wherein the amount of the shifting of the significand of the third operand by the pre-alignment module is determined according to: if 2n + 1 + LZD (CZ) < EZ - EP) then
Lsal = 2n + 1 + LZD (CZ);
OV = "10";
else if (0 ≤ EZ - EP ≤ 2n + 1 + LZD(CZ)) then
Lsal— EZ— EP;
OV = 00;
else if (0 < EP - EZ ≤ 2n) then
Rsal = EP - EZ;
OV = 00;
else if (2n < EP - EZ) then
Rsal = 2n;
OF = 01;
end
wherein LZD (CZ) is the position of leading non-zero digit of the third significand, EZ is the exponent of the third significand, EP is the exponent of the intermediate product, Lsal is a determined amount of shifting in the more significant direction of the third significand; OV tracks the presence of overflow wherein OV = 10 denotes left shift overflow and OV - 01 denotes right shift overflow, and Rsal is a determined amount of shifting in the less significant direction of the third significand.
5. The decimal floating-point multiplier adder of any one of claims 1 to 4, wherein the intermediate product has a digit-set in a range of [m, n], m≥ -8, n < 8, ABS(m— n)≥ 9.
6. The decimal floating-point multiplier adder of claim 5, wherein the first significand of the first operand has a digit-set in a range of [0,9], the second significand of the second operand has a digit-set of in a range of [0,9] and the intermediate product has a digit-set in a range of [-8,7].
7. The decimal floating-point multiplier adder of claim 6, wherein the multiplication module includes a partial product generator for generating IX, 2X, 3X, AX and 5X partial products of the significand of the first operand, a signed digit recoder for recoding the significand of the second operand to a digit-set having a range of [-5,5], and a partial products reduction module for summing the partial products based on digits of the recoded second significand. The decimal floating-point multiplier adder of claim 7, wherein the partial products generated according to:
Figure imgf000045_0001
9. The decimal floating-point multiplier adder of any one of claims 1 to 8, wherein the intermediate product has a digit-set in a range of [m, n], m >— 8, n < 8, ABS(m - n) > 9 and the intermediate sum has a digit-set in the range of[m, n], m > -8, n < 8, lB5(m - n) > 9.
10. The decimal floating-point multiplier adder of claim 9, wherein the addition module is a carry-free adder having a transfer and complement generator for generating a transfer digit and a complement digit, and wherein a temporary sum Wt and the transfer digit for the next more significant digit is determined according to:
Figure imgf000046_0001
1 1 . The decimal floating-point multiplier-adder of any one of claims 1 to 10, wherein the aligning of the intermediate sum is further based on a position of the leading non-zero digit of the intermediate sum.
12. The decimal floating-point multiplier-adder of any one of claims 1 to 1 1 , wherein the shifting of the intermediate sum is determined according to: if (LOP' - TZD ≤ n and TZD - DIFFpre ≥ θ)
then
I if (LOP' - DIFFpre≤ n) then
I I Rsa2 = DIFFpre;
else if (LOP' - DIFFpre > n) then
I I Rsa2 = LOP' - n;
I end
else if (LOP' - TZD≤ n and TZD - DlFFpre < θ) then I Rsa2 = TZD;
else if (LOP' - TZD > n or DlFFpre < 0)then I Rsal = LOP' - n;
end
where LOP' is an effective position of the leading non-zero digit; and where Diffpre is a difference between a preferred exponent and an exponent corresponding to the intermediate sum.
13. A leading non-zero digit detection module comprising:
a leading non-zero detector for receiving an operand having digits in a signed digit-set having a range of [m, n], m≥ -8, n < 8, ABS(m - n) > 9, the detector being adapted to detect an initial position of the leading non-zero digit of the operand;
a leading non-zero digit corrector for selectively correcting the position of the initial position of the leading non-zero digit by at most one position in the less significant direction based on pattern analysis of the digits of the operand.
14. The leading non-zero digit detection module of claim 13, wherein the leading nonzero digit corrector corrects the initial position of the leading non-zero digit to the next less significant digit if:
the value of the initial leading non-zero digit is 1 and the value of the next non-zero less significant digit after the initial leading non-zero digit is negative; or
the value of the initial leading non-zero digit is -1 and the value of the next non-zero less significant digit after the initial leading non-zero is positive.
15. The leading non-zero digit detection module of claims 13 or 14, wherein the leading one corrector is a tree structure having a plurality of nodes, each node of the tree structure having a left branch input, a right branch input, and an output, wherein the output for any node is determined according to the equations:
if the initial non-zero digit is positive:
node(d) = p+ if p+l · zr + zl · p+r
node(d)— po if zl · por + pol ■ zr
node(d) = z if zl · zr
node(d)— n if nl + zl · nr
node(d) - y if yl + zl ■ yr + po1 ■ nr; or if the initial non-zero digit is negative:
node(d) = n~ if n~l■ zr + zl · n~r
node d) = no if zl■ nor + no1 ■ zr
node(d)— z if zl ■ zr
node(d)— p if pl + zl · pr
node(d) = y if yl + zl ■ yr + no1 ■ pr
wherein for a digit signal s, p+: (s > l);po: (s = 1); p: (s > 0);z: (s = 0);no: (s = -l); n": (s < -l); n: (5 < 0), 5' denotes a left branch input and sr denotes a right branch input, and node(d) denotes an output of the node;
wherein the root node has as its left branch input the most significant digit and as its right branch input the second most significant digit;
wherein a child node has as its left branch input the output of its parent node and has as its right branch input the next less significant digit of the digit that is the right branch input of its parent node; and
wherein if the output of the leaf node is equal to y, the leading one corrector corrects the initial position of the leading non-zero digit to the next less significant digit.
16. The leading non-zero digit detection module of claims 13 or 14, wherein the leading one corrector comprises a first tree structure having a plurality of nodes and a second tree structure having a plurality of nodes, each node of the first tree structure having a left branch input, a right branch input, and an output, wherein the output for any node of the first tree structure is determined according to the equations:
node(d) = p+ if p+l · zr + zl · p+r
node(d) = po if zl ■ por + pol ■ zr
node(d) = z if zl · zr
node(d) = n if n' + zl ■ nr
node(d)— y if yl + zl · yr + po1 ■ nr;
wherein for a digit signal s,p+:(s > l);po:(s = l);z:(s = 0);n:(s < 0), s' denotes a left branch input and sr denotes a right branch input, and node(d) denotes an output of the node; and wherein each node of the second tree structure has a left branch input, a right branch input, and an output, wherein the output for any node of the second tree structure is determined according to the equations:
node(d) = n~ if n~l■ zr + zl · n~r
node(d) = no if zl■ nor + no1 · zr
node(d) = z if zl · zr
node(d)— p if pl + zl■ pr
node(d) = y if yl + zl · yr + no1 ■ pr
wherein for a digit signal s, p: (s > 0); z: (s = 0); no: (s = -1); n~: (s < -1); ), sl denotes a left branch input and sr denotes a right branch input, and node(d) denotes an output of the node;
wherein the root nodes of the first and second tree structures have as their left branch input the most significant digit and as its right branch input the second most significant digit;
wherein a child node of the first and second tree structure has as its left branch input the output of its parent node and has as its right branch input the next less significant digit of the digit that is the right branch input of its parent node; and wherein if the output of the leaf node of at least one of the first tree structure and the second tree structure is equal to y, the leading one corrector corrects the initial position of the leading non-zero digit to the next less significant digit.
17. A combined rounder-conversion module for processing a signed operand having a sign and a significand having / digits in a signed digit-set having a range of [m,n], m≥ -9, n < 8, ABS(m - n)≥ 9, the module comprising:
an inverter for selectively inverting the / - 1 most significant digits of the significand based on the sign of the operand to output a bit-inverted intermediate;
a calculation unit for determining the propagation bits of the 1-2 most significant digits of the significand;
a calculation unit for determining the generation bits of the 1-2 most significant digits significand; a rounding increment generation unit for determining an increment value based on at least the sign of the operand, the least significant digit of the significand, and a sticky digit representing values of one or more less significant digits of the least significant digit of the significand;
a negative carry generation unit for determining a negative carry signal based on the sign of the operand, the increment value, the value of the second least significant digit of the significand, the propagation bits of the 1— 2 most significant digits, and the generation bits of the / - 2 most significant digits;
a correction signal generator for generating a correction signal based on the negative carry signal;
an adder for adding the bit-inverted intermediate with the correction signal to output a rounded-converted result.
18. The rounder-conversion module of claim 17, wherein the rounding increment is further determined based on a sign of the significand.
19. The rounder-conversion module of claims 17 or 18, wherein the rounding increment is determined based on:
Figure imgf000051_0001
20. The rounder-conversion module of any one of claims 17 to 19, wherein the rounded- converted result has I - 2 significant digits and represents the I - 2 most significant digits of the significand being rounded by the least significant digit of the significand while accounting for the sticky digit.
21. The rounder-conversion module of any one of claims 17 to 20, wherein the rounder- conversion module is free of positive carry propagation.
22. The rounder-conversion module of any one of 17 to 21 , wherein the negative carry signal is determined according to:
Figure imgf000052_0001
where
1 if Sum2{l} < -1 or
(Sum2{l) = —18iSign2
0 otherwise
Figure imgf000052_0002
wherein Sign! is the sign of the operand.
23. The rounder-conversion module of claim 22, wherein the correction signal is determined according to:
Figure imgf000052_0003
1 if C1:0 = 00 & Dinc — 1,
10 if C1:0 = 01 & RDinc = 1,
11 if C1:0 = 10 & RDinc = 1,
0 if Ci:o = 11 & RDinc = 1,
0 if C1:0 = 00 & RDinc — 0,
11 if C1:0 = 01 & RDinc = 0,
10 if C1:0 = 10 & RDinc = 0, '
1 if C1:0 = 11 & RDinc = 0,
-1 if Cl:0 = 00 & RDinc = -1,
12 if C1:0 = 01 & RDinc = -1,
9 if C1:0 = 10 & RDinc = -1,
Figure imgf000053_0001
2 if Ci:o = 11 & RDinc = -1,
wherein C = {NC[l - 2: 0]; Sign2}.
24. The decimal floating-point multiplier-adder of any one of claims 1 to 12, wherein the post-alignment module includes the leading non-zero digit detection module of any one of claims 13 to 16, the leading non-zero detector receiving the intermediate sum as its operand.
25. The decimal floating-point multiplier-adder of claim 24, wherein the post-alignment module includes:
a first decision module for determining the shifting based on an uncorrected initial position of the leading non-zero digit of the intermediate sum;
a second decision module for determining the shifting based on a corrected position of the leading non-zero digit of the intermediate sum in parallel with the determining of the first decision module; and
a selector for selecting between an output of the first decision module and an output of the second decision module based on the correcting by the leading one corrector of the position of the leading non-zero digit detection module.
26. The decimal floating-point multiplier-adder of any one of claims 1 to 12, 24, or 25, wherein the post-alignment module includes a sticky digit generator for determining a sticky digit corresponding to a digit shifted out by the post-alignment module, the multiplier-adder further comprising the combined rounder-conversion module of any one of claims 17 to 23, the rounder-conversion module receiving the intermediate sum as the significand of the input operand and the determined sticky digit.
PCT/CA2014/000420 2013-05-15 2014-05-09 Decimal floating-point fused multiplier-adder WO2014183195A1 (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US201361823895P 2013-05-15 2013-05-15
US61/823,895 2013-05-15
US201361859542P 2013-07-29 2013-07-29
US61/859,542 2013-07-29

Publications (1)

Publication Number Publication Date
WO2014183195A1 true WO2014183195A1 (en) 2014-11-20

Family

ID=51897541

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CA2014/000420 WO2014183195A1 (en) 2013-05-15 2014-05-09 Decimal floating-point fused multiplier-adder

Country Status (1)

Country Link
WO (1) WO2014183195A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10310815B1 (en) 2017-11-30 2019-06-04 International Business Machines Corporation Parallel decimal multiplication hardware with a 3X generator
US11360769B1 (en) 2021-02-26 2022-06-14 International Business Machines Corporation Decimal scale and convert and split to hexadecimal floating point instruction
US11663004B2 (en) 2021-02-26 2023-05-30 International Business Machines Corporation Vector convert hexadecimal floating point to scaled decimal instruction

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
AKKAS ET AL.: "A decimal floating-point fused multiply-add unit with a novel decimal leading-zero anticipator", IEEE INTERNATIONAL CONFERENCE ON APPLICATION-SPECIFIC SYSTEMS, ARCHITECTURES AND PROCESSORS (ASAP, 2011, pages 43 - 50, XP031975714, Retrieved from the Internet <URL:http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6043235> [retrieved on 20140818], doi:10.1109/ASAP.2011.6043235 *
AMIN ET AL.: "Efficient decimal leading zero anticipator designs", 2011 CONFERENCE RECORD OF THE FORTY FIFTH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS AND COMPUTERS (ASILOMAR), 2011, pages 139 - 143., Retrieved from the Internet <URL:http://ieeexplore.ieee.org/stamp/stamp.jsp7tp=&arnumber=6189972> [retrieved on 20140818] *
ELTANTAWY: "Decimal floating-point arithmetic unit based on a fused multiply-add module", 2011, pages 26 - 27, 44-56, Retrieved from the Internet <URL:http://www.eece.cu.edu.eg/~hfahmy/thesis/2011_08_dfma.pdf> [retrieved on 20140818] *
HAN ET AL.: "High-speed parallel decimal Multiplication with redundant internal encodings", IEEE TRANSACTIONS ON COMPUTERS, vol. 62, no. ISSUE, 28 March 2013 (2013-03-28), pages 956 - 968, Retrieved from the Internet <URL:http://ieeexplore.ieee.org/stamp/stam.pjsp?tp=&arnumber=6138855> [retrieved on 20140818] *
QUINNELL ET AL.: "Floating-point fused multiply-add architectures", CONFERENCE RECORD OF THE FORTY-FIRST ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS AND COMPUTERS (ACSSC 2007), 2007, pages 331 - 337, Retrieved from the Internet <URL:http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=4487224> [retrieved on 20140818] *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10310815B1 (en) 2017-11-30 2019-06-04 International Business Machines Corporation Parallel decimal multiplication hardware with a 3X generator
US10572223B2 (en) 2017-11-30 2020-02-25 International Business Machines Corporation Parallel decimal multiplication hardware with a 3x generator
US11360769B1 (en) 2021-02-26 2022-06-14 International Business Machines Corporation Decimal scale and convert and split to hexadecimal floating point instruction
US11663004B2 (en) 2021-02-26 2023-05-30 International Business Machines Corporation Vector convert hexadecimal floating point to scaled decimal instruction

Similar Documents

Publication Publication Date Title
US8965945B2 (en) Apparatus and method for performing floating point addition
US9841948B2 (en) Microarchitecture for floating point fused multiply-add with exponent scaling
CN101174200B (en) 5-grade stream line structure of floating point multiplier adder integrated unit
US7698353B2 (en) Floating point normalization and denormalization
US9367287B2 (en) Mixed precision fused multiply-add operator
US9959093B2 (en) Binary fused multiply-add floating-point calculations
US20100312812A1 (en) Decimal Floating-Point Adder with Leading Zero Anticipation
US5993051A (en) Combined leading one and leading zero anticipator
US8185570B2 (en) Three-term input floating-point adder-subtractor
Sohn et al. Improved architectures for a floating-point fused dot product unit
US9122517B2 (en) Fused multiply-adder with booth-encoding
US6988119B2 (en) Fast single precision floating point accumulator using base 32 system
EP1782268A1 (en) Method for an efficient floating point alu
US5111421A (en) System for performing addition and subtraction of signed magnitude floating point binary numbers
KR19980041731A (en) Floating-point multiplication and accumulation with coordination and normalization classes
WO2014183195A1 (en) Decimal floating-point fused multiplier-adder
US20020129075A1 (en) Apparatus and method of performing addition and rounding operation in parallel for floating-point arithmetic logical unit
US7433911B2 (en) Data processing apparatus and method for performing floating point addition
US20100174764A1 (en) Reuse of rounder for fixed conversion of log instructions
Sohn et al. Enhanced Floating-Point Adder with Full Denormal Support
Kamble et al. Research trends in development of floating point computer arithmetic
EP1315080A1 (en) Circuitry for carrying out square root and division operations
Sohn et al. Enhanced Floating-Point Multiply-Add with Full Denormal Support
EP1315079A1 (en) Circuit for calculation of division and square root with floating point numbers
CN116521124A (en) Vector floating point multiply-add device suitable for multiple precision floating point operations

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14798111

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 14798111

Country of ref document: EP

Kind code of ref document: A1