WO2014183195A1

WO2014183195A1 - Decimal floating-point fused multiplier-adder

Info

Publication number: WO2014183195A1
Application number: PCT/CA2014/000420
Authority: WO
Inventors: Seokbum KO; Liu HAN
Original assignee: University Of Saskatchewan
Priority date: 2013-05-15
Filing date: 2014-05-09
Publication date: 2014-11-20

Abstract

A decimal floating-point fused multiplier-adder (DFMA) for carrying out addition and multiplication operations on a first operand, a second operand and a third operand includes a multiplication module for multiplying a first significand of the first operand and a second significand of the second operand to output an intermediate product. The DFMA further includes a pre-alignment module for shifting a third significand of the third operand to output a pre-aligned addend, an addition module for adding the intermediate product and the pre-aligned addend to output an intermediate sum, and a post-alignment module for shifting the intermediate sum based on the shifting of the third significand of the third operand and a preferred exponent to output a post-aligned sum. The post-alignment module of the DFMA may also include a leading non-zero digit detection that receives the intermediate sum as its operand. The post-alignment module of the DFMA may also include a combined rounder-conversion module.

Description

TITLE: DECIMAL FLOATING-POINT FUSED MULTIPLIER-ADDER

FIELD

[0001 ] The present subject-matter relates to decimal floating-point fused multiplier- adder, and more particularly to a decimal floating-point fused multiplier-adder using redundant internal encoding for improved performance.

INTRODUCTION

[0002] The representation of the decimal fraction has shown to be more accurate and more precise when compared with binary floating-point arithmetic in some specific applications, such as financial computing, banking, and billing systems. Decimal floating- point has now been included in IEEE standard 754-2008. Fused multiplication-addition merges the rounding operates of at least one multiplication function with at least one addition function.

[0003] A. Akkas and M. J. Schulte, "A decimal floating-point fused multiply-add unit with a novel decimal leading-zero anticipator" in 22^nd IEEE International Conference on Application-specific Systems, Architectures and Processors, Sep. 201 1 describes a DFP- FMA design uses a previously published parallel fixed point decimal multiplier for multiplication and a Kogge-Stone parallel prefix adder for decimal addition.

SUMMARY

[0004] The embodiments described herein provide in one aspect a leading non-zero digit detection module, which includes a leading non-zero detector for receiving an operand having digits in a signed digit-set having a range of [m, n] , m≥ -8, n≤ 8, ABS m - n) > 9. The detector is adapted to detect an initial position of the leading non-zero digit of the operand. The leading non-zero digit detection module further includes a leading non-zero digit corrector for selectively correcting the position of the initial position of the leading non- zero digit by at most one position in the less significant direction based on pattern analysis of the digits of the operand. For example, the leading non-zero digit corrector corrects the initial position of the leading non-zero digit to the next less significant digit if:

the value of the initial leading non-zero digit is 1 and the value of the next non-zero less significant digit after the initial leading non-zero digit is negative; or the value of the initial leading non-zero digit is -1 and the value of the next non-zero less significant digit after the initial leading non-zero is positive.

[0005] The embodiments described herein provide in another aspect a combined rounder-conversion module for processing a signed operand having a sign and a significand having / digits in a signed digit-set having a range of [m, n], m > -9, n≤ 8, ABS(m - n)≥ 9. The combined rounder-conversion module includes an inverter for selectively inverting the / - 1 most significant digits of the significand based on the sign of the operand to output a bit-inverted intermediate; a calculation unit for determining the propagation bits of the / - 2 most significant digits of the significand; a calculation unit for determining the generation bits of the - 2 most significant digits significand; a rounding increment generation unit for determining an increment value based on at least the sign of the operand, the least significant digit of the significand, and a sticky digit representing values of one or more less significant digits of the least significant digit of the significand; a negative carry generation unit for determining a negative carry signal based on the sign of the operand, the increment value, the value of the second least significant digit of the significand, the propagation bits of the / - 2 most significant digits, and the generation bits of the I - 2 most significant digits; a correction signal generator for generating a correction signal based on the negative carry signal; an adder for adding the bit-inverted intermediate with the correction signal to output a rounded-converted result.

[0006] The embodiments described herein provide in another yet aspect decimal floating-point fused multiplier-adder for carrying out addition and multiplication operations on a first operand, a second operand and a third operand. The multiplier-adder includes a multiplication module for multiplying a first significand of the first operand and a second significand of the second operand to output an intermediate product; a pre-alignment module for shifting a third significand of the third operand to output a pre-aligned addend; an addition module for adding the intermediate product and the pre-aligned addend to output an intermediate sum; and a post-alignment module for shifting the intermediate sum based on the shifting of the third significand of the third operand and a preferred exponent to output a post-aligned sum. According to one exemplary embodiment, the decimal floating-point fused multiplier-adder includes the leading non-zero digit detection module described herein. According to one exemplary embodiment, the decimal floating-point fused multiplier-adder includes the combined rounder-conversion module described herein.

DRAWINGS

[0007] For a better understanding of the embodiments described herein and to show more clearly how they may be carried into effect, reference will now be made, by way of example only, to the accompanying drawings which show at least one exemplary embodiment, and in which:

[0008] Figure 1 illustrates a schematic diagram of an exemplary decimal floatingpoint multiplier-adder;

[0009] Figure 2 illustrates a detailed schematic diagram of an exemplary decimal floating-point multiplier-adder;

[0010] Figure 3 illustrates a schematic diagram of an exemplary multiplier with redundant internal encodings;

[001 1] Figure 4 illustrates a schematic circuit diagram of an exemplary multiplier;

[0012] Figure 5 illustrates a schematic diagram of an exemplary alignment of the intermediate product and the pre-aligned added;

[0013] Figure 6A illustrates a schematic diagram of an exemplary pre-alignment module of the exemplary decimal floating-point multiplier-adder;

[0014] Figure 6B illustrates a schematic diagram of a hardware implementation of an exemplary digit shifter;

[0015] Figure 7A illustrates a schematic diagram of a first exemplary case of digit shifting;

[0016] Figure 7B illustrates a schematic diagram of a second exemplary case of digit shifting;

[0017] Figure 7C illustrates a schematic diagram of a third exemplary case of digit shifting;

[0018] Figure 7D illustrates a schematic diagram of a fourth exemplary case of digit shifting; [0019] Figure 7E illustrates a schematic diagram of a fifth exemplary case of digit shifting;

[0020] Figure 8 illustrates a schematic diagram of an exemplary leading non-zero digit detection unit;

[0021] Figure 9 illustrates a schematic diagram of an exemplary leading non-zero digit corrector implemented in hardware;

[0022] Figure 10 illustrates a schematic diagram of a portion of an exemplary post- alignment module; and

[0023] Figure 11 illustrates a schematic diagram of an exemplary combined rounder- conversion module.

DESCRIPTION OF VARIOUS EMBODIMENTS

[0024] It will be appreciated that, for simplicity and clarity of illustration, where considered appropriate, reference numerals may be repeated among the figures to indicate corresponding or analogous elements or steps. In addition, numerous specific details are set forth in order to provide a thorough understanding of the exemplary embodiments described herein. However, it will be understood by those of ordinary skill in the art that the embodiments described herein may be practiced without these specific details. In other instances, well-known methods, procedures and components have not been described in detail so as not to obscure the embodiments described herein. Furthermore, this description is not to be considered as limiting the scope of the embodiments described herein in any way but rather as merely describing the implementation of the various embodiments described herein.

[0025] The exemplary embodiments are described herein with reference to various algorithms, modules, methods, calculation units, circuits and architectures. It will be understood that such algorithms, modules, methods, calculation units, circuits and architectures can be implemented in hardware or machine, such as in electrical and/or electronic circuits, according to various methods known in the art. For example, and without limitation, embodiments described herein may be implemented on or embedded within a microchip, microprocessor, co-processor, programmable logic, field programmable gate array (FPGA) central processing unit (CPU), graphics processing unit (GPU), Accelerated processing unit (APU), system-on-chip (SOC) and/or application specific integrated circuits (ASICs). For example, where the embodiments are implemented as a co-processor, the co- processor can be coupled to or integrated with a processing unit in which certain operations required by the processing unit can be offloaded to the co-processor.

[0026] The term "significant digit" as used herein refers to a digit in a string of digits representing a number, wherein the digits are positioned within the string according to significance. Typically, digits positioned to the left of a particular digit are more significant and digits positioned to the right of a particular digit are less significant. For example in a number "892", the leftmost hundreds digit 8 is more significant than the middle tens digit 9 and the rightmost ones digit 2 is less significant than the middle tens digit 9. The "left" direction as used herein with reference to significant digits of a number means the direction of more significant digits. The "right" direction as used herein with reference to significant digits of a number means the direction of less significant digits.

[0027] The term "significant figures" as used herein refers to the digits in a string of digits that contribute to precision. Typically, the number of significant figures in a number will be defined, such as according to a standard such as IEEE 754-2008.

[0028] Referring now to Figure 1 , therein illustrated is a schematic diagram of an exemplary decimal floating-point fused multiplier-adder 100. The floating-point multiplier- adder 100 includes a DPD decoder 102, which receives a first operand 104, second operand 106, and third operand 108. The DPD decoder 102 further decodes the first operand 104 into a first significand, a first sign bit and a first exponent, the second operand 106 into a second significand, a second sign bit and a second exponent, and the third operand 108 into a third significand, a third sign bit and a third exponent. According to various exemplary embodiments, the first operand 104 and the second operand 106 are multiplicands and the third operand 108 is an addend. According to various exemplary embodiments, each significand has a defined length of significant digits. For example, each of the first significand, the second significand and the third significand have a length of n digits. For example, the first significand, the second significand, and the third significand are each represented by a 16-digit string.

[0029] The decimal floating-point fused multiplier-adder 100 further includes a multiplier 112 for carrying out unsigned multiplication of the first significand and the second significand. The multiplier 112 outputs an intermediate product.

[0030] The decimal floating-point fused multiplier-adder 100 further includes a pre- alignment module that includes a pre-alignment calculation unit 116 for determining a direction and an amount of shifting of the third significand. The pre-alignment module also includes digit shifting unit 120 for shifting the third significand according to the amount of shifting determined by the pre-alignment calculation unit 1 16. The pre-alignment module outputs a pre-aligned addend CZ_sh.

[0031 ] The decimal floating-point fused multiplier-adder 100 further includes a decimal carry free adder 124 for adding the intermediate product outputted from the multiplier 1 12 with the pre-aligned addend CZ_sh outputted from the pre-alignment module. The decimal carry free adder 124 outputs an intermediate sum suml 128.

[0032] The decimal floating-point fused multiplier-adder 100 further includes a post- alignment module for shifting the intermediate sum suml 128 according to a preferred exponent to be achieved and the number of digits in the intermediate sum suml 128. The decimal floating-point multiplier-adder 100 includes a digit detection unit 132 for detecting leading zeros and trailing zeros of the intermediate sum suml 128. The decimal floatingpoint multiplier-adder 100 further includes a calculation unit 136 for determining the amount of shifting of the intermediate suml 128. The decimal floating point multiplier-adder 100 further includes a right shifter 140 for shifting the intermediate sum suml 128 by the amount determined by the calculation unit 136. The post-alignment module outputs a post- aligned sum sum2 144 having a defined length. Typically, the length of the post-aligned sum sum2 144 is approximately equal to the length n of the significands of the input operands.

[0033] The decimal floating-point fused multiplier-adder 100 further includes a combined rounder-conversion module 148 for rounding the post-aligned sum sum2 144 to the desired number of significant figures and for converting the post-aligned sum to a digit set of [0,9]. The combined rounder-conversion module 148 outputs an unprocessed final result 152.

[0034] The decimal floating-point multiplier-adder 100 further includes a postprocessing module 156 and a DPD encoder 160, which together process the unprocessed final result 152 and encodes the unprocessed final result 152 along with a calculated sign and exponent value to output a processed output 164.

[0035] Referring now to Figure 2, therein illustrated is a schematic diagram of a detailed decimal floating-point multiplier-adder 100 according to various exemplary embodiments. The multiplier 1 12 receives the first significand and second significand as input. An intermediate product is outputted from the multiplier 112. Where both the first significand and the second significand have a length of n-digits, the length of the intermediate product can have a maximum of 2n + 1 digits. For example, according to various industry standards such as IEEE 754-2008, where the first significand and the second significand both have a length of 16 4-bit digits, the intermediate product has a length 33 4-bit digits. According to various exemplary embodiments, the intermediate sum is in a digit-set having a range of [0,9].

[0036] According to various exemplary embodiments, the multiplier 1 12 is a multiplier with redundant internal encodings. That is, during multiplication, the first significand and second significand are represented in an alternative digit-set other than the digit-set having a range of [0,9]. For example, the intermediate product outputted by the multiplier 1 12 has a digit-set in the range of [m, n], m≥— 8, n < 8, ABS(m - n)≥ 9. For example, Han and Ko, High-Speed Parallel Decimal Multiplication with Redundant Internal Encodings, IEEE Transactions on Computers, Vol. 62, No. 5, describes a suitable multiplier with redundant internal encodings, which is hereby incorporated by reference.

[0037] Referring now to Figure 3, therein illustrated is a schematic diagram of an exemplary multiplier 1 12 with redundant internal encodings. The exemplary multiplier 1 12, includes a partial product generation unit 204, a signed digit recoder 208, a selector 212 and a partial product reduction unit 216.

[0038] The partial product generation unit 204 generates partial products equal to 1X multiple, 2X multiple, 3X multiple, 4X multiple and a 5X multiple of the first significand that are free of carry propagation. The 1X-5X multiples of the first significand can be represented in a signed digit-set having a range of [m, n], m > -8, n < 8, ABS(m - n) > 9. For example Table 1 represents exemplary 1X-5X multipliers calculations based on the first significand having digits in the range of [0,9]. It will be appreciated that the 1X-5X multipliers are represented in a signed-digit set having a range equal to or smaller than [- 8,7]. Therefore, the digits can still be represented in 4-bits.

[0039] The signed digit recoder 208 recodes the second significand into a recoded significand having a digit-set in the range of [-5,5]. By recoding the second significand in this digit-set, each digit of the recoded significand can be used to determine which of the multiples 1X-5X generated by the partial product generation unit 204 is to be selected for the addition of the partial products. For example, Table 1 represents exemplary Vj recoded digit outputs of the recoded significand based on the input BCD operands of the second significand, wherein W_t represents the residual digit that has the same weight as a current BCD digit, T_{i+ 1} and K_i+2 are the transfer digits to the next two more significant which are 10 times and 100 times the weight of the current BCD digit.

[0040] The selector 212 receives the value of a digit of the recoded significand and selects the appropriate partial product 1X-5X based on the received value. Where the recoded significand is negative, each of the bits of the selected partial product is inverted by an invertor 220 to obtain the negative of the selected partial product.

[0041] The partial products reduction unit 216 adds the selected partial products in order to calculate the intermediate product. According to the exemplary multiplier, the intermediate product is in a signed digit-set having a range equal to or smaller than [-8,7] or [-6,6] according to Table 1..

TABLE 1

[0042] Referring now to Figure 4, therein illustrated is a schematic diagram of a hardware implemented equivalent of the exemplary multiplier of Figure 3.

[0043] Referring back to Figure 2 and to Figure 6A illustrating a schematic diagram of an exemplary pre-alignment module, a pre-alignment module 240 of the exemplary decimal floating-point multiplier-adder 100 includes a first adder 242, a second adder 246, a left-shifting module 248, a right shifting module 250, a selection generation unit 252 and a selector 254.

[0044] The first adder 242 and second adder 246 calculate intermediate signals based on the values of the first exponent, the second exponent, and the third exponent. The intermediate signals correspond to the amount of shifting of the third significand based on different cases of the relationship between first operand, the second operand, and the third operand.

[0045] The left-shifting module 248 and right-shifting module 250 receives one or more of the intermediate signals from the first adder 242 and the second adder 246 and respectively shifts the third significand of the third operand (addend) by the amount defined in the intermediate signals.

[0046] The selection generation unit 252 determines which of the different cases of the relationship between the first operand, the second operand, and the third operand is present based on values of the first exponent, the second exponent, the third exponent, and the position of the leading non-zero digit of the third significand (addend). The output of the selection generation unit 252 represents which of the various cases is present.

[0047] The selector 254 receives the output of the selection generation unit 252 and selects one of the outputs of the left-shifting module 248 and right shifting module 250 as the pre-aligned addend CZ_sh. According to various exemplary embodiments, the selector 254 further includes an inverter for inverting the bits of the third significand prior to outputting the pre-aligned addend CZ_sh if the third operand is negative (sign of the third operand is negative). For example, the inverter can be implemented as an array of XOR gates. Inverting the bits of the third significand achieves one's complement of every digit of the third significand. An operation mode EOF signal can be determined according to the sign of the third operand. For example, if the third operand is positive EOP[n - 1: 0] = 0 and if the third operand is negative EOP [n— 1: 0] = 1 so that two's complement of the inverted digits of the third significand can be achieved.

[0048] According to various exemplary embodiments of the decimal floating-point multiplication-adder 100, the multiplication and addition of the three operands is carried out free of shifting of the first and second operands prior to the multiplication. Accordingly, only the third significand of the third operand is shifted. The shifting of the significand of the third operand is carried out so that a corresponding exponent for the shifted third significand equals the exponent of the intermediate product (the sum of the first exponent and the second exponent), thereby ensuring that the addition in the adder 124 is carried out on two intermediate operands having the same exponent. Since the intermediate product can have a maximum length of 2n + 1 digits, the third significand may have to be shifted in the more significant direction by at most 2n + 1 digits. Similarly, the third significand may have to be shifted in the less significant direction by at most n digits. Accordingly, the range of shifting of the third significand has a width of 4n + 2 digits. Accordingly, the pre-aligned addend CZ_sh also has a width of An + 2 digits. Since the pre-aligned addend CZ_sh is wider, digits of the pre-aligned addend CZ_shthat correspond to digits not occupied by the shifted third significand are padded with 0.

[0049] In a first case, the third exponent of the third operand (addend) is significantly greater than the exponent of the intermediate product (the sum of the first exponent and the second exponent). In particular, the difference between the third exponent and the exponent of the intermediate product is greater than an amount the third significand can be shifted in the more significant direction without overflowing. In this first case, the third significand is shifted in the more significant direction by an amount corresponding to the length 2n + 1 plus the amount of leading zeros in the third significand. Since there will be overflow of the most significant digits of the third significand, this fact is recorded and the amount of the overflow is also recorded.

[0050] In the second case, the third exponent of the third operand (addend) is greater than the exponent of the intermediate product by a difference that does not exceed the maximum possible amount of shifting of the third significand in the more significant direction without overflowing. Accordingly, the third significand can be shifted in the more significant direction by an amount equal to the difference between the third exponent and the exponent of the intermediate product.

[0051] In the third case, the third exponent of the third operand (addend) is less than the exponent of the intermediate product by a difference that does not exceed the maximum possible amount of shifting of the third significand in the less significant direction without overflowing. Accordingly, the third significand can be shifted in the less significant direction by an amount equal to the difference between the third exponent and the exponent of the intermediate product.

[0052] In the fourth case, the third exponent of the third operand is significantly less than the exponent of the intermediate product. In particular, the difference between the exponent of the intermediate product and the third exponent is less than an amount of shifting of the third significant in the less significant direction without overflowing. In this fourth case, the third significand is shifted in the less significant direction by an amount corresponding to the length 2n. Since there will be overflow of the least significant digits of the third significand, this fact is recorded and the amount of the overflow is also recorded.

[0053] According to one example, the amount of the shifting based on the presence of one of the four described can be determined according to:

if 2n + 1 + LZD(CZ < EZ - EP) then

Lsal = 2n + l + LZD(CZ);

OV = "10";

else if (0 < EZ - EP ≤2n + l + LZD(CZ)) then

Lsal = EZ - EP;

OV = 00;

else if (0 < EP - EZ ≤ 2n) then

Rsal = EP - EZ;

OV - 00;

else if (2n < EP - EZ) then

Rsal = 2n;

OV = 01;

end

ALGORITHM 1 , wherein LZD(CZ) is the number of leading zeros of the third significand, EZ is the exponent of the third significand, EP is the exponent of the intermediate product, Lsal is the determined amount of shifting in the more significant direction of the third significand; OV tracks the presence of overflow wherein OV = 10 denotes left shift overflow and OV = 01 denotes right shift overflow, and Rsal is the determined amount of shifting in the less significant direction of the third significand.

[0054] It will be appreciated that the pre-alignment module 240 only shifts the third significand. The multiplication at the multiplier 1 12 is carried out on the first significand and the second significand free of any shifting of the first and second significands. Advantageously, in not having to shift the first and second significands prior to multiplication, the multiplication at the multiplier 1 12 and the pre-alignment of the third significand at the pre-alignment module 240 can be carried out in parallel, thereby achieving a savings in time and an improvement in speed.

[0055] According to various exemplary embodiments, the pre-alignment module 240 further includes a miscellaneous signals generation unit 256 for generating at least a operation mode EOP, the exponent value Expl corresponding to the exponent value of the pre-aligned addend, and a first sticky digit Stickyl for tracking the value of bits shifted out of range in the less significant direction.

[0056] Referring now to FIG. 5, therein illustrated is an exemplary alignment of the intermediate product with the pre-aligned addend CZ_sh. Since the third significand can be shifted in both the more significant direction and the less significant direction, the width of the pre-aligned addend CZ_sh is wider than the width of the third significand and the intermediate product. Accordingly, the exponent value Expl corresponding to the exponent value of the pre-aligned addend CZ_sh is different from the exponent of the intermediate product EP. In particular, according to the second, third and fourth cases of shifting described above, the exponent value Expl of the pre-aligned addend CZ_sh is less than the exponent of the intermediate product EP due to the n digits of the pre-aligned addend CZ_sh that are provided for shifting of the third significand in the less significand direction. According to these three cases, the exponent value Expl is n less than the exponent of the intermediate product EP.

[0057] According to the first case of shifting described above, the shifting of the third significand in the more significant direction results in a high number of trailing zero digits in the pre-aligned addend CZ_sh and the exponent value Expl is less than the exponent of the intermediate product EP. When taking into account a preferred exponent value EC to be achieved, Expl is equal to EC - LCD(CZ) - (3n + 1). In all other cases, Expl is equal to EP ~ n.

[0058] For example, where n = 16 the miscellaneous signals can be determined according to: EOP = SX SY . SZ if (OV = "10" )

Expl = EC - LZD(C' ) - 49;

else

Expl = EP - 16:

endif

if (RSHOR(CZ) = 0)

Stirky l : ^••01 )^" :

else

ii(EOP = 1 )

Stick yl— ^*Ί ' ;

else

Stickyl = "'ΟΓ' ;

endif

Algorithm 2 wherein RSHOR(CZ) means the bit-by-bit OR of all right shifted digits out of the third significand.

[0059] Referring now to Figure 6A, therein illustrated is a schematic diagram of a hardware implementation of an exemplary pre-alignment module 240. The first adder 242 is implemented as a binary prefix tree adder to determine the amount of shifting Lsal in the more significant direction. The second adder 246 is implemented as a second binary prefix tree adder to determine the amount of shifting Rsal in the less significant direction. The left and right shifting amount Lsal and Rsal are calculated simultaneously by the two binary prefix tree adders 242, 246. For example, since the maximum amount of shifting in the left direction or the right direction are constant, only lower bits of the outputs of the two adders 242, 246 are fed into the shifters.

[0060] According to one exemplary embodiment, to reduce the timing delay, the number of leading zeros in the addend LZD(CZ) is not determined before determining the amount of shifting in the more significant direction Lsa^ . Instead, the addend without the leading zeros (CZwolz) is outputted by a first left shifter 248a, and the selector 254 selects the correct digits of the CZwolz if the first case occurs {OV = 10). The selection signal outputted by the selection generator 252 can be determined from whether the third exponent is greater than or less than the exponent of the intermediate product and the value of the overflow signal OV .

[0061 ] Referring now to Figure 6B, therein illustrated is a schematic diagram of a hardware implementation of an exemplary left-shifter 248 of the pre-alignment module. Since the widths of the inputted third significand (n digits) and output of the shifter 248 (4n + 2 digits) are different, it is possible to reduce the hardware cost of the shifter compared to a typical digit-shifter. According to the example illustrated in Figure 6B a simplified model of the proposed left shifter is shown to shift one bit input x to left. Since the less significant bits of result are obtained earlier than the more significant bits in the binary adder, the multiplexors for shifting less digits are placed on the top of the shifter. It will be understood that a symmetrical structure can be used for a right shifter. In comparison to a typical shifter having the same width on both input and output, the exemplary shifter uses approximately 37% less multiplexors.

[0062] Referring back to Figure 2, the adder 124 includes a correction digit generation unit 280 and first adder 282 and second adder 284. For example, Han et al. Non-speculative Decimal Signed Digit Adder, Circuits and Systems (ISCAS), 201 1 IEEE International Symposium, which is hereby incorporated by reference, describes a suitable adder that can be appropriately modified for inclusion in the exemplary decimal floatingpoint multiplier-adder 100.

[0063] The adder 124 receives as its input the intermediate product outputted from the multiplier 1 12 and the pre-aligned addend CZ_sh. For each digit of the intermediate product and the pre-aligned addend, the first adder 282 determines a first temporary sum W_t . As described in Han et al. , the correction digit generation unit 280 calculates for each digit of the intermediate product and the pre-aligned addend a transfer digit for the next most significant digit T_i+1 and a complement digit based on the transfer digit from the next less significant digit

[0064] According to one exemplary embodiment, the first adder 282 and the correction digit generation unit 280 is adapted for the fact that the intermediate product is in a digit-set having a range of[m, n], m > -8, n < 8, ABS(rn - n) > 9. For example, where the intermediate product is in a digit-set having a specific range of [-8,7], the temporary sum Wj and the transfer digit for the next most significant digit T_{l+ 1} can be determined according to Table 2:

TABLE 2

[0065] The adder module 124 outputs the intermediate sum suml 128, which has a digit-set in the range of [-8,7]. A value of an exponent corresponding to the equal intermediate sum suml 128 is equal to the exponent of the intermediate product EP. Where the intermediate product is in a digit-set having the specific range of [-8,7] or smaller, the intermediate sum suml 128, is also in a digit-set in the range of [-8,7]. Advantageously, the digit-set [-8,7] can be represented in the same number of bits (4-bits) as the digit-set [0,9]. [0066] Referring back to Figures 1 and 2, the exemplary decimal floating-point multiplier adder 100 includes a post alignment module 300 that receives the intermediate sum suml 128 outputted by the adder 124. The post-alignment module 300 includes an intermediate signal generator 132, a position of the leading non-zero digit detection module 304, a trailing zero digit detection module 306, the post-alignment calculation unit 136 and the right shifter unit 140. According to various industry standards, such as IEEE 754-2008, a final result should have a preferred exponent EC where possible. The post-alignment module 300 determines an amount of shifting of the intermediate sum suml 128 that would either allow the unprocessed final result 152 to achieve the preferred exponent EC or, where the preferred exponent EC cannot be achieved, approach the preferred exponent EC.

[0067] The intermediate signal generator 132 calculates a plurality of intermediate values which are used to determine the required amount of shifting of the intermediate sum suml 128. A first intermediate value DIFF_pre corresponds to the difference between the preferred exponent EC and the exponent Expl of the intermediate product.

[0068] DIFFpre > ° corresponds to a situation where shifting in the less significant direction of the intermediate sum suml 128 is required in order to achieve the preferred exponent EC. DIFFabs is defined as the absolute value of the difference between the exponents of the intermediate product and the preferred exponent.

[0069] 0≤ DIFF_pre≤ ⁿ _> corresponds to a situation where the amount of shifting in the less significant direction depends on the number of significant digits between the leading non-zero digits of the intermediate sum suml 128 and the first trailing zero of the intermediate sum suml. It will be understood that n corresponds to the length in digits of the significand of the unprocessed final result 152. Within this situation, where DIFF_pre = n, there is an overlap with the situation where DIFF_pre > 0. [0070] DIFFpre < 0 corresponds to a situation where shifting in the more significant direction of the intermediate sum suml 128 is required in order to achieve the preferred exponent EC. However, since the significand of the third operand 108 was shifted in the pre-alignment module 240 in a manner that ensures that the required precision (number of significant figures) is always achieved, DIFF_pre < 0 corresponds to a situation where the preferred exponent EC cannot be achieved. Accordingly, the amount of shifting in the less significant direction of the intermediate sum suml 128 depends on the position of the leading non-zero digit of the intermediate sum suml 128. In this situation, LZD(CZ) + 3n + 1 corresponds to the maximum possible amount of shifting in the more significant direction digits of the third significand in the pre-alignment module 240. Since left overflow happens in this case, DIFFabs is always larger than LSD(CZ) + 2n + 1. Thus DIFFpre is less than n, and the analysis of shifting is similar to the case where DIFF_pre > 0.

[0071] For example, in the decimal floating-point multiplier-adder operating on operands having 16-bit long significands (n = 16), the intermediate values are determined according to:

if (EP > EC) then

* right shift addend */

Expl = · i _:

E pp = EC;

D1FF₍„, = EC - EP + 10;

DlFE_prr = Dill 10;

I'll /',.. < 10;

else

/ * lef t. s hi ft ddend * /

if [OV = ()) then

/··.<..· - /^'/' IC;

Expp = EP;

DIE I),,.,, = EP - EP+ \<i

DlFF,,_ri = 10;

else

/^'., ; ! = EC - LZD(CZ) - 49;

Expp = EP;

/⁾// / . ,, = /. /' - EC + LZDiCZ) - 19;

D FF,,,.. = -DlFF„_t,s - LZD(CZ) - 11);

DIFF,.,. < 10;

end

DIFF_pr< = Expp- Expl;

DI F = A SiEP - EC);

Expp = M AX(-:m. MIN(E . EC));

ABS{) me JUS the absolute value function.

[0072] The leading non-zero digit detection module 304 is operable to determine the effective position of the leading non-zero digit of the intermediate sum suml 128. The position of the leading non-zero digit detection module 304 is also operable to determine the number of leading zero digits in the intermediate sum suml 128 before the effective position of the leading non-zero digit. These values are later used to determine the amount of shifting of the intermediate sum suml 128. According to various exemplary embodiments, any leading non-zero digit detection module 304 known in the art may be used.

[0073] "Effective position of the leading non-zero digit" herein refers to the position of the digit that corresponds to a most-significant non-zero digit when taking into account the signed digit-set of the intermediate sum suml 128. For example, due to the intermediate sum suml 128 being represented in the signed digit-set, the interspersion of positive and negative digits can result in a particular number value to be represented using a greater number of non-zero digits than necessary. In such cases, the digits following the most- significant non-zero digit must be analyzed to determine whether correction is needed in order to determine the effective position of the leading non-zero digit.

[0074] The trailing zero detection module 306 determines the amount of trailing zeros in the intermediate sum suml 128. This value is later used to determine the amount of shifting of the intermediate sum suml 128. According to various exemplary embodiments, any trailing zeros detection module 306 known in the art may be used.

[0075] The post-alignment calculation unit 136 is operable to determine an amount rsal of the shifting of the intermediate sum suml 128 in the less significant direction by the shifter unit 140. The post-alignment calculation unit 136 receives the outputs from the leading non-zero-digit detection module 304, the output from the trailing zero detection module 306, and the DIFF_pre intermediate value outputted by intermediate signals generation unit 132. Shifting of the intermediate sum suml 128 is carried out in order to achieve the preferred exponent EC or to achieve a result that is close to the preferred exponent EC.

[0076] Referring now to FIG. 7A, therein illustrated is a schematic diagram of a first case according to which an amount of the shifting in the less significant direction of the intermediate sum suml 128 is to be determined. According to the first case, the number of digits between the effective position of the leading non-zero digit (LOP') and the number of trailing zeros (TZD) is less than n digits. In this first case, the difference between the exponent of the intermediate product EP and the preferred exponent EC (DIFF_pre) is less than the number of trailing zeros of the intermediate sum suml 128. Accordingly, to obtain the post-aligned sum 144, the intermediate sum suml 128 is shifted in the less significant direction by an amount equal to DIFF_pre . According to this first case, the intermediate sum suml 128 can be exactly represented in the n digits of the post-aligned sum 144. According to the first case, the post-aligned sum 144 is equal to the intermediate sum suml 128 right shifted by an amount equal to DIFF_pre . In this case, the preferred exponent EC is achieved while all the digits between the effective position of the leading non-zero digit and the trailing zeros are retained in the post-aligned sum 144.

[0077] Referring now to FIG. 7B, therein illustrated is a schematic diagram of a second case according to which an amount of shifting in the less significant direction of the intermediate sum suml 128 is to be determined. According to the second case, the number of digits between the effective position of the leading non-zero digit (position of the leading non-zero digit) and the trailing zeros of the intermediate sum suml 128 is less than DIFFpre . Accordingly, not all of the digits between the position of the leading non-zero digit and the trailing zeros are initially retained in the post-aligned sum 144. In such cases, to obtain the post-aligned sum, the intermediate sum suml 128 is shifted further towards the less significant direction so that the n most significant digits after the effective position of the leading non-zero digit (including the leading non zero) are retained.

[0078] Referring now to FIG. 7C, therein illustrated is a schematic diagram of a third case according to which an amount of shifting in the less significant direction of the intermediate sum suml 128 is to be determined. According to the third case, the difference between the exponent of the intermediate product and the preferred exponent (DIFF_pre) is greater than the number of trailing zeros of the intermediate sum suml 128. Accordingly, the intermediate sum suml 128 can be shifted in the less significant direction by an amount that is less than or equal to the number of trailing zeros. According to one exemplary embodiment, the intermediate sum suml 128 is shifted by an amount equal to the number of trailing zeros to obtain the post-aligned sum 144. In the third case, the preferred exponent cannot be reached and the adjusted exponent (Exp2) is less than and closest to the preferred exponent. Furthermore, in the first case illustrated in FIG. 7A, the second case illustrated in FIG. 7B and the third case illustrated in FIG. 7C, all of the significant figures of the intermediate sum suml 128 between the effective leading non-zero digit and the trailing zeros are retained after the shifting in the less significant direction.

[0079] Referring now to FIG. 7D, therein illustrated is a schematic diagram of a fourth case according to which an amount of shifting in the less significant direction of the intermediate sum suml 128 is to be determined. According to fourth case, the number of significant digits between the effective position of the leading non-zero digit (leading one position) and trailing zeros is greater than n. Accordingly, not all of the significant digits of the intermediate sum suml 128 can be retained within the post-aligned sum 144. Accordingly, to obtain the post-aligned sum 144, the intermediate sum suml 128 is shifted in the less significant direction by an amount so that the n most significant digits after the effective position of the leading non-zero digit (including the leading non zero digit) are retained in the post-aligned sum 144. This amount of shifting in the less significant direction corresponds to a difference between the effective position of the leading non-zero digit and the size n of the post-aligned sum 144.

[0080] Referring now to FIG. IE, therein illustrated is a schematic diagram of a fifth case according to which an amount of shifting in the less significant direction of the intermediate sum suml 128 is to be determined. According to the fifth case, DIFF_pre is negative, and the preferred exponent is smaller than the exponent of the intermediate product. Accordingly, to achieve the preferred exponent, the intermediate sum suml 128 should be shifted in the more significant direction. However, since the size of the post- aligned sum 144 is smaller than the intermediate sum suml 128, the intermediate sum suml 128 cannot be shifted in the more significant direction. Accordingly, to obtain the post-aligned sum 144, the intermediate sum suml 128 is shifted in the less significant direction by an amount so that the n most significant digits after the effective position of the leading non-zero digit (including the leading non zero digit) are retained. This amount of shifting in the less significant direction corresponds to a difference between the effective position of the leading non-zero digit and the size n of the post-aligned sum.

[0081 ] In each of the five cases described above, the exponent of the final result {exp2) is updated according to amount of the shifting of the intermediate sum suml 128. [0082] According to one example, the shifting in the less significant direction is a right shift and the amount of the shifting (rsa2) according to which of the described five cases is applicable can be determined according to:

if (LOP' - TZD ≤ n and TZD - DIFF_pre ≥ θ) then > n) then

d TZD - DlFF_pre < 0) then

/* case 3

Rsa2 = TZD;

else if (LOP' - TZD > n or DlFF_pre < 0)then

I /* case 4 or 5

Rsa2 = LOP' - n;

end

where LOP' is an effective position of the leading non-zero digit.

[0083] The right shifter unit 140 receives as its input the intermediate sum suml 128 outputted by the addition module 124 and right shifts the intermediate sum suml 128 according to the amount rsa2 determined by the post-alignment calculation unit 136.

[0084] The post-alignment module further includes a sticky digit generator 320 and a second right shifting module 324. The sticky digit sticky2 is used to track the values of one or more digits of the intermediate sum suml 128 that are lost due to the right shifting, but which may also be required for modules downstream of the post-alignment module 300. The post-alignment module 300 further includes a second miscellaneous signal generation module 328. In particular, a sign of the final result and an exponent of the final result 152 are updated in the second miscellaneous signal generation module 328. For example, the adjusted exponent exp2 is updated based on the amount of right shifting. For example, exp2 = expl + rsa2.

[0085] Referring now to Figure 8, therein illustrated is an exemplary a leading nonzero digit detection unit 400. The leading non-zero digit detection unit 400 consists essentially of a simple leading non-zero digit detector 404 and a leading one corrector 408. The position of the leading non-zero digit detection unit 700 receives an operand 410 having digits in a signed digit-set having a range of [m, n], m≥ -8, n≤ 8, ABS(m - n) > 9. For example, the operand 410 consists of a significand and a sign bit or exponent is not required. Where a sign bit or exponent is included, it does not need to be considered in the leading non-zero digit detection. The signed digit-set in the range of range of [m, n] can be a decimal redundant encoding of digits in the range [0,9].

[0086] According to the exemplary leading non-zero digit detection unit 400, the position of the simple leading non-zero digit detector 404 is enabled to detect an initial position of the leading non-zero digit in the string of digits of the operand 410. The initial position of the leading non-zero digit corresponds to most significant non-zero digit and is typically the left-most non-zero digit in the string of digits of the significand. The leading non-zero digit detector 404 can be implemented according to any method known in the art.

[0087] Due to the use of signed digit sets, the initial position of the leading non-zero digit detected by the simple leading non-zero digit detector 404 may not correspond to the effective most significant non-zero digit. Concluding that the initial position of the leading non-zero digit is the effective position of the leading non-zero digit can lead to the misinterpretation of which digits of the operand 410 are significant figures. In any signed digit set used as redundant encoding of the [0,9] digit-set, the presence of an initial leading non-zero digit equal to 1 or Ϊ followed by a next non-zero less significant digit that is of the opposite sign will result in the initial position of the leading non-zero digit to not correspond to the effective most significant non-zero digit. This is because the next non-zero less significant digit of the opposite sign either subtracts from (where initial position of the leading non-zero digit equals 1 and next non-zero less significant digit is negative) or adds to (where initial position of the leading non-zero digit equals Ϊ and next non-zero less significant digit is positive) the leading non-zero digit, thereby causing the leading 1 or Ϊ digit to be converted 0. In these cases, the effective position of the leading non-zero digit is found at a position of a digit less significant than the initial leading non-zero digit detected by the simple leading non-zero digit detector 404. [0088] By way of example, in a signed digit set having a range [x, y], x≤—9\ \y≥ 9 for redundant encoding of digits in the range [0,9], the presence of the digit 9 or 9 results in a plurality of digits immediately less significant than the leading 1 or Ϊ digit causes these digits to also be converted to 0, and thereby not represent the effective position of the leading non-zero digit. The necessity to examine many digits immediately less significant than the leading 1 or Ϊ digit introduces additional complexity. For example, in a significand "19982345", the leading 1 will be converted to 0 as a result of the next non-zero less significant digit being negative. Furthermore, due to the presence of multiple 9 in the digits immediately less significant than the leading 1, each of the 9 digits will also be converted to 0. The significand "19982345" when converted becomes "00022345". It will be appreciated that whereas the initial position of the leading non-zero digit is detected as the most significant digit having the value 1, the converted significand "00022345" has a position of the leading non-zero digit at its fourth most significant digit (the most leftwise 2). Similarly, the significand "Ϊ9982345" becomes "00017655"

[0089] It has been discovered that in an operand having digits in the range [m, n] , m > -8, n < 8, ABS(rn - n) > 9, the absence of a 9 or 9 digit results in the effective position of the leading non-zero digit being at most only one digit position different from the position of the initial non-zero digit detected by the simple leading non-zero detector 404. This is because there is at most negative carry propagation by one digit position. For example, a significand "18872345" will be converted to "01132345" and the position of the initial leading one digit needs to only be corrected by one digit position.

[0090] According to various exemplary embodiments, the leading non-zero digit corrector 408 is operable to selectively correct the position of the initial of the leading nonzero digit by at most one digit in the less significant direction based on pattern analysis of the digits of the significand operand. For example TABLE 3 shows all the possible string patterns for the input operand 410 that need to be considered for making a decision as to whether or not the initial position of the leading non-zero digit should be corrected. Where correction is not required, the initial position of the leading non-zero digit corresponds to the effective position of the leading non-zero digit. Where correction is required, the position of the next less significant digit of the initial leading non-zero digit is the effective position of the leading non-zero digit.

z : (s = 0); p : (s > 0); « : ( < 0); p~ : (s > 1 ); n^": (s < -1) ;

k >— 0, / >= 0; a-: don^'t care

TABLE 3 where the sign of operand 410 suml correspond to the sign of the initial leading non-zero digit detected by the simple leading non-zero digit detector 404.

[0091] As shown, the number of leading zeros k is increased (i.e. the initial position of the initial leading non-zero digit is to be corrected by one position in the less significant direction) in only two situations. These situations arise either when the value of the initial leading non-zero digit is 1 and the value of the next non-zero less significant digit after the initial leading non-zero digit is negative; or when the value of the initial leading non-zero digit is -1 and the value of the next non-zero less significant digit after the initial leading non-zero is positive. In these two situations, the effective position of the leading non-zero digit is one position in the less significant direction than the position of the initial leading non-zero digit. Accordingly, the leading non-zero digit corrector 408 corrects the position of the initial leading non-zero detected by the simple leading non-zero digit detector404 to the position one digit over in the less significant direction. In all other situations, the position of the initial leading non-zero digit detected by the simple leading non-zero detector 404 is the effective position and does not need to be corrected by the leading non-zero digit corrector 408.

[0092] Referring now to Figure 9, therein illustrated is an exemplary hardware structure of a portion of the leading non-zero digit corrector 408 implemented in hardware as a tree structure. Each node of the tree structure has a left branch input and a right branch input, and provides an output based on the left branch input value and the right branch input value. The root node 412 has its left branch input 414 the most significant digit of the operand 410 and has as its right branch input 416 the second most significant digit of the operand 410 and has an output 417. A child node 418 has as its left branch input 420 the output of its parent node and has as its right branch input 422 the next less significant digit of the digit that is the right branch of its parent node. The child node further has an output 424. As shown in Figure 9, the child node 418 has as its parent node the root node 412. The leaf node 426 has as its left branch input 428 the output of its parent node and has as its right branch input 430 the least significant digit of the operand 410. The output value 432 of the leaf node 426 indicates whether or not the initial position of the leading non-zero digit detected by the simple leading non-zero digit detector 404 should be corrected.

[0093] According to one exemplary embodiment, for each node, the output is determined according to the following equations:

if the initial non-zero digit is positive:

node(d) = p⁺ if p^+l · z^r + z^l · p^+r

node(d) = po if z^{l ■} po^r + po^{l ■} z^r

node(d) = z if z^l · z^r

node(d) = n if n^l + z^{l ■} n^r

node(d) = y if y^l + z^{l ■} y^r + po^{1 ■} n^r; or

if the initial non-zero digit is negative:

node d) = n^~ if n^~l■ z^r + z^l · n^~r

node(d) = no if z^L · no^r + no¹ · z^r

node(d) = z if z^l · z^r

node{d)— p if p + z^l-p^r

node(d) = y if y^l + z^l■ y^r + no^{1 ■} p^r

wherein for a digit signal s,p⁺: (s > l); po: (s = l); p: (s > 0);z: (s = 0); no: (s = — l); n^": (s <— l); n: (s < 0), node(d) denotes the output of any particular node, and wherein if the output of the leaf node is equal to y, the leading one corrector 408 corrects the initial position of the leading non-zero digit to the next less significant digit. For example, the equations can be implemented using combination logic within each node.

[0094] The equations for a node can be presented in the following Table 4:

TABLE 4

[0095] For example, as shown in Figure 9, the tree structure is shown for when the initial leading non-zero digit is positive. According to various exemplary, two tree structures can be implemented in parallel, with first tree structure for the first case of the initial leading non-zero digit being positive and a second tree structure for the second case of the initial leading non-zero digit being negative. That is, each node of the first tree structure, corresponding to a positive initial leading non-zero digit determines a node output based on the equations:

node d)— p⁺ if p^+l · z^r + z' ^■ p^+r

node(d) = po if z^{l ■} po^r + po^{1 ■} z^r

node(d) = z if z^{l ■} z^r

node(d)— n if n^l + z^{l ■} n^r

node(d)— y if y^l + z^l · y^r + po^{1 ■} n^r] and the second tree structure, corresponding to a negative initial leading non-zero digit determines a node output based on the equations:

node(d) = n^~ if n^~l · z^r + z^l · ri^~r

node(d) = no if z^{l ■} no^r + no^{1 ■} z^r

node(d) = z if z'^■ z^r

node(d) = p if p + z^{l ■} p^r

node(d) = y if y^l + z^l · y^r + no¹ · p^r

If the output of either one of the root nodes of the first tree structure and the second tree structure is equal to y, then the position of the initial leading non-zero digit should be corrected. For example, the outputs of the root nodes of the two tree structures can be passed through an OR gate.

[0096] Referring back to Figure 8, according to one exemplary embodiment, the initial position of the leading non-zero digit LOP is outputted from the simple leading nonzero digit detector 404. The output LOP is fed via a first path through a decrementer 436 (LOP - 1) to a selector 440 and via a second path directly to the selector 440. It will be appreciated that the first path corresponds to when the initial position is to be corrected, and the second path corresponds to when the initial position is the effective position and does not need to be corrected. The correct value between the two path is selected by the selector 440 based on the output of the leading non-zero digit corrector 408. The output of the selector is the effective position LOP' of the leading non-zero digit of the operand 410.

[0097] Referring now to Figure 10, therein illustrated is an exemplary portion of the post-alignment module 300 being used in conjunction with a simple leading non-zero detector 404 and a leading non-zero digit corrector 408 as described with reference to the exemplary leading non-zero detection unit 400. Both simple leading non-zero detector 404 and a leading non-zero digit corrector 408 have as its input operand 4 0 the intermediate sum suml 128 outputted from the adder module 124. According to the exemplary post- alignment module 300, the output LOP of the leading non-zero detector 404 is the position of initial position of the leading non-zero digit.

[0098] Referring back to the five possible cases of right shifting of the intermediate sum suml 128. In the second case of Figure 7B, fourth case of Figure 7D and fifth case of Figure 7E, the amount of the shifting in the less significant direction rsa2 of the intermediate sum sural 128 is the difference between the effective position of the position of the leading non-zero digit and the size n of the post-aligned sum 144. Where intermediate sum suml 128 has digits in a signed digit-set having a range of [m, n], m≥ -8, n < 8, ABS(m - n) > 9, the initial position of the leading non-zero digit of the significant is corrected by at most one position in the less significand direction. Accordingly, calculating the difference between the effective position of the leading non-zero digit (LOP') and the size n of the post-aligned sum can be divided into two situations.

[0100] In the first situation, the position of the initial non-zero digit LOP does not need to be corrected by the leading non-zero digit corrector 408. For example, where the size n of the post-aligned sum 144 equals 16, LOP' - 16 = LOP - 16.

[0101 ] In the second situation, the position of the initial non-zero digit LOP must be corrected by the leading non-zero digit corrector 408 by correcting the position of the initial leading non-zero digit by one position in the less significant direction. For example, where the size n of the post-aligned sum equals 16, LOP' - 16 = LOP - 15.

[0102] Referring to Figure 10, the determination of the amount of shifting in the less significant direction rsal for both situations of the effective position of the leading non-zero digit LOP' is calculated in parallel. For example, a first decision module 450 applies the equations for determining rsal for the first situation where the initial position of the leading non-zero digit LOP does not need to be corrected. For example, the first decision module 450 includes a first subtraction module 454 that calculates the difference between the position of the leading non-zero digit and the size n of the post-aligned sum 144 (ex: 16 digits) as LOP - 16.

[0103] For example, a second decision module 460 applies the equations for determining rsal for the second situation where the initial position of the leading non-zero digit LOP is corrected by one position in the less significant direction. For example, the first second module 460 includes a second subtraction module 464 that calculates the difference between the initial position of the leading non-zero digit and the size n of the post-aligned sum 144 (ex: 16 digits) as LOP - 15. [0104] The exemplary post-alignment calculation unit 448 further includes a selector

468, which receives the output of the first decision module 450 and the output of the second decision module 460 and selects the correct output based on the output of the leading non-zero digit corrector 408.

[0105] Advantageously, according to the exemplary post alignment calculation unit 448 of Figure 10, the use of a separate simple leading non-zero digit detector 404 and leading non-zero digit corrector 408 allows for determination of the amount of right shifting rsal and the determination of the correction of the initial position of the leading non-zero digit to be carried out in parallel. This achieves a time saving in the post-alignment module 448. In particular, the first decision module 450 and the second decision module 460 both apply the equations for determining rsal for the two possible cases of the initial digit correction at the same time as the leading non-zero digit corrector 408 determines whether correction is required. By contrast, and by way of example, if another post-alignment calculation unit receives an effective position of the leading non-zero digit output before determining rsal, the determination of whether correction of the initial position leading nonzero digit is required and the determination of the shifting amount rsal will have to be carried out sequentially.

[0106] Referring now to Figure 1 1 , therein illustrated is an exemplary combination rounder-conversion module 900 for rounding an input operand formed of a sign digit sign 902 and a significand in 903 having / digits (in[l - 1: 0]) in a signed digit-set having a range of [m, n], m > -9, n≤ 8, ABS(m - n) > 9.

[0107] In the significand having / digits, the I - 1 digits are significant figures, which are to be rounded by the least significant digit of the significand (in{0}), herein referred to as the rounding digit.

[0108] The sign digit signl 902 corresponds to a sign of the input operand when taking into account prior redundant encoding of an initial operand initially represented in an unsigned digit set into a signed digit-set.

[0109] The combination rounder-conversion module 900 also receives a sticky digit stickyl, which represents values of less significant digits of the rounding digit. According to various exemplary embodiments, the sticky digit can be representative of the value of the next non-zero less significant digit of the rounding digit. For example, the sticky digit can be represented in a digit-set having a range that is smaller than the range of digit-set of the significand. For example, the sticky digit can be represented in two bits to denote whether the next non-zero less significant digit is positive, negative or equal to 0.

[01 10] Due to the digits of the significand being in a signed digit-set and the operand being signed, two specific factors affect the rounding of the significant figures of the input number. For example, in a given positive first operand having a significand "ssss. . ssss51234" and a given positive second operand having a significand "5sss. . 55S551234" wherein s denotes a significant figure, the digit 5 immediately to the right of the least significant significant figure is the rounding digit in both operands. According to a ties-to-away rounding mode, the first operand will be rounded to "ssss.. ssss + 1", while the second operand will be rounded to "ssss. . ssss" due to the rounding digit 5 being decreased by the digit Ϊ in the next less significant digit.

[01 1 1 ] For example, whereas the positive first operand having the significand "ssss. . ssss5 l234" is rounded to "ssss. . ssss + 1", if the first operand is negative according to the sign bit of the operand, the significand "ssss. . ssss51234" is negated to "ssss. .5-55551234" and the negated significand will be rounded to "ssss. . ssss - 1". For example, for a positive third operand having the significand "s5ss. . s55551234", the sign of the third operand is negative {sign! = 1), the signifcand will also be negated to "ssss. . ssss51234", and the negated significand will be rounded to "ssss. . ssss - 1".

[01 12] It will be appreciated that the rounding of significant figures by the rounding digit depends on both the value of the next non-zero less significant digit of the rounding digit and on the sign of the operand. The rounding of the significant figures can further depend on the sign of the significand (as denoted by the sign of the position of the leading non-zero digit of the significand). Rounding is furthermore always based on the value of the rounding digit.

[01 13] Continuing with Figure 1 1 , the exemplary rounder-conversion module includes an invertor 904 for selectively inverting the / - 1 most significant digits of the significand of the operand. These digits can be the significant figures of the significand of the operand. The inversion is based on the sign of the operand. For example, sign! = 0 denotes that the operand is positive and sign2 = 1 denotes that the operand is negative. Accordingly, using an XOR array as the invertor 904 having the inputs sign! and the / - 1 most significant digits of the significand, the invertor 904 will invert the Z— 1 most significant digits of the significand if the operand is negative. The selectively negated I - 1 most significant digits of the significand is outputted as an inverted intermediate.

[0114] The exemplary rounder-conversion module further includes a calculation unit for determining the propagation bits of the / - 2 most significant digits of the significand and a calculation unit for determining the generation bits of the 1 - 2 most significant digits of the significand. According to the example shown in Figure 1 1 , the calculation units for generating the propagation bits of the I - 2 most significant digits of the significand and the generation bits of the I - 2 most significant digits of the significand are implemented within a single unit 908. As shown in Figure 1 1 , the single unit is a / - 2 bit prefix tree structure, however various other known methods of propagation bit and generation bit calculation may be used.

[0115] The exemplary rounder-conversion module further includes a rounding increment generation unit for determining an increment value RD_inc. RD_inc is the value by which the /— 1 most significant digits of the significand, representing / - 1 significant figures, should be incremented. Taking into account various combinations of the sign of the operand, the sign of the significand, the value of the rounding digit, and the value of the sticky digit, it has been discovered that the following TABLE 4 provides a complete set of possible rounding increments based on the various combinations for a digit-set in the range of [-8,7] for various modes of rounding.

TABLE 4, wherein RD denotes the value of the Rounding Digit, SD denotes the value of the Sticky Digit, x denotes don't care, LE denotes that the least significant figure is even. The sticky digit is equal to -1 if the next non-zero less significant digit of the rounding digit is negative, the sticky digit is equal to 1 if the next non-zero less significant digit of the rounding digit is positive, and the sticky digit is equal to 0 if all less significant digits of the rounding digit is equal to 0. SignF denotes the sign of the final result, and is determined according to: SignF = Sign2@SignS wherein SignS is the sign of a first addend that was added (or subtracted) to a second addend to obtain the input operand in.

[01 16] The exemplary rounder-conversion module 900 further includes a negative carry generation unit for determining a negative carry signal. Whether a negative carry will arise depends on the least significant figure of significand, which corresponds to the digit ί^'η{1}. Whether a negative carry will arise further depends on the least significant figure of the significand as incremented by the increment value RD_inc. Whether a negative carry will arise further depends on the sign of the operand, Sign2. Taking into account various combinations of the sign of the operand Sign!, the value of least significant figure of the significand and the increment value RD_inc, according to one exemplary embodiment, it has been discovered that the following equations provide a complete set of possible values of the least significant negative carry digit NC{0}.

1 if Sum2{l} < -1 or

NC{0}⁺ⁱ = (Sum2{\) = -l&Sign2 = 1)

0 otherwise

1 if Sum2{l} < 1 or

NCiO}-¹ = (Sum2{l} = 18iSign2

0 otherwise

[01 17] The negative carry generation unit can further generate the remainder of the negative carry signal and further generate a complete carry signal C. According to the value of the least significant negative carry digit, the rest of the negative carry can be determined based on the determined / - 2 propagation bits and the determined / - 2 generation bits. For example, the remainder of the negative carry signal can be determined according to the equation:

VQ__1:0 = 0₁-2:O&(P.-2;O |NC{O}) and the complete carry signal C can be determined according to:

C = {NC[l - 2: 0]; Sign2}

[01 18] As shown in Figure 1 1 , the negative carry generation unit 912 is implemented with a negative carry signal least significant digit generator 912 that is discrete from a complete carry signal generator 916. According to various exemplary embodiments, the negative carry signal least significant digit generator 912 can be implemented in combination logic according to known methods. According to various exemplary complete the carry signal generator 916 can be implemented in combination logic according to known methods.

[01 19] Advantageously, implementing the negative carry signal least significant digit generator 912 separately allows the determination of the least significant negative carry digit NC{0} to be carried out in parallel with the generating of the I - 2 propagation bits and the generating I - 2 generation bits in the prefix tree structure 908. The outputs of the negative carry signal least significant digit generator 912 and the prefix tree structure 908 can be readily combined in the complete carry signal generator 916.

[0120] The exemplary rounder-conversion module 900 further includes a correction signal generator 920 for generating a correction signal Cor2. The correction signal Cor2 represents the amount of correction of the significant figure digits of the significand in based on both rounding increment, the selective negation of the significand (i.e. sign of the operand Sign2), and the complete carry signal C. Taking into account various combinations of the complete carry signal C and the rounding increment, according to one exemplary embodiment, it has been discovered that the following equations provide a complete set of possible values for the correction signal Corl.

[0121] The exemplary rounder-conversion module 900 further includes an adder 924 for adding digits of the correction signal Cor2 with corresponding digits of the selectively negated / - 1 most significant digits of the significand in outputted as the inverted intermediate. The resulting sum is a rounded and digit-set converted result representing a final result. For example, the resulting sum is in the convention BCD digit-set [0,9] and is the final result 152. According to one exemplary embodiment, the adder 924 is a carry look ahead array, however it will be understood that any other suitable adder known in the art may be used to add the correction signal Cor2 with the inverted intermediate sum.

[0 22] Advantageously, the exemplary rounder-conversion module is free of positive carry propagation. That is, the module will not experience positive carry propagation. Only negative carry (borrow) propagation is experienced. [0123] According to various exemplary embodiments, the rounder-conversion module can be used in any design for carrying out arithmetic operations wherein an intermediate operand having the same properties as the operand is generated.

[0124] According to various exemplary embodiments, the combined rounder- conversion module 900 can be included in the decimal floating-point multiplier adder 100. The significand in of the input operand for the combined rounder-conversion module 900 is the post-aligned sum sum.2 144. The sign of the input operand sign! is the sign of the initial leading non-zero digit of the intermediate sum suml 128. The sticky digit sticky! of the input operand for the combined rounder-conversion module 900 represents values of digits that overflow due to shifting of the intermediate sum suml in the less significant direction. The output CR 152 of the adder array 924 is the significand portion of the final result.

[0125] The rounder conversion module 900 can further include a sign generation module 930 for determining a sign SR of the final result. For example, sign SR is equal to signF and is determined based on the sign of the input operand sign! and the sign of the first operand SX and the sign of the second operand SY of the decimal floating-point fused multiplier adder 100. For example signF - sign2©(SX 0 SY).

[0126] The rounder conversion module 900 can further include an exponent generation module 934 for determining an exponent ER of the final result. For example, ER is equal to the sum of the exponent expl of the intermediate sum suml and the amount of shifting rsa2 in the post-alignment module 300.

[0127] The significand CR of the final result, exponent ER of the final result, and sign SR of the final result are provided to the post processor 156 and DPD Encoder 160 to compute the processed output 164.

[0128] Referring back to Figure 2, the sticky digit generator 320 for determining sticky digit sticky2 is included as part of the exemplary post-alignment module 300. According to one exemplary embodiment, the sticky digit generator 320 is implemented as two prefix tree structures for determining a value p and a value z. The detection algorithm for the sticky digit is similar to the carry propagation process. For example, the values p can be determined according to: p = ' + z' · p^r

and the value z can be determined according to:

ί r

Z = Z ^■ Z

wherein if p is equal to 1 , the next non-zero less significant digit of the of the rounding digit of the post-aligned sum 144 (the least significant digit of the post-aligned sum 144) is positive, if z is equal to 1 , all less significant digits of the rounding digit is equal to zero and if both p and z are 0, the next non-zero less significant digit of the rounding digit of the post- aligned sum 144 (the least significant digit of the post-aligned sum 144) is negative. It will be appreciated that the sticky digit can be represented in 2 bits.

EXAMPLE 1

According to a first exemplary calculation:

Tnput:

SX = 0 CX = 0963625455443960 EX = 18 SY = 0 CY = 7S2S17S241591672 EY = -1

SZ = 1 = 9999358S77665432 EZ = 31

Calculation:

1. Multiplication:

Product =

012463432142204420125102041301120

2. Pre- Alignment:

EP = 17 EZ = 31

L a\ = 14 (active) ?5fll =— 14

CZsfc = 11../11αΜ099995377654311...11

EOP = 1 5f;c½1 = 00 (zero) £zp1 = 1

3. Addition:

012463432142204420125102041301120 -00...00999953387766543200. -00

00...001345656323043Π 231251020473011200...00

4. Poet-Alignment:

E 2 = 31

Sum2 = 34565632304311231

Sign! = 0 Sticky! = 01 (positive) Exp2 = 32 6. Rounding:

ED = 1

ED,_r = 0 C_hd = 1

C = 11111100100010110

Output:

3E = 0 CR = 6043443170429077 EE = 32 EXAMPLE 2

According to this example, n = 4.

Input:

sx = 0 cx = 3960 EX = 1 S

SY = 0 CY = 1672 EY = - 1 sz = 1 cz = 6522 EZ = 20

Calculation:

1_: ultiplicat

CY = 02332 J2120

J_2120

2 CX = 12120

12120

3 ^" = 12120 12120

Product = 013421120

Pre-Alignment:

EP = 17 EZ - 20

ΙΛΪΙ = 3 (active)

CZsk = 17633111

EOP = 1 Sridt l = 00 (zero) Expl

3, Addition:

013421120 (Prodia)

17633111 ( .-¾)

11111111 (EOP)

00101120

4 , P ost -Ali gnm ent :

Rsa2= 5

Sum 2 = 0112

Sign2 = 0 5ndh'2= 00 (zero) £ j»2 = 18

5_r Rounding:

£Z> = 0

ΛΖ⁾. = 0 C_hd = 0

C = 11000

Output:

SR = 0 Cj? = 9912 ER = 18 [0129] While the above description provides examples of the embodiments, it will be appreciated that some features and/or functions of the described embodiments are susceptible to modification without departing from the spirit and principles of operation of the described embodiments. Accordingly, what has been described above has been intended to be illustrative and non-limiting and it will be understood by persons skilled in the art that other variants and modifications may be made without departing from the scope of the invention as defined in the claims appended hereto.

Claims

CLAIMS:

1 . A decimal floating-point multiplier-adder for carrying out addition and multiplication operations on a first operand, a second operand and a third operand, the multiplier-adder comprising:

a multiplication module for multiplying a first significand of the first operand and a second significand of the second operand to output an intermediate product;

a pre-alignment module for shifting a third significand of the third operand to output a pre-aligned addend;

an addition module for adding the intermediate product and the pre-aligned addend to output an intermediate sum; and

a post-alignment module for shifting the intermediate sum based on the shifting of the third significand of the third operand and a preferred exponent to output a post-aligned sum.

2. The decimal floating-point multiplier-adder of claim 1 ,

wherein the pre-alignment module shifts the third significand based on a difference between an exponent of the third operand and a sum of the exponents of the first and second operands;

wherein the significand of the first operand being unshifted and the significand of the second operand being unshifted are multiplied by the multiplication module; and

wherein the third significand is shifted by the pre-alignment module in parallel with the multiplying of the first significand and the second significand.

3. The decimal floating-point multiplier adder of any one of claims 1 or 2,

wherein the first significand has n digits, the second significand has n digits and the third significand has n digits; and

wherein the pre-aligned addend represents the significand of the third operand shifted in a range of 4n + 2 digits.

4. The decimal floating-point multiplier adder of claim 3, wherein the amount of the shifting of the significand of the third operand by the pre-alignment module is determined according to: if 2n + 1 + LZD (CZ) < EZ - EP) then

Lsal = 2n + 1 + LZD (CZ);

OV = "10";

else if (0 ≤ EZ - EP ≤ 2n + 1 + LZD(CZ)) then

Lsal— EZ— EP;

OV = 00;

else if (0 < EP - EZ ≤ 2n) then

Rsal = EP - EZ;

OV = 00;

else if (2n < EP - EZ) then

Rsal = 2n;

OF = 01;

end

wherein LZD (CZ) is the position of leading non-zero digit of the third significand, EZ is the exponent of the third significand, EP is the exponent of the intermediate product, Lsal is a determined amount of shifting in the more significant direction of the third significand; OV tracks the presence of overflow wherein OV = 10 denotes left shift overflow and OV - 01 denotes right shift overflow, and Rsal is a determined amount of shifting in the less significant direction of the third significand.

5. The decimal floating-point multiplier adder of any one of claims 1 to 4, wherein the intermediate product has a digit-set in a range of [m, n], m≥ -8, n < 8, ABS(m— n)≥ 9.

6. The decimal floating-point multiplier adder of claim 5, wherein the first significand of the first operand has a digit-set in a range of [0,9], the second significand of the second operand has a digit-set of in a range of [0,9] and the intermediate product has a digit-set in a range of [-8,7].

7. The decimal floating-point multiplier adder of claim 6, wherein the multiplication module includes a partial product generator for generating IX, 2X, 3X, AX and 5X partial products of the significand of the first operand, a signed digit recoder for recoding the significand of the second operand to a digit-set having a range of [-5,5], and a partial products reduction module for summing the partial products based on digits of the recoded second significand. The decimal floating-point multiplier adder of claim 7, wherein the partial products generated according to:

9. The decimal floating-point multiplier adder of any one of claims 1 to 8, wherein the intermediate product has a digit-set in a range of [m, n], m >— 8, n < 8, ABS(m - n) > 9 and the intermediate sum has a digit-set in the range of[m, n], m > -8, n < 8, lB5(m - n) > 9.

10. The decimal floating-point multiplier adder of claim 9, wherein the addition module is a carry-free adder having a transfer and complement generator for generating a transfer digit and a complement digit, and wherein a temporary sum W_t and the transfer digit for the next more significant digit is determined according to:

1 1 . The decimal floating-point multiplier-adder of any one of claims 1 to 10, wherein the aligning of the intermediate sum is further based on a position of the leading non-zero digit of the intermediate sum.

12. The decimal floating-point multiplier-adder of any one of claims 1 to 1 1 , wherein the shifting of the intermediate sum is determined according to: if (LOP' - TZD ≤ n and TZD - DIFF_pre ≥ θ)

then

I if (LOP' - DIFFpre≤ n) then

I I Rsa2 = DIFF_pre;

else if (LOP' - DIFF_pre > n) then

I I Rsa2 = LOP' - n;

I end

else if (LOP' - TZD≤ n and TZD - DlFF_pre < θ) then I Rsa2 = TZD;

else if (LOP' - TZD > n or DlFF_pre < 0)then I Rsal = LOP' - n;

end

where LOP' is an effective position of the leading non-zero digit; and where Diff_pre is a difference between a preferred exponent and an exponent corresponding to the intermediate sum.

13. A leading non-zero digit detection module comprising:

a leading non-zero detector for receiving an operand having digits in a signed digit-set having a range of [m, n], m≥ -8, n < 8, ABS(m - n) > 9, the detector being adapted to detect an initial position of the leading non-zero digit of the operand;

a leading non-zero digit corrector for selectively correcting the position of the initial position of the leading non-zero digit by at most one position in the less significant direction based on pattern analysis of the digits of the operand.

14. The leading non-zero digit detection module of claim 13, wherein the leading nonzero digit corrector corrects the initial position of the leading non-zero digit to the next less significant digit if:

the value of the initial leading non-zero digit is 1 and the value of the next non-zero less significant digit after the initial leading non-zero digit is negative; or

the value of the initial leading non-zero digit is -1 and the value of the next non-zero less significant digit after the initial leading non-zero is positive.

15. The leading non-zero digit detection module of claims 13 or 14, wherein the leading one corrector is a tree structure having a plurality of nodes, each node of the tree structure having a left branch input, a right branch input, and an output, wherein the output for any node is determined according to the equations:

if the initial non-zero digit is positive:

node(d) = p⁺ if p^+l · z^r + z^l · p^+r

node(d)— po if z^l · po^r + po^{l ■} z^r

node(d) = z if z^l · z^r

node(d)— n if n^l + z^l · n^r

node(d) - y if y^l + z^{l ■} y^r + po^{1 ■} n^r; or if the initial non-zero digit is negative:

node(d) = n^~ if n^~l■ z^r + z^l · n^~r

node d) = no if z^l■ no^r + no^{1 ■} z^r

node(d)— z if z^{l ■} z^r

node(d)— p if p^l + z^l · p^r

node(d) = y if y^l + z^{l ■} y^r + no^{1 ■} p^r

wherein for a digit signal s, p⁺: (s > l);po: (s = 1); p: (s > 0);z: (s = 0);no: (s = -l); n^": (s < -l); n: (5 < 0), 5' denotes a left branch input and s^r denotes a right branch input, and node(d) denotes an output of the node;

wherein the root node has as its left branch input the most significant digit and as its right branch input the second most significant digit;

wherein a child node has as its left branch input the output of its parent node and has as its right branch input the next less significant digit of the digit that is the right branch input of its parent node; and

wherein if the output of the leaf node is equal to y, the leading one corrector corrects the initial position of the leading non-zero digit to the next less significant digit.

16. The leading non-zero digit detection module of claims 13 or 14, wherein the leading one corrector comprises a first tree structure having a plurality of nodes and a second tree structure having a plurality of nodes, each node of the first tree structure having a left branch input, a right branch input, and an output, wherein the output for any node of the first tree structure is determined according to the equations:

node(d) = p⁺ if p^+l · z^r + z^l · p^+r

node(d) = po if z^{l ■} po^r + po^{l ■} z^r

node(d) = z if z^l · z^r

node(d) = n if n' + z^{l ■} n^r

node(d)— y if y^l + z^l · y^r + po^{1 ■} n^r;

wherein for a digit signal s,p⁺:(s > l);po:(s = l);z:(s = 0);n:(s < 0), s' denotes a left branch input and s^r denotes a right branch input, and node(d) denotes an output of the node; and wherein each node of the second tree structure has a left branch input, a right branch input, and an output, wherein the output for any node of the second tree structure is determined according to the equations:

node(d) = n^~ if n^~l■ z^r + z^l · n^~r

node(d) = no if z^l■ no^r + no¹ · z^r

node(d) = z if z^l · z^r

node(d)— p if p^l + z^l■ p^r

node(d) = y if y^l + z^l · y^r + no^{1 ■} p^r

wherein for a digit signal s, p: (s > 0); z: (s = 0); no: (s = -1); n^~: (s < -1); ), s^l denotes a left branch input and s^r denotes a right branch input, and node(d) denotes an output of the node;

wherein the root nodes of the first and second tree structures have as their left branch input the most significant digit and as its right branch input the second most significant digit;

wherein a child node of the first and second tree structure has as its left branch input the output of its parent node and has as its right branch input the next less significant digit of the digit that is the right branch input of its parent node; and wherein if the output of the leaf node of at least one of the first tree structure and the second tree structure is equal to y, the leading one corrector corrects the initial position of the leading non-zero digit to the next less significant digit.

17. A combined rounder-conversion module for processing a signed operand having a sign and a significand having / digits in a signed digit-set having a range of [m,n], m≥ -9, n < 8, ABS(m - n)≥ 9, the module comprising:

an inverter for selectively inverting the / - 1 most significant digits of the significand based on the sign of the operand to output a bit-inverted intermediate;

a calculation unit for determining the propagation bits of the 1-2 most significant digits of the significand;

a calculation unit for determining the generation bits of the 1-2 most significant digits significand; a rounding increment generation unit for determining an increment value based on at least the sign of the operand, the least significant digit of the significand, and a sticky digit representing values of one or more less significant digits of the least significant digit of the significand;

a negative carry generation unit for determining a negative carry signal based on the sign of the operand, the increment value, the value of the second least significant digit of the significand, the propagation bits of the 1— 2 most significant digits, and the generation bits of the / - 2 most significant digits;

a correction signal generator for generating a correction signal based on the negative carry signal;

an adder for adding the bit-inverted intermediate with the correction signal to output a rounded-converted result.

18. The rounder-conversion module of claim 17, wherein the rounding increment is further determined based on a sign of the significand.

19. The rounder-conversion module of claims 17 or 18, wherein the rounding increment is determined based on:

20. The rounder-conversion module of any one of claims 17 to 19, wherein the rounded- converted result has I - 2 significant digits and represents the I - 2 most significant digits of the significand being rounded by the least significant digit of the significand while accounting for the sticky digit.

21. The rounder-conversion module of any one of claims 17 to 20, wherein the rounder- conversion module is free of positive carry propagation.

22. The rounder-conversion module of any one of 17 to 21 , wherein the negative carry signal is determined according to:

where

1 if Sum2{l} < -1 or

(Sum2{l) = —18iSign2

0 otherwise

wherein Sign! is the sign of the operand.

23. The rounder-conversion module of claim 22, wherein the correction signal is determined according to:

1 if C_1:0 = 00 & Dinc — 1,

10 if C_1:0 = 01 & RDinc = 1,

11 if C_1:0 = 10 & RD_inc = 1,

0 if Ci:o = 11 & RD_inc = 1,

0 if C_1:0 = 00 & RD_inc — 0,

11 if C_1:0 = 01 & RDinc = 0,

10 if C_1:0 = 10 & RDinc = 0, '

1 if C_1:0 = 11 & RDinc = 0,

-1 if Cl:0 = 00 & RDinc = -1,

12 if C_1:0 = 01 & RDinc = -1,

9 if C_1:0 = 10 & RDinc = -1,

2 if Ci:o = 11 & RDinc = -1,

wherein C = {NC[l - 2: 0]; Sign2}.

24. The decimal floating-point multiplier-adder of any one of claims 1 to 12, wherein the post-alignment module includes the leading non-zero digit detection module of any one of claims 13 to 16, the leading non-zero detector receiving the intermediate sum as its operand.

25. The decimal floating-point multiplier-adder of claim 24, wherein the post-alignment module includes:

a first decision module for determining the shifting based on an uncorrected initial position of the leading non-zero digit of the intermediate sum;

a second decision module for determining the shifting based on a corrected position of the leading non-zero digit of the intermediate sum in parallel with the determining of the first decision module; and

a selector for selecting between an output of the first decision module and an output of the second decision module based on the correcting by the leading one corrector of the position of the leading non-zero digit detection module.

26. The decimal floating-point multiplier-adder of any one of claims 1 to 12, 24, or 25, wherein the post-alignment module includes a sticky digit generator for determining a sticky digit corresponding to a digit shifted out by the post-alignment module, the multiplier-adder further comprising the combined rounder-conversion module of any one of claims 17 to 23, the rounder-conversion module receiving the intermediate sum as the significand of the input operand and the determined sticky digit.