CA1193022A - Arithmetic unit for use in data processing systems - Google PatentsArithmetic unit for use in data processing systems
- Publication number
- CA1193022A CA1193022A CA 459822 CA459822A CA1193022A CA 1193022 A CA1193022 A CA 1193022A CA 459822 CA459822 CA 459822 CA 459822 A CA459822 A CA 459822A CA 1193022 A CA1193022 A CA 1193022A
- Grant status
- Patent type
- Prior art keywords
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
DATA PROCESSING SYSTEMS
Abstract of the Invention A data processing system using unique procedures for handling var-ious arithmetic operations. Thus, in floating point arithmetic mantissa cal-culations the system uses a novel technique for inserting a round bit into the appropriate bit of the floating point result wherein a look-ahead carry bit generator stage is used for such purpose to reduce the overall mantissa calculation time. Further, the system utilizes unique logic which operates in parallel with the floating point exponent calculation logic for effect-ively predicting whether or not an overflow or underflow condition will be present in the final exponent result and for informing the system which such conditions have occurred. Moreover, the system utilizes a simplified tech-nique for computing the extension bits which are required in multiply and divide computations wherein a programmable array logic unit and a four-bit adder unit are combined for such purposes.
~ his is a divisional of copending Canadian patent application serial No. 401,467 filed April 22, 1982 by Data General Corporation.
Introduction This invcntion rclates generally to data processing systems which utilize fixed and floating-point arithmetic units and, more part-icularly, to unique tecimiques for "rounding" floating-point calculations, for handling overflow and underflow conditions therein, and for providing unique arithmetic word extension logic for use in performing multipli-ca1:ion and division operations.
Background of thc Invention The representation of numbers in data processing systems, part-icularly non-integer numbers, requires the introduction of a radix point into the llotation. For example, data processing systems may employ "fixed point notation" whereill the radix point is placed immediately to the right of the least significant bit or placed immediately to the right of the sign hit before the first information bit.
A further option is often referred to as "floating-point notation"
in which the numbers are represented by a sign, an exponent, and a mantissa.
Such technique is described in many texts, one example being "Computer Architecture", Caxton C. Foster, Van Nostrand Reinllold Co., Ncw York 1976, pages 16 ct seq.
Calculations UpOII the malltissa may be performcd by oyerating on groups of bits (i.e,, "bit slices") of the m~ntissa words involved, the com-putation for each bit slice producing a "carry" bit to be adcled to the adja-cent bit slice until the calculation is completed for the entire word. For
example, overall mantissa words having 32 bits may use eight 4-bit slice logic units in such calculations.
If each bit slice is permitted to produce its "carry" bit only after the operation for such bi-t slice has occurred and the carry bit is then added to the next bit slice, the overall calculation time is consider-ably longer than desired. In order to reduce the overall ca].culation tin-.e, techlliques for effective].y computing the carry bits ahead of time, i.e., so--called "].ook ahead" carry bit techniques have been devised wherein the var-ious carry bits are computed in parallel and sinmultaneously with the bit slice computation operations. Such techniques have been used for many years and are well known to those in the art.
After the overall computation has been completed, a "round" bit is then con,puted and added to the last bit slice as discussed below, the round bit being determined by a plurality of bits, often referred to as "guard"
bits, which form a particular coded word which must be suitably decoded to produce the round bit. The round bit is usually calculated following the overall computation ~md then added to the least significant bit of the un-rounded floating point result at the appropriate carry bi.t location, an operation which can be thought of as an effective multiplexing operation, i.e., the round bit being inserted during the rounding cycle instead of the associated carry bit. Por example, when using 32-bit words, the unrounded floating point result comprises 32 bits alld is then roundcd to a fillal result , lL1~3~
having 24 bits. In such case, the unbi-ased rounding algorithm uscs the eight bits of least significancc to detcrmine how to round the final 24-bit result. However, the insertion of the round bit to the completed floating point computation result by effective multiplexing techniques adds additional time to all of the calcula-tions required for the mantissa calculation. It is desirable to devise techniques to save this time.
Further, in calculating the exponent portion of a floating point resul-t, if the calculation does not produce a value which falls within a particular exponent value range (i.e., a value having a particular number of bits), an "overflow" or an "underflow" condition occurs. If either such con-dition occurs, the system must provide an indication thereof so that appro-priate sub-routines for handling such conditions can be invoked and the status of the floating point condition rnust be appropriately communicated to the overall system. In order to save time in the overall operation it is desirable to accelerate the detection and communication of exponent overflow and underflow conditions.
Further, in a multiply or divide operation, conventi.onal algorithms which are used for such operations require that certain operands be extended, i.e.~ tllat additional bits be added to the operand words. For example, in a particular multiply algorithm such operands must be extended by two bits, while in a particular divide algorithm such operands must be extended by one bit. Extension techniques which require the use of extra bit slice logic units add to the ~lardware complexity of tlle arithmetic units or floatlng point units. It is desirable to devise less cumbersome extension techniques for such purpose to avoid such hardware complexity.
Brief Summary of the Inven-tion In order to achieve rounding of mantissa compucations, a novel modification to the conventional "look ahead" carry techTIique is utilized in accordance with the invention, wherein a portion of an existing "loolc ahead"
scage (i.e., that stage normally used to generate the carry bit which is to be added to the least significant bit slice of the unrounded floating point result) is used for the ro~md bi-t calculation, the round bit then being added as the carry bit in the least significant bit of the unrounded floating point result in parallel with all the other carry bits calculated by the remaining parallel look ahead stages.
Fllrther, iTI order to provide relatively fast detection of overflow and underflow conditions during exponent calculations, rather than making a complete addition of the exponent value and then detecting the overflow and underflow conditions thereafter, the system in accordance with the invention utilizes overflow/underflow logic which operates in parallel with the final exponent computation and in effect predicts whether or not an overflow or an underflow condition will exist in the final exponent calculation. In a pre-ferred embodiment such logic uses an extra adder stage, together with associ-ated logic to produce a signal which provides an indication to the system that an overflow or underflow condition exists.
Further, the invention makes use of simplified arithmetic unit extension logic, Usillg less complex programmable array logic and addition stages to provide a simpler techllique for operand extensions during multiply and divide operations.
According to a -first broad aspect of tlle present i.nvention, there is provided in a data processing system having an arithmetic uni-t responsive to one or more instructions for performing multiplication or division operations requiring a plurality of computation cycles, each of whicl~ cycles requires the result thereof to be extended by a selected number of bits, said arithmetic unit comprising programmable array logic means responsive to selected bits generated by said instructions and by said arithmetic unit i.n performing said multiplication or division operations identifyillg selected characteristics of the computation being performed for providing operand extension ~its; and further logic means responsive to said operand extension bits for providing said se:lected number of extension bits for extending the computation result during the current computation cycle.
According to a second broad aspect of the present invention, there is provided in a data processing system having an arithmetic unit responsive to one or more instructions for performing divide operations requiring a plurality of computation cycles, each of whi.cil cycles requires an extension of the dividend operand, said arithmetic unit comprising programmable array ].ogic means responsive ,n a current di.vide cycle to a first selected bit of the result of the previous divide cycle ancl to a second selected bit representing a quotient bi.t provided in the previous divide cycle i.delltify:ing whetller an add or subtract operation is required for the c.urrent di.vide cycle, said logic means producing a plura].ity of cxtension bits; and further logic means responsive to said extension bits for providing a quotient bit for each current divide cycle.
According to a further aspect of the present invention, there is provided in a data processing system having an arithmetic unit res-ponsive to one or more instructions for pcrforming multiplication ope-~ation.s. requiring a plurality of computation cycles, each of which cycles. requires a sign or a zero extension of the pa.rtial product operand -5a-~3~
and the multiplicalld operand, said arithmetic unit comprising program-mable array logic means responsive to two selected bits each representing the most significant bit of each operand, to a selected bit generated by said instructions identifyillg whether the current cycle is to be a sign extended or a zero extended operation, to a selected bit generated by said instructi.ons identifyi.ng whet}ler the current multiply cycle is the first multiply cycle or other than the first multiply cycle, to a selected bit generated by said arithmetic unit in performing said multiplica-tion ~peration identifying whether the current cycle requires an addition or a subtraction operation, and to a selected pair of bits generated by said arithmetic unit i.n performing said multiplication operation identifying the form of said multiplicand operand during said current cycle, said logic means l)roviding a plurality of extension bits; and further logic means responsive to said extension bits for provi.ding a plurality of extension bits for extending the partial product result during the current multiply cycle.
~5b ~ 36~2~
Description of the Invention The invention can bc described in more detail with the help of the accompanying drawings whcrein:
FIG. 1 shows a block diagram of an ari.thmetic uni.t for performing arithmetic calculations;
FIGS. 2 - 2D show specific 4-bit microprocessor slice logic units used i.n mantissa calculations;
FIG. 3 shows specific look-ahead logic units for computing the carry bits for mantissa calculations;
FIG. 4 shows specific logic units for use in computing the round bit for mantissa calculations in accordance with the invention;
FIG. 5 shows specific 4-bit microprocessor slice logic units used in exponent calculations;
FIG. 6 shows a chart useful in explaining the overflow/underflow operation in accordance with the invention;
FIG. 7 shows logic units used to detect overflow/underflow condi-tions and to provide an indication thereof to the data processor system; and ~ IG. 8 shows logic units used to provide extension bits for use in multiply or divide operations in accordance with the invention.
The inventioll disclosed herein can be best describecl in the con-text of a particular data processing system, such as that disclosed in United States l'atent No. 4,386~399, issued May 31, 1983 to Rasala et al.
A broad block diagram of the ar:ithmetic logic unit ~ALU) used therein is sho~ in FIGURE 154 of that patent whicll is reproduced herein as FIG~RE 1.
Specific logic diagrams required to understand the invention described herein are shown here in FIGURF.S 2-5 and FIGURES 7 and 8. In such system, as is conventional for 32-bit mantissa calculations, the computation is made in four-bit slices as sho~n by the 4-bit microprocessor slice logic units lOA-lOH. Two 4-bit values are operated on, and a l-bit carry value is added to the result of this operation to produce the 4-bit output (DS-outputs). Thus the slice units produce an ~mrounded floating point result formed as the following 4-bit slices shown below:
(_ _ .. _. .. . .~ . _ _ _. ___J y UNROUi`lD1) Il.O/~l`lNG l'()lNT RBNSIIIII` GUARD BITS
As call hc seen ;n the above diagram, in the 32-bit calculated word the 24 bi-ts formin~ the six more significant 4-bit slices (bits 0-23) produce the unroull-lell -flo~t;llg~ pOillt result "~hile the e;gllt bits forming the two less sigllific;lTl-t ;ld~i-t &rollps (bits '4-31) rel)resent the bits ~.hich are used to cletermille tlle "ro~ (l" bit, referrecl to as the "gu.lrd" bits.
In each instance, eacll oL` the 4-bit slices e-Efectively produces a carry (CRY) bit w'hich is su~ lie(l to th( rle.Yt acljacent 4-bit slice, e.g., ~3~2~
bit slice ~-3 is effectively supplied with the carry bi-t CRY4 from the 4-bit slice 4-79 the latter slice is supplied with CRY8 from 4-bit slice ~-11, and so on. The CRY31 bit is supplied from microcode, while the CRY0 bit is the carry bi-c for the final overall floating point result. In order to save time in the computation the carry bits are actually generated in parallel with the 4-bit computations in each of the 4-bit microprocessor slice logic units, the parallel carry bit calculations being performed in the look ahead carry generator units llA, llB and llC as shown in FIG. 3. Thus, during such arithmetic operation cycle, the overall computation, including both the
4-bit additions and the carry bits generation, is performed substantia]ly simultaneously to form the unrounded 32-bit result.
In conventional floating point computation techniques the round bit is then added during the next cycle ~the round cycle) of the floating point computation, after the arithmetic operation has been completed, by appropri-ately decoding the guard bits to generate the round bit and then adding the round bit to bit 23 (effectively as CRY24 bit), which latter process can be achieved by using effective multiplexing techniques, for example. A certain amount of time is required for performing the multiplexing operation, and such time increases the overall data processing time for all operations. It is desirable to avoid such added processing time by reducing the time needed to generate and insert the round bit, as described below, in accorcl.mce Wit}l the invention.
The logic circuitry of the inventioll utilizes an additional section of look ahead generator stage llA ~Id associated circuitry 12, 13 and 14 for generating the round bit anc! for adding the rowld bit ~as an effective CRY24 bit) in parallel with the generation and insertion of the other carry bits.
Thus, the guard bits ~D24-31) are supplied to appropriate gating logic 13, shown in FIG. 4 to produce the RNDB bit. Such bit is then utili~ed in the round enable logic 12 and 14, SIIOWll in FIG. 4 which produce the ROU,~D bit, the latter being supplied to the CRY24 look ahead generator llA and, thence, ~3~2~
to the specific 4-bit slice unit lOC which produces the lcast significant four bits ~bits 20-23) of the (24-bit) floating point result as the final step in the arithmetic operation. Thus, the rounciing of the floating point result is accomplished without adding a time interval required for the effec-tive multiplexing of the round bit into CRY24 as required when using conven-tional tec]miques.
The teclmique used in accordance with the invention for detecting overElow and underflow conditiolls during compu-tatioll of the exponent value in a floating point operation is specifically described, for convenience, with reference to the generation of a 7-bit exponent. The generation of a computed exponent value involves the addition of a first exponent value (AEXP
representing an exponent value stored in a particular register and a second exponent value DE~P), representing an exponent value which is obtained, for example, from an external source. The AEXP and DE~P values are each defined by seven bits and their addition yields the desired exponent resul-t (~EXP), whicll can be stored in another specified register. The above operation can be represented in accordance with the following relation:
Ar:~P + DE~P -~ BE~P
If the original exponents are each represented by 7-bit values and the arithmetic operation is performed as shown above, the useful result should also be expressed by 7-bits. If such result is exllresscd hy more than 7-bits an overflow or mderflow conclitioll exists. More partic~ r1y, 7-bit exponcnts, during overflow or underflow, will never yield values that re-quire more than eight bits. Suc]- characteristics can be understood with the hPlp of FIG. 6 wherein a 7-bit exponent defines 12~ values within a range from -64 to +63, while an 8-bit e~cponent defines 256 values ~ithin a range from -128 to +127. Values from +64 to +127 and values from -65 to -128 are defined as representing an overflow or underflow condi-tion (these clecimal values are obtained by interpreting -tlle 8-bit expollents as two's complelllcnt notation binary numbers).
FlG. 6 also depicts overflow/underflow conditions for addition, subtraction, multiplication and division operations. Thus, an addition or subtraction overflow occurs within a range from +6~ to +126 while an adcli-tion or subtraction underflow occurs in a range from -64 to -127. A multi-ply overflow occurs in a range from +6~ to +126 while a multiply underflow condition occurs in a range from -64 to -12~. A division overflow occurs within a range from +64 to +128, while a division underflow occurs within a range from -64 to -127. Two special conditions should be pointed out. In the multiply underflow range a special condition occurs wherein -129 is represented as +127 and in the divide overflow range a special condition occurs wherein +128 is represented as -128.
The exponent calculation is performed in two 4-bit slices as shown in FIG. 5 by 4-bit microprocessor slice logic units 20A and 20B. ~EXP is addressed from the A register (AREG0A-3A) while the externally sourced e~pon-end DEXP is supplied as bits XD~-XD7. During this calculation bits EXP~-7 hold the AEXP value. BEXP is then calculated and supplied to the register addressed by the B register (BREG0A-3A).
The system must then provicle an indication of an overflow or an underflow condition, i.e., when the result falls outside the 7-bit range, and provide a suitable signal which will enable an appropriate sub-routine for handling the particular identifiecl overflow or underflow conc~it-ion. Bec.luse of the particula~ collventioll.Ll algorithlll usecl, durirlg the last cycle of the e~ponent calculatioll DEXP is limited to the value which lies wi-thin the range from -8 to +7. Thus, the value ranges of interest in the re~ister AEXP
(~ddressed by AREG) are as shown in FIG. 6. They include a middle range from -56 to +55 in W]licll it is clear there will be no overflow or underflow error condition (i.e., even where DEXP is at its -S or +7 limits, thc final result would not lie in an overflow or wlderflow range), an upper range from +72 to +127 ancl a lower range from -73 to -12S. In the latter ran(-es it is clear tllat no matter what DEXP value during the las-t cycle, the final result woulcl ~19~2~
clearly be in an overflow or an underflow range. In the two cross-over ranges (+56 to -~71) and ~-57 to -72~ the overflow or unclcrflo~Y conditions must be determined, depending on the value of the DEXP within its limits during the last cycle of the exponent calculation.
An NEXPl bit is obtained by adding selected EXPl and EXP5-7 of the AEXP with bits XDl and XD5-7 of thc DEXP, as shown by 4-bit adder unit 21 of FIG. 7. An ERR CASE signal is derived from AEXP bits EXP0-7 via programmable array logic unit 22 in FIG. 7. The overflow/ul-derflow status of the exponent calculation is defined by the NSl and NS~ bits from programmable array logic (PAL) 23 in FIG. 7. In an overflow condition, programmable array logic 23 asserts an NSl sig~lal while in an underflow condition an NS2 bit is asserted.
Bits SEXP0 and SEXPl are derived from AEXP bits EXP0-7 and signal ELAG 5 indicating multiply or divide, in PAL 24.
This logic is provided to detect generally in which range the fina] floating point exponent result resides and, more particularly, provides a capability of determining where within the cross-over regions the final re-sult lies so as to determine whether an overflow or underflow condition exists within such latter regions. In order to do so, examination of the selected AEXP bits and the selected DEXP bits is made in accordance with the following chart an explanation of whic}l can be understood in connection with FIG. 6, by the above described logic. No-te th.Lt signals SEXP0 and SEXP1 in all cases, except for the special casc dcscribed in lhc ch,Lrt and l)elow, arc equal to EXP0 and EXP1, rcs~)ectively.
O FLOl~ (NSl) I~ ERR CASE SEXP0,1 = 00~ 01~ 01 IS "TRUE" NEXPl = lJ OR 1 ~ OR (DON'T CARE) XDl = 0 1~ 0 IF ERR CASE
Sl~.~P~, 1 = 01 IS "EALSE"
t UNnERFLOI~ (NS2) IF ERR CASL SEXP0,1 = 11~ 10 IS "TRUE" NEXPl = ~ ~ OR ~ OR (DON'T CARE) XD1 = lJ 0 IF ERR CASE
SEXP0,1 = 10 IS "FALSE"
SP AL CASE:
SPC CAS~ IS "TI~E" when MUI.TIPLY UNDERFLOI~ EXP = +127 (real value is -129,FLAG 5 DIVISION OVERFLO~Y EXP = -128 (real value is +128,FLAG 5 is false) IF SPC CASE IS "TRUE" SEXP0,1 VALUES ARE INVERTED
SEXP0,1 ~ EXP0,1 IF SPC CASE IS "FALSE" SEXP0,1 VALUES REMAIN THE SAME
SEXP0,1 ~ EXP0,1 The cases set forth in the above chart depi.ct the situations in which an overflow or an underflolY condition exists in the final computed exponent result. The values of the condition indicator bi,ts SEXP0,1, NEXPl, XDl and ERR CASE are uti],ized during the last computation cycle in effect to predict the presence or absence of overflo~i or underflow condition in the final exponent result BEXP.
In order to access the desired s-lb-routine :Eor handling overflow or underflow conditions, a sign~l :Eor indic.ltillg that one of such conditions has occurred is provided to the system as the SET FL'r LRR signal from PROM unit 25 (see FIG. 7). The latter signal is determined by the states of the first t-~o AEXP indicator bits (directly accessible as EXP~ and EXPl from 4-bit slice logic unit 20A), by the NEXI'l indicator bi.t (determined by sel.ected EXP
1~ 5, 6 and 7 and XD 1, 5, 6 and 7 bits as discussed above), by the ERR CASE
indicator bit, and by indicator bit XDl, the second most sigllificallt bit of tlle DEXI', as shol~n in FIG. 7. The st.ltlls of such bits in cletermillillg the overflo~ and undcrflo~ conditiolls is deEilled in thc chart set forth above.
In effcct S~ rl r ERR is "true'l wllcll eithcr NSl or NS2 is "tr~c".
~ Yhile the add ("ADD") or subtract ("SUB") overflow and underf]ow conditions are relatively straightforward, as shown in FIG. 6, special cases exist for a multiply ("MULT") calculation wherein an underflow condition e~ists over a range from -64 to -129 and for a divide ("DIV") calculation wherein an overflow condition e,cists over a range from +65 to -~128. The special cases are as follows: in multiply at -129, which is represented as +127, and in divide at +128~ which is represented as -128. Such special cases are determined by programmable array logic 24 in FIG. 7, wherein the AEXP bits EXP0-7 are examined as well as the signal FLAG 5 which is set "true" during multiply and "false" during divide by microcode. If a special case condition ~+127 for a MULT and -128 for a DIV) exists, a SPCCASE signal -is generated (SPCCASE is "true"). In such conditions the values of SE~P~
and SEXPl must be inverted. That is, when SPCCASE is "true" SEXP0=E,YP0 and SEXPl=EXPl. So long as SPCCASE is "false", no change is made in the values of SEXP0 and SEXPl. Tha-t is, SEXP0 = EXP~ and SEXPl = EXPl.
Accordingly, the above discussed logic not only computes the status of the floating point result but also simultaneously computes a SET rLT ERR signal t~hich is supplied to the system (in this case the address translation unit (ATU) of the system shown in the aforementioned Rasala et al.
application) for accessing the desired sub-routine for handling the overflo~
or underflow conditions, such operations occurring substantially at the same time ~hat the overall exponent additioll oper.ltioll occurs in the coml)ut.ltio of the floating point rcsult (in the comp-ltatioll of the BEXP value).
In order to achieve full precision in the results obtained ~or multiplication and division operations of an arithme-tic logic unit or a float-ing point computation unit, the use of a conventiollal multiplication algorithm (known as a "two-bit Booth's algorithm") requires that the multipli-cand and partial product operands be e.ctended by two additional bits (e.g., when using 32-bit or 64-bi-t words SUC}l operallds must be e.ctended to 34 bits and 66 bits, respectively) allcl the usc of a conventional division algori-tllm 3L~,.5~3~2;2 (~nown as a "non-restoring divide" algorithm) requires extension of the dividend and divisor operands by one additional bit (e.g., when using ~2-bit or 64-bit words such operands must be extended to 33 bits or 65 bits, respec-tively). It is desirable to provide a technique for extending the words in-volved which does not require adding an extra 4-bit microprocessor slice logic unit.
A technique for providing the desired arithmetic extensions is shown in FIG. 8 which depicts a programmable array logic unit 26 and a simple 4-bit adder unit 27. The following analyses assist in explaining the operation of units 26 and 27 for a multiply or division operation.
In the system design described herein, the multiply operation pro-vides a result by adding a value A to the product of two values B and C, i.e., the multiply result is A+ (B~C) . A multiply operation may be signed or un-signed. In accordance with the conventional multiply algorithm, a signed multiply operation requires both a first operand (the partial product) and a second operand (either a 0, the multiplicand, or twice the multiplicand) to be sign extended. An unsigned multiply operation requires the first operand to be sign extended (except in the first multiplication cycle) and requires the second operand to be zero extended. The chart set forth below depicts the two extended bits for both the first and second operands for the various multiply operations.
OPER~ND ADD SU131R~Cr 1st Do Do Do Do 2nd +0 0 ~1 1 +~
1st Do Do Do Do 2nd QB QB +QB QB +- MULTIPLICAND
1st Do Do Do Do 211d QB QB tQI3 QB r 2~ ~uLTIpLIcAl~D
~ 14 -U SIGNED MULTIPLY
OPERAND ADD SUBTR~CT
1st Do Do Do Do 2nd -~ 0 +1 1 +0 1st Do Do Do Do 2nd +0 0 ~ MU~rIPLIC~D
1st Do Do Do Do 2nd +0 QB+l QB + 2X MULTIPLICAND
In the above chart Do represents the most significant bit of the first operand and QB represents the most significant bit of the second operand.
In the special case of an unsigned multiply during the first multi-ply cycle (when the partial product is the value A) the first operand (partial product) is ~ero extended as follo~s:
UNSIGNED MULTIPLY (DURING FIRST CYCLE) _ _ OPERAND ADD SUBT~CT
1st 0 0 ~ 0 2nd +O 0 +1 1 +0 1st 0 0 0 0 2nd +0 0 +1 1 _ MULTXPLICAND
1st 0 0 0 _0 2nd +0 QB+1 QB -~ 2X ~IULTIPLICAND
The signal D0 contains tlle value of Do and QBit contains the value of QB in the above chart.
Thc progranunable arr.ly loric ~6 halldles the generatiorl oE the above values for the first and second operands as shown in FIG. 5. IRI0 and IRIl bits identify the cases set fort]l below, the IROP4 signifies an "add" or "subtract" operation during the nlultiply, the UNSIGN bit signifies a signed or an unsigned multiply operation and the ~IACC EN bit signifies operation either in the first multiply cycle or in a cycle other thall the first multi-ply cycle, as set forth below. The signal ~IPY is "true" during multiply cycles.
B _ SIG~1II'Y
0 0 ~0 0 1 + MULTIPLIC~ND
1 0 + 2X MULTIPLICAND
i 1 +0 (REDUND~NT) _ UNSIGN
0 FIRST h~LTIPLY CYCLE
where in each case 0 - Low and 1 -~ lligh.
Bits IRIl, IRI0, and IR0P~ are generated in accordance wi-th the particular multiply algorithm which is used, utilizing known logic techni.ques, and bits MPY, IJNSIGN ancd MACC EN are obtained from microcode cDntrol, in accordance with well known techniques.
Thus, input bits to progra~mable array logic 26 as specified above provide the two operand extension bits at the A-inputs (first operand~ and the B-inputs (second operand) of adder Ullit 2`7. Tl-e :resultillg addition pro-duces the extension bits l)STX and DSTY for the ne.~ct partial product as sllo~
in FIG. 8. The required CRY~ bit is also input to adder unit 27 as mentioned above. In the PAL unit 26 the LDQB, TCI`cY0 and QBIN are used for purposes other than in the multi.ply or divide operations and need not be discussed further.
~ ith reference to division operations, in accordance \~itll the afore-melltioned conventional divide algorithmsJ the first operalld is tllC' diVi.CIellCl and the seconcl operallcl is the divisor, In all unsigned clivicle operatioll, dur-~33~
ing each divide cycle the first operand is all~ays extended by the most signi-ficant bit of the result from the last divide cycle (identified as the LINK
bit, as sho~n in FIG. 8) while the second operancl is al~ays zero extended, as follo~s:
U IGNED DIVISION
1st OPERAND LIN~ LI~K
2nd OPERr~ND ~0 -~l The TCRYY bit signifies whether an "add" or "subtract" is rcquired, as follo~s:
The carry bit (CRYY) from adder unit 27 resul-ting from the above additions of the LIN~ bit and the + 0 bit is the quotie11t bit and such bit is registered for the next divide cycle, the registered value being designated the TCRYY bit. The PAL unit 26 and adder 27, thus, determine the carry bit required for the divide operation from the e~tellded bits of the dividend (first operand) and the divisor (second operand). The signal DIVD is "true"
during divide cycles as set by microcode.
Accordingly, tile use of PAL unit 26 and adder unit 27 provides for the e~tension require1nents in both a multip1y and a dividc opcr.ltio~ ithout requiring the use of an ad.1itio1lal ~ bit microl)rocessor slice logic unit of the type depicted for the 32 bits sho~ in FIGS. 2 or for 8 bits sho~n in FIG. 5.
PROPERTY OR PRIVILEGE IS CLAIMED ARE DEFINED AS FOLLOWS:
Priority Applications (3)
|Application Number||Priority Date||Filing Date||Title|
|US06256772 US4405992A (en)||1981-04-23||1981-04-23||Arithmetic unit for use in data processing systems|
|CA 401467 CA1175572A (en)||1981-04-23||1982-04-22||Arithmetic unit for use in data processing systems|
|Publication Number||Publication Date|
|CA1193022A true CA1193022A (en)||1985-09-03|
Family Applications (1)
|Application Number||Title||Priority Date||Filing Date|
|CA 459822 Expired CA1193022A (en)||1981-04-23||1984-07-26||Arithmetic unit for use in data processing systems|
Country Status (1)
|CA (1)||CA1193022A (en)|
Also Published As
|Publication number||Publication date||Type|
|US6687722B1 (en)||High-speed/low power finite impulse response filter|
|US5732007A (en)||Computer methods and apparatus for eliminating leading non-significant digits in floating point computations|
|US6256655B1 (en)||Method and system for performing floating point operations in unnormalized format using a floating point accumulator|
|US5278783A (en)||Fast area-efficient multi-bit binary adder with low fan-out signals|
|US4228520A (en)||High speed multiplier using carry-save/propagate pipeline with sparse carries|
|US5787030A (en)||Correct and efficient sticky bit calculation for exact floating point divide/square root results|
|US4737926A (en)||Optimally partitioned regenerative carry lookahead adder|
|US5943250A (en)||Parallel multiplier that supports multiple numbers with different bit lengths|
|Lang et al.||Floating-point multiply-add-fused with reduced latency|
|US5161117A (en)||Floating point conversion device and method|
|US6401194B1 (en)||Execution unit for processing a data stream independently and in parallel|
|Santoro et al.||Rounding algorithms for IEEE multipliers|
|US6377970B1 (en)||Method and apparatus for computing a sum of packed data elements using SIMD multiply circuitry|
|US4901270A (en)||Four-to-two adder cell for parallel multiplication|
|US5991785A (en)||Determining an extremum value and its index in an array using a dual-accumulation processor|
|US4622650A (en)||Circuitry for generating scalar products and sums of floating point numbers with maximum accuracy|
|US4926370A (en)||Method and apparatus for processing postnormalization and rounding in parallel|
|US5465226A (en)||High speed digital parallel multiplier|
|US5157624A (en)||Machine method to perform newton iterations for reciprocal square roots|
|US6996596B1 (en)||Floating-point processor with operating mode having improved accuracy and high performance|
|US5257215A (en)||Floating point and integer number conversions in a floating point adder|
|US5220524A (en)||Machine method to perform newton iterations for reciprocals|
|US4868777A (en)||High speed multiplier utilizing signed-digit and carry-save operands|
|US5508952A (en)||Carry-lookahead/carry-select binary adder|
|US6633895B1 (en)||Apparatus and method for sharing overflow/underflow compare hardware in a floating-point multiply-accumulate (FMAC) or floating-point adder (FADD) unit|