WO2002046910A1 - Carry lookahead adder for different data types - Google Patents

Carry lookahead adder for different data types Download PDF

Info

Publication number
WO2002046910A1
WO2002046910A1 PCT/GB2001/005358 GB0105358W WO0246910A1 WO 2002046910 A1 WO2002046910 A1 WO 2002046910A1 GB 0105358 W GB0105358 W GB 0105358W WO 0246910 A1 WO0246910 A1 WO 0246910A1
Authority
WO
WIPO (PCT)
Prior art keywords
adder
sub
cells
cell
bit
Prior art date
Application number
PCT/GB2001/005358
Other languages
French (fr)
Inventor
Neil Burgess
Original Assignee
University College Cardiff Consultants Limited
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from GB0029538A external-priority patent/GB0029538D0/en
Priority claimed from GB0106600A external-priority patent/GB0106600D0/en
Application filed by University College Cardiff Consultants Limited filed Critical University College Cardiff Consultants Limited
Priority to AU2002220883A priority Critical patent/AU2002220883A1/en
Publication of WO2002046910A1 publication Critical patent/WO2002046910A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/50Adding; Subtracting
    • G06F7/505Adding; Subtracting in bit-parallel fashion, i.e. having a different digit-handling circuit for each denomination
    • G06F7/506Adding; Subtracting in bit-parallel fashion, i.e. having a different digit-handling circuit for each denomination with simultaneous carry generation for, or propagation over, two or more stages
    • G06F7/508Adding; Subtracting in bit-parallel fashion, i.e. having a different digit-handling circuit for each denomination with simultaneous carry generation for, or propagation over, two or more stages using carry look-ahead circuits
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2207/00Indexing scheme relating to methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F2207/506Indexing scheme relating to groups G06F7/506 - G06F7/508
    • G06F2207/50632-input gates, i.e. only using 2-input logical gates, e.g. binary carry look-ahead, e.g. Kogge-Stone or Ladner-Fischer adder

Definitions

  • the present invention relates to addition circuitry, particularly to packed arithmetic prefix adder circuitry, often referred to as "carry lookahead" adders or adder trees, and, more particularly to prefix adder circuitry capable of calculating the sum of or difference between pairs of packed or unpacked binary numbers.
  • Multimedia processor chips make much use of "packed” arithmetic operations, in which long wordlength numbers are optionally treated as several independent shorter wordlength numbers - for example, a 32-bit word may be treated as 2 separate 16-bit words or as 4 8-bit words.
  • a common arithmetic operation used in video processing is "absolute difference", denoted
  • a most valuable operation is a "packed absolute difference” operation which returns the absolute differences of several independent short numbers simultaneously.
  • Absolute differences are computed by performing a subtraction operation followed by a separate "absolute value” operation which returns the magnitude of a signed number. Absolute differences can be obtained by computing both A - B and B - A, and using the signs of the two results to select the positive result, which corresponds to the absolute difference.
  • This document describes how parallel prefix adder trees - widely used in VLSI processor chips - may be modified to support packed arithmetic operations, including packed absolute difference calculations.
  • Parallel prefix carry-lookahead adders are a popular VLSI design technique that accelerates a w-bit addition by means of a parallel prefix tree.
  • FIG. 1 A block diagram of a prefix adder is illustrated in Figure 1 , where the adder is seen to consist of three blocks: input bit propagate, generate, and not kill cells; the prefix tree; output sum cells.
  • ⁇ k(i) a(i) v b(i) — 0) respectively, where: g(i) is called the bit generate condition, a value 1 indicating that the bits a(i) and b(i) produce an output carry bit (c(i)) irrespective of the incoming carry, p(i) is called the bit propagate condition, a value 1 indicating that the bits a(i) and b(i) produce an output carry bit (c(i)) only if there is an incoming carry, and
  • -,k(i) is called the not kill bit condition, a value 0 indicating that the bits a(i) and b(i) produce no output carry bit (the symbol "-.” being used to indicate the NOT condition);
  • c(i) is a binary carry bit which is received by each next most significant bit position, c(i) having the same significance as a(i) and b(i).
  • the prefix tree combines the bit generate and bit not kill signals to derive
  • the present invention is aimed at modifying the prefix tree such that it returns group ' generate and group not kill signals for use in either full-length or packed arithmetic calculations to provide added functionality.
  • the prefix tree converts the input bit generate and bit not kill signals, g(i) and — .k(i), into group generate and group not kill signals, Gj° and -*Kp through a number of levels of logic operations.
  • G z x represents a "group generate” signal across the bits from significance x up to and including significance z
  • ->Kz x represents a "group not kill” signal across the same significances.
  • Each level of logic in the tree widens the range of the groups until the lower value of the range covered by the group is 0, and the upper value is i.
  • bit combinations of (G ⁇ , -,K ⁇ ) may be interpreted in terms of carry conditions, c(i), as shown in Table 1.
  • Table 1 Interpretation of (G ⁇ , -iKz ) bit combinations
  • Pairs of group signals are combined to yield compound group signals, C ⁇ , from pairs of group signals, C Z Y and Cy , as shown in Table 2, where z ⁇ y, z ⁇ w, w ⁇ x, and y ⁇ lrV+1 .
  • Figure 2 shows the prefix tree proposed by Ladner and Fischer.
  • the black squares are prefix cells, which implement the equation pair:
  • Both the group generate and the group not kill expressions are implementable as individual CMOS logic gates, and exploit the don't cares in Table 3 so as to minimise the complexity of the equation pair.
  • a pattern of carry conditions always emerges which comprises a string of CG & CK conditions, followed by a string (possibly null) of CP conditions.
  • the trailing string of CP conditions identifies the trailing string of sum bits that must change from 1 to 0 when the sum is incremented, whence equation (2a).
  • an adder having circuitry for calculating the sum of or difference between pairs of unpacked binary numbers having 2 n bits or packed binary numbers having 2 n" bits where m ⁇ n, including: 2 m sub-adders, each sub-adder partition including a plurality of columns and a plurality of rows of cells, each column of cells having an input cell in the lowermost row for receiving bits of each of the pairs of numbers, each sub-adder above the lowest significance sub-adder having a lowest significance column input cell arranged to receive a third input bit, and the cells in the remaining rows of the or each sub-adder above the lowest significance sub-adder being arranged to prevent the carry-over of a carry bit from the most significant column of the preceding sub-adder being introduced into the sub- adder, depending on whether the third input bit is zero or one.
  • the lowest significance column input cells of the lowest significance sub-adder is the same as the lowest significance column input cells of the or each sub-adder above the lowest significance sub-adder.
  • cells in the remaining rows below the uppermost row of the or each sub-adder having a lowest significance column input cell arranged to receive a third input bit may include operational logic as set out in Table 5.
  • the cells in the uppermost row of the or each sub-adder above the lowest significance sub-adder may include operational logic as set out in Table 6.
  • the cells in the uppermost row of the lowest significance sub-adder may include operational logic as set out in Table 7 and the cells in the uppermost row of the or each sub-adder above the lowest significance sub-adder may include operational logic as set out in Table 6.
  • the lowest significance column input cells of the lowest significance sub-adder may be the same as the lowest significance column input cells of the or each sub- adder above the lowest significance sub-adder.
  • Figure 1 is a block diagram of a generic parallel prefix adder
  • Figure 2 is a representation of the layout of cells in a 16-bit conventional Ladner-Fischer parallel prefix tree
  • Figure 3 is a representation of the layout of cells in a 16-bit Ladner-Fischer parallel prefix tree as modified in accordance with the invention
  • Figures 4 & 5 show the topology of the Ladner-Fischer parallel prefix adder including the parallel prefix tree shown in Figure 2;
  • Figure 6 illustrates the logic gates of the cells of the parallel prefix tree shown in Figure 5;
  • Figures 7a-e show the logic diagrams of the numbered cells used to make up the cells of Figures 4, 5, 8 & 9;
  • Figures 8 & 9 show the topology of the Ladner-Fischer parallel prefix adder including the parallel prefix tree shown in Figure 3;
  • Figures 10a - d show the logic diagrams of additional numbered cells used to make up the cells of Figures 8 & 9 and,
  • Figure 11 illustrates the logic gates of the cells of the parallel prefix tree shown in Figure 9.
  • Figure 12 is a modified representation of the layout of cells in a 16 bit Ladner- Fischer parallel prefix tree.
  • the key issue for designing a packed arithmetic prefix tree is to arrange for successive independent strings of the general pattern ⁇ CGICK ⁇ : ⁇ CP ⁇ to be returned so that the same structure for both full wordlength (w-bit) and sub-wordlength arithmetic may be employed.
  • the invention recognises that the introduction of a fourth symbol, denoted CB (B for "block”), that exploits the don't care states available in the prefix tree, and which replaces the CP condition at the lowest significant bit (Isb) of a sub-adder can lead to improved operation by redesign of appropriate cells.
  • CB a fourth symbol
  • Table 5 presents the same information using the coding scheme for the different carry conditions just described.
  • the bottom row of cells is logically redundant.
  • the output CS's must be converted to output CPs. This is readily accomplished by defining a second cell for the packed prefix adder that operates as a normal prefix cell if no C ⁇ 's are input, but which also converts CB's to CPs. This cell is placed only at the foot of each column in the prefix tree, as illustrated in Figure 3, where the grey squares denote the second cell type.
  • Gy is set to a logic '0', and ->Kv/ to a logic '1'. Again, both these expressions are implementable as single CMOS logic gates.
  • the prefix tree's topology must be able to return the group signals, Cj, for values of / that satisfy / ⁇ / ' ⁇ / +/ at the penultimate logic level.
  • This restriction is to accommodate the necessary CB to CP conversions in packed arithmetic mode within an existing prefix tree.
  • the Ladner-Fisher prefix tree of Figures 2 and 3 does satisfy this restriction, permitting both 4-bit and 8-bit arithmetic to be supported.
  • the operation of the packed arithmetic prefix adder circuit drawn in Figure 8 proceeds as follows: two 8-bit numbers, a(0:7) and b(0:7), are supplied on the eight pairs of a(i) and b(i) inputs. If the adder is to operate as a single 8-bit adder, mode is set to logical '0'; otherwise, if the adder is to operate as two independent 4-bit adders, mode is set to logical '1'.
  • the input cells (types 1 and 8) convert the input bits to the appropriate carry conditions CXj, for 0 ⁇ / ⁇ 7, where signifies one of G, P, K, or ⁇ , as listed in Table 12. Table 12 Conversion of input bits to carry conditions
  • the operation of the prefix carry tree in Figure 8 proceeds as follows: the first (lefthand-most) column of cell 5's combines the carry conditions of selected pairs of adjacent bits to form group carry conditions, CXA.
  • the top cell 5 combines g(0), ->/c(0), g(1 ), and -> ⁇ (1 ) to form the bit-pair (G °, - K °), also referred to as CX-, 0 ;
  • the next cell down combines g(2), ⁇ , f(2), g(3), and -" f(3) to determine CX 3 2
  • the third cell down combines g(4), ⁇ ' (4), g(5), and ⁇ "k(5) to determine CX 5 4
  • the bottom cell combines g(6), ⁇ I / (6), g(7), and ⁇ >k(7) to determine CX 7 6 .
  • the second column of cell 5's combines the outputs of the first column of cell 5's and some of the outputs of cell 1 's to yield further carry conditions (reading from top to bottom): CX 2 °, CX 3 °, CX 6 4 , and CX 7 4 as follows: the top cell combines G-, , -- g(2) and - ⁇ k(2) to determine CX 2 °; the next cell down combines G- - , K 1 °, G 3 2 , and - , K 3 2 to determine CX 3 °; the third cell down combines G 5 4 , -"Ks 4 , g(6) and -> (6) to determine CX 6 4 ; the bottom cell combines G 5 4 , ->K 5 , G , and ->K 7 e to determine CX 7 4 .
  • CX 6 °, and CX 7 ° if mode 0 That is, if any of the four cell 7's receive a CB input condition, they will change it to a CP condition. Similarly, if the top cell 6 receives a CB input condition on the bit-pair ⁇ l (4), g(4), it will output a CP condition on G 4 °, -> 4 °; the other cell 6's operate in this manner too, converting CB conditions received on G 4 , _ ⁇ ⁇ / 4 to CP conditions on their outputs, G°, -"K, 0 for 5 ⁇ / ⁇ 7.
  • the third cell down combines G 6 4 , ⁇ rC G 3 °, and -> 3 ° to determine CX 6 °; the bottom cell combines G 7 4 , ->K 7 4 , G 3 °, and ⁇ K 3 °to determine CX 7 °.
  • no CB conditions are output from the column of cell 6's and cell 7's.
  • the column of cell 2a's combines the carry conditions, CX°, with the control signal inc if 1 is to be added to the result to adjust the carry conditions from CP° to CG° as needed.
  • the last column of cell 3's XOR's the carry bits, c(i), with the bit propagate signals, p( ), to return the resultant sum bits.
  • the operation of the prefix carry tree in Figure 9 proceeds in a largely similar fashion to that of Figure 8, except that there are no cell 7's in the 4 th column and the control signal is abs, not inc.
  • the first (lefthand-most) column of cell 4's and cell 5's combines the carry conditions of selected pairs of adjacent bits to form group carry conditions, CX /+1 '.
  • the second column of cell 4's and cell 5's combines the outputs of the first column of cell 4's and cell 5's and some of the outputs of cell 1's to yield the following carry conditions (reading from top to bottom): CX 2 °, CX 3 °,
  • packed prefix adder has been described above using the group generate and group not kill signals, Gz* and -"Kz*.
  • packed arithmetic prefix adders may also be constructed using the codings laid out in either Table 14 or Table 15 in conjunction with Tables 4 and 6, that is, any "2-out-of-3" combination of g(i), p(i), and ->k(i) may be employed in the prefix tree.
  • different logic expressions from those presented in equations (3-6) would result.
  • the same expressions for g(i) and -*k(i) are required as listed in equation (7), with p(i) unaffected by the value of mode.
  • the topology of the trees described above may place restrictions on the placement of the special input cells.
  • Fig. 12 is a modified representation of the layout of cells in a 16 bit Ladner- Fischer parallel prefix tree.
  • the illustrated device may be better understood by considering Fig. 12 in conjunction with Fig. 8.
  • the modification of Fig. 12 proposes that: firstly the cell 7's are removed; secondly that the cells 5 in rows 2 and 3 and the right hand most cell 5 in row 4 are changed to cell 6's and finally that any of the cell 1 's may be converted to cell 8's except the one in row 1 , i.e., at bit position 0.

Landscapes

  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Pure & Applied Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computing Systems (AREA)
  • Mathematical Optimization (AREA)
  • General Engineering & Computer Science (AREA)
  • Logic Circuits (AREA)

Abstract

An adder is proposed which has circuitry for calculating the sum of or difference between pairs of unpacked binary numbers having 2n bits or packed binary numbers having 2n-m bits where m < n. The adder has 2m sub-adders, with each sub-adder partition including a plurality of columns and a plurality of rows of cells. The columns of cells each have an input cell in the lowermost row for receiving bits of each of the pairs of numbers. Each sub-adder above the lowest significance sub-adder has a lowest significance column input cell arranged to receive a third input bit. The cells in the remaining rows of the or each sub-adder above the lowest significance sub-adder are arranged to prevent the carry-over of a carry bit from the most significant column of the preceding sub-adder being introduced into the sub-adder, depending on whether the third input bit is zero or one.

Description

CARRY LOOKAHEAD ADDER FOR DIFFERENT DATA TYPES
The present invention relates to addition circuitry, particularly to packed arithmetic prefix adder circuitry, often referred to as "carry lookahead" adders or adder trees, and, more particularly to prefix adder circuitry capable of calculating the sum of or difference between pairs of packed or unpacked binary numbers.
Multimedia processor chips (and others) make much use of "packed" arithmetic operations, in which long wordlength numbers are optionally treated as several independent shorter wordlength numbers - for example, a 32-bit word may be treated as 2 separate 16-bit words or as 4 8-bit words. Moreover, a common arithmetic operation used in video processing is "absolute difference", denoted | A - B | , in which the magnitude of the difference between A and B is calculated. This operation is used in video motion estimation and prediction algorithms. Hence, a most valuable operation is a "packed absolute difference" operation which returns the absolute differences of several independent short numbers simultaneously.
Ordinarily, absolute differences are computed by performing a subtraction operation followed by a separate "absolute value" operation which returns the magnitude of a signed number. Absolute differences can be obtained by computing both A - B and B - A, and using the signs of the two results to select the positive result, which corresponds to the absolute difference. However, this is wasteful and a better technique is sought. This document describes how parallel prefix adder trees - widely used in VLSI processor chips - may be modified to support packed arithmetic operations, including packed absolute difference calculations.
Parallel prefix carry-lookahead adders are a popular VLSI design technique that accelerates a w-bit addition by means of a parallel prefix tree.
The addition process at each bit position can be defined in terms of signals as follows in relation to Figure 1.
A block diagram of a prefix adder is illustrated in Figure 1 , where the adder is seen to consist of three blocks: input bit propagate, generate, and not kill cells; the prefix tree; output sum cells. The input cells derive the bit propagate, generate, and not kill signals according to: p(i) = a(i) ® b(i) g(i) = a(i) Λ b(i)
^k(i) = a(i) v b(i) — 0) respectively, where: g(i) is called the bit generate condition, a value 1 indicating that the bits a(i) and b(i) produce an output carry bit (c(i)) irrespective of the incoming carry, p(i) is called the bit propagate condition, a value 1 indicating that the bits a(i) and b(i) produce an output carry bit (c(i)) only if there is an incoming carry, and
-,k(i) is called the not kill bit condition, a value 0 indicating that the bits a(i) and b(i) produce no output carry bit (the symbol "-." being used to indicate the NOT condition); c(i) is a binary carry bit which is received by each next most significant bit position, c(i) having the same significance as a(i) and b(i).
The prefix tree combines the bit generate and bit not kill signals to derive
"group generate" and "group not kill" signals, denoted G/-1 ° and ^Kμ-]^ respectively. Next, the "group generate" and "group not kill" signals are combined with control signals, denoted "inc" and "abs", to derive carry signals c(i):
Incremented Sum, A+BA : c(i) = G/-1 ° v ~>Kj.- ° Λ inc (2a)
Absolute Difference, ( A-B | : c(i) = G/_1 °≡ abs v -> /-1 ° A abs (2b)
That is, if "inc" = 1 , A + B + 1 is computed instead of A + B, and if "abs" = 0, B - A is computed instead of A - B. Finally, the carry signals are XOR'd with the bit propagate signals to return the result: s(i) = p(i) ® c(i) — (3)
The present invention is aimed at modifying the prefix tree such that it returns group ' generate and group not kill signals for use in either full-length or packed arithmetic calculations to provide added functionality.
Some background information to this process is desirable. The prefix tree converts the input bit generate and bit not kill signals, g(i) and — .k(i), into group generate and group not kill signals, Gj° and -*Kp through a number of levels of logic operations. In general, Gz x represents a "group generate" signal across the bits from significance x up to and including significance z, and ->Kzx represents a "group not kill" signal across the same significances. It should be noted that Gj' = g(i) and Kj' = k(i). Each level of logic in the tree widens the range of the groups until the lower value of the range covered by the group is 0, and the upper value is i.
The bit combinations of (G^, -,K^) may be interpreted in terms of carry conditions, c(i), as shown in Table 1. Table 1 Interpretation of (G^, -iKz ) bit combinations
Figure imgf000005_0001
where 'X' denotes a "don't care" condition (i.e. either a logic '1 ' or a logic '0'). Pairs of group signals are combined to yield compound group signals, C^, from pairs of group signals, CZY and Cy , as shown in Table 2, where z ≥ y, z ≥ w, w ≥ x, and y ≤ lrV+1 .
Table 2 Prefix adder cell function
Figure imgf000005_0002
Adopting the coding of Table 1 for the CK, CP, and CG conditions yields Table 3, which gives the required logic operations for the prefix tree.
Table 3 Prefix adder cell logic
Figure imgf000005_0003
Figure 2 shows the prefix tree proposed by Ladner and Fischer. The black squares are prefix cells, which implement the equation pair:
Figure imgf000006_0001
Both the group generate and the group not kill expressions are implementable as individual CMOS logic gates, and exploit the don't cares in Table 3 so as to minimise the complexity of the equation pair. At the output of the prefix tree, a pattern of carry conditions always emerges which comprises a string of CG & CK conditions, followed by a string (possibly null) of CP conditions. The trailing string of CP conditions identifies the trailing string of sum bits that must change from 1 to 0 when the sum is incremented, whence equation (2a).
According to the present invention there is provided an adder having circuitry for calculating the sum of or difference between pairs of unpacked binary numbers having 2n bits or packed binary numbers having 2n" bits where m < n, including: 2m sub-adders, each sub-adder partition including a plurality of columns and a plurality of rows of cells, each column of cells having an input cell in the lowermost row for receiving bits of each of the pairs of numbers, each sub-adder above the lowest significance sub-adder having a lowest significance column input cell arranged to receive a third input bit, and the cells in the remaining rows of the or each sub-adder above the lowest significance sub-adder being arranged to prevent the carry-over of a carry bit from the most significant column of the preceding sub-adder being introduced into the sub- adder, depending on whether the third input bit is zero or one.
Preferably, the lowest significance column input cells of the lowest significance sub-adder is the same as the lowest significance column input cells of the or each sub-adder above the lowest significance sub-adder.
Preferably, cells in the remaining rows below the uppermost row of the or each sub-adder having a lowest significance column input cell arranged to receive a third input bit, may include operational logic as set out in Table 5. In one form of adder, the cells in the uppermost row of the or each sub-adder above the lowest significance sub-adder may include operational logic as set out in Table 6.
In an alternative form, the cells in the uppermost row of the lowest significance sub-adder may include operational logic as set out in Table 7 and the cells in the uppermost row of the or each sub-adder above the lowest significance sub-adder may include operational logic as set out in Table 6. The lowest significance column input cells of the lowest significance sub-adder may be the same as the lowest significance column input cells of the or each sub- adder above the lowest significance sub-adder.
Examples of parallel prefix adders in accordance with the present invention will now be described with reference to the accompanying drawings, in which:-
Figure 1 is a block diagram of a generic parallel prefix adder;
Figure 2 is a representation of the layout of cells in a 16-bit conventional Ladner-Fischer parallel prefix tree;
Figure 3 is a representation of the layout of cells in a 16-bit Ladner-Fischer parallel prefix tree as modified in accordance with the invention;
Figures 4 & 5 show the topology of the Ladner-Fischer parallel prefix adder including the parallel prefix tree shown in Figure 2;
Figure 6 illustrates the logic gates of the cells of the parallel prefix tree shown in Figure 5; Figures 7a-e show the logic diagrams of the numbered cells used to make up the cells of Figures 4, 5, 8 & 9;
Figures 8 & 9 show the topology of the Ladner-Fischer parallel prefix adder including the parallel prefix tree shown in Figure 3;
Figures 10a - d show the logic diagrams of additional numbered cells used to make up the cells of Figures 8 & 9 and,
Figure 11 illustrates the logic gates of the cells of the parallel prefix tree shown in Figure 9.
Figure 12 is a modified representation of the layout of cells in a 16 bit Ladner- Fischer parallel prefix tree. The key issue for designing a packed arithmetic prefix tree is to arrange for successive independent strings of the general pattern {CGICK} : {CP} to be returned so that the same structure for both full wordlength (w-bit) and sub-wordlength arithmetic may be employed.
The invention recognises that the introduction of a fourth symbol, denoted CB (B for "block"), that exploits the don't care states available in the prefix tree, and which replaces the CP condition at the lowest significant bit (Isb) of a sub-adder can lead to improved operation by redesign of appropriate cells.
The required characteristics of the CB condition are:
(a) unlike the CP condition, it must prevent bits with lower significances from interacting with bits with higher significances;
(b) in common with the CP condition, it must return a string of output CP conditions at the prefix tree's output from each sub-adder. If these constraints can be met, a w-bit adder is partitionable into independent sub-wordlength adders.
The CB condition is representable by the combination (G^, -iKz*) = (1 ,0), implying that the CG condition must be represented by (G^, - KXX) = (1 ,1), and not
(Gz* -ιKz x) = (1 ,X). The codings for the CK and CP conditions remain unchanged. Table 4 below gives the input-output relationship for the packed prefix adder. It should be noted that Table 2 is subsumed by Table 4, indicating that ordinary addition operations are also supported if no CB conditions are introduced to the prefix tree.
Table 4 Packed prefix adder cell function
Figure imgf000008_0001
Table 5 presents the same information using the coding scheme for the different carry conditions just described.
Table 5 Packed prefix adder cell logic
Figure imgf000008_0002
The major effect of this cell's function relative to a normal prefix cell's function is to return CB conditions - (Gz , -iKz*) = (1 ,0), instead of CP conditions - (Gz ,-ιrz ) = (0,1 ), at the prefix tree's output. The logic equations described by Table 5 are:
Figure imgf000009_0001
-^Kz* = - Kzy Λ (--Kw* v Gzy) — (5)
Both expressions are implementable as single CMOS logic gates.
Now, because the sub-adders have a shorter wordlength than the entire adder, the bottom row of cells is logically redundant. However, in order to support the packed absolute difference operation, the output CS's must be converted to output CPs. This is readily accomplished by defining a second cell for the packed prefix adder that operates as a normal prefix cell if no Cβ's are input, but which also converts CB's to CPs. This cell is placed only at the foot of each column in the prefix tree, as illustrated in Figure 3, where the grey squares denote the second cell type.
The function and logic of the second cell are described in Tables 6 and 7 below. Cells with only one apparent input are reduced complexity cells whose sole function is to convert input CB conditions to output CP conditions.
Table 6 Second prefix adder cell function
Figure imgf000009_0002
Table 7 Second prefix adder cell logic
Figure imgf000009_0003
The logic equations described by Table 7 are: Gzx = ^Kzy Λ (Gzy v Gw X)
Figure imgf000010_0001
Again, both expressions are implementable as single CMOS logic gates. The cells with only one input (g, ~>k) pair have the following simplified expressions:
Figure imgf000010_0002
i.e. compared with equation (6), Gy is set to a logic '0', and ->Kv/ to a logic '1'. Again, both these expressions are implementable as single CMOS logic gates.
The inputs to the adder where a CB condition may need to be injected to mark the Isb's of sub-adders require extra logic to return the correct values of (g, ->k). CG and CK conditions prevent carries from lower significances interacting with carries at higher significances in any case: hence, we only need to replace input CP conditions by input CB conditions where necessary. The full Table for deriving (g, -•k) duples under normal or packed operation as a function of a control signal labelled mode is shown in Table 8:
Table 8 Input logic for packed prefix tree
Figure imgf000010_0003
The equations implied by Table 8 are again implementable using single CMOS logic gates whose inputs are a, b, and mode:
-"/ = a Λ b v ->mode A (a v b) g = a Λ ύ v mode A (a v b) (8)
Once the carry conditions (following either a full wordlength or packed arithmetic operation) emerge from the prefix tree, they must be combined with the bit propagate signals and other control signals so as to return the required results, according to equation (2). This involves supplying the correct control signals to the appropriate sub-adders so that the desired result is computed using some simple output logic. For packed arithmetic to be supported then the topology of the prefix tree is restricted such that for any sub-adder traversing bits / - / +k (i.e. a / +1-bit sub-adder, whose Isb is at bit position /), the prefix tree's topology must be able to return the group signals, Cj, for values of / that satisfy / < /' < / +/ at the penultimate logic level. This restriction is to accommodate the necessary CB to CP conversions in packed arithmetic mode within an existing prefix tree. For example, the Ladner-Fisher prefix tree of Figures 2 and 3 does satisfy this restriction, permitting both 4-bit and 8-bit arithmetic to be supported.
Two examples of adders according to the invention will now be described with reference to figures 8 to 11.
The individual cells used in the packed arithmetic prefix adder circuits illustrated in Figures 7 & 10 have been described earlier in this document. In summary, there are 5 distinct cell types in the invention, two of which have two versions:
cells 2a & 2b Figures 7a & 7b carry update cell cell 3 Figure 7d output cell cell 5 Figure 10a PAPA prefix cell cells 6 & 7 Figures 10b & 10c CB:CP conversion cells cell 8 Figure 10d input cell
The input to output relationships of cells 2a, 2b and 3 are described by equations (2a), (2b), and (3) respectively. In tabular form, these cells' functions may be written as shown in Tables 9 to11 :
Table 9 Cell 2a input to output relationship
Figure imgf000011_0001
Table 10 Cell 2b input to output relationship
Figure imgf000012_0001
Table 11 Cell 3 input to output relationship
Figure imgf000012_0002
The function of cell 5 was presented in Table 5, the function of cell 6 was presented in Table 7 (cell 7 is derived from cell 6 by assigning (Gw x, -'K ') = (0,1)) and the input cell's function (cell 8) was shown in Table 8.
The operation of the packed arithmetic prefix adder circuit drawn in Figure 8 proceeds as follows: two 8-bit numbers, a(0:7) and b(0:7), are supplied on the eight pairs of a(i) and b(i) inputs. If the adder is to operate as a single 8-bit adder, mode is set to logical '0'; otherwise, if the adder is to operate as two independent 4-bit adders, mode is set to logical '1'. The input cells (types 1 and 8) convert the input bits to the appropriate carry conditions CXj, for 0 < / < 7, where signifies one of G, P, K, or β, as listed in Table 12. Table 12 Conversion of input bits to carry conditions
Figure imgf000013_0001
In the prefix carry tree, the separate carry conditions are represented as pairs of bits, (G^, -^Kz*) as shown in Table 13. Cell types 5, 6, 7 and 2 can all be described using the concept of carry conditions.
Table 13 Interpretation of (G^, -,/ zx) bit combinations
Figure imgf000013_0002
The operation of the prefix carry tree in Figure 8 proceeds as follows: the first (lefthand-most) column of cell 5's combines the carry conditions of selected pairs of adjacent bits to form group carry conditions, CXA. The top cell 5 combines g(0), ->/c(0), g(1 ), and ->Λ(1 ) to form the bit-pair (G °, - K °), also referred to as CX-,0; the next cell down combines g(2), ~, f(2), g(3), and -" f(3) to determine CX3 2, the third cell down combines g(4), ~' (4), g(5), and ~"k(5) to determine CX5 4, and the bottom cell combines g(6), ~I/ (6), g(7), and ~>k(7) to determine CX7 6.
The second column of cell 5's combines the outputs of the first column of cell 5's and some of the outputs of cell 1 's to yield further carry conditions (reading from top to bottom): CX2°, CX3°, CX6 4, and CX7 4 as follows: the top cell combines G-, , -- g(2) and -<k(2) to determine CX2°; the next cell down combines G- -,K1°, G3 2, and -,K3 2to determine CX3°; the third cell down combines G5 4, -"Ks4, g(6) and -> (6) to determine CX6 4; the bottom cell combines G5 4, ->K5 , G , and ->K7 e to determine CX7 4.
Next, the column of cell 6's and cell 7's either converts CB conditions to CP conditions if mode = 1 (introduced by cell 8) or groups the outputs from selected cell 5's (and cell 8) to return CX °, CX5°,
CX6°, and CX7° if mode = 0 That is, if any of the four cell 7's receive a CB input condition, they will change it to a CP condition. Similarly, if the top cell 6 receives a CB input condition on the bit-pair ~l (4), g(4), it will output a CP condition on G4°, -> 4°; the other cell 6's operate in this manner too, converting CB conditions received on G4, </ 4 to CP conditions on their outputs, G°, -"K,0 for 5 < / < 7. If mode = 0, no CB conditions can occur and the cell 6's combine the outputs of cell 5's and some cell 1's to yield further carry conditions (reading from top to bottom): CX4°, CX5°, CX6°, and CX7° as follows: the top cell combines G3°, -^Kz0, g(4) and ->k(4) to determine CX °; the next cell down combines
G5 4, -<K5 Λ, G3°, and -- 3 0to determine CX5°; the third cell down combines G6 4, ^rC G3°, and -> 3° to determine CX6°; the bottom cell combines G7 4, ->K7 4, G3°, and ^K3°to determine CX7°. In all cases, no CB conditions are output from the column of cell 6's and cell 7's. Next, the column of cell 2a's combines the carry conditions, CX°, with the control signal inc if 1 is to be added to the result to adjust the carry conditions from CP° to CG° as needed. Finally, the last column of cell 3's XOR's the carry bits, c(i), with the bit propagate signals, p( ), to return the resultant sum bits.
The operation of the prefix carry tree in Figure 9 proceeds in a largely similar fashion to that of Figure 8, except that there are no cell 7's in the 4th column and the control signal is abs, not inc. As before, the first (lefthand-most) column of cell 4's and cell 5's combines the carry conditions of selected pairs of adjacent bits to form group carry conditions, CX/+1'. The second column of cell 4's and cell 5's combines the outputs of the first column of cell 4's and cell 5's and some of the outputs of cell 1's to yield the following carry conditions (reading from top to bottom): CX2°, CX3°,
CX6 4, and CX7 4. Again, the column of cell 6's either converts CB conditions to CP conditions if mode = 1 (introduced by cell 8) or groups the outputs from selected cell 5's (and cell 8) to return CX °, CX5°, CX6°, and CX7° if mode = 0
In either case, no CB conditions are output from the column of cell 6's. Next, the column of cell 2b's combines the carry conditions, CX°, with the control signal abs to return the final carry signals, c(/+1). Finally, the last column of cell 3's XOR's the carry bits, c(i), with the bit propagate signals, p(i), to return the resultant sum bits.
The packed prefix adder has been described above using the group generate and group not kill signals, Gz* and -"Kz*. However, packed arithmetic prefix adders may also be constructed using the codings laid out in either Table 14 or Table 15 in conjunction with Tables 4 and 6, that is, any "2-out-of-3" combination of g(i), p(i), and ->k(i) may be employed in the prefix tree. However, different logic expressions from those presented in equations (3-6) would result. At the input logic stage, the same expressions for g(i) and -*k(i) are required as listed in equation (7), with p(i) unaffected by the value of mode.
Table 14 Interpretation of (G^, P^) bit combinations
Figure imgf000015_0001
Table 15 Interpretation of (-1KZ X, Pz x) bit combinations
Figure imgf000015_0002
The topology of the trees described above may place restrictions on the placement of the special input cells.
This disadvantage can be overcome by changing the topology of the tree as shown in Fig. 12, where the black and grey squares represent the same cells as in Fig. 3. This new topology causes all of the restrictions on the topology of the tree and on the placement of the special input cells to disappear. This follows because in the adder structure, every output from the prefix cell is a function of all the input positions up to and including its own significance. Hence, provided the outputs can be restricted to the set {CP,CG,CK} as in a non-packed prefix adder, no restrictions are placed on the location of the special input cells that introduce the CB conditions. The grey cells have the property that they only output the CB condition if they receive a CB condition on their "horizontal" input from a cell of lower significance and simultaneously receive a CP condition on their "vertical" input from the same significance.
Hence, by inspection of Fig. 12, it will be appreciated that the only way any of the grey cells can output a CB condition is if a CB condition is injected at bit 0. To prevent this from happening, the constraint is imposed that bit 0 may not have a special input cell that partitions the adder. This is not a serious constraint because the adder begins at bit 0 in any case.
Fig. 12 is a modified representation of the layout of cells in a 16 bit Ladner- Fischer parallel prefix tree. The illustrated device may be better understood by considering Fig. 12 in conjunction with Fig. 8. Relative to Fig. 8, the modification of Fig. 12 proposes that: firstly the cell 7's are removed; secondly that the cells 5 in rows 2 and 3 and the right hand most cell 5 in row 4 are changed to cell 6's and finally that any of the cell 1 's may be converted to cell 8's except the one in row 1 , i.e., at bit position 0.
This change permits any prefixed tree topology to support packed arithmetic with partitions happening anywhere in the adder except at bit position 0.

Claims

1. An adder having circuitry for calculating the sum of or difference between pairs of unpacked binary numbers having 2π bits or packed binary numbers having 2n" bits where m < n, including:
2m sub-adders, each sub-adder partition including a plurality of columns and a plurality of rows of cells, each column of cells having an input cell in the lowermost row for receiving bits of each of the pairs of numbers, each sub-adder above the lowest significance sub-adder having a lowest significance column input cell arranged to receive a third input bit, and the cells in the remaining rows of the or each sub-adder above the lowest significance sub-adder being arranged to prevent the carry-over of a carry bit from the most significant column of the preceding sub-adder being introduced into the sub- adder, depending on whether the third input bit is zero or one.
2. An adder according to claim 1 , wherein the lowest significance column input cells of the lowest significance sub-adder is the same as the lowest significance column input cells of the or each sub-adder above the lowest significance sub-adder.
3. An adder according to claim 1 or claim 2, wherein cells, in the remaining rows below the uppermost row of the or each sub-adder having a lowest significance column input cell arranged to receive a third input bit, include operational logic as set out in Table 5.
4. An adder according to any of claims 1 to 3, wherein the cells in the uppermost row of the or each sub-adder above the lowest significance sub-adder include operational logic as set out in Table 6.
5. An adder according to claim 2, wherein the cells in the uppermost row of the lowest significance sub-adder include operational logic as set out in Table 7 and the cells in the uppermost row of the or each sub-adder above the lowest significance sub-adder include operational logic as set out in Table 6.
PCT/GB2001/005358 2000-12-04 2001-12-04 Carry lookahead adder for different data types WO2002046910A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU2002220883A AU2002220883A1 (en) 2000-12-04 2001-12-04 Carry lookahead adder for different data types

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
GB0029538.6 2000-12-04
GB0029538A GB0029538D0 (en) 2000-12-04 2000-12-04 Addition circuitry
GB0106600A GB0106600D0 (en) 2001-03-16 2001-03-16 Addition circuitry
GB0106600.0 2001-03-16

Publications (1)

Publication Number Publication Date
WO2002046910A1 true WO2002046910A1 (en) 2002-06-13

Family

ID=26245371

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/GB2001/005358 WO2002046910A1 (en) 2000-12-04 2001-12-04 Carry lookahead adder for different data types

Country Status (2)

Country Link
AU (1) AU2002220883A1 (en)
WO (1) WO2002046910A1 (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US3987291A (en) * 1975-05-01 1976-10-19 International Business Machines Corporation Parallel digital arithmetic device having a variable number of independent arithmetic zones of variable width and location
US5719802A (en) * 1995-12-22 1998-02-17 Chromatic Research, Inc. Adder circuit incorporating byte boundaries
US5943251A (en) * 1996-11-18 1999-08-24 Samsung Electronics Co., Ltd. Adder which handles multiple data with different data types

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US3987291A (en) * 1975-05-01 1976-10-19 International Business Machines Corporation Parallel digital arithmetic device having a variable number of independent arithmetic zones of variable width and location
US5719802A (en) * 1995-12-22 1998-02-17 Chromatic Research, Inc. Adder circuit incorporating byte boundaries
US5943251A (en) * 1996-11-18 1999-08-24 Samsung Electronics Co., Ltd. Adder which handles multiple data with different data types

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
KARTHIKEYAN P S ET AL: "MORE ON ARBITRARY BOUNDARY PACKED ARITHMETIC", PROCEEDINGS. INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING, XX, XX, 17 December 1998 (1998-12-17), pages 19 - 24, XP001001267 *

Also Published As

Publication number Publication date
AU2002220883A1 (en) 2002-06-18

Similar Documents

Publication Publication Date Title
US3993891A (en) High speed parallel digital adder employing conditional and look-ahead approaches
EP0448367B1 (en) High speed digital parallel multiplier
US7685408B2 (en) Methods and apparatus for extracting bits of a source register based on a mask and right justifying the bits into a target register
US4623982A (en) Conditional carry techniques for digital processors
KR19980064395A (en) Operation method of arithmetic unit, storage medium and arithmetic unit
JPH0456339B2 (en)
JPH09501526A (en) Modified Wallace Tree adder structure and method for fast binary multipliers
US5343417A (en) Fast multiplier
US6715066B1 (en) System and method for arranging bits of a data word in accordance with a mask
US5251167A (en) Method and apparatus for processing sign-extension bits generated by modified booth algorithm
US5291431A (en) Array multiplier adapted for tiled layout by silicon compiler
US5299145A (en) Adder for reducing carry processing
US20060143260A1 (en) Low-power booth array multiplier with bypass circuits
JPS62203426A (en) Digital compression/expansion circuit
US5586071A (en) Enhanced fast multiplier
WO2002046910A1 (en) Carry lookahead adder for different data types
EP0109137A2 (en) Partial product accumulation in high performance multipliers
US6012077A (en) Method and apparatus for indicating overflow status of bit-variable data employing pipelining adder
US5935202A (en) Compressor circuit in a data processor and method therefor
US6345286B1 (en) 6-to-3 carry-save adder
JPS6230451B2 (en)
WO1991000568A1 (en) Conditional-sum carry structure compiler
US6629239B1 (en) System and method for unpacking and merging bits of a data world in accordance with bits of a mask word
US6631393B1 (en) Method and apparatus for speculative addition using a limited carry
JP3231298B2 (en) Multiplication device

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ OM PH PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase

Ref country code: JP

WWW Wipo information: withdrawn in national office

Country of ref document: JP