US20230333815A1 - Concurrent multi-bit adder - Google Patents

Concurrent multi-bit adder Download PDF

Info

Publication number
US20230333815A1
US20230333815A1 US18/337,086 US202318337086A US2023333815A1 US 20230333815 A1 US20230333815 A1 US 20230333815A1 US 202318337086 A US202318337086 A US 202318337086A US 2023333815 A1 US2023333815 A1 US 2023333815A1
Authority
US
United States
Prior art keywords
bit
section
adder
group
stored
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
US18/337,086
Inventor
Moshe LAZER
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
GSI Technology Inc
Original Assignee
GSI Technology Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by GSI Technology Inc filed Critical GSI Technology Inc
Priority to US18/337,086 priority Critical patent/US20230333815A1/en
Assigned to GSI TECHNOLOGY INC. reassignment GSI TECHNOLOGY INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: LAZER, MOSHE
Publication of US20230333815A1 publication Critical patent/US20230333815A1/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/50Adding; Subtracting
    • G06F7/505Adding; Subtracting in bit-parallel fashion, i.e. having a different digit-handling circuit for each denomination
    • G06F7/506Adding; Subtracting in bit-parallel fashion, i.e. having a different digit-handling circuit for each denomination with simultaneous carry generation for, or propagation over, two or more stages
    • G06F7/507Adding; Subtracting in bit-parallel fashion, i.e. having a different digit-handling circuit for each denomination with simultaneous carry generation for, or propagation over, two or more stages using selection between two conditionally calculated carry or sum values
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/50Adding; Subtracting
    • G06F7/505Adding; Subtracting in bit-parallel fashion, i.e. having a different digit-handling circuit for each denomination
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/50Adding; Subtracting
    • G06F7/505Adding; Subtracting in bit-parallel fashion, i.e. having a different digit-handling circuit for each denomination
    • G06F7/506Adding; Subtracting in bit-parallel fashion, i.e. having a different digit-handling circuit for each denomination with simultaneous carry generation for, or propagation over, two or more stages
    • G06F7/508Adding; Subtracting in bit-parallel fashion, i.e. having a different digit-handling circuit for each denomination with simultaneous carry generation for, or propagation over, two or more stages using carry look-ahead circuits
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/50Adding; Subtracting
    • G06F7/501Half or full adders, i.e. basic adder cells for one denomination
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11CSTATIC STORES
    • G11C15/00Digital stores in which information comprising one or more characteristic parts is written into the store and in which information is read-out by searching for one or more of these characteristic parts, i.e. associative or content-addressed stores
    • G11C15/04Digital stores in which information comprising one or more characteristic parts is written into the store and in which information is read-out by searching for one or more of these characteristic parts, i.e. associative or content-addressed stores using semiconductor elements
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11CSTATIC STORES
    • G11C7/00Arrangements for writing information into, or reading information out from, a digital store
    • G11C7/10Input/output [I/O] data interface arrangements, e.g. I/O data control circuits, I/O data buffers
    • G11C7/1006Data managing, e.g. manipulating data before writing or reading out, data bus switches or control circuits therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/0207Addressing or allocation; Relocation with multidimensional access, e.g. row/column, matrix
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2207/00Indexing scheme relating to methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F2207/38Indexing scheme relating to groups G06F7/38 - G06F7/575
    • G06F2207/48Indexing scheme relating to groups G06F7/48 - G06F7/575
    • G06F2207/4802Special implementations
    • G06F2207/4804Associative memory or processor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2207/00Indexing scheme relating to methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F2207/506Indexing scheme relating to groups G06F7/506 - G06F7/508
    • G06F2207/50632-input gates, i.e. only using 2-input logical gates, e.g. binary carry look-ahead, e.g. Kogge-Stone or Ladner-Fischer adder

Definitions

  • the present invention relates to associative memory generally and to a method for concurrent bit addition, in particular.
  • adders are used not only in arithmetic logic units, but also in other parts, where they are used to calculate addresses, table indices, increment and decrement operators, and similar operations.
  • FIG. 1 to which reference is now made illustrates a one-bit full adder 100 and a multi-bit ripple carry adder 120 , all known in the art.
  • One-bit full adder 100 receives three one-bit values as input, A, B, and Cin, and adds them.
  • the output of one-bit full adder 100 is the calculated sum of the three input bits, S, and a bit carried out from the add operation, Cout.
  • Multi-bit ripple carry adder 120 may be used for adding N-bit variables A and B.
  • Multi-bit ripple carry adder 120 may be constructed from N one-bit full adders 100 . Each full adder 100 inputs a bit A i from variable A and a bit B i from variable B. Each full adder also inputs a carry in, C in-i , which is the carry out of the previous adder, C out-i-1 .
  • the input bits of full adder 100 a are the least significant bits (LSB) of A, (e.g. 0), the LSB of B, (e.g. 1), and a carry in which is by definition 0 for the first full adder.
  • Full adder 100 a may perform the calculation (in this example 0+1+0).
  • the output bits of full adder 100 a are the result bit S with value of 1, and the carry out bit C out , with value of 0.
  • the C out of full adder 100 a becomes the C in of full adder 100 b . It may be appreciated that full adder 100 b may start its computation only after the computation of full adder 100 a has been completed and the same constraint applies to all full adders including 100 c and 100 d , except for the first.
  • the last C out , of the last full adder 100 d is referred to as the overflow of the computation.
  • step 1 bit 0 (LSB) of both variables is added resulting in a bit S 0 and a carry out bit C out-0
  • step 2 bit 1 of both variables and the carry out of the previous step, C out-0
  • step 3 bit 2 of both variables and the carry of the previous step, C out-1
  • step 4 bit 3 of both variables and the carry of the previous step, C out-2
  • step 3 bit 3 and a carry out bit C out-3 .
  • the result of the add operation is all bits S from all steps and the last carry out, which is the overflow if its value is 1.
  • a computation step may start only when all its input values are known, i.e. A i , B i and A i and B i are known in advance (bits from the input numbers A and B).
  • the first C in is 0 (this is the first step, there is no previous step, thus there is no value to carry into this step).
  • the value of C in in each step (except for the first one) is known only after the computation of the previous step is completed, as it is the C out of that former step.
  • ripple carry adder can get very slow when adding large multi bit values.
  • the entire ripple carry add computation is serial and its complexity is O(N), which is a disadvantage.
  • a method for an associative memory device includes in parallel, performing multi-bit operations of P pairs of multi-bit operands stored in columns of a memory array, each pair of the P pairs is stored in a different column of the array and each operation of the multi-bit operations occurs in its associated different column, and each bit i of each of the multi-bit operands of each of the P pairs is stored in a row of a section i in the column.
  • each multi-bit operation of the multi-bit operations includes a plurality of per-section operations, and each per-section operation includes one or more Boolean operations between a plurality of bits stored in the section.
  • the performing includes concurrently performing the per-section operations on a plurality of sections.
  • the multi-bit operation is a multi-bit add operation.
  • a system that includes a non-destructive associative memory array that includes a plurality of sections, each section including cells arranged in rows and columns, to store a bit j from a first multi-bit number in a first row and a bit j from a second multi-bit number in a second row of a same column, and a concurrent adder to, in parallel, perform per-section operations in each section, each per-section operation includes one or more Boolean operations between a plurality of bits stored in rows of the section.
  • FIG. 1 is a schematic illustration of a one-bit full adder and a multi-bit ripple carry adder known in the art
  • FIG. 2 is a schematic illustration of an exemplary, known in the art, four-bit ripple carry adder used to add two 4-bit variables;
  • FIG. 3 is a schematic illustration of a multi-bit concurrent adder, constructed and operative in accordance with a preferred embodiment of the present invention
  • FIG. 4 is a schematic illustration of an associative memory array, constructed and operative in accordance with a preferred embodiment of the present invention
  • FIG. 5 is a schematic illustration of data stored in a section of an associative memory array, constructed and operative in accordance with a preferred embodiment of the present invention
  • FIG. 6 is a schematic illustration of what is stored in the various rows of the memory array during the add operation performed by the concurrent adder of FIG. 3 to concurrently add two 8-bit operands according to a preferred embodiment of the present invention.
  • FIG. 7 is a flow chart illustration showing the operations performed by the concurrent adder of FIG. 3 , according to a preferred embodiment of the present invention.
  • the carry out signal may be calculated in advance by a procedure, known in the art, called Carry Look Ahead (CLA).
  • CLA Carry Look Ahead
  • the CLA calculation is based on the value of all previous input bits A i and B i (0 ⁇ i ⁇ N) of variables A and B, and on the value of the first C in The computation of the CLA is expressed in equation 3.
  • C out-N A N *B N +A N-1 *B N-1 *( A N +B N )+ A N-2 *B N-2 *( A N-1 +B N-1 )*( A N +B N )+ . . . + C in *( A 0 +B 0 )*( A 1 +B 1 ) . . . ( A N +B N ) Equation 3
  • the bits of the variables may be split into groups (nibbles for example) and the carry of the group, referred herein as C out-group , i.e. the carry from the last bit in the group, may be calculated without waiting for each bit computation to be completed.
  • C out-group the carry of the group
  • the performance of a multi-bit adder may be improved (compared to the ripple carry); however, the CLA may only be implemented using specialized hardware, explicitly designed to calculate the expected carry out of a group using all the input bits of the group i.e. all the bits of variable A, all the bits of variable B and the C in of the group referred herein as C in-group .
  • Applicant has realized that a similar carry propagation functionality, that improves the computation efficiency of a multi-bit adder compared to a ripple carry adder, may be provided by a multi-purpose associative memory replacing the specialized hardware, by performing a calculation using a prediction regarding the value of a carry in as described hereinbelow.
  • Multi-purpose associative memory devices are described in U.S. Pat. No. 8,238,173, (entitled “USING STORAGE CELLS TO PERFORM COMPUTATION”) issued on Aug. 7, 2012; U.S. Patent Publication No. US 2015/0131383, (entitled “NON-VOLATILE IN-MEMORY COMPUTING DEVICE”) published on May 14, 2015, now issued as U.S. Pat. No. 10,832,746 on Nov. 10, 2020; U.S. Pat. No. 9,418,719 (entitled “IN-MEMORY COMPUTATIONAL DEVICE”), issued on Aug. 16, 2016 and U.S. Pat. No. 9,558,812 (entitled “SRAM MULTI-CELL OPERATIONS”) issued on Jan. 31, 2017, all assigned to the common assignee of the present invention and incorporated herein by reference.
  • FIG. 3 schematically illustrates a multi-bit concurrent adder 300 , constructed and operative in accordance with a preferred embodiment of the present invention.
  • Multi-bit concurrent adder 300 comprises a concurrent adder 310 and an associative memory array 320 .
  • Associative memory array 320 may store each pair of operands, A and B, in a column, and may also store intermediate and final results of the computation in the same column.
  • Concurrent adder 310 comprises a predictor 314 , a selector 316 and a summer 318 , described in more detail hereinbelow.
  • FIG. 4 schematically illustrates associative memory array 320 .
  • Associative memory array 320 comprises a plurality of sections 330 , each section 330 comprises rows and columns. Each section 330 may store a different bit of the operands A and B. Bits 0 of the operands may be stored in section 0, bits 1 may be stored in section 1 and so on until bit 15 may be stored in section 15 . As can be seen, each bit j of both operands A and B may be stored in a different row of the same column k, of the same section j.
  • bit A0 of operand A is stored in row A, column C-k of section 0, and bit B0 of operand B is stored in a different row, row R-B, in the same column col-k of the same section, section 0.
  • the other bits of the operands A and B are similarly stored in additional sections 330 of associative memory array 320 .
  • Concurrent adder 310 may utilize additional rows of each section 330 to store intermediate values, predictions and final results as illustrated in FIG. 5 , to which reference is now made.
  • concurrent adder 310 may store, in a section x, a bit x from operand A in row A, and a bit x from operand B in row B.
  • concurrent adder 310 may store the result of a Boolean OR performed on bits stored on rows A and B, in Row AorB.
  • concurrent adder 310 may store the result of a Boolean AND performed on bits stored on rows A and B. The values stored in both rows AorB and AandB may be used later for the computation of a carry out.
  • concurrent adder 310 may store a value related to the carry out.
  • Predictor 314 may use row C0 to store a value of C out , calculated using a prediction that the value of the carry in (to the group) will be 0.
  • Predictor 314 may use row C1 to store a value of C out , calculated using a prediction that the value of the carry in (to the group) will be 1.
  • Selector 316 may select the actual value used by summer 318 for calculating the sum and may store it in row C out after the actual value of the carry in is known, when the calculation of the carry out of the previous group is completed.
  • summer 318 may store the sum of bit x from operand B, bit x from operand A and the carry out from the previous computation, used as carry in.
  • each column may store different variables to concurrently perform multiple add operations, such that a computation regarding a specific pair of variables may be performed in col 0 , while a completely unrelated computation on two other variables may be performed in a different column, such as col 1 .
  • concurrent adder 310 may relate to each variable having N bits as a variable having several groups of M bits each.
  • a 16-bit variable X 15 X 14 X 13 X 12 X 11 X 10 X 9 X 8 X 7 X 6 X 5 X 4 X 3 X 2 X 1 X 0 may be divided into 4 groups of 4 bits X 15 X 14 X 13 X 12 , X 11 X 10 X 9 X 8 , X 7 X 6 X 5 X 4 and X 3 X 2 X 1 X 0 .
  • concurrent adder 310 may split each variable, A and B, into groups of size M and may perform the computation in the level of groups. It may be appreciated that the number of bits in the operands and the group size are not limited to specific sizes and the same steps and logic described in the current application may apply to operands having more or less bits, divided into a larger or smaller group size.
  • FIG. 6 schematically illustrates an example of the steps performed by concurrent adder 310 to concurrently add two 8-bit operands A and B in a table 600 .
  • table 600 only intends to facilitate the understanding of the procedure performed by multi-bit concurrent adder 300 (of FIG. 3 ) and does not apply to the hardware structure of associative memory array 320 .
  • row 610 may contain all bits of number A written in step #1, each bit stored in a different row, in a different section labeled with the same label A, as can also be understood from FIG. 4 and FIG. 5 .
  • Table 600 illustrates the data stored in different rows of different sections of associative memory array 320 .
  • Table 600 provides the step number in column 620 , the row in column 630 , the action performed by concurrent adder 310 on different sections 330 in column 640 .
  • the value contained in each section, 7-0, is provided by columns 657 - 650 respectively.
  • Concurrent adder 310 may store each bit of variable A in a dedicated section 330 .
  • the LSB of variable A is stored in row A of section 0, the next bit is stored in row A in section 1 and so on until the MSB of variable A is stored in row A of section 7.
  • Variable B is stored similarly in row B of sections 1 to 7.
  • Variables A and B may be divided into two groups of 4 bits: nibble0 comprising sections 0, 1, 2 and 3 and nibble1, comprising sections 4, 5, 6, and 7.
  • concurrent adder 310 may write variable A to rows A.
  • the first four bits, 0110 may be stored in nibble0 and the other bits, 0111 may be stored in nibble1.
  • concurrent adder 310 may write the first group of bits of variable B, which are 1011, to nibble0 and the second group of variable B, which are 1110, to nibble1.
  • Concurrent adder 310 may then calculate the result of a Boolean OR in, step #3, and a Boolean AND, in step #4, between the bits of operands A and B in each section as defined in equations 4 and 5.
  • Concurrent adder 310 may store the results of equations 4 and 5 in rows AorB and AandB, respectively. It may be appreciated that concurrent adder 310 may concurrently perform the calculation of each of the steps on all sections, i.e. equation 4 is calculated in a single step on all sections storing bits of operands A and B. In addition, equation 4 may be concurrently performed on all columns of associative memory array 320 .
  • concurrent adder 310 may calculate the carry out inside all groups in parallel, using the standard ripple carry formula of equation 6.
  • the ripple carry inside a group may take M steps for a group of size M.
  • the ripple carry may be calculated inside each group twice.
  • Predictor 314 may store the calculated carry outs in dedicated rows in each section.
  • concurrent adder 310 may perform a ripple carry between groups.
  • the C group-in of the first group may be zero, if there is no carry in from a previous computation, and may be the carry out of a previous computation if the current computation is a step in a multi-step process, such as adding a 64 bit number using 4 rounds of concurrent adding of 16 bit numbers.
  • Selector 316 may write, in step 9, the values of the correct row of the first nibble to row C out according to the actual value of the C in of the first group. Since the actual value of the C group-in is known only once the carry out of the last bit of the previous group is calculated, selector 312 may select the relevant values of the carry bits for the group, i.e. from row C0 or row C1, after the C group-out of the previous group is known.
  • the C group-out of the first group (the value stored in row C0 of section 3) is 1 and selector 316 may select row C1 of the second group as the actual values of the carry bits of nibble1. Selector 316 may then write the values of row C1 of the sections of nibble1 (sections 4, 5, 6 and 7) to row C out of the relevant sections in step 10 .
  • selector 316 may choose the value for C out of each group using equation 7.
  • the C group-out of the first group is provided after M steps of a standard ripple carry adder (4 steps for a nibble as in the example of FIG. 6 ).
  • summer 318 may concurrently compute, in step 11 , the sum of all bits in all sections using equation 8.
  • FIG. 7 is a flow chart 700 describing the steps that concurrent adder 310 may perform for adding operands A and B.
  • concurrent adder 310 may store operands A and B in sections 330 of memory array 320 .
  • concurrent adder 310 may concurrently compute the Boolean OR between bits of operands A and B in all sections 330 .
  • concurrent adder 310 may concurrently compute the Boolean AND between bits of operands A and B in all sections 330 .
  • selector 316 may compute the carry out of the next groups until the carry out of the last group is computed and the correct carry row may be selected for the actual Gut of the group.
  • summer 318 may compute the sum in step 790 .
  • concurrent adder 310 may perform the following procedures:
  • concurrent adder 310 may perform the same steps for computing the sum of a 16 bit variable as in the example of the 8 bit numbers with 2 additional steps of “ripple carry between groups” for the third and fourth groups. It may also be appreciated that concurrent adder 310 may use a concurrent adder in two phases. In the first phase, the carry out of the least significant bits of the variables are calculated and the carry out of the last bit, or the overflow of the calculation, is an input carry in value used in the calculation of the most significant bits of the variables.
  • the total carry ripple computation time may include a) the steps needed to perform a standard ripple carry inside a single group, equal to the number of bits M in the group (4 steps in a nibble in the example), b) the computation of a second standard ripple carry inside the group assuming another value of the C in , that may take one additional step, as the computation for the first bit in a group may start immediately after the previous computation of that bit is completed, and c) number of groups minus 1 ripples between groups, as the C out of each group needs to ripple to the next group.
  • multi-bit concurrent adder 300 may concurrently perform multiple add operations on multiple pairs of operands stored in multiple columns of memory array 320 , each pair stored in a different column. A complete add operation may be performed on a single column.
  • Memory array 320 may comprise P columns and multi-bit concurrent adder 300 may concurrently operate on all columns, thereby performing P multi-bit add operations at the time.

Abstract

A method for an associative memory device includes performing in parallel multi-bit operations of P pairs of multi-bit operands stored in columns of a memory array, each pair is stored in a different column, each bit i of each multi-bit operands of each pair is stored in a row of a section i in the column and each operation occurs in its associated column. A system includes a non-destructive associative memory array with multiple sections, each section j includes cells arranged in rows and columns, to store a bit j from a first multi-bit number in a first row and a bit j from a second multi-bit number in a second row of a same column, and a concurrent adder to, in parallel, perform per-section operations in each section, that includes one or more Boolean operations between a plurality of bits stored in rows of the section.

Description

    CROSS REFERENCE TO RELATED APPLICATIONS
  • This application is a continuation application of U.S. patent application Ser. No. 17/086,506, filed Nov. 2, 2020, which is a continuation application of U.S. patent application Ser. No. 16/554,730, filed Aug. 29, 2019, which is a continuation application of U.S. patent application Ser. No. 15/690,301, filed Aug. 30, 2017, all of which are incorporated herein by reference.
  • FIELD OF THE INVENTION
  • The present invention relates to associative memory generally and to a method for concurrent bit addition, in particular.
  • BACKGROUND OF THE INVENTION
  • In many computers and other kinds of processors, adders are used not only in arithmetic logic units, but also in other parts, where they are used to calculate addresses, table indices, increment and decrement operators, and similar operations.
  • FIG. 1 to which reference is now made illustrates a one-bit full adder 100 and a multi-bit ripple carry adder 120, all known in the art. One-bit full adder 100, receives three one-bit values as input, A, B, and Cin, and adds them. The output of one-bit full adder 100 is the calculated sum of the three input bits, S, and a bit carried out from the add operation, Cout.
  • Multi-bit ripple carry adder 120 may be used for adding N-bit variables A and B. Multi-bit ripple carry adder 120 may be constructed from N one-bit full adders 100. Each full adder 100 inputs a bit Ai from variable A and a bit Bi from variable B. Each full adder also inputs a carry in, Cin-i, which is the carry out of the previous adder, Cout-i-1.
  • FIG. 2 , to which reference is now made, illustrates an exemplary, known in the art, four-bit ripple carry adder 120 used to add two 4-bit variables, A=1110 and B=0101, and comprises four one-bit full adders 100: 100 a, 100 b, 100 c and 100 d.
  • The input bits of full adder 100 a are the least significant bits (LSB) of A, (e.g. 0), the LSB of B, (e.g. 1), and a carry in which is by definition 0 for the first full adder. Full adder 100 a may perform the calculation (in this example 0+1+0). The output bits of full adder 100 a are the result bit S with value of 1, and the carry out bit Cout, with value of 0. The Cout of full adder 100 a becomes the Cin of full adder 100 b. It may be appreciated that full adder 100 b may start its computation only after the computation of full adder 100 a has been completed and the same constraint applies to all full adders including 100 c and 100 d, except for the first. The last Cout, of the last full adder 100 d, is referred to as the overflow of the computation.
  • The computation steps of this example are: In step 1, bit 0 (LSB) of both variables is added resulting in a bit S0 and a carry out bit Cout-0 In step 2, bit 1 of both variables and the carry out of the previous step, Cout-0, are added, resulting in a bit S1 and a carry out bit Cout-1 In step 3, bit 2 of both variables and the carry of the previous step, Cout-1, are added, resulting in a bit S2 and a carry out bit Cout-2 Finally, in step 4, bit 3 of both variables and the carry of the previous step, Cout-2, are added, resulting in a bit S3 and a carry out bit Cout-3. The result of the add operation is all bits S from all steps and the last carry out, which is the overflow if its value is 1.
  • It may be appreciated that a computation step may start only when all its input values are known, i.e. Ai, Bi and Ai and Bi are known in advance (bits from the input numbers A and B). The first Cin is 0 (this is the first step, there is no previous step, thus there is no value to carry into this step). The value of Cin in each step (except for the first one) is known only after the computation of the previous step is completed, as it is the Cout of that former step.
  • It may be appreciated that the ripple carry adder can get very slow when adding large multi bit values. The entire ripple carry add computation is serial and its complexity is O(N), which is a disadvantage.
  • SUMMARY OF THE PRESENT INVENTION
  • There is provided, in accordance with a preferred embodiment of the present invention, a method for an associative memory device. The method includes in parallel, performing multi-bit operations of P pairs of multi-bit operands stored in columns of a memory array, each pair of the P pairs is stored in a different column of the array and each operation of the multi-bit operations occurs in its associated different column, and each bit i of each of the multi-bit operands of each of the P pairs is stored in a row of a section i in the column.
  • Moreover, in accordance with a preferred embodiment of the present invention, each multi-bit operation of the multi-bit operations includes a plurality of per-section operations, and each per-section operation includes one or more Boolean operations between a plurality of bits stored in the section.
  • Further, in accordance with a preferred embodiment of the present invention, the performing includes concurrently performing the per-section operations on a plurality of sections.
  • Still further, in accordance with a preferred embodiment of the present invention, the multi-bit operation is a multi-bit add operation.
  • There is provided, in accordance with a preferred embodiment of the present invention, a system that includes a non-destructive associative memory array that includes a plurality of sections, each section including cells arranged in rows and columns, to store a bit j from a first multi-bit number in a first row and a bit j from a second multi-bit number in a second row of a same column, and a concurrent adder to, in parallel, perform per-section operations in each section, each per-section operation includes one or more Boolean operations between a plurality of bits stored in rows of the section.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • The subject matter regarded as the invention is particularly pointed out and distinctly claimed in the concluding portion of the specification. The invention, however, both as to organization and method of operation, together with objects, features, and advantages thereof, may best be understood by reference to the following detailed description when read with the accompanying drawings in which:
  • FIG. 1 is a schematic illustration of a one-bit full adder and a multi-bit ripple carry adder known in the art;
  • FIG. 2 is a schematic illustration of an exemplary, known in the art, four-bit ripple carry adder used to add two 4-bit variables;
  • FIG. 3 is a schematic illustration of a multi-bit concurrent adder, constructed and operative in accordance with a preferred embodiment of the present invention;
  • FIG. 4 is a schematic illustration of an associative memory array, constructed and operative in accordance with a preferred embodiment of the present invention;
  • FIG. 5 is a schematic illustration of data stored in a section of an associative memory array, constructed and operative in accordance with a preferred embodiment of the present invention;
  • FIG. 6 is a schematic illustration of what is stored in the various rows of the memory array during the add operation performed by the concurrent adder of FIG. 3 to concurrently add two 8-bit operands according to a preferred embodiment of the present invention; and
  • FIG. 7 is a flow chart illustration showing the operations performed by the concurrent adder of FIG. 3 , according to a preferred embodiment of the present invention.
  • It will be appreciated that for simplicity and clarity of illustration, elements shown in the figures have not necessarily been drawn to scale. For example, the dimensions of some of the elements may be exaggerated relative to other elements for clarity. Further, where considered appropriate, reference numerals may be repeated among the figures to indicate corresponding or analogous elements.
  • DETAILED DESCRIPTION OF THE PRESENT INVENTION
  • In the following detailed description, numerous specific details are set forth in order to provide a thorough understanding of the invention. However, it will be understood by those skilled in the art that the present invention may be practiced without these specific details. In other instances, well-known methods, procedures, and components have not been described in detail so as not to obscure the present invention.
  • It is known in the art that the sum, S, and the carry out, Cout, of a one-bit computation can be expressed by equations 1 and 2:

  • S=A⊕B⊕C in  Equation 1

  • C out =A*B+C in*(A+B)  Equation 2
  • Where the symbol ⊕ indicates a Boolean XOR, the symbol * indicates a Boolean AND and the symbol + indicates a Boolean OR. The carry out signal may be calculated in advance by a procedure, known in the art, called Carry Look Ahead (CLA). The CLA calculation is based on the value of all previous input bits Ai and Bi (0<i<N) of variables A and B, and on the value of the first Cin The computation of the CLA is expressed in equation 3.

  • C out-N =A N *B N +A N-1 *B N-1*(A N +B N)+A N-2 *B N-2*(A N-1 +B N-1)*(A N +B N)+ . . . +C in*(A 0 +B 0)*(A 1 +B 1) . . . (A N +B N)  Equation 3
  • Using this technique, the bits of the variables may be split into groups (nibbles for example) and the carry of the group, referred herein as Cout-group, i.e. the carry from the last bit in the group, may be calculated without waiting for each bit computation to be completed. Using the CLA, the performance of a multi-bit adder may be improved (compared to the ripple carry); however, the CLA may only be implemented using specialized hardware, explicitly designed to calculate the expected carry out of a group using all the input bits of the group i.e. all the bits of variable A, all the bits of variable B and the Cin of the group referred herein as Cin-group.
  • Applicant has realized that a similar carry propagation functionality, that improves the computation efficiency of a multi-bit adder compared to a ripple carry adder, may be provided by a multi-purpose associative memory replacing the specialized hardware, by performing a calculation using a prediction regarding the value of a carry in as described hereinbelow.
  • Multi-purpose associative memory devices are described in U.S. Pat. No. 8,238,173, (entitled “USING STORAGE CELLS TO PERFORM COMPUTATION”) issued on Aug. 7, 2012; U.S. Patent Publication No. US 2015/0131383, (entitled “NON-VOLATILE IN-MEMORY COMPUTING DEVICE”) published on May 14, 2015, now issued as U.S. Pat. No. 10,832,746 on Nov. 10, 2020; U.S. Pat. No. 9,418,719 (entitled “IN-MEMORY COMPUTATIONAL DEVICE”), issued on Aug. 16, 2016 and U.S. Pat. No. 9,558,812 (entitled “SRAM MULTI-CELL OPERATIONS”) issued on Jan. 31, 2017, all assigned to the common assignee of the present invention and incorporated herein by reference.
  • Applicant has further realized that the computation may be parallelized, using bit line processors, one per bit, as described in U.S. patent application Ser. No. 15/650,935 filed on Jul. 16, 2017 (entitled “IN-MEMORY COMPUTATIONAL DEVICE WITH BIT LINE PROCESSORS”) and published on Nov. 2, 2017 as US 2017/0316829, now issued as U.S. Pat. No. 10,153,042 on Dec. 11, 2018, assigned to the common assignee of the present invention and incorporated herein by reference.
  • FIG. 3 , to which reference is now made, schematically illustrates a multi-bit concurrent adder 300, constructed and operative in accordance with a preferred embodiment of the present invention. Multi-bit concurrent adder 300 comprises a concurrent adder 310 and an associative memory array 320. Associative memory array 320 may store each pair of operands, A and B, in a column, and may also store intermediate and final results of the computation in the same column. Concurrent adder 310 comprises a predictor 314, a selector 316 and a summer 318, described in more detail hereinbelow.
  • FIG. 4 , to which reference is now made, schematically illustrates associative memory array 320. Associative memory array 320 comprises a plurality of sections 330, each section 330 comprises rows and columns. Each section 330 may store a different bit of the operands A and B. Bits 0 of the operands may be stored in section 0, bits 1 may be stored in section 1 and so on until bit 15 may be stored in section 15. As can be seen, each bit j of both operands A and B may be stored in a different row of the same column k, of the same section j. In particular, bit A0 of operand A is stored in row A, column C-k of section 0, and bit B0 of operand B is stored in a different row, row R-B, in the same column col-k of the same section, section 0. The other bits of the operands A and B are similarly stored in additional sections 330 of associative memory array 320.
  • Concurrent adder 310 (of FIG. 3 ) may utilize additional rows of each section 330 to store intermediate values, predictions and final results as illustrated in FIG. 5 , to which reference is now made. As already mentioned hereinabove, concurrent adder 310 may store, in a section x, a bit x from operand A in row A, and a bit x from operand B in row B. In addition, concurrent adder 310 may store the result of a Boolean OR performed on bits stored on rows A and B, in Row AorB. In row AandB concurrent adder 310 may store the result of a Boolean AND performed on bits stored on rows A and B. The values stored in both rows AorB and AandB may be used later for the computation of a carry out.
  • In C0, C1 and Cout, concurrent adder 310 may store a value related to the carry out. Predictor 314 may use row C0 to store a value of Cout, calculated using a prediction that the value of the carry in (to the group) will be 0. Predictor 314 may use row C1 to store a value of Cout, calculated using a prediction that the value of the carry in (to the group) will be 1. Selector 316 may select the actual value used by summer 318 for calculating the sum and may store it in row Cout after the actual value of the carry in is known, when the calculation of the carry out of the previous group is completed. In row Sum, summer 318 may store the sum of bit x from operand B, bit x from operand A and the carry out from the previous computation, used as carry in.
  • As already mentioned before, all data relevant to a specific sum computation may be stored in a single column of each section, and each column may store different variables to concurrently perform multiple add operations, such that a computation regarding a specific pair of variables may be performed in col 0, while a completely unrelated computation on two other variables may be performed in a different column, such as col 1.
  • According to a preferred embodiment of the present invention, concurrent adder 310 (of FIG. 3 ) may relate to each variable having N bits as a variable having several groups of M bits each. For example, a 16-bit variable X15X14X13X12X11X10X9X8X7X6X5X4X3X2X1X0 may be divided into 4 groups of 4 bits X15X14X13X12, X11X10X9X8, X7X6X5X4 and X3X2X1X0. Using this approach, concurrent adder 310 may split each variable, A and B, into groups of size M and may perform the computation in the level of groups. It may be appreciated that the number of bits in the operands and the group size are not limited to specific sizes and the same steps and logic described in the current application may apply to operands having more or less bits, divided into a larger or smaller group size.
  • FIG. 6 , to which reference is now made, schematically illustrates an example of the steps performed by concurrent adder 310 to concurrently add two 8-bit operands A and B in a table 600. It may be appreciated that the structure of table 600 only intends to facilitate the understanding of the procedure performed by multi-bit concurrent adder 300 (of FIG. 3 ) and does not apply to the hardware structure of associative memory array 320. For example, row 610 may contain all bits of number A written in step #1, each bit stored in a different row, in a different section labeled with the same label A, as can also be understood from FIG. 4 and FIG. 5 . In the example of table 600, A=01110110 and B=11101011. Table 600 illustrates the data stored in different rows of different sections of associative memory array 320. Table 600 provides the step number in column 620, the row in column 630, the action performed by concurrent adder 310 on different sections 330 in column 640. The value contained in each section, 7-0, is provided by columns 657-650 respectively.
  • Concurrent adder 310 may store each bit of variable A in a dedicated section 330. The LSB of variable A is stored in row A of section 0, the next bit is stored in row A in section 1 and so on until the MSB of variable A is stored in row A of section 7. Variable B is stored similarly in row B of sections 1 to 7. Variables A and B may be divided into two groups of 4 bits: nibble0 comprising sections 0, 1, 2 and 3 and nibble1, comprising sections 4, 5, 6, and 7. In step #1 concurrent adder 310 may write variable A to rows A. The first four bits, 0110 may be stored in nibble0 and the other bits, 0111 may be stored in nibble1. Similarly, in step #2, concurrent adder 310 may write the first group of bits of variable B, which are 1011, to nibble0 and the second group of variable B, which are 1110, to nibble1.
  • Concurrent adder 310 may then calculate the result of a Boolean OR in, step #3, and a Boolean AND, in step #4, between the bits of operands A and B in each section as defined in equations 4 and 5.

  • AorB=A i +B i  Equation 4

  • AandB=A i *B i  Equation 5
  • Concurrent adder 310 may store the results of equations 4 and 5 in rows AorB and AandB, respectively. It may be appreciated that concurrent adder 310 may concurrently perform the calculation of each of the steps on all sections, i.e. equation 4 is calculated in a single step on all sections storing bits of operands A and B. In addition, equation 4 may be concurrently performed on all columns of associative memory array 320.
  • After calculating and storing values in rows AorB and AandB, concurrent adder 310 may calculate the carry out inside all groups in parallel, using the standard ripple carry formula of equation 6.

  • C out =A*B+C in*(A+B)=AandB+(C in*AorB)  Equation 6
  • The ripple carry inside a group may take M steps for a group of size M.
  • Since the carry in of all groups, except for the first one, is not known in advance, the ripple carry may be calculated inside each group twice. Predictor 314 may perform the first calculation under the prediction that the input carry of the group is 0 (Cgroup-in=0) and the second calculation under the prediction that the input carry of the group is 1 (Cgroup-in=1). Predictor 314 may store the calculated carry outs in dedicated rows in each section. Predictor 314 may store the carry value calculated assuming Cgroup-in=0 in row C0 and the carry value calculated assuming Cgroup-in=1 in row C1.
  • The standard ripple carry of equation 6 may be performed assuming Cgroup-in=0 in step 5 on the first section of each group, in step 6 on the second section of each group, in step 7 on the third section of each group and in step 8 on the fourth section of each group.
  • The standard ripple carry of equation 6 may be performed assuming Cgroup-in=1 in step 6 on the first section of each group, in step 7 on the second section of each group, in step 8 on the third section of each group and in step 9 on the fourth section of each group.
  • Thus, the two ripple carry operations may be performed in merely M+1 steps as concurrent adder 310 may start the calculation under the assumption of Cgroup-in=1, immediately after calculating the carry out of the first bit of the group using Cgroup-in=0 as the bits may be stored in different sections and a calculation may be done concurrently on any number of sections.
  • After the standard ripple carry is completed inside the groups, and rows C0 and C1 store values for all the bits of the group, concurrent adder 310 may perform a ripple carry between groups.
  • The Cgroup-in of the first group may be zero, if there is no carry in from a previous computation, and may be the carry out of a previous computation if the current computation is a step in a multi-step process, such as adding a 64 bit number using 4 rounds of concurrent adding of 16 bit numbers. Selector 316 may write, in step 9, the values of the correct row of the first nibble to row Cout according to the actual value of the Cin of the first group. Since the actual value of the Cgroup-in is known only once the carry out of the last bit of the previous group is calculated, selector 312 may select the relevant values of the carry bits for the group, i.e. from row C0 or row C1, after the Cgroup-out of the previous group is known. In the example, the Cgroup-out of the first group (the value stored in row C0 of section 3) is 1 and selector 316 may select row C1 of the second group as the actual values of the carry bits of nibble1. Selector 316 may then write the values of row C1 of the sections of nibble1 ( sections 4, 5, 6 and 7) to row Cout of the relevant sections in step 10.
  • In a preferred embodiment of the present invention, selector 316 may choose the value for Cout of each group using equation 7.

  • Cout=(C1*C prev-group-out)+(C0*(NOT(C prev-group-out))  Equation 7
  • The Cgroup-out of the first group is provided after M steps of a standard ripple carry adder (4 steps for a nibble as in the example of FIG. 6 ). The Cgroup-out of the next groups is provided after M+1 steps of a standard ripple carry adder, as it is calculated for both Cgroup-in=0 and Cgroup-in=1 that can start after the first bit of the group is calculated using Cgroup-in=0.
  • Once all values of the carry are, known in all sections of all groups, summer 318 may concurrently compute, in step 11, the sum of all bits in all sections using equation 8.

  • S=A⊕B⊕C in  Equation 8
  • where Cin is the Cout of the previous section.
  • FIG. 7 , to which reference is now made, is a flow chart 700 describing the steps that concurrent adder 310 may perform for adding operands A and B. In step 710, concurrent adder 310 may store operands A and B in sections 330 of memory array 320. In step 720, concurrent adder 310 may concurrently compute the Boolean OR between bits of operands A and B in all sections 330. In step 730, concurrent adder 310 may concurrently compute the Boolean AND between bits of operands A and B in all sections 330. In steps 740, 742 and 744, predictor 314 may perform a ripple carry inside all groups in parallel, assuming Cgroup-in=0, and in steps 750, 752 and 754, predictor 314 may perform a ripple carry inside all groups in parallel, assuming Cgroup-in=1. In step 760, selector 316 may compute the carry out of the first group using Cgroup-in=0. In steps 770 and 780, selector 316 may compute the carry out of the next groups until the carry out of the last group is computed and the correct carry row may be selected for the actual Gut of the group. When the Gut of the last groups is computed, summer 318 may compute the sum in step 790.
  • It may be appreciated by the person skilled in the art that the steps shown are not intended to be limiting and that the flow may be practiced with more or less steps, or with a different sequence of steps, or any combination thereof.
  • It may be appreciated that, for adding two 16-bit operands divided into four nibbles, concurrent adder 310 may perform the following procedures:
      • A. Calculate A+B (in parallel for all bits)
      • B. Calculate A*B (in parallel for all bits)
      • C. Calculate Cin (in parallel for all groups) (total 8 steps)
        • a. ripple carry inside nibble (total 5 steps)
          • i. nibble 0: nibble ripple carry C1=0 (4 steps)
          • ii. nibbles 1-3: nibble ripple carry Cin=0 and Cin=1 (5 steps)
        • b. ripple carry between nibbles (3 steps)
      • D. calculate sum: S=A⊕B⊕Cin (in parallel for all bits)
  • It may be appreciated that concurrent adder 310 may perform the same steps for computing the sum of a 16 bit variable as in the example of the 8 bit numbers with 2 additional steps of “ripple carry between groups” for the third and fourth groups. It may also be appreciated that concurrent adder 310 may use a concurrent adder in two phases. In the first phase, the carry out of the least significant bits of the variables are calculated and the carry out of the last bit, or the overflow of the calculation, is an input carry in value used in the calculation of the most significant bits of the variables.
  • It may further be appreciated that the total carry ripple computation time may include a) the steps needed to perform a standard ripple carry inside a single group, equal to the number of bits M in the group (4 steps in a nibble in the example), b) the computation of a second standard ripple carry inside the group assuming another value of the Cin, that may take one additional step, as the computation for the first bit in a group may start immediately after the previous computation of that bit is completed, and c) number of groups minus 1 ripples between groups, as the Cout of each group needs to ripple to the next group. For example, the computation complexity of ripple carry when adding two 16 bit numbers divided into four nibbles (the size of each nibble is 4), may be 4+1+3=8, while the computation complexity using a standard ripple carry for the same computation may be 16.
  • It may be appreciated that multi-bit concurrent adder 300 may concurrently perform multiple add operations on multiple pairs of operands stored in multiple columns of memory array 320, each pair stored in a different column. A complete add operation may be performed on a single column. Memory array 320 may comprise P columns and multi-bit concurrent adder 300 may concurrently operate on all columns, thereby performing P multi-bit add operations at the time.
  • While certain features of the invention have been illustrated and described herein, many modifications, substitutions, changes, and equivalents will now occur to those of ordinary skill in the art. It is, therefore, to be understood that the appended claims are intended to cover all such modifications and changes as fall within the true spirit of the invention.

Claims (5)

What is claimed is:
1. A method for an associative memory device, the method comprising:
in parallel, performing multi-bit operations of P pairs of multi-bit operands stored in columns of a memory array,
wherein each pair of said P pairs is stored in a different column of said array and each operation of said multi-bit operations occurs in its associated different column,
wherein each bit i of each of said multi-bit operands of each of said P pairs is stored in a row of a section i in said column.
2. The method of claim 1 wherein said each multi-bit operation of said multi-bit operations comprises a plurality of per-section operations, and wherein each per-section operation comprises one or more Boolean operations between a plurality of bits stored in said section.
3. The method of claim 2 wherein said performing comprises concurrently performing said per-section operations on a plurality of sections.
4. The method of claim 1 wherein said multi-bit operation is a multi-bit add operation.
5. A system comprising:
a non-destructive associative memory array comprising a plurality of sections, each section j comprising cells arranged in rows and columns, to store a bit j from a first multi-bit number in a first row and a bit j from a second multi-bit number in a second row of a same column; and
a concurrent adder to, in parallel perform per-section operations in each section, wherein each per-section operation comprises one or more Boolean operations between a plurality of bits stored in rows of said section.
US18/337,086 2017-08-30 2023-06-19 Concurrent multi-bit adder Pending US20230333815A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US18/337,086 US20230333815A1 (en) 2017-08-30 2023-06-19 Concurrent multi-bit adder

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US15/690,301 US10402165B2 (en) 2017-08-30 2017-08-30 Concurrent multi-bit adder
US16/554,730 US10824394B2 (en) 2017-08-30 2019-08-29 Concurrent multi-bit adder
US17/086,506 US11681497B2 (en) 2017-08-30 2020-11-02 Concurrent multi-bit adder
US18/337,086 US20230333815A1 (en) 2017-08-30 2023-06-19 Concurrent multi-bit adder

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US17/086,506 Continuation US11681497B2 (en) 2017-08-30 2020-11-02 Concurrent multi-bit adder

Publications (1)

Publication Number Publication Date
US20230333815A1 true US20230333815A1 (en) 2023-10-19

Family

ID=65436849

Family Applications (4)

Application Number Title Priority Date Filing Date
US15/690,301 Active US10402165B2 (en) 2017-08-30 2017-08-30 Concurrent multi-bit adder
US16/554,730 Active US10824394B2 (en) 2017-08-30 2019-08-29 Concurrent multi-bit adder
US17/086,506 Active 2038-05-26 US11681497B2 (en) 2017-08-30 2020-11-02 Concurrent multi-bit adder
US18/337,086 Pending US20230333815A1 (en) 2017-08-30 2023-06-19 Concurrent multi-bit adder

Family Applications Before (3)

Application Number Title Priority Date Filing Date
US15/690,301 Active US10402165B2 (en) 2017-08-30 2017-08-30 Concurrent multi-bit adder
US16/554,730 Active US10824394B2 (en) 2017-08-30 2019-08-29 Concurrent multi-bit adder
US17/086,506 Active 2038-05-26 US11681497B2 (en) 2017-08-30 2020-11-02 Concurrent multi-bit adder

Country Status (3)

Country Link
US (4) US10402165B2 (en)
KR (1) KR102341523B1 (en)
CN (1) CN109426483B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102305568B1 (en) * 2016-07-17 2021-09-27 쥐에스아이 테크놀로지 인코포레이티드 Finding k extreme values in constant processing time
US11669302B2 (en) * 2019-10-16 2023-06-06 Purdue Research Foundation In-memory bit-serial addition system
CN113342309B (en) * 2020-02-18 2023-09-15 芯立嘉集成电路(杭州)有限公司 Programmable nonvolatile arithmetic memory operator
US11200029B2 (en) * 2020-04-16 2021-12-14 Flashsilicon Incorporation Extendable multiple-digit base-2n in-memory adder device
KR20220007260A (en) 2020-07-10 2022-01-18 주식회사 엘지에너지솔루션 Negative electrode and lithium secondary battery with improved fast charging performance
US11755240B1 (en) * 2022-02-23 2023-09-12 Gsi Technology Inc. Concurrent multi-bit subtraction in associative memory

Family Cites Families (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
ATE180586T1 (en) * 1990-11-13 1999-06-15 Ibm PARALLEL ASSOCIATIVE PROCESSOR SYSTEM
CN1164988C (en) * 2002-01-17 2004-09-01 北京大学 Structure and circuit of logarithmic skip adder
US7178080B2 (en) * 2002-08-15 2007-02-13 Texas Instruments Incorporated Hardware-efficient low density parity check code for digital communications
DE10305849B3 (en) * 2003-02-12 2004-07-15 Infineon Technologies Ag Carry-ripple adder for addition of bits of similar value has 3 inputs for input bits to be summated, carry inputs for carry bits, output for calculated sum bit and carry outputs for carry bits
US7966547B2 (en) * 2007-07-02 2011-06-21 International Business Machines Corporation Multi-bit error correction scheme in multi-level memory storage system
US20090254694A1 (en) * 2008-04-02 2009-10-08 Zikbit Ltd. Memory device with integrated parallel processing
US10832746B2 (en) 2009-07-16 2020-11-10 Gsi Technology Inc. Non-volatile in-memory computing device
US8238173B2 (en) 2009-07-16 2012-08-07 Zikbit Ltd Using storage cells to perform computation
US8236173B2 (en) 2011-03-10 2012-08-07 Kior, Inc. Biomass pretreatment for fast pyrolysis to liquids
KR101814558B1 (en) 2011-07-05 2018-01-30 수저우 시스케이프 바이오메디신 사이언스 앤드 테크놀로지 컴퍼니 리미티드 Use of salmonella flagellin derivative in preparation of drug for preventing and treating inflammatory bowel diseases
CN103324461B (en) * 2013-07-03 2015-12-23 刘杰 Four addend binary parallel synchronous addition devices
US9418719B2 (en) 2013-11-28 2016-08-16 Gsi Technology Israel Ltd. In-memory computational device
US10153042B2 (en) 2013-11-28 2018-12-11 Gsi Technology Inc. In-memory computational device with bit line processors
US9786335B2 (en) * 2014-06-05 2017-10-10 Micron Technology, Inc. Apparatuses and methods for performing logical operations using sensing circuitry
US9830999B2 (en) * 2014-06-05 2017-11-28 Micron Technology, Inc. Comparison operations in memory
US9455020B2 (en) * 2014-06-05 2016-09-27 Micron Technology, Inc. Apparatuses and methods for performing an exclusive or operation using sensing circuitry
US9556812B2 (en) 2014-08-22 2017-01-31 At&T Intellectual Property I, L.P. Methods, systems, and products for detection of environmental conditions
US9747961B2 (en) * 2014-09-03 2017-08-29 Micron Technology, Inc. Division operations in memory
CN110335633B (en) 2015-05-05 2024-01-12 Gsi科技公司 SRAM multi-cell operation
CN206162532U (en) * 2016-09-13 2017-05-10 广东电网有限责任公司电力科学研究院 Parallel arithmetic unit and concurrent operation system

Also Published As

Publication number Publication date
US20210081173A1 (en) 2021-03-18
US10402165B2 (en) 2019-09-03
US10824394B2 (en) 2020-11-03
CN109426483A (en) 2019-03-05
KR102341523B1 (en) 2021-12-21
US20190065148A1 (en) 2019-02-28
US11681497B2 (en) 2023-06-20
US20190384573A1 (en) 2019-12-19
CN109426483B (en) 2021-09-21
KR20190024701A (en) 2019-03-08

Similar Documents

Publication Publication Date Title
US20230333815A1 (en) Concurrent multi-bit adder
US11604850B2 (en) In-memory full adder
US11574031B2 (en) Method and electronic device for convolution calculation in neural network
CN111213125B (en) Efficient direct convolution using SIMD instructions
US20190188237A1 (en) Method and electronic device for convolution calculation in neutral network
US6901422B1 (en) Matrix multiplication in a vector processing system
CN112988656A (en) System and method for loading weights into tensor processing blocks
US10977000B2 (en) Partially and fully parallel normaliser
CN113918883A (en) Data processing method, device and equipment and computer readable storage medium
US20190294412A1 (en) Stochastic rounding logic
CN112650471A (en) Processor and method for processing masked data
US10635397B2 (en) System and method for long addition and long multiplication in associative memory
US11755240B1 (en) Concurrent multi-bit subtraction in associative memory
US20210263707A1 (en) Iterative binary division with carry prediction
KR102628658B1 (en) Neural processor and control method of neural processor
US20230221925A1 (en) Square root calculations on an associative processing unit
US11610095B2 (en) Systems and methods for energy-efficient data processing
CN113536221A (en) Operation method, processor and related product
CN112862086A (en) Neural network operation processing method and device and computer readable medium
CN117742786A (en) Computing method executed by memory processor and memory device
JP2013210837A (en) Arithmetic circuit and arithmetic method

Legal Events

Date Code Title Description
AS Assignment

Owner name: GSI TECHNOLOGY INC., CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:LAZER, MOSHE;REEL/FRAME:064220/0814

Effective date: 20170925

STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION