CN115390789A - Magnetic tunnel junction calculation unit-based analog domain full-precision memory calculation circuit and method - Google Patents

Magnetic tunnel junction calculation unit-based analog domain full-precision memory calculation circuit and method Download PDF

Info

Publication number
CN115390789A
CN115390789A CN202211033693.5A CN202211033693A CN115390789A CN 115390789 A CN115390789 A CN 115390789A CN 202211033693 A CN202211033693 A CN 202211033693A CN 115390789 A CN115390789 A CN 115390789A
Authority
CN
China
Prior art keywords
circuit
magnetic tunnel
tunnel junction
node
electrode
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211033693.5A
Other languages
Chinese (zh)
Inventor
崔佳乐
孙澜洋
蔡浩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southeast University
Original Assignee
Southeast University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southeast University filed Critical Southeast University
Priority to CN202211033693.5A priority Critical patent/CN115390789A/en
Publication of CN115390789A publication Critical patent/CN115390789A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/52Multiplying; Dividing
    • G06F7/523Multiplying only
    • G06F7/525Multiplying only in serial-serial fashion, i.e. both operands being entered serially
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/26Power supply means, e.g. regulation thereof
    • G06F1/32Means for saving power
    • G06F1/3203Power management, i.e. event-based initiation of a power-saving mode
    • G06F1/3234Power saving characterised by the action undertaken
    • G06F1/325Power saving in peripheral device
    • G06F1/3275Power saving in memory, e.g. RAM, cache
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/50Adding; Subtracting
    • G06F7/504Adding; Subtracting in bit-serial fashion, i.e. having a single digit-handling circuit treating all denominations after each other
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11CSTATIC STORES
    • G11C11/00Digital stores characterised by the use of particular electric or magnetic storage elements; Storage elements therefor
    • G11C11/02Digital stores characterised by the use of particular electric or magnetic storage elements; Storage elements therefor using magnetic elements
    • G11C11/16Digital stores characterised by the use of particular electric or magnetic storage elements; Storage elements therefor using magnetic elements using elements in which the storage effect is based on magnetic spin effect
    • G11C11/165Auxiliary circuits
    • G11C11/1653Address circuits or decoders
    • G11C11/1655Bit-line or column circuits
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11CSTATIC STORES
    • G11C11/00Digital stores characterised by the use of particular electric or magnetic storage elements; Storage elements therefor
    • G11C11/21Digital stores characterised by the use of particular electric or magnetic storage elements; Storage elements therefor using electric elements
    • G11C11/34Digital stores characterised by the use of particular electric or magnetic storage elements; Storage elements therefor using electric elements using semiconductor devices
    • G11C11/40Digital stores characterised by the use of particular electric or magnetic storage elements; Storage elements therefor using electric elements using semiconductor devices using transistors
    • G11C11/401Digital stores characterised by the use of particular electric or magnetic storage elements; Storage elements therefor using electric elements using semiconductor devices using transistors forming cells needing refreshing or charge regeneration, i.e. dynamic cells
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Mathematical Analysis (AREA)
  • Pure & Applied Mathematics (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Computer Hardware Design (AREA)
  • Microelectronics & Electronic Packaging (AREA)
  • Analogue/Digital Conversion (AREA)

Abstract

The invention discloses a magnetic tunnel junction (3T 2M) based analog domain full-precision memory computing circuit and a method thereof, and the circuit comprises a Magnetic Random Access Memory (MRAM) computing array, a pulse generating circuit, a time sequence control circuit, an accumulation circuit, a multiplexer, an input sensitive parallel analog/digital converter (Flash ADC), an enabling signal generating circuit and a digital shift accumulator, wherein the MRAM computing array is provided with a 3 transistor 2 magnetic tunnel junction (3T 2M). In the memory computing mode, a 3T2M computing unit is used for realizing built-in multiplication operation, the yield of the computing unit is improved through two complementary Magnetic Tunnel Junctions (MTJ), and accumulation operation is realized by using parallel transistors and capacitors based on kirchhoff current law. Compared with a traditional von Neumann architecture accelerator and an existing MRAM analog domain memory computing architecture, the Von Neumann vector multiplication and accumulation circuit disclosed by the invention can effectively adapt to sparse vector matrix multiplication and accumulation operation, reduce power consumption overhead and improve circuit energy efficiency.

Description

Magnetic tunnel junction calculation unit-based analog domain full-precision memory calculation circuit and method
Technical Field
The invention belongs to the field of integrated circuit design, and particularly relates to a magnetic tunnel junction calculation unit-based analog domain full-precision memory calculation circuit and a circuit design method adaptive to sparse vector matrix multiply-accumulate operation.
Background
In recent years, the rapid development of artificial intelligence and deep neural networks greatly promotes the development of the application of the internet of things, and with the improvement of computational complexity, massive data needs to be transmitted in a central processing unit and a storage unit. The von Neumann architecture with independent memories and processors can cause the mismatching of computing operation and data handling operation speeds on one hand, and also has the problem of 'storage wall' on the other hand, so that the data handling power consumption is far larger than the computing power consumption, and the von Neumann architecture becomes the main bottleneck of the development of low-power-consumption Internet of things edge-end equipment.
Memory computing architecture is one of the most promising approaches to break the bottleneck of von neumann architecture. The architecture reserves the storage and read-write functions of the storage circuit, can perform operation operations such as multiply-accumulate operation and the like, and reduces the access power consumption and the data carrying times. Spin Transfer Torque Magnetic random access memory (STT-MRAM) has the characteristics of high write resistance, non-volatility, compatibility with CMOS (complementary metal oxide semiconductor) process and the like, so that the device is suitable to be used as a medium for realizing memory calculation. In addition, a software algorithm applied to the Internet of things equipment needs to be designed elaborately, and the computing energy consumption and the hardware resource overhead are reduced by using the sparsity of data.
Currently, MRAM-based memory computing circuits are capable of implementing boolean logic and multiply-accumulate operations. In the multiply-accumulate operation, it is necessary to read out the weight data and multiply-accumulate the weight data with the input excitation. The design with external multiplication can increase the energy consumption overhead of the STT-MRAM storage array in a sparse vector matrix multiply-accumulate scene. In addition, the low-bit analog domain memory computing circuit mostly uses a Flash type ADC (analog/digital converter, ADC) in the selection of the ADC, and the ADC has high speed and flexible design, but has high power consumption and area overhead, and occupies a major part of the memory computing architecture power consumption ratio.
Disclosure of Invention
The invention aims to provide a magnetic tunnel junction computing unit-based analog domain full-precision memory computing circuit, which aims to solve the technical problem of high power consumption overhead of an MRAM-based memory computing architecture in a sparse vector matrix multiply-accumulate operation scene and improve the energy efficiency of the MRAM-based analog domain memory computing circuit.
In order to solve the technical problems, the specific technical scheme of the invention is as follows:
a magnetic tunnel junction calculation unit-based analog domain full-precision memory calculation circuit is characterized by comprising a 3-transistor 2 magnetic tunnel junction (3T 2M) Magnetic Random Access Memory (MRAM) calculation array, a pulse generation circuit, a time sequence control circuit, an accumulation circuit, a multiplexer, an input sensitive parallel analog/digital converter (Flash ADC), an enabling signal generation circuit and a digital shift accumulator;
the 3-transistor 2-magnetic tunnel junction 3T2M calculation array is composed of calculation units which comprise 2 cross connection pipes, 1 serial access pipe and 2 complementary magnetic tunnel junctions and are arranged in a matrix (K rows and L columns). In the memory computing mode, the complementary magnetic tunnel junction realizes multiplication operation through cross-connection transistors and series access tubes;
the pulse generating circuit completes the conversion from the digital domain input vector to the analog pulse signal in the memory computing mode and feeds the pulse into the computing array;
the time sequence control circuit generates a control signal of the memory computing circuit;
the accumulation circuit comprises N groups of transistor-capacitance integration modules, each group comprises M PMOS transistors connected in parallel and 1 capacitor, and in the memory calculation mode, the sum of currents is related to the conduction number of the PMOS transistors, and the sum of currents is converted into a voltage signal through charging of the capacitors;
the multiplexer is used for realizing column selection in a memory computing mode and realizing multiplexing of the ADC;
the input sensitive parallel analog/digital converter quantizes the voltage signal output by the accumulation circuit in the memory calculation mode to obtain a digital result of multiply-accumulate operation;
the enabling signal generating module is used for counting the number of bits which are equal to 1 in an input vector in the memory computing mode and generating an enabling signal to control the enabling number of comparators in the FLASH ADC;
the digital shift accumulator weights the multi-bit weight and the partial sum of the input excitation and then adds the weighted multi-bit weight and the partial sum to obtain a final multi-bit multiply-accumulate calculation result.
Further, the 3 transistor 2 magnetic tunnel junction 3T2M computational array includes K rows and L columns of 3T2M computational cells. The same row of computing units share one word line WL and are connected with input vector excitation; the same column of computing units share one bit line BL and one source line SL, the results of the 3T2M computing units are stored through a latch after being stabilized, and sampling is carried out by a post-stage accumulation circuit.
Further, the 3-transistor 2 magnetic tunnel junction 3T2M calculation unit includes:
a grid electrode of the first NMOS tube N1 is connected with a node VD, a drain electrode is connected with the reference magnetic tunnel junction unit, and a source electrode is connected with a drain electrode of the N3 tube;
a grid electrode of the second NMOS tube N2 is connected with the node VDB, a drain electrode of the second NMOS tube N2 is connected with the data magnetic tunnel junction unit, and a source electrode of the second NMOS tube N2 is connected with a drain electrode of the N3 tube;
a third NMOS tube N3, the grid electrode of which is connected with the word line WL, the drain electrode of which is connected with the source electrodes of the N1 and N2 tubes, and the source electrode of which is connected with the source line SL;
one end of the first magnetic tunnel junction reference unit is connected with a bit line BL, and the other end of the first magnetic tunnel junction reference unit is connected with a drain electrode of the N1 tube and a grid electrode of the N2 tube;
and one end of the second magnetic tunnel junction data unit is connected with the bit line BL, and the other end of the second magnetic tunnel junction data unit is connected with the grid electrode of the N1 tube and the drain electrode of the N2 tube.
Further, the accumulation circuit comprises 1 power gating switch S1, 1 capacitance reset switch S2 and N groups of long-channel PMOS transistor-capacitance modules, wherein each group comprises M long-channel PMOS transistors connected in parallel and 1 summation capacitor CSUM. The circuit structure includes:
one end of the power supply gate control switch S1 is connected with a power supply, and the other end is connected with a source electrode of the PMOS tube;
the capacitance reset switch S2 is connected in parallel with the CSUM;
one end of the summation capacitor CSUM is grounded, and the other end of the summation capacitor CSUM is connected with the drain electrodes of the M PMOS tubes;
the drains of the M PMOS transistors are connected, the sources are connected, and the drains serve as data lines DL.
Further, the input sensitive parallel analog/digital converter is composed of 1 reference resistor chain, M comparators and 1 encoder. The comparator comprises 1 preamplifier and 1 latch-type comparator.
Further, the preamplifier includes:
a grid electrode of the first PMOS pipe P1 is connected with a bias voltage Vb, a source electrode of the first PMOS pipe P1 is connected with a power supply, and a drain electrode of the first PMOS pipe P1 is connected with source electrodes of the P2 pipe and the P3 pipe;
a second PMOS tube P2, the grid electrode of which is connected with positive input voltage, the source electrode of which is connected with the drain electrode of the P1 tube, and the drain electrode of which is connected with a node AOUT-;
a grid electrode of the third PMOS tube P3 is connected with the negative input voltage, a source electrode is connected with a drain electrode of the P1 tube, and a drain electrode is connected with a node AOUT +;
a fourth PMOS transistor P4 having a gate connected to the enable signal AEN, a source connected to the node AOUT +, and a drain connected to the node AOUT +;
a grid electrode of the first NMOS tube N1 is connected with a node AOUT-, a source electrode is connected with one end of the switch S1, and a drain electrode is connected with the node AOUT-;
a grid electrode of the second NMOS tube N2 is connected with the node AOUT +, a source electrode is connected with one end of the switch S1, and a drain electrode is connected with the node AOUT-;
a grid electrode of the third NMOS tube N3 is connected with the node AOUT-, a source electrode is connected with one end of the switch S1, and a drain electrode is connected with the node AOUT +;
a fourth NMOS transistor N4 having a gate connected to the node AOUT +, a source connected to one end of the switch S1, and a drain connected to the node AOUT +;
one end of the first switch S1 is connected with the source electrodes of the N1, N2, N3 and N4 tubes, and the other end is grounded.
Further, the latch type comparator includes:
a first PMOS tube P5, the grid electrode of which is connected with the control signal SEN, the source electrode of which is connected with the power supply, and the drain electrode of which is connected with the source electrodes of the tubes P6 and P7;
a grid electrode of the second PMOS tube P6 is connected with the node AOUT +, a source electrode is connected with a drain electrode of the first PMOS tube P5, and a drain electrode is connected with a source electrode of the P8 tube;
the grid electrode of the third PMOS tube P7 is connected with the node AOUT-, the source electrode is connected with the drain electrode of the first PMOS tube P5, and the drain electrode is connected with the source electrode of the P9 tube;
a grid electrode of the fourth PMOS tube P8 is connected with the node DOUT, a source electrode of the fourth PMOS tube P8 is connected with the drain electrode of the second PMOS tube P6, and a drain electrode of the fourth PMOS tube P8 is connected with the node DOUTB;
a fifth PMOS transistor P9 having a gate connected to the node DOUTB, a source connected to the drain of the third PMOS transistor P7, and a drain connected to the node DOUT;
a first NMOS transistor N5, the grid of which is connected with a control signal SEN, the source electrode of which is grounded, and the drain electrode of which is connected with a node DOUTB;
the grid electrode of the second NMOS tube N6 is connected with the node DOUT, the source electrode is grounded, and the drain electrode is connected with the node DOUTB;
a third NMOS transistor N7, the grid electrode of which is connected with the node DOUTB, the source electrode of which is grounded, and the drain electrode of which is connected with the node DOUT;
the gate of the fourth NMOS transistor N8 is connected to the control signal SEN, the source is grounded, and the drain is connected to the node DOUT.
Further, the enabling signal generating circuit comprises a full adder circuit and a logic gate circuit.
The invention relates to a magnetic tunnel junction calculation unit-based analog domain full-precision memory calculation circuit, which has the following advantages:
(1) The 3T2M computing unit realizes multiplication built-in operation by using the complementary magnetic tunnel junction and the series transistor, and can effectively improve the yield and reduce the power consumption. Compared with the traditional 1T1M and 2T2M structures, the 3T2M structure can adapt to sparse vector matrix multiplication operation, and the calculation energy efficiency is improved.
(2) The PMOS transistor in the accumulation circuit adopts the long channel size, so that the influence of the transistor channel modulation effect on the current is reduced, the problem of non-linearity of memory calculation in an analog domain is further reduced, and the calculation precision is improved.
(3) The input sensitive parallel analog/digital converter and the enabling signal generating circuit can dynamically adjust the enabling number of the comparators in the Flash ADC according to the input vector, reduce unnecessary power consumption overhead, reduce the power consumption occupation ratio of the ADC circuit in the whole memory computing framework and improve the computing energy efficiency.
Drawings
Fig. 1 is a block diagram of a structure of a magnetic tunnel junction calculation unit-based analog domain full-precision memory calculation circuit according to an embodiment of the present invention;
fig. 2 is a circuit diagram of a 3T2M calculation unit in an analog domain full-precision memory calculation circuit based on a magnetic tunnel junction calculation unit according to an embodiment of the present invention;
fig. 3 is a schematic diagram of logic of 3T2M multiplication calculation in a computation circuit of analog domain full-precision memory based on a magnetic tunnel junction computation unit according to an embodiment of the present invention;
fig. 4 is a diagram of a computing array structure in a simulated domain full-precision memory computing circuit based on a magnetic tunnel junction computing unit according to an embodiment of the present invention;
FIG. 5 is a circuit diagram of an input-sensitive parallel analog-to-digital converter in an analog domain full-precision memory computing circuit based on a magnetic tunnel junction computing unit according to an embodiment of the present invention;
FIG. 6 is a circuit diagram of a comparator of an input-sensitive parallel analog-to-digital converter in an analog domain full-precision memory computing circuit based on a magnetic tunnel junction computing unit according to an embodiment of the present invention;
FIG. 7 is a simulation diagram of the output voltage of the analog domain full-precision memory computing circuit accumulation circuit based on the magnetic tunnel junction computing unit according to the embodiment of the present invention;
FIG. 8 is a timing diagram of multi-bit calculation of a magnetic tunnel junction calculation unit-based analog domain full-precision memory calculation circuit according to an embodiment of the present invention;
fig. 9 is an energy efficiency result diagram of an analog domain full-precision memory computing circuit based on a magnetic tunnel junction computing unit according to an embodiment of the present invention.
Detailed Description
In order to better understand the purpose, structure and function of the present invention, the following describes a simulation domain full-precision memory computing circuit based on a magnetic tunnel junction computing unit in detail with reference to the accompanying drawings.
A magnetic tunnel junction calculation unit-based analog domain full-precision memory calculation circuit comprises a 3-transistor 2-magnetic tunnel junction (3T 2M) Magnetic Random Access Memory (MRAM) calculation array, a pulse generation circuit, a time sequence control circuit, an accumulation circuit, a multiplexer, an input sensitive parallel analog-to-digital converter (Flash ADC), an enable signal generation circuit and a digital shift accumulator.
The 3-transistor 2 magnetic tunnel junction 3T2M calculation array is composed of calculation units which comprise 2 cross connection pipes, 1 serial access pipe and 2 complementary magnetic tunnel junctions and are arranged in a matrix (K rows and L columns). In the memory computing mode, the complementary magnetic tunnel junctions implement multiplication operations by cross-connecting transistors and series access transistors.
The pulse generating circuit completes the conversion from the digital domain input vector to the analog pulse signal in the memory computing mode and feeds the pulse into the computing array;
the time sequence control circuit generates a control signal of the memory computing circuit;
the accumulation circuit comprises N groups of transistor-capacitance integration modules, each group comprises M parallel PMOS transistors and 1 capacitor, the sum of currents is related to the conduction number of the PMOS transistors in the memory calculation mode, and the sum of the currents is converted into a voltage signal by charging the capacitors;
the multiplexer is used for realizing column selection in a memory computing mode and realizing multiplexing of the ADC;
the input sensitive parallel analog/digital converter quantizes the voltage signal output by the accumulation circuit in the memory calculation mode to obtain a digital result of multiply-accumulate operation;
the enabling signal generating module is used for counting the number of bits which are equal to 1 in an input vector in the memory computing mode and generating an enabling signal to control the enabling number of comparators in the FLASH ADC;
the digital shift accumulator weights the multi-bit weight and the partial sum of the input excitation and then adds the weighted multi-bit weight and the partial sum to obtain a final multi-bit multiply-accumulate calculation result.
The 3 transistor 2 magnetic tunnel junction 3T2M computational array includes K rows and L columns of 3T2M computational cells. The computing units in the same row share one word line WL and are connected with input vector excitation; the same column of calculating units share one bit line BL and one source line SL, the result of the 3T2M calculating unit is stored through a latch after being stabilized, and the sampling is carried out by a post-stage accumulation circuit.
The 3-transistor 2 magnetic tunnel junction 3T2M calculation unit includes:
a first NMOS transistor N1 with its gate connected to the node V D The drain electrode is connected with the reference magnetic tunnel junction unit, and the source electrode is connected with the drain electrode of the N3 tube;
a second NMOS transistor N2 having its gate connected to the node V DB The drain electrode is connected with the data magnetic tunnel junction unit, and the source electrode is connected with the drain electrode of the N3 tube;
a third NMOS tube N3, the grid electrode of which is connected with the word line WL, the drain electrode of which is connected with the source electrodes of the N1 and N2 tubes, and the source electrode of which is connected with the source line SL;
one end of the first magnetic tunnel junction reference unit is connected with a bit line BL, and the other end of the first magnetic tunnel junction reference unit is connected with a drain electrode of the N1 tube and a grid electrode of the N2 tube;
and one end of the second magnetic tunnel junction data unit is connected with the bit line BL, and the other end of the second magnetic tunnel junction data unit is connected with the grid electrode of the N1 tube and the drain electrode of the N2 tube.
The accumulation circuit comprises 1 power gating switch S1, 1 capacitance reset switch S2 and N groups of long-channel PMOS transistor-capacitance modules, wherein each group comprises M long-channel PMOS transistors connected in parallel and 1 summation capacitor C SUM . The circuit structure includes:
one end of the power gating switch S1 is connected with a power supply, and the other end of the power gating switch S1 is connected with a source electrode of the PMOS tube;
capacitance reset switch S2 and summation capacitor C SUM Connecting in parallel;
summation capacitor C SUM One end of the PMOS tube is grounded, and the other end of the PMOS tube is connected with the drain electrodes of the M PMOS tubes;
the drains of the M PMOS transistors are connected, the sources are connected, and the drains serve as data lines DL.
The input sensitive parallel analog/digital converter consists of 1 reference resistor chain, M comparators and 1 encoder. The comparator comprises 1 preamplifier and 1 latch-type comparator.
The preamplifier includes:
a first PMOS transistor P1 with its gate connected to a bias voltage V b The source electrode is connected with a power supply, and the drain electrode is connected with the source electrodes of the P2 and P3 tubes;
a second PMOS tube P2 with a grid connected with the positive input voltage, a source connected with the drain of the tube P1 and a drain connected with the node AOUT-;
a third PMOS tube P3, wherein the grid electrode of the third PMOS tube P3 is connected with the negative input voltage, the source electrode of the third PMOS tube P3 is connected with the drain electrode of the P1 tube, and the drain electrode of the third PMOS tube P is connected with a node AOUT +;
a fourth PMOS tube P4, the grid electrode of which is connected with the enable signal AEN, the source electrode of which is connected with the node AOUT-, and the drain electrode of which is connected with the node AOUT +;
a grid electrode of the first NMOS tube N1 is connected with the node AOUT-, a source electrode is connected with one end of the switch S1, and a drain electrode is connected with the node AOUT-;
a grid electrode of the second NMOS tube N2 is connected with the node AOUT +, a source electrode is connected with one end of the switch S1, and a drain electrode is connected with the node AOUT-;
a grid electrode of the third NMOS tube N3 is connected with the node AOUT-, a source electrode is connected with one end of the switch S1, and a drain electrode is connected with the node AOUT +;
a fourth NMOS transistor N4 having a gate connected to the node AOUT +, a source connected to one end of the switch S1, and a drain connected to the node AOUT +;
one end of the first switch S1 is connected with the source electrodes of the N1, N2, N3 and N4 tubes, and the other end is grounded.
The latch type comparator includes:
a first PMOS tube P5, the grid electrode of which is connected with the control signal SEN, the source electrode of which is connected with the power supply, and the drain electrode of which is connected with the source electrodes of the tubes P6 and P7;
a grid electrode of the second PMOS tube P6 is connected with the node AOUT +, a source electrode is connected with a drain electrode of the P5 tube, and a drain electrode is connected with a source electrode of the P8 tube;
a grid electrode of the third PMOS tube P7 is connected with the node AOUT-, a source electrode is connected with a drain electrode of the P5 tube, and a drain electrode is connected with a source electrode of the P9 tube;
a fourth PMOS tube P8, the grid electrode of which is connected with the node DOUT, the source electrode of which is connected with the drain electrode of the P6 tube, and the drain electrode of which is connected with the node DOUTB;
a fifth PMOS tube P9, the grid electrode of which is connected with the node DOUTB, the source electrode of which is connected with the drain electrode of the P7 tube, and the drain electrode of which is connected with the node DOUT;
a first NMOS transistor N5, the grid of which is connected with a control signal SEN, the source electrode of which is grounded, and the drain electrode of which is connected with a node DOUTB;
the grid electrode of the second NMOS tube N6 is connected with the node DOUT, the source electrode is grounded, and the drain electrode is connected with the node DOUTB;
a third NMOS transistor N7, the grid electrode of which is connected with the node DOUTB, the source electrode of which is grounded, and the drain electrode of which is connected with the node DOUT;
a fourth NMOS transistor N8 having a gate connected to the control signal SEN, a source grounded, and a drain connected to the node DOUT;
the enabling signal generating circuit comprises a full adder circuit and a logic gate circuit.
Examples
The invention relates to an analog domain full-precision memory computing circuit based on a magnetic tunnel junction computing unit, which comprises a Magnetic Random Access Memory (MRAM) computing array of a 3-transistor 2 magnetic tunnel junction (3T 2M), a pulse generating circuit, a time sequence control circuit, an accumulation circuit, a multi-path selector, an input sensitive parallel analog-to-digital converter (Flash ADC), an enabling signal generating circuit and a digital shift accumulator.
The in-memory computing architecture shown in FIG. 1 includes: the 3-transistor 2 magnetic tunnel junction 3T2M computational array is composed of computational cells comprising 2 cross-connection pipes, 1 series access pipe and 2 complementary magnetic tunnel junctions arranged in a matrix (K rows and L columns). In the memory computing mode, the complementary magnetic tunnel junction realizes multiplication operation through cross-connection transistors and series access tubes; the pulse generating circuit converts the input vector into a pulse signal with a fixed pulse width, and inputs the pulse signal into the computing array; the time sequence control circuit generates a control signal and controls the enabling time of each module; the accumulation circuit connects a plurality of transistors in parallel based on kirchhoff current law, the current charges a summing capacitor, and a current signal is converted into a voltage signal; the multiplexer is used for realizing column selection in a memory computing mode and realizing multiplexing of the ADC; the input sensitive parallel analog/digital converter quantizes the voltage signal output by the accumulation circuit to obtain a digital result of multiplication and accumulation operation; the enabling signal generating module counts the number of bits equal to '1' in the input vector to generate an enabling signal to control the enabling number of comparators in the FLASH ADC; the digital shift accumulator weights the multi-bit weight and the sum of the input excitation part and then adds the weighted multi-bit weight and the sum to obtain the final multi-bit multiply-accumulate calculation result.
IN this embodiment, an 8 × 1 weight matrix is used as an object (K =8, l = 1), and a multiply-accumulate operation of 8 2-bit input values IN and 8 1-bit weight values W is implemented, where the formula is:
Figure BDA0003818052070000081
the input value IN equation (1) is mapped IN the 3T2M computational array disclosed IN the present invention as:
Figure BDA0003818052070000091
IN IN equation (2) i,0 And IN i,1 Representing the high and low bits of the input value, respectively. The embodiment of the invention adopts a serial input strategy, when in calculation, high bit is firstly fed into a calculation array, partial sum is obtained, then low bit is fed into the calculation array, and finally the two partial sums are added to obtain a final multiply-accumulate result.
The weight value W in formula (1) is mapped in the 3T2M computational array disclosed in the present invention as:
Figure BDA0003818052070000092
the weight values in equation (3) are stored in a matrix format in the 3T2M computing units in the computing array disclosed herein.
As shown in fig. 2, a circuit diagram of a 3T2M computing unit in an analog domain full-precision memory computing circuit based on a magnetic tunnel junction computing unit according to an embodiment of the present invention is shown. The right magnetic tunnel junction is a data cell storing weight information, and the left magnetic tunnel junction is a reference cell having an opposite state to the right magnetic tunnel junction. Two transistors connected in cross and in series implement multiplication operation and output multiplication result to V D And (4) a node. When the input excitation is '1' (WL = '1') and the data unit magnetic tunnel junction is in an antiparallel state (high-impedance state), the partial voltage of the magnetic tunnel junction is large, and V is large D Node electricityThe N1 tube is cut off under the low pressure, the feedback action increases the conduction degree of the N2 tube, and V D The node voltage is close to 0. The same analysis can be done for the rest of the cases.
As shown in fig. 3, a multiplication logic diagram of a 3T2M computing unit in a simulation domain full-precision memory computing circuit based on a magnetic tunnel junction computing unit according to an embodiment of the present invention is provided. Different input and weight configurations implement multiplication operations based on and logic. When the input is '0' (the grid of the N3 tube is '0'), the N3 tube is turned off, and V is no matter what value the weight information is D All nodes are at high level V H . When the input is '1' (the grid of the N3 tube is '1'), the N3 tube is conducted, and V D The node exhibits a high level (V) according to whether the weight information stored in the data unit is ' 0 ' or ' 1 H ) Or low level (V) L )。
As shown in fig. 4, a 3T2M calculation array and an accumulation circuit diagram in an analog domain full-precision memory calculation circuit based on a magnetic tunnel junction calculation unit according to an embodiment of the present invention are provided. The white background part is a calculation unit array, and the dark background part is an accumulation circuit. Each column of calculation units shares one source line SL, one bit line BL and one data line DL, and each row of calculation units shares one word line WL. In the embodiment only operations on a column of calculation units are considered, i.e. 1 bit weight data. The accumulation circuit comprises 8 PMOS transistors connected in parallel and 1 summation capacitor C SUM And two switches S1 and S2. Before summation begins, S2 is closed to reset the summation capacitor, V SUM The node is reset to zero potential. After the multiplication operation is finished, S1 is closed, S2 is opened, the multiplication results of 8 units are input to the grid electrode, and the conduction states of 8 PMOS tubes are controlled respectively. The charging current is determined by the multiplication and accumulation result, the larger the multiplication and accumulation value is, the more the number of the conducted PMOS tubes is, the larger the charging current is, and V is SUM The higher the voltage value of the node.
As shown in fig. 5, an input-sensitive parallel analog-to-digital converter circuit in an analog domain full-precision memory computing circuit based on a magnetic tunnel junction computing unit is provided in an embodiment of the present invention. The circuit comprises a series resistor chain, 8 comparators, 1 encoder and 1 enable signal generator. The series resistor chain generates a linear reference voltage,as the negative terminal input of the comparator. Output voltage V of accumulation circuit SUM And the output of the comparator is encoded and then outputs the digital quantity of the multiplied and accumulated value. The comparator consists of a preamplifier and a latch type comparator, and the enable state is controlled by two signals respectively: a preamplifier enable signal AEN and a latch-type comparator enable signal SEN. The enabling signal generating circuit determines the enabling number of 8 comparators according to the number of WL (word line) bits of the input vector, so that the working number of the comparators is reduced under the condition that the input sparsity is high, and the energy consumption is reduced.
As shown in fig. 6, a circuit diagram of a comparator of an input-sensitive parallel analog-to-digital converter in an analog domain full-precision memory computing circuit based on a magnetic tunnel junction computing unit according to an embodiment of the present invention is shown. The left side is a preamplifier circuit and the right side is a latch-type comparator circuit. The preamplifier amplifies the voltage difference of input signals of positive and negative ends through the feedback action of N2 and N3 tubes, and the output of the preamplifier is connected to the grids of P6 and P7 of the latch type comparator. When SEN =1, the DOUT and DOUTB nodes of the latch type comparator are reset to low level; when SEN =0, the two nodes are respectively charged through the P7 and P6 transistors, the voltage of AOUT + and AOUT-determines the charging speed, if AOUT + > AOUT-, DOUT charging speed is high, the P8 transistor is turned off, the N6 transistor is turned on, DOUTB is pulled down to a low level, DOUT =1 and latching is carried out, and the value is the output result of the comparator.
As shown in fig. 7, a simulation diagram of an output voltage of an accumulation circuit in an analog domain full-precision memory calculation circuit based on a magnetic tunnel junction calculation unit according to an embodiment of the present invention is provided. The S1 switch of the accumulation circuit is closed in 0.3ns, the summing capacitor is charged through the parallel PMOS tube, after 0.8ns, the S1 switch is disconnected, the voltage on the summing capacitor is latched, and the sampling of the rear-stage Flash ADC circuit is facilitated. The parallel PMOS tube uses a long-channel transistor to reduce the current nonlinearity problem caused by channel modulation effect.
As shown in fig. 8, a timing chart of 2-bit input 1-bit weight calculation for an analog domain full-precision memory calculation circuit based on a magnetic tunnel junction calculation unit according to an embodiment of the present invention is provided. In the first clock period, inputting and exciting high-bit feed-in calculation array to obtain high-bit operation portion sum; in the second clock cycle, inputting and exciting a low bit to feed into the calculation array to obtain a low bit operation part sum, and meanwhile, realizing weighting processing by the high bit part and the left shift by 1 bit; the final multiply-accumulate result is output on the third clock rising edge.
Fig. 9 is a diagram illustrating an energy efficiency result of a simulation domain full-precision memory computing circuit based on a magnetic tunnel junction computing unit according to an embodiment of the present invention. Compared with the traditional digital domain memory computing circuit, the memory computing circuit provided by the invention has obvious energy efficiency improvement and has obvious advantages in a high-input sparsity scene.
It is to be understood that the present invention has been described with reference to certain embodiments, and that various changes in the features and embodiments, or equivalent substitutions may be made therein by those skilled in the art without departing from the spirit and scope of the invention. In addition, many modifications may be made to adapt a particular situation or material to the teachings of the invention without departing from the essential scope thereof. Therefore, it is intended that the invention not be limited to the particular embodiment disclosed, but that the invention will include all embodiments falling within the scope of the appended claims.

Claims (10)

1. A magnetic tunnel junction calculation unit-based analog domain full-precision memory calculation circuit is characterized by comprising a 3-transistor 2 magnetic tunnel junction 3T2M calculation array, a pulse generation circuit, a time sequence control circuit, an accumulation circuit, a multiplexer, an input sensitive parallel analog/digital converter, an enable signal generation circuit and a digital shift accumulator;
the 3-transistor 2 magnetic tunnel junction 3T2M calculation array is arranged in a K-row L-column matrix formed by calculation units comprising 2 cross connection pipes, 1 serial access pipe and 2 complementary magnetic tunnel junctions; in the memory computing mode, the complementary magnetic tunnel junction realizes multiplication operation through a cross connection pipe and a series access pipe;
the pulse generating circuit completes the conversion from the digital domain input vector to the analog pulse signal in the memory computing mode and feeds the pulse into the computing array;
the time sequence control circuit generates a control signal of the memory computing circuit;
the accumulation circuit comprises N groups of transistor-capacitance integration modules, each group comprises M PMOS tubes connected in parallel and 1 capacitor, and in the memory calculation mode, the sum of currents is related to the conduction number of the PMOS tubes, and the sum of currents is converted into a voltage signal through charging of the capacitors;
the multiplexer is used for realizing column selection in a memory computing mode and realizing multiplexing of the ADC;
the input sensitive parallel analog/digital converter quantizes the voltage signal output by the accumulation circuit in the memory calculation mode to obtain a digital result of multiply-accumulate operation;
the enabling signal generating circuit counts the number of bits equal to '1' in an input vector in a memory computing mode, and generates an enabling signal to control the enabling number of comparators in the input sensitive parallel analog/digital converter;
the digital shift accumulator weights the multi-bit weight and the partial sum of the input excitation and then adds the weighted multi-bit weight and the partial sum to obtain a final multi-bit multiply-accumulate calculation result.
2. The magnetic tunnel junction computing unit-based analog domain full-precision memory computing circuit as claimed in claim 1, wherein the same row of computing units share a word line WL, connected to the input vector excitation; the same column of calculation units share one bit line BL and one source line SL, and the results of the calculation units are stored through a latch after being stabilized and are sampled by a post-stage accumulation circuit.
3. The magnetic tunnel junction calculation unit based analog domain full-precision memory calculation circuit of claim 1, wherein the 3-transistor 2 magnetic tunnel junction 3T2M calculation unit comprises:
a first NMOS transistor N1 with its gate connected to the node V D A drain connected to the first magnetic tunnel junction reference cell and a source connected to the first magnetic tunnel junction reference cellThe drain electrode of the third NMOS tube N3;
a second NMOS transistor N2 having a gate connected to the node V DB The drain electrode is connected with the second magnetic tunnel junction data unit, and the source electrode is connected with the drain electrode of the third NMOS tube N3;
a third NMOS transistor N3 having a gate connected to the word line WL, a drain connected to the sources of the first and second NMOS transistors N1 and N2, and a source connected to the source line SL;
one end of the first magnetic tunnel junction reference unit is connected with a bit line BL, and the other end of the first magnetic tunnel junction reference unit is connected with the drain electrode of the first NMOS tube N1 and the grid electrode of the second NMOS tube N2;
and one end of the second magnetic tunnel junction data unit is connected with the bit line BL, and the other end of the second magnetic tunnel junction data unit is connected with the grid electrode of the first NMOS tube N1 and the drain electrode of the second NMOS tube N2.
4. The magnetic tunnel junction computing unit-based analog domain full-precision memory computing circuit of claim 1, wherein the accumulation circuit comprises 1 power gate switch S1, 1 capacitance reset switch S2 and N groups of long-channel PMOS transistor-capacitance modules, each group comprising M parallel-connected long-channel PMOS transistors and 1 summing capacitor C SUM
5. The magnetic tunnel junction computing unit-based analog domain full-precision memory computing circuit according to claim 4, wherein the circuit structure of the accumulation circuit comprises:
one end of the power supply gate control switch S1 is connected with a power supply, and the other end is connected with a source electrode of the PMOS tube;
capacitance reset switch S2 and summation capacitor C SUM Parallel connection;
summing capacitor C SUM One end of the M PMOS tubes is grounded, and the other end of the M PMOS tubes is connected with the drain electrodes of the M PMOS tubes;
the drain electrodes of the M PMOS tubes are connected, the source electrodes are connected, and the drain electrodes are used as data lines DL.
6. The magnetic tunnel junction computing unit-based analog domain full-precision memory computing circuit according to claim 1, wherein the input-sensitive parallel analog/digital converter is composed of 1 reference resistor chain, M comparators and 1 encoder; the comparator includes 1 preamplifier and 1 latch-type comparator.
7. The magnetic tunnel junction calculation unit based analog domain full precision memory calculation circuit of claim 6, wherein the preamplifier comprises:
a first PMOS transistor P1 with its gate connected to a bias voltage V b The source electrode is connected with a power supply, and the drain electrode is connected with the source electrodes of a second PMOS tube P2 and a third PMOS tube P3;
a grid electrode of the second PMOS pipe P2 is connected with positive input voltage, a source electrode is connected with a drain electrode of the first PMOS pipe P1, and a drain electrode is connected with a node AOUT-;
a grid electrode of the third PMOS tube P3 is connected with the negative input voltage, a source electrode of the third PMOS tube P3 is connected with a drain electrode of the first PMOS tube P1, and a drain electrode of the third PMOS tube P3 is connected with a node AOUT +;
a fourth PMOS tube P4, the grid electrode of which is connected with the enable signal AEN, the source electrode of which is connected with the node AOUT-, and the drain electrode of which is connected with the node AOUT +;
a grid electrode of the first NMOS tube N1 is connected with a node AOUT-, a source electrode is connected with one end of the switch S1, and a drain electrode is connected with the node AOUT-;
a grid electrode of the second NMOS tube N2 is connected with the node AOUT +, a source electrode is connected with one end of the switch S1, and a drain electrode is connected with the node AOUT-;
a grid electrode of the third NMOS tube N3 is connected with the node AOUT-, a source electrode is connected with one end of the switch S1, and a drain electrode is connected with the node AOUT +;
a grid electrode of the fourth NMOS tube N4 is connected with the node AOUT +, a source electrode is connected with one end of the switch S1, and a drain electrode is connected with the node AOUT +;
one end of the first switch S1 is connected to the source electrodes of the first NMOS transistor N1, the second NMOS transistor N2, the third NMOS transistor N3, and the fourth NMOS transistor N4, and the other end is grounded and is controlled to be turned on or off by the enable signal AEN.
8. The magnetic tunnel junction calculation unit based analog domain full precision memory calculation circuit of claim 6, wherein the latch type comparator comprises:
a first PMOS tube P5, the grid electrode of which is connected with the control signal SEN, the source electrode of which is connected with the power supply, and the drain electrode of which is connected with the source electrodes of the tubes P6 and P7;
a grid electrode of the second PMOS pipe P6 is connected with the node AOUT +, a source electrode of the second PMOS pipe P6 is connected with a drain electrode of the first PMOS pipe P5, and a drain electrode of the second PMOS pipe P6 is connected with a source electrode of the P8 pipe;
the grid electrode of the third PMOS tube P7 is connected with the node AOUT-, the source electrode is connected with the drain electrode of the first PMOS tube P5, and the drain electrode is connected with the source electrode of the P9 tube;
a grid electrode of the fourth PMOS tube P8 is connected with the node DOUT, a source electrode of the fourth PMOS tube P8 is connected with the drain electrode of the second PMOS tube P6, and a drain electrode of the fourth PMOS tube P8 is connected with the node DOUTB;
a fifth PMOS tube P9, the grid electrode of which is connected with the node DOUTB, the source electrode of which is connected with the drain electrode of the third PMOS tube P7, and the drain electrode of which is connected with the node DOUT;
a first NMOS transistor N5, the grid of which is connected with a control signal SEN, the source electrode of which is grounded, and the drain electrode of which is connected with a node DOUTB;
the grid electrode of the second NMOS tube N6 is connected with the node DOUT, the source electrode is grounded, and the drain electrode is connected with the node DOUTB;
a third NMOS transistor N7, the grid electrode of which is connected with the node DOUTB, the source electrode of which is grounded, and the drain electrode of which is connected with the node DOUT;
the gate of the fourth NMOS transistor N8 is connected to the control signal SEN, the source is grounded, and the drain is connected to the node DOUT.
9. The magnetic tunnel junction calculation unit based analog domain full-precision memory calculation circuit of claim 1, wherein the enable signal generation circuit comprises a full adder circuit and a logic gate circuit.
10. The magnetic tunnel junction calculation unit-based analog domain full-precision memory calculation method of any one of claims 1-9, wherein in a memory calculation mode, a complementary magnetic tunnel junction realizes multiplication operation through cross-connected transistors and series access transistors; the pulse generating circuit converts the input vector into a pulse signal with a fixed pulse width, and inputs the pulse signal into the computing array; the time sequence control circuit generates a control signal and controls the enabling time of each module; the accumulation circuit connects a plurality of transistors in parallel based on kirchhoff current law, the current charges a summing capacitor, and a current signal is converted into a voltage signal; the multiplexer is used for realizing column selection in a memory computing mode and realizing multiplexing of the ADC; the input sensitive parallel analog/digital converter quantizes the voltage signal output by the accumulation circuit to obtain a digital result of multiply-accumulate operation; the enabling signal generating module counts the number of bits equal to '1' in the input vector to generate an enabling signal to control the enabling number of comparators in the FLASH ADC; the digital shift accumulator weights the multi-bit weight and the sum of the input excitation part and then adds the weighted multi-bit weight and the sum to obtain the final multi-bit multiply-accumulate calculation result.
CN202211033693.5A 2022-08-26 2022-08-26 Magnetic tunnel junction calculation unit-based analog domain full-precision memory calculation circuit and method Pending CN115390789A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211033693.5A CN115390789A (en) 2022-08-26 2022-08-26 Magnetic tunnel junction calculation unit-based analog domain full-precision memory calculation circuit and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211033693.5A CN115390789A (en) 2022-08-26 2022-08-26 Magnetic tunnel junction calculation unit-based analog domain full-precision memory calculation circuit and method

Publications (1)

Publication Number Publication Date
CN115390789A true CN115390789A (en) 2022-11-25

Family

ID=84122306

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211033693.5A Pending CN115390789A (en) 2022-08-26 2022-08-26 Magnetic tunnel junction calculation unit-based analog domain full-precision memory calculation circuit and method

Country Status (1)

Country Link
CN (1) CN115390789A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115756388A (en) * 2023-01-06 2023-03-07 上海后摩智能科技有限公司 Multi-mode storage and calculation integrated circuit, chip and calculation device
CN115794728A (en) * 2022-11-28 2023-03-14 北京大学 Memory computing bit line clamping and summing peripheral circuit and application thereof
CN116070685A (en) * 2023-03-27 2023-05-05 南京大学 Memory computing unit, memory computing array and memory computing chip

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115794728A (en) * 2022-11-28 2023-03-14 北京大学 Memory computing bit line clamping and summing peripheral circuit and application thereof
CN115794728B (en) * 2022-11-28 2024-04-12 北京大学 In-memory computing bit line clamping and summing peripheral circuit and application thereof
CN115756388A (en) * 2023-01-06 2023-03-07 上海后摩智能科技有限公司 Multi-mode storage and calculation integrated circuit, chip and calculation device
CN116070685A (en) * 2023-03-27 2023-05-05 南京大学 Memory computing unit, memory computing array and memory computing chip

Similar Documents

Publication Publication Date Title
CN110427171B (en) In-memory computing device and method for expandable fixed-point matrix multiply-add operation
CN115390789A (en) Magnetic tunnel junction calculation unit-based analog domain full-precision memory calculation circuit and method
CN112581996B (en) Time domain memory internal computing array structure based on magnetic random access memory
CN111816234B (en) Voltage accumulation in-memory computing circuit based on SRAM bit line exclusive nor
CN113936717B (en) Storage and calculation integrated circuit for multiplexing weight
Mu et al. SRAM-based in-memory computing macro featuring voltage-mode accumulator and row-by-row ADC for processing neural networks
CN113782072A (en) Multi-bit memory computing circuit
CN114038492B (en) Multiphase sampling memory internal computing circuit
CN115906976A (en) Full-analog vector matrix multiplication memory computing circuit and application thereof
CN114400031A (en) Complement mapping RRAM (resistive random access memory) storage and calculation integrated chip and electronic equipment
CN114496010A (en) Analog domain near memory computing array structure based on magnetic random access memory
CN112989273A (en) Method for carrying out memory operation by using complementary code
CN115691613B (en) Charge type memory internal calculation implementation method based on memristor and unit structure thereof
CN114882921B (en) Multi-bit computing device
CN114895869B (en) Multi-bit memory computing device with symbols
CN114512161B (en) Memory computing device with symbols
CN114093394B (en) Rotatable internal computing circuit and implementation method thereof
CN116543808A (en) All-digital domain in-memory approximate calculation circuit based on SRAM unit
CN114974337A (en) Time domain memory computing circuit based on spin magnetic random access memory
WO2022197534A1 (en) Compute-in-memory with ternary activation
Gao et al. Current research status and future prospect of the in-memory computing
CN112951290A (en) Memory computing circuit and device based on nonvolatile random access memory
Jeong et al. A Ternary Neural Network computing-in-Memory Processor with 16T1C Bitcell Architecture
WO2023160735A2 (en) Operation method and operation unit
Wang et al. Sparsity-aware clamping readout scheme for high parallelism and low power nonvolatile computing-in-memory based on resistive memory

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination