CN116189732A - Integrated memory chip and method for optimizing read-out circuit - Google Patents

Integrated memory chip and method for optimizing read-out circuit Download PDF

Info

Publication number
CN116189732A
CN116189732A CN202310433555.4A CN202310433555A CN116189732A CN 116189732 A CN116189732 A CN 116189732A CN 202310433555 A CN202310433555 A CN 202310433555A CN 116189732 A CN116189732 A CN 116189732A
Authority
CN
China
Prior art keywords
calculation
integrated
memory
unit
circuit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310433555.4A
Other languages
Chinese (zh)
Other versions
CN116189732B (en
Inventor
王宇宣
傅高鸣
李龙飞
赵文翔
潘红兵
彭成磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University
Original Assignee
Nanjing University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University filed Critical Nanjing University
Priority to CN202310433555.4A priority Critical patent/CN116189732B/en
Publication of CN116189732A publication Critical patent/CN116189732A/en
Application granted granted Critical
Publication of CN116189732B publication Critical patent/CN116189732B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11CSTATIC STORES
    • G11C7/00Arrangements for writing information into, or reading information out from, a digital store
    • G11C7/10Input/output [I/O] data interface arrangements, e.g. I/O data control circuits, I/O data buffers
    • G11C7/1051Data output circuits, e.g. read-out amplifiers, data output buffers, data output registers, data output level conversion circuits
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention discloses a memory integrated chip and a memory integrated method for read-out circuit optimization, belonging to the field of very large scale integrated circuits and the field of memory integrated circuits. The integrated memory chip with optimized reading circuit is divided into a calculation reading circuit and a calibration reading circuit by a calibration calculation separation method; the independent calibration reading circuit can improve the weight deployment precision of the integrated chip for memory calculation and the calculation precision of the chip; the independent calculation readout circuit can further optimize the area, the power consumption and the speed due to the reduction of the functional requirements. Furthermore, the calculation readout circuit realizes matrix vector multiplication realized by the integrated circuit through a segmented digital-analog hybrid addition method, wherein the first stage adopts analog operation, the second stage adopts digital operation, the function and performance requirements of the readout circuit are reduced, the area, the power consumption, the speed and the dynamic range of the integrated chip readout circuit are optimized, and the universality of the integrated readout circuit is improved.

Description

Integrated memory chip and method for optimizing read-out circuit
Technical Field
The invention relates to a memory integrated chip and a memory integrated method for read-out circuit optimization, belonging to the field of very large scale integrated circuits and the field of memory integrated circuits.
Background
The rapid development of artificial intelligence and deep learning makes the demands of people on the computing power and energy efficiency ratio of hardware continuously increase. Most of the traditional computers adopt a von neumann architecture, and a storage unit and an operation unit of the traditional computers are separated, so that most of power consumption and time are wasted in data transportation during calculation. This places limitations on computational power expansion, energy efficiency improvement, and cost reduction of hardware based on von neumann architecture.
The integrated memory technology eliminates the power consumption and the speed cost of data moving in an integrated memory chip by fusing a memory unit and an operation unit, and is considered as an important technical means for breaking through the bottleneck of the von neumann architecture. The readout circuit of the integrated memory chip is a main power consumption and area source of the chip, and also determines the calculation speed of the chip, so that the readout circuit is one of core optimization points of the integrated memory chip.
Disclosure of Invention
In order to solve the problems, the invention aims to provide two optimization methods of a memory integrated chip reading circuit, and two memory integrated chips are formed based on the two optimization methods. According to the integrated memory-calculation chip, the readout circuit is divided into the calibration readout circuit and the calculation readout circuit by adopting the calibration calculation separation method, so that the weight storage precision of the integrated memory-calculation array is improved; furthermore, the area, the power consumption, the speed and the dynamic range of the integrated chip reading circuit are optimized by adopting a piecewise digital-analog hybrid addition method for the calculation reading circuit.
A first object of the present invention is to provide a method for optimizing a memory integrated chip readout circuit, the method comprising: dividing a memory integrated chip reading circuit into a calibration reading circuit and a calculation reading circuit by adopting a calibration calculation separation method; the calibration reading circuit is used for calibrating the weight, and the calculation reading circuit is used for reading the matrix vector multiplication operation result of the memory integrated chip.
The calibration calculation separation method comprises the following steps: the readout circuit of the integrated memory chip is functionally decomposed into a calibration readout circuit and a calculation readout circuit. The calibration reading circuit performs depth optimization aiming at the weight writing calibration of the integrated memory and calculation chip, and the accuracy is higher than that of the calculation reading circuit, so that the weight stored by the integrated memory and calculation chip can be more accurate; the calculation and readout circuit is used for carrying out depth optimization on the calculation of the integrated memory and calculation chip, and is comprehensively considered from the aspects of precision, speed, area, power consumption and dynamic range, and as the calculation and readout circuit does not have the function limitation of calibrating the weight of the integrated memory and calculation chip, the calculation and readout circuit can make a larger choice in precision, and the independent design of the calculation and readout circuit can enable the integrated memory and calculation chip to have better calculation performance.
In one embodiment, the calibration readout circuit is configured to read out weights stored in the integrated memory array, verify the correctness of the weight storage, and accurately write the weights to the integrated memory array. And the calculation and readout circuit is used for reading out the matrix vector multiplication operation result of the memory and calculation integrated array and converting the calculation result of the analog domain into the digital domain.
In one embodiment, the method further comprises a piecewise digital-analog hybrid addition method.
The piecewise digital-analog hybrid accumulation method comprises the following steps: matrix vector multiplication operation in the integrated memory chip is realized in two stages, the first stage uses analog operation to split an input vector to be calculated by matrix vector multiplication into sub-input vectors, matrix vector multiplication operation of each sub-input vector and weight is realized through analog operation, the input length of the sub-vector is smaller than the input length of the vector, the functional requirement of a readout circuit of the integrated memory chip is reduced, and the area, the power consumption, the speed and the dynamic range of the readout circuit are optimized; and in the second stage, digital operation is used, and the sub-input vector operation result in the first stage is spliced with an adder by adopting a shift circuit, so that complete matrix vector multiplication operation is realized.
A second object of the present invention is to provide a memory integrated chip optimized for a readout circuit, in which the readout circuit is divided into a calibration readout circuit and a calculation readout circuit; the calibration reading circuit is used for calibrating the weight, and the calculation reading circuit is used for reading the matrix vector multiplication operation result of the memory integrated chip. By adopting a calibration calculation separation method, the calibration reading circuit aims at optimizing the calibration precision of the calibration reading circuit, so that the precision of weight storage of the integrated memory and calculation chip is improved; and the calculation and readout selects a folding point in the aspects of precision, area and power consumption, so that the calculation performance of the integrated memory and calculation chip is improved.
In one embodiment, the integrated memory chip comprises a control unit, an integrated memory array, a weight input unit, a calibration readout circuit, a calculation readout circuit, a digital processing and caching unit; the control unit is connected with the weight input unit, the calibration reading circuit, the calculation reading circuit, the digital processing unit and the buffer unit; the memory and calculation integrated array is connected with the weight input unit, the calibration reading circuit and the calculation reading circuit; the weight input unit is connected with the control unit and the memory and calculation integrated array; the calibration reading circuit is connected with the control unit and the memory and calculation integrated array; the calculation and readout circuit is connected with the control unit, the integrated storage and calculation array, the digital processing unit and the cache unit; the digital processing and buffering unit is connected with the control unit and the calculation reading circuit.
The control unit is used for generating control signals required by the operation of other units so that each unit can work cooperatively;
the storage and calculation integrated array is used for storing weights and performing matrix vector multiplication operation;
the weight input unit is used for rewriting the weight into the memory integrated array;
the calibration reading circuit is used for reading the weight stored in the integrated memory-calculation array, checking the correctness of the weight storage, working cooperatively with the weight input unit and accurately writing the weight into the integrated memory-calculation array;
the calculation and readout circuit is used for reading out the matrix vector multiplication operation result of the memory-calculation integrated array and converting the calculation result of the analog domain into the digital domain;
the digital processing and caching unit is used for completing non-matrix vector multiplication operation in the chip and caching intermediate results between matrix vector multiplication and matrix vector multiplication.
In one embodiment, the calibration readout circuitry is optimized for accuracy preferentially, so that the stored integrated array weights store as little deviation as possible of the ideal value from the actual value.
In one embodiment, the readout circuit optimized integrated memory chip further comprises an input vector splitting unit and an output vector splicing unit; the input vector splitting unit is connected with the memory and calculation integrated array; the calculation reading circuit is connected with the output vector splicing unit;
the input vector splitting unit is used for splitting the vector to be operated into a plurality of sub-input vectors according to bit positions;
the storage and calculation integrated array is used for storing the weight matrix, receiving the input of the input vector splitting unit and calculating;
the calculation and readout circuit is used for reading out the sub-input vector calculation analog quantity of the integrated memory and calculation array, converting the sub-input vector calculation analog quantity into digital quantity and inputting the digital quantity into the output vector splicing unit;
the output vector splicing unit is used for splicing the outputs of the calculation readout circuit to realize complete matrix vector multiplication operation.
The invention also provides electronic equipment containing the integrated memory chip.
The invention has the beneficial effects that:
the invention provides two innovative methods for optimizing a readout circuit of an integrated memory chip and the method is used for the integrated memory chip. According to the invention, a calibration calculation separation method is adopted, and the integrated memory circuit is divided into a calibration reading circuit and a calculation reading circuit, so that the reading circuit can be conveniently optimized according to different functional requirements, the weight storage precision of the integrated memory chip is improved, and the speed, the area and the power consumption of the integrated memory chip reading circuit are optimized; furthermore, the matrix vector multiplication operation of the integrated memory chip is realized in two stages by a segmented digital-analog hybrid addition method, and the area, the power consumption, the speed and the dynamic range of a readout circuit of the integrated memory chip are optimized.
Drawings
FIG. 1 is a flow chart of a calibration calculation separation method.
Fig. 2 is a schematic diagram of a memory integrated chip based on a calibration calculation separation method.
FIG. 3 is a flow chart of a segmented digital-analog hybrid accumulation method.
Fig. 4 is a schematic diagram of a memory integrated chip based on a piecewise digital-analog hybrid addition method.
FIG. 5 is a second schematic diagram of a memory integrated chip based on the segment digital-analog hybrid addition method.
Detailed Description
The following is a further detailed description of the present invention with reference to the drawings and examples.
Example 1
The embodiment provides a specific implementation scheme of an optimization method of a memory integrated chip readout circuit based on a calibration calculation separation method. Fig. 1 is a flowchart of a calibration calculation separation method, and a readout circuit of a memory integrated chip is divided into a calibration readout circuit and a calculation readout circuit, wherein the calibration readout circuit performs depth optimization on precision, and the calculation readout circuit cooperatively considers aspects of area, power consumption, speed, precision, dynamic range and the like so as to improve the comprehensive calculation performance of the memory integrated chip.
FIG. 2 is a schematic diagram of a memory integrated chip based on a calibration calculation separation method. The system comprises a control unit, a memory integrated array, a weight input unit, a calibration reading circuit, a calculation reading circuit, a digital processing unit and a buffer unit. The control unit is connected with the weight input unit, the calibration reading circuit, the calculation reading circuit, the digital processing unit and the buffer unit. The memory and calculation integrated array is connected with the weight input unit, the calibration reading circuit and the calculation reading circuit. The weight input unit is connected with the control unit and the memory and calculation integrated array. The calibration reading circuit is connected with the control unit and the memory integrated array. The calculation and readout circuit is connected with the control unit, the integrated memory and calculation array, the digital processing unit and the buffer memory unit. The digital processing and buffering unit is connected with the control unit and the calculation reading circuit. And the control unit is used for generating control signals required by the operation of other units so that the units can work cooperatively. And the memory and calculation integrated array is used for storing the weights and performing matrix vector multiplication operation. And the weight input unit is used for rewriting the weight into the memory integrated array. And the calibration reading circuit is used for reading the weights stored in the integrated memory array, checking the correctness of the weight storage, and accurately writing the weights into the integrated memory array by cooperating with the weight input unit. And the calculation and readout circuit is used for reading out the matrix vector multiplication operation result of the memory and calculation integrated array and converting the calculation result of the analog domain into the digital domain. And the digital processing and caching unit is used for completing non-matrix vector multiplication operation in the chip and caching intermediate results between matrix vector multiplication and matrix vector multiplication.
When the memory and calculation integrated chip works, the memory and calculation integrated chip is controlled and scheduled by the control unit. The weight input unit is used for writing the weight into the photoelectric calculation integrated array, and the calibration reading circuit is used for calibrating the weight. After the weight is deployed, the matrix vector multiplication operation result of the integrated memory chip can be read out through a calculation reading circuit. Further, the operation result of the matrix vector multiplication can be further calculated by the digital processing and buffering unit.
Assuming that the weight storage accuracy requirement of the memory chip is 8 bits, the calculation accuracy requirement is 6 bits. The read-out circuit without adopting the calibration calculation separation method needs to support 8-bit precision, and brings great challenges to the area, the power consumption and the speed of the integrated chip for storage and calculation. By adopting the calibration calculation separation method, the precision of a calibration reading circuit is 8 bits, and the precision of a calculation reading circuit is 6 bits, thereby meeting the design requirement of the integrated memory chip. Generally, the speed requirement of the integrated chip calibration function is far smaller than that of the calculation function, and when the calibration calculation separation method is adopted, the number of the calibration reading circuits is negligible compared with that of the calculation reading circuits, and the performance of the calculation reading circuits is obviously improved compared with that of the original reading circuits.
Example 2
The embodiment provides a specific implementation scheme I based on a piecewise digital-analog hybrid addition method aiming at the optimization of a memory integrated chip reading circuit. Fig. 3 is a flow chart of a piecewise digital-analog hybrid addition method, wherein matrix vector multiplication operation of a memory integrated chip is realized in two stages, the first stage realizes matrix vector multiplication of sub-input vectors in an analog domain, and the second stage splices matrix vector multiplication results of the sub-input vectors in a digital domain, so that complete matrix vector multiplication operation is realized. The matrix vector multiplication operation is split, so that the functional requirement on the readout circuit can be reduced, the area, the power consumption, the speed and the dynamic range of the readout circuit of the integrated memory chip are optimized, and the universality of the readout circuit of the integrated memory chip is improved.
Fig. 4 is a schematic diagram of a memory integrated chip based on a piecewise digital-analog hybrid addition method. The device comprises an input vector splitting unit, a memory integrated array, a reading circuit unit and an output vector splicing unit. The input vector splitting unit is connected with the memory and calculation integrated array. The memory-calculation integrated array is connected with the input vector splitting unit and the reading circuit unit. The reading circuit unit is connected with the integrated storage and calculation array and the output vector splicing unit; the output vector splicing unit is connected with the reading circuit unit. And the input vector splitting unit is used for splitting the vector to be operated into a plurality of sub-input vectors according to bits. And the memory and calculation integrated array is used for storing the weight matrix, receiving the input of the input vector splitting unit and calculating. And the reading circuit unit is used for reading the calculated analog quantity of the sub-input vector of the integrated array, converting the calculated analog quantity into digital quantity and inputting the digital quantity to the output vector splicing unit. The output vector splicing unit is used for splicing the outputs of the reading circuit to realize complete matrix vector multiplication operation.
It is assumed that this embodiment requires the realization of an input vector with an accuracy of 8-bit numbers, a matrix vector multiplication operation. An input vector with an accuracy of 8-bit numbers is divided into two sub-input vectors with an accuracy of 4-bit numbers. One high 4 bits x2 and one low 4 bits x1. The sub-input vector of x1 is sent into a memory-calculation integrated array through a 4-bit DAC to carry out analog domain operation, and the analog domain operation result is read out through an 8-bit ADC and converted into a digital quantity y1. And then the sub-input vector of x2 is sent into a storage and calculation integrated array through a 4-bit DAC to carry out analog domain operation, and the analog domain operation result is read out through an 8-bit ADC and converted into a digital quantity y2. And calculating 16y2+y1 by a digital circuit in the output vector splicing unit to obtain a complete matrix vector multiplication operation result.
The readout circuit unit of the embodiment comprises a 4-bit DAC and an 8-bit ADC, and the sectional digital-analog hybrid addition method is adopted to enable the readout circuit to convert the requirement of the 8-bit DAC into the requirement of the 4-bit DAC, so that the area, the power consumption and the speed of the readout circuit can be optimized. Meanwhile, the input range of the ADC is reduced, the requirement on the dynamic range of the input of the ADC is reduced, and the area, the power consumption and the speed can be further optimized. 16y2+y1 exceeds the 8-bit precision of the ADC, so that the subsequent high-precision digital processing is facilitated, and the universality of the integrated chip reading circuit for memory calculation is improved.
Alternatively, the readout circuit unit in the present embodiment is the calculation readout circuit in embodiment 1; the integrated memory chip in this embodiment is based on the integrated memory chip in embodiment 1, and further optimizes the calculation readout circuit.
Example 3
Aiming at the optimization of the integrated chip reading circuit, the embodiment provides a second specific implementation mode based on a piecewise digital-analog hybrid addition method.
FIG. 5 is a second schematic diagram of a memory integrated chip based on the segment digital-analog hybrid addition method. The device comprises an input vector splitting unit, a memory integrated array, a reading circuit unit and an output vector splicing unit. The input vector splitting unit is connected with the memory and calculation integrated array. The memory-calculation integrated array is connected with the input vector splitting unit and the reading circuit unit. The reading circuit unit is connected with the integrated storage and calculation array and the output vector splicing unit; the output vector splicing unit is connected with the reading circuit unit. And the input vector splitting unit is used for splitting the vector to be operated into a plurality of sub-input vectors according to bits. And the memory and calculation integrated array is used for storing the weight matrix, receiving the input of the input vector splitting unit and calculating. And the reading circuit unit is used for reading the calculated analog quantity of the sub-input vector of the integrated array, converting the calculated analog quantity into digital quantity and inputting the digital quantity to the output vector splicing unit. The output vector splicing unit is used for splicing the outputs of the reading circuit to realize complete matrix vector multiplication operation.
It is assumed that this embodiment requires the realization of an input vector with an accuracy of 8-bit numbers, a matrix vector multiplication operation. An input vector with an accuracy of 8-bit numbers is divided into two sub-input vectors with an accuracy of 4-bit numbers. One high 4 bits x2 and one low 4 bits x1. The sub-input vector of x1 is firstly sent into a storage and calculation integrated array for analog domain operation from low to high through a single-bit DAC for 4 times according to bits, and the operation result of each bit is weighted 2, 4, 8 and 16 in sequence through a current mirror and a capacitor array and accumulated. The result of the analog domain operation is read out by an 8-bit ADC and converted into a digital quantity y1. And then the sub-input vector of x2 is sent into a storage and calculation integrated array for analog domain operation from low to high through a single-bit DAC for 4 times according to bits, and the operation result of each bit is weighted 2, 4, 8 and 16 in sequence through a current mirror and a capacitor array and accumulated. The result of the analog domain operation is read out by an 8-bit ADC and converted into a digital quantity y2. And calculating 16y2+y1 by a digital circuit in the output vector splicing unit to obtain a complete matrix vector multiplication operation result.
The readout circuit unit of the embodiment comprises a single-bit DAC, a current mirror and capacitor array weighted summation circuit and an 8-bit ADC, and the function requirements of the current mirror and capacitor array weighted summation circuit can be obviously reduced by adopting a segmented digital-analog mixed summation method, so that the area, the power consumption and the speed of the readout circuit can be optimized. Meanwhile, the input range of the ADC is reduced, the requirement on the dynamic range of the input of the ADC is reduced, and the area, the power consumption and the speed can be further optimized. 16y2+y1 exceeds the 8bit precision of the ADC, is favorable for subsequent high-precision digital processing, and improves the universality of the integrated chip reading circuit.
Alternatively, the readout circuit unit in the present embodiment is the calculation readout circuit in embodiment 1; the integrated memory chip in this embodiment is based on the integrated memory chip in embodiment 1, and further optimizes the calculation readout circuit.
Example 4:
an electronic device includes the integrated memory chip optimized by the readout circuit of the present invention, and the integrated memory chip can be referred to the integrated memory chip of any of the above embodiments, which is not described herein.
Optionally, the electronic device may be a server, a mobile phone, a tablet computer, a notebook computer, a digital photo frame, a wearable electronic device, an intelligent home device, or the like.
Alternatively, the electronic device may also have other components. For example, the electronic device may also include other processing devices, such as a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), a direct memory access (Direct MemoryAccess, DMA) controller, or other forms of processing units having data processing and/or program execution capabilities; in addition, the electronic device may further include interconnection devices such as a bus, so that the electronic device includes various devices that are interconnected. The electronic device may also include memory, such as volatile memory and/or non-volatile memory.

Claims (10)

1. The optimization method of the integrated chip reading circuit is characterized by comprising the following steps: dividing a memory and calculation integrated chip reading circuit into a calibration reading circuit and a calculation reading circuit; the calibration reading circuit is used for calibrating the weight, and the calculation reading circuit is used for reading the matrix vector multiplication operation result of the memory integrated chip.
2. The method of claim 1, wherein the calibration readout circuitry is configured to read out weights stored in the computationally intensive array, verify the correctness of the weight storage, and accurately write the weights to the computationally intensive array; the calculation and readout circuit is used for reading out the matrix vector multiplication operation result of the memory-calculation integrated array and converting the calculation result of the analog domain into the digital domain.
3. The method of claim 1, further comprising performing a matrix vector multiplication operation in a memory-integrated chip in two stages using a piecewise digital-analog hybrid addition method; the first stage uses simulation operation to split the input vector multiplied by the matrix vector to be calculated into sub-input vectors, and realizes matrix vector multiplication operation of each sub-input vector and weight through the simulation operation; and in the second stage, digital operation is used, and the sub-input vector operation result in the first stage is spliced with an adder by adopting a shift circuit, so that complete matrix vector multiplication operation is realized.
4. A readout circuit optimized integrated memory chip, wherein said integrated memory chip comprises a readout circuit optimized by the method of claim 1; the integrated memory chip divides the reading circuit into a calibration reading circuit and a calculation reading circuit.
5. The integrated memory chip of claim 4, wherein the integrated memory chip comprises a control unit, an integrated memory array, a weight input unit, a calibration readout circuit, a calculation readout circuit, a digital processing and buffering unit; the control unit is connected with the weight input unit, the calibration reading circuit, the calculation reading circuit, the digital processing unit and the buffer unit; the memory and calculation integrated array is connected with the weight input unit, the calibration reading circuit and the calculation reading circuit; the weight input unit is connected with the control unit and the memory and calculation integrated array; the calibration reading circuit is connected with the control unit and the memory and calculation integrated array; the calculation and readout circuit is connected with the control unit, the integrated storage and calculation array, the digital processing unit and the cache unit; the digital processing and buffering unit is connected with the control unit and the calculation reading circuit.
6. The integrated memory chip of claim 5, wherein,
the control unit is used for generating control signals required by the operation of other units so that each unit can work cooperatively;
the storage and calculation integrated array is used for storing weights and performing matrix vector multiplication operation;
the weight input unit is used for rewriting the weight into the memory integrated array;
the calibration reading circuit is used for reading the weight stored in the integrated memory-calculation array, checking the correctness of the weight storage, working cooperatively with the weight input unit and accurately writing the weight into the integrated memory-calculation array;
the calculation and readout circuit is used for reading out the matrix vector multiplication operation result of the memory-calculation integrated array and converting the calculation result of the analog domain into the digital domain;
the digital processing and caching unit is used for completing non-matrix vector multiplication operation in the chip and caching intermediate results between matrix vector multiplication and matrix vector multiplication.
7. The integrated memory chip according to claim 4 or 5, further comprising an input vector splitting unit and an output vector stitching unit; the input vector splitting unit is connected with the memory and calculation integrated array; the calculation reading circuit is connected with the output vector splicing unit;
the input vector splitting unit is used for splitting the vector to be operated into a plurality of sub-input vectors according to bit positions;
the storage and calculation integrated array is used for storing the weight matrix, receiving the input of the input vector splitting unit and calculating;
the calculation and readout circuit is used for reading out the sub-input vector calculation analog quantity of the integrated memory and calculation array, converting the sub-input vector calculation analog quantity into digital quantity and inputting the digital quantity into the output vector splicing unit;
the output vector splicing unit is used for splicing the outputs of the calculation readout circuit to realize complete matrix vector multiplication operation.
8. A read-out circuit for a memory chip, characterized in that it is obtained by the optimization method according to any one of claims 1-3.
9. An electronic device comprising the integrated memory chip of any one of claims 4-6.
10. The electronic device of claim 9, wherein the electronic device is any one of: server, cell-phone, panel computer, notebook computer, digital photo frame, wearable electronic equipment, intelligent house equipment.
CN202310433555.4A 2023-04-21 2023-04-21 Integrated memory chip and method for optimizing read-out circuit Active CN116189732B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310433555.4A CN116189732B (en) 2023-04-21 2023-04-21 Integrated memory chip and method for optimizing read-out circuit

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310433555.4A CN116189732B (en) 2023-04-21 2023-04-21 Integrated memory chip and method for optimizing read-out circuit

Publications (2)

Publication Number Publication Date
CN116189732A true CN116189732A (en) 2023-05-30
CN116189732B CN116189732B (en) 2023-07-21

Family

ID=86449200

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310433555.4A Active CN116189732B (en) 2023-04-21 2023-04-21 Integrated memory chip and method for optimizing read-out circuit

Country Status (1)

Country Link
CN (1) CN116189732B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111949935A (en) * 2019-05-16 2020-11-17 北京知存科技有限公司 Analog vector-matrix multiplication circuit and chip
CN113553293A (en) * 2021-07-21 2021-10-26 清华大学 Storage and calculation integrated device and calibration method thereof
US20220188628A1 (en) * 2020-12-15 2022-06-16 International Business Machines Corporation Dynamic configuration of readout circuitry for different operations in analog resistive crossbar array
US20220391681A1 (en) * 2021-06-07 2022-12-08 International Business Machines Corporation Extraction of weight values in resistive processing unit array

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111949935A (en) * 2019-05-16 2020-11-17 北京知存科技有限公司 Analog vector-matrix multiplication circuit and chip
US20220188628A1 (en) * 2020-12-15 2022-06-16 International Business Machines Corporation Dynamic configuration of readout circuitry for different operations in analog resistive crossbar array
US20220391681A1 (en) * 2021-06-07 2022-12-08 International Business Machines Corporation Extraction of weight values in resistive processing unit array
CN113553293A (en) * 2021-07-21 2021-10-26 清华大学 Storage and calculation integrated device and calibration method thereof

Also Published As

Publication number Publication date
CN116189732B (en) 2023-07-21

Similar Documents

Publication Publication Date Title
CN108416422B (en) FPGA-based convolutional neural network implementation method and device
US11244225B2 (en) Neural network processor configurable using macro instructions
CN106127302A (en) Process the circuit of data, image processing system, the method and apparatus of process data
WO2020238843A1 (en) Neural network computing device and method, and computing device
US11157799B2 (en) Neuromorphic circuits for storing and generating connectivity information
US20230168891A1 (en) In-memory computing processor, processing system, processing apparatus, deployment method of algorithm model
WO2021163866A1 (en) Neural network weight matrix adjustment method, writing control method, and related device
US11593628B2 (en) Dynamic variable bit width neural processor
CN113344170B (en) Neural network weight matrix adjustment method, write-in control method and related device
JP2019179364A (en) Semiconductor device and information processing system and information processing method
CN111723550A (en) Statement rewriting method, device, electronic device, and computer storage medium
CN114612996A (en) Method for operating neural network model, medium, program product, and electronic device
Zhang et al. Parallel convolutional neural network (CNN) accelerators based on stochastic computing
CN111563582A (en) Method for realizing and optimizing accelerated convolution neural network on FPGA (field programmable Gate array)
CN116189732B (en) Integrated memory chip and method for optimizing read-out circuit
US20210208885A1 (en) Processing-in-memory (pim) device for implementing a quantization scheme
US20210150328A1 (en) Hierarchical Hybrid Network on Chip Architecture for Compute-in-memory Probabilistic Machine Learning Accelerator
KR102318819B1 (en) In-memory device for operation of multi-bit Weight
CN112101511A (en) Sparse convolutional neural network
US9762285B1 (en) Compression using mu-law approximation
US20220108203A1 (en) Machine learning hardware accelerator
CN111814675A (en) Convolutional neural network characteristic diagram assembling system based on FPGA supporting dynamic resolution
CN114662681A (en) YOLO algorithm-oriented general hardware accelerator system platform capable of being deployed rapidly
Zhang et al. Yolov3-tiny Object Detection SoC Based on FPGA Platform
CN112287992A (en) Reinforcing steel bar cluster classification method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant