WO2020024608A1 - 模拟向量-矩阵乘法运算电路 - Google Patents
模拟向量-矩阵乘法运算电路 Download PDFInfo
- Publication number
- WO2020024608A1 WO2020024608A1 PCT/CN2019/081342 CN2019081342W WO2020024608A1 WO 2020024608 A1 WO2020024608 A1 WO 2020024608A1 CN 2019081342 W CN2019081342 W CN 2019081342W WO 2020024608 A1 WO2020024608 A1 WO 2020024608A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- analog
- programmable semiconductor
- matrix multiplication
- voltage
- semiconductor device
- Prior art date
Links
- 239000011159 matrix material Substances 0.000 title claims abstract description 119
- 239000004065 semiconductor Substances 0.000 claims abstract description 165
- 238000006243 chemical reaction Methods 0.000 claims description 31
- 238000000034 method Methods 0.000 claims description 29
- 239000000758 substrate Substances 0.000 claims description 9
- 238000001514 detection method Methods 0.000 claims description 8
- 230000001105 regulatory effect Effects 0.000 claims description 2
- 238000007667 floating Methods 0.000 description 16
- 238000010586 diagram Methods 0.000 description 11
- 238000004364 calculation method Methods 0.000 description 8
- 230000005540 biological transmission Effects 0.000 description 7
- 230000006870 function Effects 0.000 description 7
- 238000004422 calculation algorithm Methods 0.000 description 5
- 230000008878 coupling Effects 0.000 description 5
- 238000010168 coupling process Methods 0.000 description 5
- 238000005859 coupling reaction Methods 0.000 description 5
- 230000008569 process Effects 0.000 description 5
- 230000004044 response Effects 0.000 description 5
- 238000004088 simulation Methods 0.000 description 4
- 239000013598 vector Substances 0.000 description 4
- 230000010354 integration Effects 0.000 description 3
- 239000000243 solution Substances 0.000 description 3
- 230000000694 effects Effects 0.000 description 2
- 230000008859 change Effects 0.000 description 1
- 239000004020 conductor Substances 0.000 description 1
- 238000007418 data mining Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000005669 field effect Effects 0.000 description 1
- 239000002784 hot electron Substances 0.000 description 1
- 238000002347 injection Methods 0.000 description 1
- 239000007924 injection Substances 0.000 description 1
- 229940050561 matrix product Drugs 0.000 description 1
- 239000002184 metal Substances 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 230000005641 tunneling Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06G—ANALOGUE COMPUTERS
- G06G7/00—Devices in which the computing operation is performed by varying electric or magnetic quantities
- G06G7/12—Arrangements for performing computing operations, e.g. operational amplifiers
- G06G7/16—Arrangements for performing computing operations, e.g. operational amplifiers for multiplication or division
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/16—Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Definitions
- the invention relates to the field of signal processing, in particular to an analog vector-matrix multiplication operation circuit.
- Matrix multiplication is widely used in data mining fields such as image processing, recommendation systems, and data dimensionality reduction.
- traditional technology architectures and serial-based methods that rely solely on a single computer are becoming increasingly unsuitable for current mass data processing requirements. Therefore, expanding the operation scale of matrix multiplication and reducing its operation time will help meet the requirements of matrix factorization algorithms for processing large-scale data.
- matrix multiplication has a high time complexity.
- Traditional matrix multiplication solves the matrix product by finding the inner product of the left matrix row and the right matrix column. This algorithm can be implemented as a distributed algorithm, but its performance is not optimistic.
- Another form of matrix multiplication is to perform an outer product operation on the columns of the left matrix and the corresponding rows of the right matrix, so as to obtain partial results of the result matrix, and finally sum the results of each part.
- this algorithm has greatly improved efficiency compared with traditional algorithms, but there are certain bottlenecks.
- the matrix size is very large, the memory of a single machine cannot store a row and a right matrix of the left matrix. Can't be calculated.
- Vector-matrix multiplication is a commonly used logical calculation function. In the traditional Von Neumann computing architecture, the memory and processor are physically separated. The two are connected through a data bus. When performing vector-matrix multiplication, the vector and matrix data to be processed must first be removed from the memory. Read it out, transfer it to the processor, perform logical calculations, and then store the calculation results back into memory. This calculation method consumes a lot of data bus bandwidth and transmission power.
- Vector-matrix multiplication for analog signals is even more complicated. First, the analog signal needs to be converted into a digital signal through analog-to-digital conversion and stored in the memory. Then the vector-matrix multiplication is performed according to the above process, and then the digital signal is converted into digital-to-analog conversion through other methods. Analog signal. This analog vector-matrix multiplication operation results in greater power consumption and cost overhead, and poor processing performance. With the rise of big data applications, the transmission and processing of massive data has further exacerbated these problems.
- an embodiment of the present invention provides an analog vector-matrix multiplication operation circuit, which solves the problem of poor processing performance of existing matrix multiplication operations.
- An analog vector-matrix multiplication operation circuit includes: multiple analog voltage input terminals, a programmable semiconductor device array, multiple first terminals, and multiple second terminals;
- the gates of all programmable semiconductor devices in each row are connected to the same analog voltage input terminal, and multiple rows of programmable semiconductor devices are correspondingly connected to multiple analog voltage input terminals, and all programmable semiconductor devices in each column are connected.
- the drains are connected to the same first terminal, and multiple rows of programmable semiconductor devices are correspondingly connected to multiple first terminals.
- the sources of all programmable semiconductor devices in each column are connected to the same second terminal. Corresponding to multiple second terminals, the threshold voltage of each programmable semiconductor device can be adjusted;
- the first terminal is a bias voltage input terminal
- the second terminal is an analog current output terminal.
- the first terminal is an analog current output terminal
- the second terminal is a bias voltage input terminal
- analog vector-matrix multiplication operation circuit further includes:
- a programming circuit is connected to the source, gate, and / or substrate of each programmable semiconductor device in the programmable semiconductor device array, and is used to regulate the threshold voltage of the programmable semiconductor device.
- the programming circuit includes a voltage generating circuit and a voltage control circuit.
- the voltage generating circuit is used to generate a programming voltage or an erasing voltage.
- the voltage control circuit is used to load the programming voltage to a source of a selected programmable semiconductor device.
- the erase voltage is loaded to the gate or substrate of the selected programmable semiconductor device to regulate the threshold voltage of the programmable semiconductor device.
- analog vector-matrix multiplication operation circuit further includes:
- the controller is connected to the programming circuit, and controls the programming circuit to control the number of programmable semiconductor devices that are put into operation and the threshold voltage of each programmable semiconductor device.
- the controller includes a row and column decoder for gating the programmable semiconductor device to be programmed.
- the analog vector-matrix multiplication operation circuit further includes a conversion device connected to the plurality of analog voltage input terminals, for converting the plurality of analog current input signals into analog voltage input signals, and outputting the signals to corresponding analog signals. Voltage input.
- the conversion device includes a plurality of programmable semiconductor devices
- each programmable semiconductor device is connected to the drain and connected to the corresponding analog voltage input terminal;
- each programmable semiconductor device is connected to a first bias voltage.
- the analog vector-matrix multiplication circuit further includes: a current detection output circuit, which is connected to the analog current output terminal and is used to process and output the analog current output signal output from the analog current output terminal.
- the current detection output circuit includes: a plurality of operational amplifiers, a non-inverting input terminal of each operational amplifier is connected to a second bias voltage, an inverting input terminal is connected to a corresponding analog current output terminal, and the inverting input A resistor or transistor is connected between the terminal and the output terminal.
- the programmable semiconductor device uses a floating gate transistor.
- the invention also provides a control method of an analog vector-matrix multiplication operation circuit, which is used for the analog vector-matrix multiplication operation circuit.
- the control method includes:
- the controller is used to control the number of programmable semiconductor devices in operation
- Threshold voltage of programmable semiconductor device is regulated by programming circuit
- Multiple analog current output signals are obtained through multiple analog current output terminals corresponding to multiple columns of programmable semiconductor devices.
- control method before applying a plurality of analog voltage signals to the gates of all programmable semiconductor devices in a corresponding row through a plurality of analog voltage input terminals, the control method further includes:
- a plurality of analog current input signals are converted into a plurality of analog voltage input signals by a conversion device.
- the present invention also provides a storage device including the above analog vector-matrix multiplication operation circuit.
- the present invention also provides a chip including the above analog vector-matrix multiplication operation circuit.
- the analog vector-matrix multiplication operation circuit and the control method thereof provided by the present invention can treat each programmable semiconductor device as a variable equivalent simulation by dynamically adjusting the threshold voltage V TH of each programmable semiconductor device according to a certain rule in advance.
- the weight is equivalent to storing an analog data, and the programmable semiconductor device array stores an analog data array; when the circuit is operating, a column of analog voltage vectors or a column of analog voltage vectors converted by the analog current vector through the conversion device is applied to the corresponding programmable
- the gate of the semiconductor device enables the gate of the programmable semiconductor device to obtain a voltage signal, and the source (or drain) outputs an analog current output signal.
- each programmable semiconductor device source (or Drain) The analog current output signal is equal to the voltage times the weight, because the sources (or drains) of all programmable semiconductor devices in each column are connected to the same analog current output. According to Kirchhoff ’s law, so The analog current output signal at this analog current output is all programmable for this column
- the sum of the source (or drain) current of the conductor device is the sum of the product of the gate voltage and weight of all programmable semiconductor devices in the column.
- Multiple analog current output terminals output the sum of the product of multiple gate voltages and weight.
- the present invention uses a programmable semiconductor device array to realize analog vector-matrix multiplication operation, because the programmable semiconductor device has high integration, fast response speed, and low power consumption, so the simulation implemented by the programmable semiconductor device array is used.
- the vector-matrix multiplication circuit effectively reduces the overhead caused by analog-to-digital conversion, digital-to-analog conversion, and data transmission, and improves its processing performance.
- the programmable semiconductor device array can be used as a flash memory or an electrically erasable programmable read-only memory to realize the reuse of electrical components and improve the components. Utilize the efficiency, save the hardware cost of the integrated circuit.
- analog vector-matrix multiplication operation circuit provided by the present invention, by setting a current detection output circuit after the analog current output end, accurately processes and outputs the calculated current, or receives the input of the next programmable semiconductor device array, Can effectively improve the output current accuracy.
- the storage device integrates an analog vector-matrix multiplication operation circuit, and directly performs vector-matrix multiplication operations on analog signals in the storage device, without the need to transfer data back and forth between the memory and the processor, improving processing performance and reducing power Consumption and cost.
- FIG. 1A is a schematic diagram of a first embodiment of an analog vector-matrix multiplication operation circuit according to the present invention
- 1B is a schematic diagram of a second embodiment of an analog vector-matrix multiplication operation circuit according to the present invention.
- FIG. 2 is a structural diagram of a floating gate transistor in an analog vector-matrix multiplication operation circuit according to an embodiment of the present invention
- 3A is a schematic diagram of a third embodiment of an analog vector-matrix multiplication operation circuit according to the present invention.
- 3B is a schematic diagram of a fourth embodiment of an analog vector-matrix multiplication operation circuit according to the present invention.
- FIG. 4 is a schematic diagram of a fifth embodiment of an analog vector-matrix multiplication operation circuit according to the present invention.
- FIG. 5 is a flowchart of a control method of an analog vector-matrix multiplication operation circuit according to an embodiment of the present invention.
- Vector-matrix multiplication is a commonly used logic calculation function.
- the performance, power consumption, and cost of existing analog vector-matrix multiplication operations need to be improved.
- the analog vector-matrix multiplication operation circuit treats each programmable semiconductor device as a variable equivalent analog weight by adjusting the threshold voltage of the programmable semiconductor device, which is equivalent to analog matrix data.
- the device array applies analog voltage to realize the matrix multiplication function.
- the circuit structure is simple, the number of components is small, the response speed is fast, and the power consumption is low, which greatly reduces the overhead caused by analog-to-digital conversion, digital-to-analog conversion, and data transmission. improve.
- FIG. 1A is a schematic diagram of a first embodiment of an analog vector-matrix multiplication operation circuit according to the present invention.
- the analog vector-matrix multiplication operation circuit includes: M analog voltage input terminals, a programmable semiconductor device array of M rows ⁇ N columns, N first terminals, and N second terminals, where: The first terminal is a bias voltage input terminal, and the second terminal is an analog current output terminal.
- the gates of all programmable semiconductor devices in each row are connected to the same analog voltage input terminal, and the programmable semiconductor devices in M rows are correspondingly connected to M analog voltage input terminals, and all programmable semiconductors in each column are connected.
- the drains of the devices are connected to the same bias voltage input terminal, and the N column programmable semiconductor devices are correspondingly connected to N bias voltage input terminals.
- the sources of all programmable semiconductor devices in each column are connected to the same analog current output terminal.
- N columns of programmable semiconductor devices are correspondingly connected to N analog current output terminals, wherein the threshold voltage of each programmable semiconductor device can be adjusted.
- N is a positive integer greater than or equal to zero
- M is a positive integer greater than or equal to zero.
- M and N may be equal or different.
- each programmable semiconductor device can be regarded as a variable equivalent analog weight (denoted as W k, j , where 0 ⁇ k ⁇ M and 0 ⁇ j ⁇ N represent the row and column numbers respectively), which is equivalent to storing an analog data, while the programmable semiconductor device array stores an analog data array
- a column of analog voltage signals V 1 ⁇ V M are applied to M rows of programmable semiconductor devices, wherein the gates of all programmable semiconductor devices in the K-th row get an analog voltage signal V k and the drain input is biased Set the voltage V b and the source outputs current signals I k, 1 ⁇ I k, N respectively .
- I V ⁇ W
- the analog current output terminal is based on Kirchhoff's law, so the current I j at the analog current output terminal is the sum of the source currents of all programmable semiconductor devices in the column, which is Multiple analog current outputs output multiple currents and Realize the function of matrix multiplication.
- the invention uses an array of programmable semiconductor devices to implement analog vector-matrix multiplication operations. Because the programmable semiconductor devices have high integration, fast response speed, and low power consumption, the analog vector-matrix multiplication operation circuits implemented using programmable semiconductor devices are effectively reduced. The overhead caused by analog-to-digital conversion, digital-to-analog conversion, and data transmission is greatly improved.
- FIG. 1B is a schematic diagram of a second embodiment of an analog vector-matrix multiplication operation circuit according to the present invention.
- the analog vector-matrix multiplication circuit includes: M analog voltage input terminals, a programmable semiconductor device array of M rows ⁇ N columns, N first terminals, and N second terminals, where The first terminal is an analog current output terminal, and the second terminal is a bias voltage input terminal.
- the gates of all programmable semiconductor devices in each row are connected to the same analog voltage input terminal, and the programmable semiconductor devices in M rows are correspondingly connected to M analog voltage input terminals, and all programmable semiconductors in each column are connected.
- the source of the device is connected to the same bias voltage input terminal, and the programmable semiconductor devices in N columns are correspondingly connected to N bias voltage input terminals.
- the drains of all programmable semiconductor devices in each column are connected to the same analog current output terminal.
- N columns of programmable semiconductor devices are correspondingly connected to N analog current output terminals, wherein the threshold voltage of each programmable semiconductor device can be adjusted.
- N is a positive integer greater than or equal to zero
- M is a positive integer greater than or equal to zero.
- M and N may be equal or different.
- each programmable semiconductor device can be regarded as a variable equivalent analog weight (denoted as W k, j , where 0 ⁇ k ⁇ M and 0 ⁇ j ⁇ N represent the row and column numbers respectively), which is equivalent to storing an analog data, while the programmable semiconductor device array stores an analog data array
- a column of analog voltage signals V 1 to V M are applied to M rows of programmable semiconductor devices, among which the gates of all programmable semiconductor devices in the K-th row get an analog voltage signal V k and the source input is biased.
- V b the voltage
- I k, 1 ⁇ I k, N the drain output current signals I k, 1 ⁇ I k, N respectively .
- I V ⁇ W.
- the drain output current of each programmable semiconductor device is equal to the gate voltage.
- the invention uses an array of programmable semiconductor devices to implement analog vector-matrix multiplication operations. Because the programmable semiconductor devices have high integration, fast response speed, and low power consumption, the analog vector-matrix multiplication operation circuits implemented using programmable semiconductor devices are effectively reduced. The overhead caused by analog-to-digital conversion, digital-to-analog conversion, and data transmission is greatly improved.
- the output current of the programmable semiconductor device is very sensitive to the source voltage and may cause calculation errors.
- gate coupling The topology of the drain summation does not cause calculation errors even if the source voltage fluctuates, which can improve the accuracy of the calculation.
- the programmable semiconductor device may be implemented using a floating gate transistor.
- the structure of the floating gate transistor is shown in FIG. 2.
- the floating gate transistor includes a substrate, an insulating layer, Gate G, source S, drain D, and floating gate F.
- the floating gate is placed between the gate and the insulating layer, and the insulating layer is placed between the floating gate and the substrate. It is used to protect the electrons in the floating gate from Leakage, electrons can be stored in the floating gate; by adjusting the number of electrons in the floating gate, the threshold voltage of the floating gate transistor is dynamically adjusted. Due to this structural characteristic of the floating gate transistor, it can be regarded as a variable equivalent simulation Weights to store one simulation data.
- analog vector-matrix multiplication operation circuit may further include:
- a programming circuit is connected to the source, gate, and / or substrate of each programmable semiconductor device in the programmable semiconductor device array, and is used to regulate the threshold voltage of the programmable semiconductor device.
- the programming circuit includes a voltage generating circuit and a voltage control circuit
- the voltage generating circuit is used to generate a programming voltage or an erasing voltage
- the voltage control circuit is used to load the programming voltage to a source of a selected programmable semiconductor device, or To load the erase voltage to the gate or substrate of the selected programmable semiconductor device to regulate the threshold voltage of the programmable semiconductor device.
- the programming circuit uses the hot electron injection effect to apply a high voltage to the source of the programmable semiconductor device according to the threshold voltage demand data of the programmable semiconductor device, and accelerate the channel electrons to a high speed to increase the threshold voltage of the programmable semiconductor device.
- the programming circuit uses a tunneling effect to apply a high voltage to the gate or substrate of the programmable semiconductor device according to the threshold voltage requirement data of the programmable semiconductor device, thereby reducing the threshold voltage of the programmable semiconductor device.
- analog vector-matrix multiplication circuit may further include:
- the controller is connected to the programming circuit, and controls the programming circuit to adjust the number of programmable semiconductor devices that are put into operation and the threshold voltage of each programmable semiconductor device to meet the needs of the matrix multiplication operation.
- the controller includes a row and column decoder for gating the programmable semiconductor device to be programmed.
- the analog vector-matrix multiplication operation circuit may further include: a bias voltage generating circuit, configured to generate a preset bias voltage, and input the bias voltage to the bias voltage input.
- a bias voltage generating circuit configured to generate a preset bias voltage
- the analog vector-matrix multiplication circuit can also be provided without a bias voltage generating circuit. By multiplexing the voltage generating circuit in the programming circuit, the voltage generating circuit is controlled to generate a preset bias voltage. To the bias voltage input.
- FIG. 3A is a schematic diagram of a third embodiment of an analog vector-matrix multiplication circuit according to the present invention.
- the analog vector-matrix multiplication circuit includes all the contents of the first embodiment shown in FIG. 1A or the second embodiment shown in FIG. 1B. On the basis of, it may further include: a conversion device 5, which is connected in front of a plurality of analog voltage input terminals, for converting a plurality of analog current input signals into analog voltage input signals and inputting them to the corresponding analog voltage input terminals. .
- the conversion device 5 comprises a plurality of programmable semiconductor devices.
- each programmable semiconductor device is connected to the drain and connected to the corresponding analog voltage input terminal.
- each programmable semiconductor device is connected to a first bias voltage.
- the first bias voltage connected to the source can be a ground voltage, that is, the source is grounded.
- each programmable semiconductor device is connected to receive an analog current input signal.
- the programmable semiconductor device in the conversion device 5 may adopt a floating gate transistor.
- the analog vector-matrix multiplication operation circuit in the embodiment of the present invention is suitable for not only analog voltage input signals but also analog current input signals, and the applicability of the analog vector-matrix multiplication operation circuit can be increased.
- FIG. 3B is a schematic diagram of a fourth embodiment of an analog vector-matrix multiplication operation circuit according to the present invention.
- the analog vector-matrix multiplication operation circuit includes all of the contents of the first embodiment shown in FIG. 1A or the second embodiment shown in FIG. 1B. On the basis of, it may further include: a conversion device 5, which is connected in front of a plurality of analog voltage input terminals, for converting a plurality of analog current input signals into analog voltage input signals and inputting them to the corresponding analog voltage input terminals. .
- the conversion device 5 includes a plurality of resistors, and the plurality of resistors are connected to the plurality of analog voltage input terminals in a one-to-one correspondence.
- each resistor is connected to a corresponding analog voltage input end, and the other end is connected to a first bias voltage.
- the first bias voltage may be a ground voltage, that is, the other end of the resistor is grounded.
- a series of analog current input signals I in1 to I inM are converted into a series of analog voltage input signals V 1 to V M by the conversion device 5 and then applied to M rows of programmable semiconductor devices.
- the analog vector-matrix multiplication operation circuit in the embodiment of the present invention is suitable for not only analog voltage input signals but also analog current input signals, and the applicability of the analog vector-matrix multiplication operation circuit can be increased.
- the implementation of the above conversion device is only an example, and any circuit structure or circuit element that can convert a current input signal into a voltage input signal can be used to implement the conversion device, such as a metal semiconductor field effect transistor.
- FIG. 4 is a schematic diagram of a fifth embodiment of an analog vector-matrix multiplication operation circuit according to the present invention.
- the analog vector-matrix multiplication operation circuit includes all the contents described in any one of the first to fourth embodiments, and may further include a current detection output circuit 6 connected to After the analog current output terminal, it is used to process and output the analog current output signal output from the analog current output terminal.
- the current detection output circuit accurately processes and outputs the calculated current, or receives the input of the next programmable semiconductor array, which can effectively realize accurate current output.
- the current detection output circuit may include: a plurality of operational amplifiers, a non-inverting input terminal of each of the operational amplifiers is connected to a second bias voltage Vs, and an inverting input terminal is connected to the corresponding analog A current output terminal, and a resistor or a transistor is connected between the inverting input terminal and the output terminal.
- the non-inverting input terminal is generally grounded.
- the operational amplifier controls the voltage at the analog current output terminal to be equal to the non-inverting input terminal to ensure that the gate-source voltage V GS of the programmable semiconductor device is controlled by the programmable semiconductor device only.
- the corresponding input voltage is controlled so that the output voltage of the operational amplifier represents the amplitude of the output current of the corresponding column of programmable semiconductor devices.
- each module in the analog vector-matrix multiplication circuit provided by the embodiment of the present invention.
- the specific structure of each module is not limited to the above structure provided by the embodiment of the present invention, and may also be Other structures known to those skilled in the art are not limited herein.
- the embodiment of the present application further provides a control method of an analog vector-matrix multiplication operation circuit, which can be used to control the analog vector-matrix multiplication operation circuit described in the foregoing embodiments, as described in the following embodiments. Since the principle of the control method to solve the problem is similar to the above-mentioned circuit, the implementation of the control method can refer to the implementation of the above-mentioned circuit, and the duplicated parts will not be repeated.
- the control method of the analog vector-matrix multiplication operation circuit is shown in FIG. 5 and is used to control the above analog vector-matrix multiplication operation circuits.
- the control method includes:
- Step S430 applying a plurality of analog voltage input signals to the gates of all the programmable semiconductor devices in the corresponding row through the plurality of analog voltage input terminals.
- Step S440 Apply a preset bias voltage to all programmable semiconductor devices in a corresponding row through a plurality of bias voltage input terminals.
- a preset bias voltage is applied to the drain of the programmable semiconductor device; when the analog vector-matrix multiplication operation When the circuit uses a gate coupling and drain summing topology, a preset bias voltage is applied to the source of the programmable semiconductor device.
- Step S450 obtaining a plurality of analog current output signals through a plurality of analog current output terminals corresponding to the plurality of rows of programmable semiconductor devices.
- the input signal is an analog current input signal
- the plurality of analog current input signals are first converted into a plurality of analog voltage input signals by the conversion device 5, and then the analog voltage input signal is input to the analog voltage input terminal. Matrix-multiplication operation.
- the analog current output signal obtained by each column is: the product of the analog voltage input signal of each row connected to the column and the weight of each programmable semiconductor device in the column is then summed.
- control method of the analog vector-matrix multiplication circuit further includes:
- Step S420 The threshold voltage of the programmable semiconductor device is adjusted by the programming circuit.
- control method further includes:
- Step S410 Based on the number of bits required for the matrix multiplication operation, a controller is used to control the number of programmable semiconductor devices that are put into operation.
- An embodiment of the present invention further provides a storage device including the foregoing analog vector-matrix multiplication operation circuit.
- the storage device integrates an analog vector-matrix multiplication operation circuit, and performs vector-matrix multiplication operations on the analog signals directly in the storage device, eliminating the need to transfer data between the memory and the processor, improving processing performance, reducing power consumption and cost. Overhead.
- the storage device is a flash memory or an electrically erasable programmable read-only memory.
- the flash memory is a NOR type flash memory.
- An embodiment of the present invention further provides a chip, including the foregoing analog vector-matrix multiplication operation circuit.
- the floating gate transistor may be a SONOS floating-gate transistor, a split-gate floating-gate transistor, or a charge-trapping floating-gate transistor.
- transistor including but not limited to this, all transistors capable of adjusting the threshold voltage of the transistor itself by adjusting the number of electrons in the floating gate belong to the protection scope of the embodiment of the present invention.
- the analog vector-matrix multiplication operation circuit, the control method, the storage device and the chip according to the embodiments of the present invention can be used to perform related operations in terminals such as computers, mobile phones, and tablet computers.
- terminals such as computers, mobile phones, and tablet computers.
- analog vector-matrix multiplication operation circuits it is indispensable.
- the components are all understood by those of ordinary skill in the art, and are not repeated here, nor should they be taken as a limitation on the present invention.
- each programmable semiconductor device is regarded as a variable equivalent analog weight, which is equivalent to the analog matrix data.
- the analog voltage is applied to the programmable semiconductor device array to implement the matrix multiplication function.
- the circuit structure is simple, the number of components is small, the response speed is fast, and the power consumption is low, which greatly reduces the overhead caused by analog-to-digital conversion, digital-to-analog conversion, and data transmission, and effectively improves the processing performance of the arithmetic circuit.
- the programmable semiconductor device array can be used as a flash memory or an electrically erasable programmable read-only memory to realize the reuse of electrical components and improve the components. Utilize the efficiency, save the hardware cost of the integrated circuit.
- the storage device integrateds an analog vector-matrix multiplication operation circuit, and directly performs vector-matrix multiplication operations on analog signals in the storage device, without the need to transfer data back and forth between the memory and the processor, improving processing performance and reducing power. Consumption and cost.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Power Engineering (AREA)
- Computer Hardware Design (AREA)
- Mathematical Optimization (AREA)
- Mathematical Analysis (AREA)
- Pure & Applied Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Computational Mathematics (AREA)
- Algebra (AREA)
- Databases & Information Systems (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Logic Circuits (AREA)
Abstract
一种模拟向量-矩阵乘法运算电路,采用可编程存储器件阵列实现,可编程半导体器件阵列中,每一行的所有可编程半导体器件的栅极均连接至同一模拟电压输入端,M行可编程半导体器件对应连接M个模拟电压输入端,每一列的所有可编程半导体器件的漏极(或源极)均连接至同一偏置电压输入端,N列可编程半导体器件对应连接N个偏置电压输入端,每一列的所有可编程半导体器件的源极(或漏极)均连接至同一个模拟电流输出端,N列可编程半导体器件对应连接N个模拟电流输出端,通过控制可编程半导体器件的阈值电压,将每个可编程半导体器件看作一个可变的等效模拟权重,实现矩阵乘法运算功能。
Description
本申请要求于2018年08月02日提交中国专利局、申请号为201810872120.9申请名称为“模拟向量-矩阵乘法运算电路”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
本发明涉及信号处理领域,尤其涉及一种模拟向量-矩阵乘法运算电路。
矩阵乘法运算广泛应用于图像处理、推荐系统、数据降维等数据挖掘领域,然而,传统的技术架构和仅靠单台计算机基于串行的方式越来越不适应当前海量数据处理的要求。因此,扩大矩阵乘法的运算规模并降低其运算时间,将有利于满足矩阵分解算法处理大规模数据的要求。然而,矩阵乘法具有较高的时间复杂度,传统矩阵乘法通过求左矩阵行与右矩阵列的内积来求解矩阵的乘积。这种算法可以实现为分布式算法,但是其性能不容乐观。对于矩阵乘法的另外一种形式是将左矩阵的列和右矩阵相应的行进行外积运算,从而得到结果矩阵的部分结果,最后对各个部分结果求和。虽然在并行化方面,这种算法与传统算法相比在效率有了很大提升,但也存在一定的瓶颈,当矩阵规模非常大,大到单个机器的内存不能存放左矩阵的一行和右矩阵的一列时,便不能计算。
向量-矩阵乘法是一种常用的逻辑计算函数。在传统冯诺依曼计算体系结构中,存储器和处理器是物理分离的,两者之间通过数据总线进行连接,执行向量-矩阵乘法运算时,首先需要把待处理的向量和矩阵数据从存储器中读取出来,传输到处理器当中,进行逻辑计算,再把计算结果存回到存储器当中。这种计算方式消耗大量的数据总线带宽和传输功 耗。对于模拟信号的向量-矩阵乘法运算就更加复杂。首先,需要通过模数转换等方法,把模拟信号转换成数字信号,存储到存储器当中,然后根据上面的处理过程进行向量-矩阵乘法运算后,再通过数模转换等方法,把数字信号转换成模拟信号。这种模拟向量-矩阵乘法运算造成更大功耗与成本开销,处理性能不佳。随着大数据应用的兴起,海量数据的传输与处理进一步加剧了这些问题。
发明内容
有鉴于此,本发明实施例提供了一种模拟向量-矩阵乘法运算电路,解决现有矩阵乘法运算的处理性能不佳的问题。
为了达到上述目的,本发明采用如下技术方案:
一种模拟向量-矩阵乘法运算电路,包括:多个模拟电压输入端、可编程半导体器件阵列、多个第一端以及多个第二端;
可编程半导体器件阵列中,每一行的所有可编程半导体器件的栅极均连接至同一模拟电压输入端,多行可编程半导体器件对应连接多个模拟电压输入端,每一列的所有可编程半导体器件的漏极均连接至同一第一端,多列可编程半导体器件对应连接多个第一端,每一列的所有可编程半导体器件的源极均连接至同一第二端,多列可编程半导体器件对应连接多个第二端,每个可编程半导体器件的阈值电压均可调节;
其中,第一端为偏置电压输入端,第二端为模拟电流输出端,
或者,第一端为模拟电流输出端,第二端为偏置电压输入端。
一实施例中,模拟向量-矩阵乘法运算电路还包括:
编程电路,连接可编程半导体器件阵列中每一个可编程半导体器件的源极、栅极和/或衬底,用于调控可编程半导体器件的阈值电压。
一实施例中,编程电路包括:电压产生电路和电压控制电路,电压产生电路用于产生编程电压或者擦除电压,电压控制电路用于将编程电 压加载至选定的可编程半导体器件的源极,或者,将擦除电压加载至选定的可编程半导体器件的栅极或衬底,以调控可编程半导体器件的阈值电压。
一实施例中,模拟向量-矩阵乘法运算电路还包括:
控制器,连接编程电路,通过控制编程电路工作,控制投入工作的可编程半导体器件的数量以及各可编程半导体器件的阈值电压。
一实施例中,控制器包括:行列译码器,用于选通待编程的可编程半导体器件。
一实施例中,模拟向量-矩阵乘法运算电路还包括:转换装置,连接在多个模拟电压输入端之前,用于将多个模拟电流输入信号分别转换为模拟电压输入信号,输至对应的模拟电压输入端。
一实施例中,转换装置包括多个可编程半导体器件;
每个可编程半导体器件的栅极与漏极相连,并连接至对应的模拟电压输入端;
每个可编程半导体器件的源极接入第一偏置电压。
一实施例中,模拟向量-矩阵乘法运算电路还包括:电流检测输出电路,连接在模拟电流输出端之后,用于对模拟电流输出端输出的模拟电流输出信号进行处理和输出。
一实施例中,电流检测输出电路包括:多个运算放大器,每个运算放大器的正相输入端连接第二偏置电压,反相输入端连接至对应的模拟电流输出端,并且,反相输入端与输出端之间连接一电阻器或晶体管。
一实施例中,可编程半导体器件采用浮栅晶体管。
本发明还提供一种模拟向量-矩阵乘法运算电路的控制方法,用于上述模拟向量-矩阵乘法运算电路,控制方法包括:
基于矩阵乘法运算的位数需求,利用控制器控制投入工作的可编程 半导体器件的数量;
通过编程电路调控可编程半导体器件的阈值电压;
将多个模拟电压输入信号通过多个模拟电压输入端施加至对应行所有可编程半导体器件的栅极;
将一预设偏置电压通过多个偏置电压输入端施加至对应列所有可编程半导体器件;
通过多列可编程半导体器件对应的多个模拟电流输出端,得到多个模拟电流输出信号。
一实施例中,在将多个模拟电压信号通过多个模拟电压输入端施加至对应行所有可编程半导体器件的栅极之前,控制方法还包括:
通过转换装置将多个模拟电流输入信号分别转换为多个模拟电压输入信号。
本发明还提供一种存储装置,包括上述模拟向量-矩阵乘法运算电路。
本发明还提供一种芯片,包括上述模拟向量-矩阵乘法运算电路。
本发明提供的模拟向量-矩阵乘法运算电路及其控制方法,通过预先按照一定规律动态调节各可编程半导体器件的阈值电压V
TH,可将各可编程半导体器件看作一个可变的等效模拟权重,相当于存储一个模拟数据,可编程半导体器件阵列则存储一个模拟数据阵列;电路工作时,将一列模拟电压向量或一列由模拟电流向量经转换装置转换成的模拟电压向量施加至对应可编程半导体器件的栅极,使可编程半导体器件的栅极得到一电压信号,源极(或漏极)输出一模拟电流输出信号,根据可编程半导体器件特性,每个可编程半导体器件源极(或漏极)输出的模拟电流输出信号等于电压乘以权重,因为每一列的所有可编程半导体器件的源极(或漏极)均连接至同一个模拟电流输出端,根据基尔霍夫定律,所以在该模拟电流输出端的模拟电流输出信号为该列所有可编程半导体器 件的源极(或漏极)电流之和,即为该列所有可编程半导体器件的栅压与权重的乘积之和,多个模拟电流输出端输出多个栅压与权重的乘积之和,实现矩阵乘法运算功能;本发明利用可编程半导体器件阵列实现模拟向量-矩阵乘法运算,因为可编程半导体器件集成度高、响应速度快、功耗低,所以采用可编程半导体器件阵列实现的模拟向量-矩阵乘法运算电路有效减少了模数转换、数模转换、数据传输等带来的开销,其处理性能提高。
并且,本发明提供的模拟向量-矩阵乘法运算电路,在处于空闲状态时,可编程半导体器件阵列可以用作快闪存储器或电可擦可编程只读存储器,实现电器元件的复用,提高元件利用效率,节省集成电路的硬件成本。
另外,本发明提供的模拟向量-矩阵乘法运算电路,通过在模拟电流输出端之后设置电流检测输出电路,将运算完的电流精确处理并输出,或者接到下一个可编程半导体器件阵列的输入,能够有效提高输出电流精度。
本发明提供的存储装置上集成模拟向量-矩阵乘法运算电路,直接在存储装置中对模拟信号进行向量-矩阵乘法运算,不需要在存储器与处理器之间来回传输数据,提高处理性能,降低功耗与成本开销。
为让本发明的上述和其他目的、特征和优点能更明显易懂,下文特举较佳实施例,并配合所附图式,作详细说明如下。
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技 术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1A为本发明模拟向量-矩阵乘法运算电路的第一实施例的示意图;
图1B为本发明模拟向量-矩阵乘法运算电路的第二实施例的示意图;
图2为本发明实施例一种模拟向量-矩阵乘法运算电路中的浮栅晶体管结构图;
图3A为本发明模拟向量-矩阵乘法运算电路的第三实施例的示意图;
图3B为本发明模拟向量-矩阵乘法运算电路的第四实施例的示意图;
图4为本发明模拟向量-矩阵乘法运算电路的第五实施例的示意图;
图5为本发明实施例一种模拟向量-矩阵乘法运算电路的控制方法的流程图。
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。
向量-矩阵乘法是一种常用的逻辑计算函数,现有模拟向量-矩阵乘法运算的性能、功耗与成本有待提高。本发明提供的模拟向量-矩阵乘法运算电路,通过调节可编程半导体器件的阈值电压,将每个可编程半导体器件看作一个可变的等效模拟权重,相当于模拟矩阵数据,对可编程半导体器件阵列施加模拟电压,实现矩阵乘法运算功能,电路结构简单、元器件数量少、响应速度快、功耗低,大大降低模数转换、数模转换、数据传输等带来的开销,处理性能大大提高。
图1A为本发明模拟向量-矩阵乘法运算电路的第一实施例的示意图。如图1A所示,该模拟向量-矩阵乘法运算电路包括:M个模拟电压输入端、一个M行×N列的可编程半导体器件阵列、N个第一端以及N个第二端,其中,第一端为偏置电压输入端,第二端为模拟电流输出端。
该可编程半导体器件阵列中,每一行的所有可编程半导体器件的栅极均连接至同一模拟电压输入端,M行可编程半导体器件对应连接M个模拟电压输入端,每一列的所有可编程半导体器件的漏极均连接至同一偏置电压输入端,N列可编程半导体器件对应连接N个偏置电压输入端,每一列的所有可编程半导体器件的源极均连接至同一个模拟电流输出端,N列可编程半导体器件对应连接N个模拟电流输出端,其中,每个可编程半导体器件的阈值电压均可调节。N为大于等于零的正整数,M为大于等于零的正整数,M和N可以相等,也可以不等。
通过上述电路连接方式,形成栅极耦合、源极求和的拓扑结构。
其中,通过预先按照一定规则动态调节各可编程半导体器件的阈值电压V
TH,可将各可编程半导体器件看作一个可变的等效模拟权重(记为W
k,j,其中0<k<M和0<j<N分别代表行号和列号),相当于存储一个模拟数据,而可编程半导体器件阵列则存储一个模拟数据阵列
电路工作时,将一列模拟电压信号V
1~V
M分别施加至M行可编程半导体器件,其中第K行所有可编程半导体器件的栅极均得到一模拟电压信号V
k,漏极输入一偏置电压V
b,源极分别输出电流信号I
k,1~I
k,N,根据可编程半导体器件的特性,I=V×W每个可编程半导体器件的源极输出电流等于栅压乘以该可编程半导体器件的权重,即I
k,1=V
kW
k,1,I
k,N= V
kW
k,N,因为每一列的所有可编程半导体器件的源极均连接至同一个模拟电流输出端,根据基尔霍夫定律,所以在该模拟电流输出端的电流I
j为该列所有可编程半导体器件的源极电流之和,即为
多个模拟电流输出端输出多个电流和
实现矩阵乘法运算功能。
本发明利用可编程半导体器件阵列实现模拟向量-矩阵乘法运算,因为可编程半导体器件集成度高、响应速度快、功耗低,所以采用可编程半导体器件实现的模拟向量-矩阵乘法运算电路有效减少了模数转换、数模转换、数据传输等带来的开销,其处理性能大大提高。
图1B为本发明模拟向量-矩阵乘法运算电路的第二实施例的示意图。如图1B所示,该模拟向量-矩阵乘法运算电路包括:M个模拟电压输入端、一个M行×N列的的可编程半导体器件阵列、N个第一端以及N个第二端,其中,第一端为模拟电流输出端,第二端为偏置电压输入端。
该可编程半导体器件阵列中,每一行的所有可编程半导体器件的栅极均连接至同一模拟电压输入端,M行可编程半导体器件对应连接M个模拟电压输入端,每一列的所有可编程半导体器件的源极均连接至同一偏置电压输入端,N列可编程半导体器件对应连接N个偏置电压输入端,每一列的所有可编程半导体器件的漏极均连接至同一个模拟电流输出端,N列可编程半导体器件对应连接N个模拟电流输出端,其中,每个可编程半导体器件的阈值电压均可调节。N为大于等于零的正整数,M为大于等于零的正整数,M和N可以相等,也可以不等。
通过上述电路连接方式,形成栅极耦合、漏极求和的拓扑结构。
其中,通过预先按照一定规则动态调节各可编程半导体器件的阈值电压V
TH,可将各可编程半导体器件看作一个可变的等效模拟权重(记为 W
k,j,其中0<k<M和0<j<N分别代表行号和列号),相当于存储一个模拟数据,而可编程半导体器件阵列则存储一个模拟数据阵列
电路工作时,将一列模拟电压信号V
1~V
M分别施加至M行可编程半导体器件,其中第K行所有可编程半导体器件的栅极均得到一模拟电压信号V
k,源极输入一偏置电压V
b,漏极分别输出电流信号I
k,1~I
k,N,其中,根据可编程半导体器件的特性,I=V×W每个可编程半导体器件的漏极输出电流等于栅压乘以该可编程半导体器件的权重,即I
k,1=V
kW
k,1,I
k,N=V
kW
k,N,因为每一列的所有可编程半导体器件的漏极均连接至同一个模拟电流输出端,根据基尔霍夫定律,所以在该模拟电流输出端的电流I
j为该列所有可编程半导体器件的漏极电流之和,即为
多个模拟电流输出端输出多个电流和
实现矩阵乘法运算功能。
本发明利用可编程半导体器件阵列实现模拟向量-矩阵乘法运算,因为可编程半导体器件集成度高、响应速度快、功耗低,所以采用可编程半导体器件实现的模拟向量-矩阵乘法运算电路有效减少了模数转换、数模转换、数据传输等带来的开销,其处理性能大大提高。
另外,由于可编程半导体器件的栅源电压V
GS决定可编程半导体器件的输出电流,可编程半导体器件的输出电流对源极电压非常敏感,可能引起计算误差,而本实施例采用栅极耦合、漏极求和的拓扑结构,即使源极电压发生波动,也不会引起计算误差,能够提高计算的精度。
在第一实施例或第二实施例中,可选地,该可编程半导体器件可以 采用浮栅晶体管实现,浮栅晶体管的结构如图2所示,该浮栅晶体管包括衬底、绝缘层、栅极G、源极S、漏极D以及浮栅F,浮栅设置于栅极与绝缘层之间,绝缘层设置于浮栅与衬底之间,用于保护浮栅中的电子不会泄漏,浮栅中可以存储电子;通过调节浮栅中的电子数量,动态调节该浮栅晶体管的阈值电压,由于浮栅晶体管的这种结构特性,可以将其看作一个可变的等效模拟权重,存储一个模拟数据。
在第一实施例或第二实施例中,可选地,该模拟向量-矩阵乘法运算电路还可以包括:
编程电路,连接可编程半导体器件阵列中每一个可编程半导体器件的源极、栅极和/或衬底,用于调控可编程半导体器件的阈值电压。
优选地,编程电路包括:电压产生电路和电压控制电路,电压产生电路用于产生编程电压或者擦除电压,电压控制电路用于将编程电压加载至选定的可编程半导体器件的源极,或者,将擦除电压加载至选定的可编程半导体器件的栅极或衬底,以调控可编程半导体器件的阈值电压。
具体地,编程电路利用热电子注入效应,根据可编程半导体器件阈值电压需求数据,向可编程半导体器件的源极施加高电压,将沟道电子加速到高速,以增加可编程半导体器件的阈值电压。
并且,编程电路利用隧穿效应,根据可编程半导体器件阈值电压需求数据,向可编程半导体器件的栅极或衬底施加高电压,从而减少可编程半导体器件的阈值电压。
在第一实施例或第二实施例中,可选地,模拟向量-矩阵乘法运算电路还可以包括:
控制器,连接编程电路,通过控制编程电路工作,调节投入工作的可编程半导体器件的数量以及各可编程半导体器件的阈值电压,以适应-矩阵乘法运算需求。
优选地,控制器包括:行列译码器,用于选通待编程的可编程半导体器件。
在第一实施例或第二实施例中,可选地,该模拟向量-矩阵乘法运算电路还可以包括:偏置电压产生电路,用于产生预设的偏置电压,输入至偏置电压输入端,可以理解的是,该模拟向量-矩阵乘法运算电路还可以不设置偏置电压产生电路,通过复用编程电路中的电压产生电路,控制该电压产生电路产生预设的偏置电压,输入至偏置电压输入端。
图3A为本发明模拟向量-矩阵乘法运算电路的第三实施例的示意图,该模拟向量-矩阵乘法运算电路在包括图1A所示第一实施例或图1B所示第二实施例中全部内容的基础上,还可以包括:转换装置5,转换装置5连接在多个模拟电压输入端之前,用于将多个模拟电流输入信号分别转换为模拟电压输入信号,输至对应的模拟电压输入端。
在一个可选的实施例中,转换装置5包括多个可编程半导体器件。
每个可编程半导体器件的栅极与漏极相连,并连接至对应的模拟电压输入端。
每个可编程半导体器件的源极接入第一偏置电压。
其中,可以理解的是,该源极接入的第一偏置电压可以为地电压,即该源极接地。
该实施例中,将每个可编程半导体器件的栅极与漏极连接起来,用于接收模拟电流输入信号。
可选地,转换装置5中可编程半导体器件可采用浮栅晶体管。
电路工作时,将一列模拟电流输入信号I
in1~I
inM通过转换装置5转换为一列模拟电压输入信号V
1~V
M后分别施加至M行可编程半导体器件。通过设置转换装置,使得本发明实施例中的模拟向量-矩阵乘法运算电路不仅适于模拟电压输入信号,还适于模拟电流输入信号,能够增加模拟 向量-矩阵乘法运算电路的适用性。
图3B为本发明模拟向量-矩阵乘法运算电路的第四实施例的示意图,该模拟向量-矩阵乘法运算电路在包括图1A所示第一实施例或图1B所示第二实施例中全部内容的基础上,还可以包括:转换装置5,转换装置5连接在多个模拟电压输入端之前,用于将多个模拟电流输入信号分别转换为模拟电压输入信号,输至对应的模拟电压输入端。
在一个可选的实施例中,转换装置5包括多个电阻器,多个电阻器与多个模拟电压输入端一一对应连接。
每个电阻器一端连接对应的模拟电压输入端,另一端连接第一偏置电压,其中,可以理解的是,第一偏置电压可以为地电压,即电阻器的另一端接地。
电路工作时,将一列模拟电流输入信号I
in1~I
inM通过转换装置5转换为一列模拟电压输入信号V
1~V
M后分别施加至M行可编程半导体器件。
通过设置转换装置,使得本发明实施例中的模拟向量-矩阵乘法运算电路不仅适于模拟电压输入信号,还适于模拟电流输入信号,能够增加模拟向量-矩阵乘法运算电路的适用性。
值得说明的是,上述的转换装置的实现方式只是一种示例,凡是能实现将电流输入信号转换为电压输入信号的电路结构或电路元件均可用于实施该转换装置,比如金属半导体场效应晶体管。
图4为本发明模拟向量-矩阵乘法运算电路的第五实施例的示意图。如图4所示,该模拟向量-矩阵乘法运算电路包括上述第一实施例至第四实施例中任一项所述的全部内容的基础上,还可以包括:电流检测输出电路6,连接在模拟电流输出端之后,用于对模拟电流输出端输出的模拟电流输出信号进行处理和输出。
其中,通过该电流检测输出电路对运算完的电流进行精确处理并输 出,或者接到下一个可编程半导体阵列的输入,能够有效实现电流精准输出。
在一个可选的实施例中,电流检测输出电路可以包括:多个运算放大器,每个所述运算放大器的正相输入端连接第二偏压Vs,反相输入端连接至对应的所述模拟电流输出端,并且,反相输入端与输出端之间连接一电阻器或晶体管等。
其中,该正相输入端一般为接地,该运算放大器将模拟电流输出端的电压控制在与正相输入端的电压相等,用来保证可编程半导体器件的栅源电压V
GS仅由该可编程半导体器件对应的输入电压控制,进而使得运算放大器的输出端电压代表对应列可编程半导体器件的输出电流的幅度。
以上仅是举例说明本发明实施例提供的模拟向量-矩阵乘法运算电路中各模块的具体结构,在具体实施时,上述各模块的具体结构不限于本发明实施例提供的上述结构,还可以是本领域技术人员可知的其他结构,在此不作限定。
本申请实施例还提供了一种模拟向量-矩阵乘法运算电路的控制方法,可以用于控制上述各实施例所描述的模拟向量-矩阵乘法运算电路,如下面的实施例所述。由于控制方法解决问题的原理与上述电路相似,因此控制方法的实施可以参见上述电路的实施,重复之处不再赘述。
该模拟向量-矩阵乘法运算电路的控制方法如图5所示,用于控制上述各模拟向量-矩阵乘法运算电路,控制方法包括:
步骤S430:将多个模拟电压输入信号通过多个模拟电压输入端施加至对应行所有可编程半导体器件的栅极。
步骤S440:将一预设偏置电压通过多个偏置电压输入端施加至对应列所有可编程半导体器件。
其中,该步骤中,当模拟向量-矩阵乘法运算电路采用栅极耦合、源 极求和的拓扑结构时,预设偏置电压施加至可编程半导体器件的漏极;当模拟向量-矩阵乘法运算电路采用栅极耦合、漏极求和的拓扑结构时,预设偏置电压施加至可编程半导体器件的源极。
步骤S450:通过多列可编程半导体器件对应的多个模拟电流输出端,得到多个模拟电流输出信号。
可选地,若输入信号为模拟电流输入信号,则先通过转换装置5将多个模拟电流输入信号分别转换为多个模拟电压输入信号,再将模拟电压输入信号输至模拟电压输入端,进行矩阵-乘法运算。
可选地,每一列得到的模拟电流输出信号为:与该列相连接的每一行的模拟电压输入信号与该列每一个可编程半导体器件的权重的乘积再求和。
优选地,模拟向量-矩阵乘法运算电路的控制方法还包括:
步骤S420:通过编程电路调控可编程半导体器件的阈值电压。
优选地,控制方法还包括:
步骤S410:基于矩阵乘法运算的位数需求,利用控制器控制投入工作的可编程半导体器件的数量。
本发明实施例还提供一种存储装置,包括上述模拟向量-矩阵乘法运算电路。该存储装置上集成模拟向量-矩阵乘法运算电路,直接在存储装置中对模拟信号进行向量-矩阵乘法运算,不需要在存储器与处理器之间来回传输数据,提高处理性能,降低功耗与成本开销。
优选地,该存储装置为快闪存储器或电可擦可编程只读存储器。
优选地,该快闪存储器是NOR型快闪存储器。
本发明实施例还提供一种芯片,包括上述模拟向量-矩阵乘法运算电路。
上述各实施例中,浮栅晶体管可为SONOS型浮栅晶体管 (floating-gate transistor)、分裂式浮栅晶体管(Split-gate floating-gate transistor)或电荷式浮栅晶体管(Charge-trapping floating-gate transistor),包括但不限于此,所有能够通过调节浮栅中电子数量而调节晶体管本身阈值电压的晶体管均属于本发明实施例的保护范围。
本发明实施例模拟向量-矩阵乘法运算电路、控制方法、存储装置以及芯片,可用于计算机、手机、平板电脑等终端中执行相关运算,对于该模拟向量-矩阵乘法运算电路的其它必不可少的组成部分均为本领域的普通技术人员应该理解具有的,在此不做赘述,也不应作为对本发明的限制。
通过调节可编程半导体器件的阈值电压,将每个可编程半导体器件看作一个可变的等效模拟权重,相当于模拟矩阵数据,对可编程半导体器件阵列施加模拟电压,实现矩阵乘法运算功能,电路结构简单、元器件数量少、响应速度快、功耗低,大大降低模数转换、数模转换、数据传输等带来的开销,有效提高了运算电路的处理性能。
并且,本发明提供的模拟向量-矩阵乘法运算电路,在处于空闲状态时,可编程半导体器件阵列可以用作快闪存储器或电可擦可编程只读存储器,实现电器元件的复用,提高元件利用效率,节省集成电路的硬件成本。
本发明提供的存储装置上集成模拟向量-矩阵乘法运算电路,直接在存储装置中对模拟信号进行向量-矩阵乘法运算,不需要在存储器与处理器之间来回传输数据,提高处理性能,降低功耗与成本开销。
本发明中应用了具体实施例对本发明的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本发明的方法及其核心思想;同时,对于本领域的一般技术人员,依据本发明的思想,在具体实施方式及应 用范围上均会有改变之处,综上所述,本说明书内容不应理解为对本发明的限制。
Claims (14)
- 一种模拟向量-矩阵乘法运算电路,其特征在于,包括:多个模拟电压输入端、可编程半导体器件阵列、多个第一端以及多个第二端;所述可编程半导体器件阵列中,每一行的所有可编程半导体器件的栅极均连接至同一模拟电压输入端,多行可编程半导体器件对应连接多个模拟电压输入端,每一列的所有可编程半导体器件的漏极均连接至同一第一端,多列可编程半导体器件对应连接多个第一端,每一列的所有可编程半导体器件的源极均连接至同一第二端,多列可编程半导体器件对应连接多个第二端,每个所述可编程半导体器件的阈值电压均可调节;其中,所述第一端为偏置电压输入端,所述第二端为模拟电流输出端,或者,所述第一端为模拟电流输出端,所述第二端为偏置电压输入端。
- 根据权利要求1所述模拟向量-矩阵乘法运算电路,其特征在于,还包括:编程电路,连接可编程半导体器件阵列中每一个可编程半导体器件的源极、栅极和/或衬底,用于调控可编程半导体器件的阈值电压。
- 根据权利要求2所述模拟向量-矩阵乘法运算电路,其特征在于,所述编程电路包括:电压产生电路和电压控制电路,所述电压产生电路用于产生编程电压或者擦除电压,所述电压控制电路用于将所述编程电压加载至选定的可编程半导体器件的源极,或者,将擦除电压加载至选定的可编程半导体器件的栅极或衬底,以调控可编程半导体器件的阈值 电压。
- 根据权利要求3所述模拟向量-矩阵乘法运算电路,其特征在于,还包括:控制器,连接所述编程电路,通过控制所述编程电路工作,控制投入工作的可编程半导体器件的数量以及各可编程半导体器件的阈值电压。
- 根据权利要求4所述模拟向量-矩阵乘法运算电路,其特征在于,所述控制器包括:行列译码器,用于选通待编程的可编程半导体器件。
- 根据权利要求1所述模拟向量-矩阵乘法运算电路,其特征在于,还包括:转换装置,连接在多个所述模拟电压输入端之前,用于将多个模拟电流输入信号分别转换为模拟电压输入信号,输至对应的所述模拟电压输入端。
- 根据权利要求6所述模拟向量-矩阵乘法运算电路,其特征在于,所述转换装置包括多个可编程半导体器件;每个所述可编程半导体器件的栅极与漏极相连,并连接至对应的所述模拟电压输入端;每个所述可编程半导体器件的源极接入第一偏置电压。
- 根据权利要求1所述模拟向量-矩阵乘法运算电路,其特征在于,还包括:电流检测输出电路,连接在所述模拟电流输出端之后,用于对所述模拟电流输出端输出的模拟电流输出信号进行处理和输出。
- 根据权利要求8所述模拟向量-矩阵乘法运算电路,其特征在于,所述电流检测输出电路包括:多个运算放大器,每个所述运算放大器的正相输入端连接第二偏置电压,反相输入端连接至对应的所述模拟电流输出端,并且,反相输入端与输出端之间连接一电阻器或晶体管。
- 一种模拟向量-矩阵乘法运算电路的控制方法,其特征在于,应用于权利要求1至9任一项所述模拟向量-矩阵乘法运算电路,所述控制 方法包括:将多个模拟电压输入信号通过多个模拟电压输入端施加至对应行所有可编程半导体器件的栅极;将一预设偏置电压通过多个偏置电压输入端施加至对应列所有可编程半导体器件;通过多列可编程半导体器件对应的多个模拟电流输出端,得到多个模拟电流输出信号;其中,若所述模拟向量-矩阵乘法运算电路的输入信号为模拟电流输入信号,则所述控制方法还包括:通过转换装置将多个模拟电流输入信号分别转换为多个模拟电压输入信号。
- 根据权利要求10所述模拟向量-矩阵乘法运算电路的控制方法,其特征在于,还包括:通过编程电路调控可编程半导体器件的阈值电压。
- 根据权利要求10所述模拟向量-矩阵乘法运算电路的控制方法,其特征在于,还包括:基于矩阵乘法运算的位数需求,利用控制器控制投入工作的可编程半导体器件的数量。
- 一种存储装置,其特征在于,包括权利要求1至9任一项所述模拟向量-矩阵乘法运算电路。
- 一种芯片,其特征在于,包括权利要求1至9任一项所述模拟向量-矩阵乘法运算电路。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US17/163,617 US11379673B2 (en) | 2018-08-02 | 2021-02-01 | Analog vector-matrix multiplication circuit |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810872120.9A CN108763163B (zh) | 2018-08-02 | 2018-08-02 | 模拟向量-矩阵乘法运算电路 |
CN201810872120.9 | 2018-08-02 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US17/163,617 Continuation US11379673B2 (en) | 2018-08-02 | 2021-02-01 | Analog vector-matrix multiplication circuit |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2020024608A1 true WO2020024608A1 (zh) | 2020-02-06 |
Family
ID=63968761
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2019/081342 WO2020024608A1 (zh) | 2018-08-02 | 2019-04-03 | 模拟向量-矩阵乘法运算电路 |
Country Status (3)
Country | Link |
---|---|
US (1) | US11379673B2 (zh) |
CN (1) | CN108763163B (zh) |
WO (1) | WO2020024608A1 (zh) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111639758A (zh) * | 2020-04-11 | 2020-09-08 | 复旦大学 | 一种基于柔性材料的模拟卷积计算器件 |
Families Citing this family (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108763163B (zh) * | 2018-08-02 | 2023-10-20 | 北京知存科技有限公司 | 模拟向量-矩阵乘法运算电路 |
CN111198670B (zh) * | 2018-11-20 | 2021-01-29 | 华为技术有限公司 | 执行矩阵乘法运算的方法、电路及soc |
CN111611534B (zh) * | 2019-02-26 | 2023-12-01 | 北京知存科技有限公司 | 一种动态偏置模拟向量-矩阵乘法运算电路及其运算控制方法 |
CN111611535A (zh) * | 2019-02-26 | 2020-09-01 | 北京知存科技有限公司 | 抗工艺偏差的模拟向量-矩阵乘法运算电路 |
CN111614353A (zh) * | 2019-02-26 | 2020-09-01 | 北京知存科技有限公司 | 一种存算一体芯片中数模转换电路与模数转换电路复用装置 |
CN110597487B (zh) * | 2019-08-26 | 2021-10-08 | 华中科技大学 | 一种矩阵向量乘法电路及计算方法 |
CN114424198A (zh) * | 2019-09-17 | 2022-04-29 | 安纳富来希股份有限公司 | 乘法累加器 |
CN111128279A (zh) * | 2020-02-25 | 2020-05-08 | 杭州知存智能科技有限公司 | 基于NAND Flash的存内计算芯片及其控制方法 |
CN112632460B (zh) * | 2020-12-20 | 2024-03-08 | 北京知存科技有限公司 | 源极耦合、漏极求和的模拟向量-矩阵乘法运算电路 |
CN115310030A (zh) * | 2021-05-07 | 2022-11-08 | 脸萌有限公司 | 一种矩阵乘法电路模块及方法 |
CN115310031A (zh) * | 2021-05-07 | 2022-11-08 | 脸萌有限公司 | 一种矩阵乘法电路模块及方法 |
CN113505342B (zh) * | 2021-07-08 | 2022-05-24 | 北京华大九天科技股份有限公司 | 一种rc矩阵向量乘法的改进方法 |
CN118484622A (zh) * | 2024-07-11 | 2024-08-13 | 苏州长江睿芯电子科技有限公司 | 一种基于sram imc的矩阵乘法器电路和方法 |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106843809A (zh) * | 2017-01-25 | 2017-06-13 | 北京大学 | 一种基于nor flash阵列的卷积运算方法 |
US20170277659A1 (en) * | 2016-03-23 | 2017-09-28 | Gsi Technology Inc. | In memory matrix multiplication and its usage in neural networks |
CN108053029A (zh) * | 2017-12-27 | 2018-05-18 | 宁波山丘电子科技有限公司 | 一种基于存储阵列的神经网络的训练方法 |
CN108763163A (zh) * | 2018-08-02 | 2018-11-06 | 北京知存科技有限公司 | 模拟向量-矩阵乘法运算电路 |
CN108777155A (zh) * | 2018-08-02 | 2018-11-09 | 北京知存科技有限公司 | 闪存芯片 |
CN109086249A (zh) * | 2018-08-02 | 2018-12-25 | 北京知存科技有限公司 | 模拟向量-矩阵乘法运算电路 |
CN109273035A (zh) * | 2018-08-02 | 2019-01-25 | 北京知存科技有限公司 | 闪存芯片的控制方法、终端 |
CN208507187U (zh) * | 2018-08-02 | 2019-02-15 | 北京知存科技有限公司 | 闪存芯片 |
CN208547942U (zh) * | 2018-08-02 | 2019-02-26 | 北京知存科技有限公司 | 模拟向量-矩阵乘法运算电路 |
CN208569628U (zh) * | 2018-08-02 | 2019-03-01 | 北京知存科技有限公司 | 模拟向量-矩阵乘法运算电路 |
Family Cites Families (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4161785A (en) * | 1977-11-17 | 1979-07-17 | General Electric Company | Matrix multiplier |
US20050125477A1 (en) * | 2003-12-04 | 2005-06-09 | Genov Roman A. | High-precision matrix-vector multiplication on a charge-mode array with embedded dynamic memory and stochastic method thereof |
WO2006122271A2 (en) * | 2005-05-10 | 2006-11-16 | Georgia Tech Research Corporation | Systems and methods for programming floating-gate transistors |
WO2006124953A2 (en) * | 2005-05-16 | 2006-11-23 | Georgia Tech Research Corporation | Systems and methods for programming large-scale field-programmable analog arrays |
JP4988190B2 (ja) * | 2005-12-02 | 2012-08-01 | 富士通セミコンダクター株式会社 | 不揮発性半導体メモリ |
JP6702596B2 (ja) * | 2016-01-18 | 2020-06-03 | 華為技術有限公司Huawei Technologies Co.,Ltd. | 多層rramクロスバー・アレイに基づくメモリデバイス、およびデータ処理方法 |
US10496855B2 (en) * | 2016-01-21 | 2019-12-03 | Hewlett Packard Enterprise Development Lp | Analog sub-matrix computing from input matrixes |
US10529418B2 (en) * | 2016-02-19 | 2020-01-07 | Hewlett Packard Enterprise Development Lp | Linear transformation accelerators |
US9910827B2 (en) * | 2016-07-01 | 2018-03-06 | Hewlett Packard Enterprise Development Lp | Vector-matrix multiplications involving negative values |
WO2018201060A1 (en) * | 2017-04-27 | 2018-11-01 | The Regents Of The University Of California | Mixed signal neuromorphic computing with nonvolatile memory devices |
CN108009640B (zh) * | 2017-12-25 | 2020-04-28 | 清华大学 | 基于忆阻器的神经网络的训练装置及其训练方法 |
CN111542826A (zh) * | 2017-12-29 | 2020-08-14 | 斯佩罗设备公司 | 支持模拟协处理器的数字架构 |
US11507808B2 (en) * | 2018-06-01 | 2022-11-22 | Arizona Board Of Regents On Behalf Of Arizona State University | Multi-layer vector-matrix multiplication apparatus for a deep neural network |
US10452472B1 (en) * | 2018-06-04 | 2019-10-22 | Hewlett Packard Enterprise Development Lp | Tunable and dynamically adjustable error correction for memristor crossbars |
-
2018
- 2018-08-02 CN CN201810872120.9A patent/CN108763163B/zh active Active
-
2019
- 2019-04-03 WO PCT/CN2019/081342 patent/WO2020024608A1/zh active Application Filing
-
2021
- 2021-02-01 US US17/163,617 patent/US11379673B2/en active Active
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170277659A1 (en) * | 2016-03-23 | 2017-09-28 | Gsi Technology Inc. | In memory matrix multiplication and its usage in neural networks |
CN106843809A (zh) * | 2017-01-25 | 2017-06-13 | 北京大学 | 一种基于nor flash阵列的卷积运算方法 |
CN108053029A (zh) * | 2017-12-27 | 2018-05-18 | 宁波山丘电子科技有限公司 | 一种基于存储阵列的神经网络的训练方法 |
CN108763163A (zh) * | 2018-08-02 | 2018-11-06 | 北京知存科技有限公司 | 模拟向量-矩阵乘法运算电路 |
CN108777155A (zh) * | 2018-08-02 | 2018-11-09 | 北京知存科技有限公司 | 闪存芯片 |
CN109086249A (zh) * | 2018-08-02 | 2018-12-25 | 北京知存科技有限公司 | 模拟向量-矩阵乘法运算电路 |
CN109273035A (zh) * | 2018-08-02 | 2019-01-25 | 北京知存科技有限公司 | 闪存芯片的控制方法、终端 |
CN208507187U (zh) * | 2018-08-02 | 2019-02-15 | 北京知存科技有限公司 | 闪存芯片 |
CN208547942U (zh) * | 2018-08-02 | 2019-02-26 | 北京知存科技有限公司 | 模拟向量-矩阵乘法运算电路 |
CN208569628U (zh) * | 2018-08-02 | 2019-03-01 | 北京知存科技有限公司 | 模拟向量-矩阵乘法运算电路 |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111639758A (zh) * | 2020-04-11 | 2020-09-08 | 复旦大学 | 一种基于柔性材料的模拟卷积计算器件 |
CN111639758B (zh) * | 2020-04-11 | 2023-05-02 | 复旦大学 | 一种基于柔性材料的模拟卷积计算器件 |
Also Published As
Publication number | Publication date |
---|---|
US11379673B2 (en) | 2022-07-05 |
CN108763163B (zh) | 2023-10-20 |
CN108763163A (zh) | 2018-11-06 |
US20210365646A1 (en) | 2021-11-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2020024608A1 (zh) | 模拟向量-矩阵乘法运算电路 | |
CN109086249B (zh) | 模拟向量-矩阵乘法运算电路 | |
CN108777155B (zh) | 闪存芯片 | |
US11886378B2 (en) | Computer architecture with resistive processing units | |
US10169297B2 (en) | Resistive memory arrays for performing multiply-accumulate operations | |
CN109146070B (zh) | 一种支撑基于rram的神经网络训练的外围电路及系统 | |
US11461621B2 (en) | Methods and systems of implementing positive and negative neurons in a neural array-based flash memory | |
CN209657299U (zh) | 模拟向量-矩阵乘法运算电路以及芯片 | |
CN111611534B (zh) | 一种动态偏置模拟向量-矩阵乘法运算电路及其运算控制方法 | |
US20200201751A1 (en) | Memory storage device and operation method thereof for implementing inner product operation | |
WO2020172951A1 (zh) | 可软件定义存算一体芯片及其软件定义方法 | |
US20220129519A1 (en) | Apparatus and method for matrix multiplication using processing-in-memory | |
CN111949935A (zh) | 模拟向量-矩阵乘法运算电路以及芯片 | |
CN111128279A (zh) | 基于NAND Flash的存内计算芯片及其控制方法 | |
CN211016545U (zh) | 基于NAND Flash的存内计算芯片、存储装置以及终端 | |
US10381074B1 (en) | Differential weight reading of an analog memory element in crosspoint array utilizing current subtraction transistors | |
CN112632460B (zh) | 源极耦合、漏极求和的模拟向量-矩阵乘法运算电路 | |
CN208547942U (zh) | 模拟向量-矩阵乘法运算电路 | |
CN113593622B (zh) | 存内计算装置及运算装置 | |
CN113571109B (zh) | 存储器电路及其操作方法 | |
CN111614353A (zh) | 一种存算一体芯片中数模转换电路与模数转换电路复用装置 | |
Zhang et al. | Fast Fourier transform (FFT) using flash arrays for noise signal processing | |
CN208507187U (zh) | 闪存芯片 | |
CN111859261B (zh) | 计算电路及其操作方法 | |
CN208569628U (zh) | 模拟向量-矩阵乘法运算电路 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 19844960 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 19844960 Country of ref document: EP Kind code of ref document: A1 |