CN114662682A

CN114662682A - Memory-resistor-based integrated calculation unit, array circuit and control method

Info

Publication number: CN114662682A
Application number: CN202210306111.XA
Authority: CN
Inventors: 高润雄; 贾嵩; 段杰斌
Original assignee: Peking University
Current assignee: Peking University
Priority date: 2022-03-25
Filing date: 2022-03-25
Publication date: 2022-06-24

Abstract

The invention provides a computing unit, an array circuit and a control method based on memory and calculation of a memristor, wherein the drain electrode of a first MOS tube of the computing unit is connected with a first voltage input signal, and the source electrode of a second MOS tube of the computing unit is connected with a second voltage input signal; the grid electrode of the first MOS tube and the grid electrode of the second MOS tube are connected with a control voltage signal together; the source electrode of the first MOS tube is connected with one end of the first memristor, and the drain electrode of the second MOS tube is connected with one end of the second memristor; the other end of the first memristor and the other end of the second memristor are connected with a grid electrode of the fourth MOS transistor; the drain electrode of the fourth MOS tube is connected with the source electrode of the third MOS tube, the source electrode of the fourth MOS tube is connected with the drain electrode of the fifth MOS tube, the source electrode of the fifth MOS tube is grounded, and the drain electrode of the third MOS tube is used as a current output end, so that the accuracy of circuit output current can be improved, and the pruning function of a neural network is realized on hardware.

Description

Memory-resistor-based integrated calculation unit, array circuit and control method

Technical Field

The invention relates to the technical field of integrated circuits, in particular to a memory-computation-integrated computing unit based on a memristor, an array circuit and a control method.

Background

The von Neumann system mechanism is a classic structure of a computer, and the operation principle of the von Neumann system mechanism is that when calculation is needed, data are firstly stored in a storage unit, then the data of the storage unit are carried to a logic unit through instructions, and after the calculation is completed in the logic unit, an operation result is stored in the storage unit. However, as the data volume of deep learning tasks becomes larger and larger, the memory is frequently read and written by using the traditional von neumann architecture, and the memory wall is formed by the overhead of frequently accessing the memory by the processor. The storage and calculation integration is a scheme provided for solving the problem of a memory wall, and the basic idea of the storage and calculation integration is to combine calculation and storage into one, so that the frequency of accessing a memory by a processor is reduced.

The integrated circuit can be used for multiply-add operation in a neural network, and for the neural network, a large number of redundant neurons and weights exist in a neural network model, and the weights which participate in main calculation and influence final results only account for 5-10% of the total number, so that the integrated circuit is particularly important for researching neural network pruning. The neural network pruning can not only screen out unimportant neurons and weights from a large network and delete the neurons and the weights, so that the purpose of compressing the network is achieved, but also the performance of the network can be kept as far as possible.

Therefore, how to embody the pruning function on hardware is still an urgent problem to be solved in the field of integrated circuit design.

Disclosure of Invention

The invention provides a memory-computation integrated computing unit based on a memristor, an array circuit and a control method, which are used for solving the problem that neural network pruning cannot be realized on hardware in the prior art.

The invention provides a computing unit integrating storage and calculation based on a memristor, which is used for executing convolution operation in a neural network and comprises a first MOS (metal oxide semiconductor) transistor, a second MOS transistor, a third MOS transistor, a fourth MOS transistor, a fifth MOS transistor, a first memristor and a second memristor;

the drain/source of the first MOS transistor is connected with a first voltage input signal, and the source/drain of the second MOS transistor is connected with a second voltage input signal; the grid electrode of the first MOS tube and the grid electrode of the second MOS tube are connected with a control voltage signal together; the source/drain of the first MOS transistor is connected with one end of the first memristor, and the drain/source of the second MOS transistor is connected with one end of the second memristor; the other end of the first memristor and the other end of the second memristor are connected with the grid electrode of the fourth MOS tube; the drain electrode of the fourth MOS tube is connected with the source electrode of the third MOS tube, and the source electrode of the fourth MOS tube is connected with the drain electrode/source electrode of the fifth MOS tube; the grid electrode of the fifth MOS tube is connected with a first bias voltage, and the source electrode/drain electrode of the fifth MOS tube is grounded; and the grid electrode of the third MOS tube is connected with a second bias voltage, and the drain electrode of the third MOS tube is used as a current output end.

The present invention also provides an array circuit comprising: a plurality of memristor-based computational cells arranged in an array as described above;

the first bias voltages of the computing units are the same, the second bias voltages of the computing units are the same, the first voltage input signals of the computing units in the same row are the same, the second voltage input signals of the computing units in the same row are the same, and the current output ends of the computing units in the same column are connected together.

According to the array circuit provided by the invention, the control voltage signals of the computing units in the same column are the same.

The invention also provides a control method based on the array circuit, which comprises the following steps:

determining control voltage signals of each computing unit in the array circuit;

and controlling the on-off of the first MOS tube and the second MOS tube of each computing unit based on the control voltage signals of each computing unit so as to control the working state of each computing unit.

According to a control method provided by the present invention, the current at the current output terminal of any one of the computing units is:

I_out＝V_in×W×IO

wherein, V_inThe input value is the input value of any one computing unit, W is the weight of any one computing unit, and IO is a preset current;

the input value is determined based on a first voltage input signal and a second voltage input signal of the any one of the computational cells, and the weight is determined based on resistance values of a first memristor and a second memristor of the any one of the computational cells.

According to a control method provided by the present invention, the determining a control voltage signal of each computing unit in the array circuit includes:

and determining the control voltage signal of each computing unit based on the bit corresponding to the weight of each computing unit.

According to a control method provided by the present invention, an input value of any one of the calculation units is determined based on the steps of:

if the working state of any computing unit is working, the first voltage input signal of any computing unit is at a high level, and the second voltage input signal of any computing unit is at a low level, the input value of any computing unit is a first input value;

otherwise, the input value of any one computing unit is a second input value.

According to a control method provided by the present invention, the weight of any one of the calculation units is determined based on the following steps:

if the resistance value of the first memristor of any computing unit is a low resistance value, and the resistance value of the second memristor of any computing unit is a high resistance value, the weight of any computing unit is a first weight;

otherwise, the weight of any one computing unit is the second weight.

The invention also provides an electronic device, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor executes the program to realize the control method.

The invention also provides a non-transitory computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements a control method as described in any of the above.

According to the calculation unit based on the memristor and integrated with the memory resistor, the calculation unit comprises the first MOS tube, the second MOS tube, the third MOS tube, the fourth MOS tube, the fifth MOS tube, the first memristor and the second memristor, can be used for executing convolution operation in a neural network and realizing a function of integrating the memory and the calculation, the resistance of the memristor is divided to drive the grid of the fourth MOS tube and combined with a cascode structure to output current, the accuracy of the output current of the circuit can be improved, in addition, the pruning function of the neural network can be realized by controlling the connection of a voltage signal and the grids of the first MOS tube and the second MOS tube, the hardware cost required by calculation is reduced, the calculation amount is reduced, and the calculation speed of the neural network is improved.

Drawings

In order to more clearly illustrate the technical solutions of the present invention or the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

FIG. 1 is a schematic diagram of a circuit structure of a computing unit of a memristor-based memory bank provided by the present invention;

FIG. 2 is a schematic diagram of an array circuit according to the present invention;

FIG. 3 is a second schematic diagram of the array circuit according to the present invention;

FIG. 4 is one exemplary diagram of the connection of control voltage signals provided by the present invention;

FIG. 5 is a second exemplary diagram of the connection of the control voltage signals provided by the present invention;

FIG. 6 is a third exemplary diagram of the connection of the control voltage signal provided by the present invention;

FIG. 7 is a schematic flow chart of a control method provided by the present invention;

FIG. 8 is a schematic structural diagram of an electronic device provided by the present invention;

reference numerals:

m1: a first MOS transistor; m2: a second MOS transistor; m3: a third MOS transistor;

m4: a fourth MOS transistor; m5: a fifth MOS transistor; RRAM 1: a first memristor;

RRAM 2: a second memristor; VL 1: a first voltage input signal;

VLB 1: a second voltage input signal; vopen 1: a control voltage signal;

VBIAS: a first bias voltage; VCAS: a second bias voltage;

IOUT 11: and a current output terminal.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The current mainstream storage and calculation integrated circuit is mainly realized based on a memristor, the memristor is a resistance change nonlinear resistor, the resistance value of the resistance change nonlinear resistor can be changed through a control signal, so that 1 represents a high resistance value, 0 represents a low resistance value, and the resistance value can be regarded as a data storage mode by utilizing the nonvolatile characteristic of the resistance change nonlinear resistor.

In this regard, the present invention provides a computing unit for performing convolution operations in a neural network based on a body of memristors. Fig. 1 is a schematic circuit structure diagram of a computing unit integrating memory and calculation based on memristors, as shown in fig. 1, the computing unit includes a first MOS Transistor (Metal-Oxide-Semiconductor Field-Effect Transistor) M1, a second MOS Transistor M2, a third MOS Transistor M3, a fourth MOS Transistor M4, a fifth MOS Transistor M5, a first memristor RRAM1, and a second memristor RRAM 2;

the drain/source of the first MOS transistor M1 is connected to the first voltage input signal VL1, and the source/drain of the second MOS transistor M2 is connected to the second voltage input signal VLB 1; the grid electrode of the first MOS tube M1 and the grid electrode of the second MOS tube M2 are commonly connected with a control voltage signal Vopen 1; the source/drain of the first MOS transistor M1 is connected with one end of a first memristor RRAM1, and the drain/source of the second MOS transistor M2 is connected with one end of a second memristor RRAM 2; the other end of the first memristor RRAM1 and the other end of the second memristor RRAM2 are connected with the gate of the fourth MOS transistor M4 in common; the drain electrode of the fourth MOS transistor M4 is connected with the source electrode of the third MOS transistor M3, and the source electrode of the fourth MOS transistor M4 is connected with the drain/source electrode of the fifth MOS transistor M5; the grid electrode of the fifth MOS tube M5 is connected with a first bias voltage VBIAS, and the source electrode/drain electrode of the fifth MOS tube M5 is grounded; the gate of the third MOS transistor M3 is connected to the second bias voltage VCAS, and the drain of the third MOS transistor M3 serves as the current output terminal IOUT 11.

Specifically, the memristor-based computing unit comprises three input voltage terminals, namely a control voltage signal Vopen1, a first voltage input signal VL1 and a second voltage input signal VLB1, comprising one current output terminal IOUT 11; the calculating unit comprises a first MOS tube M1, a second MOS tube M2, a third MOS tube M3, a fourth MOS tube M4, a fifth MOS tube M5, a first memristor RRAM1 and a second memristor RRAM 2; the resistance values of the first memristor RRAM1 and the second memristor RRAM2 are R1 and R2 respectively, the corresponding conductances are G1 and G2 respectively, and the types of the resistance values can include low-low resistance (LLRS), Low Resistance (LRS), High Resistance (HRS) and high resistance (HHRS).

The drain of the first MOS transistor M1 is connected to the first voltage input signal VL1, and the source of the second MOS transistor M2 is connected to the second voltage input signal VLB 1; the grid electrode of the first MOS tube M1 and the grid electrode of the second MOS tube M2 are commonly connected with a control voltage signal Vopen 1; the source electrode of the first MOS transistor M1 is connected with one end of a first memristor RRAM1, and the drain electrode of the second MOS transistor M2 is connected with one end of a second memristor RRAM 2; the other end of the first memristor RRAM1 and the other end of the second memristor RRAM2 are connected with the gate of the fourth MOS transistor M4 in common; the drain electrode of the fourth MOS transistor M4 is connected with the source electrode of the third MOS transistor M3, and the source electrode of the fourth MOS transistor M4 is connected with the drain electrode of the fifth MOS transistor M5; the grid electrode of the fifth MOS transistor M5 is connected to the first bias voltage VBIAS, the source electrode of the fifth MOS transistor M5 is connected to the negative electrode of the power supply, and the negative electrode of the power supply is grounded; the grid electrode of the third MOS tube M3 is connected with a second bias voltage VCAS, and the drain electrode of the third MOS tube M3 is used as a current output end IOUT 11;

in addition, according to the characteristics of the MOS transistors, the sources and the drains of the first MOS transistor M1, the second MOS transistor M2 and the fifth MOS transistor M5 are interchangeable, and the device function is not affected, that is, the source of the first MOS transistor M1 is connected to the first voltage input signal VL1, the drain of the first MOS transistor M1 is connected to one end of the first memristor RRAM1, the drain of the second MOS transistor M2 is connected to the second voltage input signal VLB1, the source of the second MOS transistor M2 is connected to one end of the second memristor RRAM2, the source of the fourth MOS transistor M4 is connected to the source of the fifth MOS transistor M5, and the drain of the fifth MOS transistor M5 is connected to the negative electrode of the power supply. Optionally, the first MOS transistor M1, the second MOS transistor M2, the third MOS transistor M3, the fourth MOS transistor M4, and the fifth MOS transistor M5 are NMOS (N-Metal-Oxide-Semiconductor) transistors.

It is appreciated that when VL1 is equal to VLB1, no current flows between VL1 and VLB1, and when VL1 is greater than VLB1, current flows through RRAM1 and RRAM2, so that voltage division occurs at RRAM1 and RRAM 2. The other end of RRAM1 and the other end of RRAM2 are connected at node VP, and the voltage at node VP is used to drive and control the on and off of fourth MOS transistor M4M 4. The different values of VL1, VLB1, R1, and R2 all change the voltage value of the node VP, so that M4 needs to be turned on to satisfy the following conditions:

(VL1-VLB1)×(R2/(R1+R2))>Vth4

where Vth4 is the threshold voltage of M4.

To ensure the normal operation of the circuit of this embodiment, appropriate VL1 and VLB1 should be selected; the minimum difference between VL1 and VLB1 is determined by the following equation:

(VL1-VLB1)>Vth4/(R2/(R1+R2))

for example, if the threshold voltage of M4 is 0.5V, R1 is 10k (lrs); r2 is 100k (hrs), the minimum difference between VL1 and VLB1 should be 0.55V to ensure the accuracy of the circuit in this embodiment.

It should be noted that, different from the prior art in which current is directly output from a node of a memristor, driving capability is poor, which results in abnormal operation, and since a high-low resistance margin of the memristor is low, it is difficult to control an accurate resistance value, which results in an unstable current value. In the embodiment of the invention, the driving capability of the output node can be improved firstly by dividing the resistance value of the memristor to drive the gate of the fourth MOS transistor M4 to output the current, and secondly, the error caused by the direct output of the current generated by the voltage through the memristor can be effectively prevented, and on the basis, the current output end IOUT11 can output the stable current IO a by combining with the cascode current mirror formed by the M3 and the M5. It will be appreciated that IOUT11 produces a current output of IO a when M4 is conducting; when M4 is turned off, IOUT11 outputs a 0A current.

In addition, a control voltage signal Vopen1 of the computing unit is connected with gates of the M1 and the M2, when a voltage value of Vopen1 is larger than threshold voltages Vth1 and Vth2 of the M1 and M2, the M1 and the M2 are conducted, so that the computing unit is started and normally works, when a voltage value of Vopen1 is smaller than the threshold voltages Vth1 and Vth2, the M1 and M2 are cut off, the computing unit does not work, the computing unit is used for performing convolution operation in the neural network, and therefore the corresponding computing unit can be pruned through the voltage value of the control voltage signal Vopen1, and the pruning function of the neural network is achieved. Further, the specification models of M1 and M2 may be the same, and Vth1 and Vth2 may be the same.

The calculating unit provided by the embodiment of the invention can be used for executing convolution operation in a neural network and realizing a function of integrating calculation through the first MOS transistor, the second MOS transistor, the third MOS transistor, the fourth MOS transistor, the fifth MOS transistor, the first memristor and the second memristor, the grid electrode of the fourth MOS transistor is driven through resistance voltage division of the memristor, and the current is output by combining a cascode structure, so that the accuracy of the circuit output current can be improved, in addition, the pruning function of the neural network can be realized through controlling the connection of a voltage signal and the grid electrodes of the first MOS transistor and the second MOS transistor, the hardware overhead required by calculation is reduced, the operation quantity is reduced, and the calculating speed of the neural network is improved.

Based on the above embodiments, the present invention provides an array circuit. Fig. 2 is a schematic structural diagram of an array circuit provided in the present invention, and as shown in fig. 2, the array circuit includes: a plurality of computing units arranged in an array and based on a memory resistor of the embodiment;

Specifically, the array circuit is composed of a plurality of computing units which are arranged in an array and based on the memristor and are integrated in computation, wherein the computing units are arranged in the array, K rows and N columns are shared, and K and N are positive integers, so that convolution operation of a larger-scale neural network can be supported.

The first bias voltages of the K × N calculation units may be the same, the second bias voltages of the K × N calculation units may be the same, the first voltage input signals of the calculation units in the same row may be the same, and the second voltage input signals of the calculation units in the same row may be the same. The current output ends of the computing units in the same column can be connected together, and finally, the total current output by the computing units in each column can be obtained through accumulation.

It should be noted that, because it is necessary to control whether each computing unit works or not through the control voltage signal of each computing unit to implement the pruning function of the neural network, the connection manner of the control voltage signals of each computing unit in the embodiment of the present invention may be set correspondingly according to actual requirements, such as control accuracy, control number, and the like, for example, each computing unit may be connected with an independent control voltage signal, each two computing units may be connected with one control voltage signal, each K × 2 computing units may be connected with one control voltage signal, each K × 3 computing unit may be connected with one control voltage signal, or each K × N computing unit may be connected with one control voltage signal, and the embodiment of the present invention is not limited to this.

The array circuit provided by the embodiment of the invention can be used for executing convolution operation in a neural network and realizing a function of integrating calculation through the calculation unit comprising the first MOS tube, the second MOS tube, the third MOS tube, the fourth MOS tube, the fifth MOS tube, the first memristor and the second memristor, the precision of the circuit output current can be improved by dividing the voltage through the resistance value of the memristor to drive the grid electrode of the fourth MOS tube and combining the cascode structure to output the current, in addition, the pruning function of the neural network can be realized by connecting the control voltage signal with the grid electrodes of the first MOS tube and the second MOS tube, reduces the hardware overhead required by calculation, reduces the operation quantity, improves the calculation speed of the neural network, on the basis, the convolution operation of a larger-scale neural network can be supported by comprising a plurality of computing units arranged in an array.

Based on any of the above embodiments, the control voltage signals of the computing units in the same column are the same.

Specifically, fig. 3 is a second schematic structural diagram of the array circuit provided by the present invention, as shown in fig. 3, voltages at nodes WL1, WL2, WL …, and WLK are VL1, VL2, …, and VLK, respectively, which are used as first voltage input signals of the computing units in rows 1-K; voltages on the nodes WLB1, WLB2, … and WLBK are respectively VLB1, VLB2, … and VLBK which are respectively used as second voltage input signals of the calculation units in the 1-K rows; voltages at the nodes Wopen1, Wopen2, … and Wopen are Vopen1, Vopen2, … and Vopen N respectively, and are used as control voltage signals of the 1 st to N th column calculation units respectively; VBIAS is input as a first bias voltage of K × N calculation units, i.e., the gate voltage of M5; VCAS is input as a second bias voltage of the K × N calculation units, i.e., the gate voltage of M3.

On the basis, the control voltage signals Vopen1, Vopen2, … and Vopen of the 1 st to N columns of computing units are controlled to prune a given computing unit, so that the hardware pruning function can be realized, for example, the control voltage signal of the computing unit of the 3 rd column is set to 0, the first MOS transistor and the second MOS transistor of each computing unit of the column are cut off, so that each computing unit of the column does not work.

In addition, fig. 4 to fig. 6 schematically show other connection manners of the control voltage signals of the computing units provided by the present invention, as shown in fig. 4, the connection manner of the control voltage signals of the computing units may be that each computing unit is connected with an independent control voltage signal, as shown in fig. 5, the connection manner of the control voltage signals of the computing units may also be that each two computing units are connected with one control voltage signal, as shown in fig. 6, the connection manner of the control voltage signals of the computing units may also be that each K/2 computing units are connected with one control voltage signal.

Based on any of the above embodiments, the present invention provides a control method based on the array circuit as described in the above embodiments. Fig. 7 is a schematic flow chart of a control method provided by the present invention, and as shown in fig. 7, the method includes:

step 710, determining control voltage signals of each computing unit in the array circuit;

and 720, controlling the on-off of the first MOS tube and the second MOS tube of each computing unit based on the control voltage signals of each computing unit so as to control the working state of each computing unit.

Specifically, the array circuit is composed of K × N computing units that are based on a memory bank of memristors as described in the above embodiments. Firstly, the control voltage signal of each computing unit in the array circuit can be determined, and then, the on-off of the first MOS transistor and the second MOS transistor of each computing unit can be controlled according to the control voltage signal of each computing unit to control the working state of each computing unit, wherein the working state is working or non-working:

when the voltage value of the control voltage signal of any one computing unit is greater than the threshold voltage of the first MOS tube and the threshold voltage of the second MOS tube of the computing unit, the first MOS tube and the second MOS tube of the computing unit are conducted, so that the computing unit is started and works normally; when the voltage value of the control voltage signal of the computing unit is smaller than the threshold voltage of the first MOS tube and the threshold voltage of the second MOS tube of the computing unit, the first MOS tube and the second MOS tube of the computing unit are cut off, and the computing unit does not work.

The method provided by the embodiment of the invention can support convolution operation of a larger-scale neural network by applying the array circuit, and controls the on-off of the first MOS tube and the second MOS tube of each computing unit according to the control voltage signal of each computing unit in the array circuit so as to control the working state of each computing unit, thereby realizing pruning a given computing unit by controlling the voltage signal, reducing the hardware cost required by computing, reducing the operation quantity and improving the computing speed of the neural network.

Based on any of the above embodiments, the current at the current output of any of the computing units is:

I_out＝V_in×W×IO

wherein, V_inW is the weight of the computing unit, and IO is the preset current;

the input value is determined based on first and second voltage input signals of the computational cell, and the weight is determined based on resistance values of first and second memristors of the computational cell.

Specifically, in order to implement analog multiplication operation of weight and input in the neural network, the input voltage is converted into a current output, and the current at the current output end of any computing unit in the embodiment of the present invention is:

I_out＝V_in×W×IO

wherein, V_inW is the weight of the calculation unit, and IO is the preset current.

Here, the input value of the calculation unit may be determined according to the first voltage input signal and the second voltage input signal of the calculation unit, for example, when the first voltage input signal VL1 of the calculation unit is equal to the second voltage input signal VLB1 of the calculation unit, no current flows between VL1 and VLB1, and the input value V is_inIs 0. The weights of the computational cell may be determined from the resistances of the first and second memristors of the computational cell, i.e., R1 and R2. It is understood that the current at the current output terminal is a multiple of IO, which is the current of the cascode current mirror at the output terminal part, and the multiple is determined by VL1, VLB1, R1, and R2 at the input terminal part.

The method provided by the embodiment of the invention can realize the analog multiplication operation of weight and input in the neural network through the computing unit, and on the basis, the array circuit comprising a plurality of computing units is applied, so that the larger-scale current multiplication and addition operation can be supported, the given computing unit can be pruned through controlling a voltage signal, and the computing speed of the neural network is improved by reducing the number of the multiplication and addition operations.

Based on any of the above embodiments, step 710 includes:

and determining the control voltage signal of each calculation unit based on the bit corresponding to the weight of each calculation unit.

Specifically, the weights of the N calculation units in each row may form an N-bit weight of the whole row, where the weight of each calculation unit corresponds to one bit, and it can be understood that the weight at the upper bit has a larger influence on the multiply-add result, and the weight at the lower bit has a smaller influence on the multiply-add result. For an array circuit composed of K × N computing units, K N-bit weights W1, W2, … and WK can be obtained.

Considering that neural network pruning can screen out unimportant neurons and weights from a large network, and then delete the neurons and weights from the network, and simultaneously keep the performance of the network as much as possible, in the embodiment of the invention, under the condition of accuracy within an acceptable range, the calculation units corresponding to the low-order bits of the user-defined number can be closed in a pruning mode, so that the occupation of calculation resources is reduced by reducing the times of multiplication and addition, and meanwhile, too much accuracy is not influenced.

The specific pruning manner may be to determine the control voltage signals of each computing unit according to the height of the bits corresponding to the weight of each computing unit and the number of computing units that need to be turned off, for example, if the bit sequence is that the weight of the computing unit in the front row corresponds to the high-order bits, the weight of the computing unit in the rear row corresponds to the low-order bits, and the number of computing units that need to be turned off is M × K, the control voltage signals of all the computing units in the rear M row may be set to 0, so that all the computing units in the rear M row do not work, that is, the number of bits in the M row is subtracted, and at this time, the reduced number of multiplications may be M × K times.

According to the method provided by the embodiment of the invention, the calculation units corresponding to the low-order bits with the user-defined number are closed in a pruning mode, so that the hardware cost required by calculation is greatly reduced, the weight calculation precision can be adjusted in a user-defined mode, and the multiplication times required by calculation are correspondingly adjusted.

Based on any of the above embodiments, the input value of any of the calculation units is determined based on the following steps:

if the working state of the computing unit is working, the first voltage input signal of the computing unit is at a high level, and the second voltage input signal of the computing unit is at a low level, the input value of the computing unit is a first input value;

otherwise, the input value of the calculation unit is the second input value.

Specifically, the voltage combination of the first voltage input signal VL1 and the second voltage input signal VLB1 of any one of the calculation units corresponds to the input value Vin of the calculation unit; the voltage input ranges of the first voltage input signal VL1 and the second voltage input signal VLB1 are from low level to high level; the different combinations of the first voltage input signal VL1 and the second voltage input signal VLB1 within the voltage input range will not cause the circuit to function incorrectly.

If the computing unit is in operation, and the input value Vin is the first input value when the first voltage input signal VL1 is at high level and the second voltage input signal VLB1 is at low level, current can be ensured to flow between VL1 and VLB 1; when the first voltage input signal VL1 is high and the second voltage input signal VLB1 is high, the input value Vin is the second input value; when the first voltage input signal VL1 is at a low level and the second voltage input signal VLB1 is at a high level, the input value Vin is a second input value; when the first voltage input signal VL1 is low and the second voltage input signal VLB1 is low, the input value Vin is the second input value;

if the operation state of the calculation unit is off, the input value Vin is the second input value regardless of whether the first voltage input signal VL1 and the second voltage input signal VLB1 are at a low level or a high level. Optionally, the first input value is 1 and the second input value is 0.

Based on any of the above embodiments, the weight of any of the calculation units is determined based on the following steps:

if the resistance value of the first memristor of the calculating unit is a low resistance value and the resistance value of the second memristor of the calculating unit is a high resistance value, the weight of the calculating unit is a first weight;

otherwise, the weight of the computing unit is the second weight.

Specifically, the partial voltage of the first memristor and the second memristor of any computing unit corresponds to the weight W of the computing unit, so that the weight can be stored by utilizing the nonvolatile characteristic of the memristors. When the resistance value of the first memristor is low and the resistance value of the second memristor is high, the weight W of the calculating unit is a first weight, and the gate voltage of the fourth MOS transistor can be ensured to be large; when the resistance value of the first memristor is low and the resistance value of the second memristor is low, the weight W of the calculating unit is a second weight; when the resistance value of the first memristor is high and the resistance value of the second memristor is low, the weight W of the calculating unit is a second weight; when the resistance value of the first memristor is high and the resistance value of the second memristor is high, the weight W of the calculating unit is the second weight. Optionally, the first weight is 1 and the second weight is 0.

Further, table 1 is a truth table of the current output terminal of the calculating unit, and as shown in table 1, the control voltage signal Vopen1, the output product Vin × W corresponding to the input value Vin and the weight W, and the current output terminal IOUT11 can be determined; when Vopen1 is 0, namely the working state of the computing unit is not working, IOUT11 is 0A; when Vopen1 is 1, that is, the operating state of the computing unit is in operation, only VL1 is VDD, VLB1 is 0, R1 is LRS, and R2 is HRS, the output product Vin × W is 1, the output IOUT11 is IO a, and the IOUT11 is 0A otherwise.

TABLE 1

It should be noted that when the resistance value of the first memristor is low and low, processing can be performed according to the condition that the resistance value is low, and when the resistance value of the first memristor is high and high, processing can be performed according to the condition that the resistance value is high; similarly, when the resistance of the second memristor is low and low, the processing can be performed according to the condition that the resistance is low, and when the resistance of the second memristor is high and high, the processing can be performed according to the condition that the resistance is high.

Based on any of the above embodiments, as shown in fig. 3, taking the first column as an example, the current output terminals IOUT11, IOUT21, …, IOUTK1 of the K calculation units are commonly connected at the node BL1, so that the multiply-add current I1 is obtained at the node BL1, where I1 can be expressed as:

I1＝(Vin1×W1+Vin2×W2+…+VinK×WK)×IO

similarly, the values of the multiply-add currents I2, I3, …, IN for the remaining N-1 columns can be obtained.

After the multiply-add current of each column is obtained, the multiply-add current may be subjected to analog-to-digital conversion and other processing, and finally an output result of the convolution operation of the neural network is obtained.

The following describes the control device provided by the present invention, and the control device described below and the control method described above can be referred to correspondingly.

Based on any one of the above embodiments, the present invention provides a control device based on the array circuit according to the above embodiments, including:

the determining unit is used for determining control voltage signals of all computing units in the array circuit;

and the control unit is used for controlling the on-off of the first MOS tube and the second MOS tube of each calculation unit based on the control voltage signal of each calculation unit so as to control the working state of each calculation unit.

I_out＝V_in×W×IO

Based on any of the embodiments above, the determining unit is configured to:

According to any of the above embodiments, the input value of the calculation unit is determined based on the following steps:

otherwise, the input value of the calculation unit is the second input value.

Based on any of the above embodiments, the weight of the calculation unit is determined based on the following steps:

otherwise, the weight of the computing unit is the second weight.

The device provided by the embodiment of the invention can support convolution operation of a larger-scale neural network by applying the array circuit, and controls the on-off of the first MOS tube and the second MOS tube of each computing unit according to the control voltage signal of each computing unit in the array circuit so as to control the working state of each computing unit, thereby realizing pruning a given computing unit by controlling the voltage signal, reducing the hardware cost required by computing, reducing the operation quantity and improving the computing speed of the neural network.

Fig. 8 illustrates a physical structure diagram of an electronic device, and as shown in fig. 8, the electronic device may include: a processor (processor)810, a communication Interface 820, a memory 830 and a communication bus 840, wherein the processor 810, the communication Interface 820 and the memory 830 communicate with each other via the communication bus 840. The processor 810 may call logic instructions in the memory 830 to perform a control method comprising: determining control voltage signals of each computing unit in the array circuit; and controlling the on-off of the first MOS tube and the second MOS tube of each computing unit based on the control voltage signals of each computing unit so as to control the working state of each computing unit.

In addition, the logic instructions in the memory 830 may be implemented in software functional units and stored in a computer readable storage medium when the logic instructions are sold or used as independent products. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk, and various media capable of storing program codes.

In another aspect, the present invention also provides a computer program product, the computer program product comprising a computer program, the computer program being storable on a non-transitory computer-readable storage medium, the computer program, when executed by a processor, being capable of executing the control method provided by the above methods, the method comprising: determining control voltage signals of each computing unit in the array circuit; and controlling the on-off of the first MOS tube and the second MOS tube of each computing unit based on the control voltage signals of each computing unit so as to control the working state of each computing unit.

In yet another aspect, the present invention also provides a non-transitory computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements a control method provided by performing the above methods, the method including: determining control voltage signals of each computing unit in the array circuit; and controlling the on-off of the first MOS tube and the second MOS tube of each computing unit based on the control voltage signals of each computing unit so as to control the working state of each computing unit.

The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.

Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.

Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims

1. The computing unit based on the memristor and integrating storage and calculation is characterized by being used for executing convolution operation in a neural network and comprising a first MOS transistor, a second MOS transistor, a third MOS transistor, a fourth MOS transistor, a fifth MOS transistor, a first memristor and a second memristor;

the drain/source of the first MOS tube is connected with a first voltage input signal, and the source/drain of the second MOS tube is connected with a second voltage input signal; the grid electrode of the first MOS tube and the grid electrode of the second MOS tube are connected with a control voltage signal together; the source/drain of the first MOS transistor is connected with one end of the first memristor, and the drain/source of the second MOS transistor is connected with one end of the second memristor; the other end of the first memristor and the other end of the second memristor are connected with the grid electrode of the fourth MOS tube; the drain electrode of the fourth MOS tube is connected with the source electrode of the third MOS tube, and the source electrode of the fourth MOS tube is connected with the drain electrode/source electrode of the fifth MOS tube; the grid electrode of the fifth MOS tube is connected with a first bias voltage, and the source electrode/drain electrode of the fifth MOS tube is grounded; and the grid electrode of the third MOS tube is connected with a second bias voltage, and the drain electrode of the third MOS tube is used as a current output end.

2. An array circuit, comprising: a plurality of memristor-based computational cells arranged in an array, as in claim 1;

3. The array circuit of claim 2, wherein the control voltage signals of the computing units in the same column are the same.

4. A control method based on the array circuit of claim 2 or 3, comprising:

5. The control method according to claim 4, wherein the current at the current output of any one of the computing units is:

I_out＝V_in×W×IO

6. The method of claim 5, wherein determining the control voltage signal for each computational cell in the array circuit comprises:

7. The control method according to claim 5, characterized in that the input value of any one of the calculation units is determined based on:

otherwise, the input value of any one computing unit is a second input value.

8. The control method according to claim 5, wherein the weight of any one of the calculation units is determined based on:

if the resistance value of the first memristor of any one computing unit is a low resistance value, and the resistance value of the second memristor of any one computing unit is a high resistance value, the weight of any one computing unit is a first weight;

otherwise, the weight of any one computing unit is the second weight.

9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the control method according to any of claims 4 to 8 when executing the program.

10. A non-transitory computer-readable storage medium on which a computer program is stored, the computer program, when being executed by a processor, implementing the control method according to any one of claims 4 to 8.