CN114677548A - Neural network image classification system and method based on resistive random access memory - Google Patents

Neural network image classification system and method based on resistive random access memory Download PDF

Info

Publication number
CN114677548A
CN114677548A CN202210579664.2A CN202210579664A CN114677548A CN 114677548 A CN114677548 A CN 114677548A CN 202210579664 A CN202210579664 A CN 202210579664A CN 114677548 A CN114677548 A CN 114677548A
Authority
CN
China
Prior art keywords
value
layer
convolution
quantized
quantization
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210579664.2A
Other languages
Chinese (zh)
Other versions
CN114677548B (en
Inventor
高丽丽
时拓
刘琦
张程高
顾子熙
王志斌
李一琪
张徽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Lab
Original Assignee
Zhejiang Lab
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Lab filed Critical Zhejiang Lab
Priority to CN202210579664.2A priority Critical patent/CN114677548B/en
Publication of CN114677548A publication Critical patent/CN114677548A/en
Application granted granted Critical
Publication of CN114677548B publication Critical patent/CN114677548B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Neurology (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a neural network image classification system and method based on a resistive random access memory, wherein the system comprises an input layer, a group of convolution layers and a full-connection layer which are sequentially connected, and a convolution quantization layer, a convolution inverse quantization layer, an activation layer and a pooling layer are arranged for the convolution layers in a matched mode, and the method comprises the following steps of S1: normalizing the image to be classified to obtain a normalized image; step S2: constructing a training set and a test set for the normalized image; step S3: constructing a neural network model based on a resistive random access memory; step S4: inputting the training set into a neural network model based on a resistive random access memory, and performing quantitative perception training to obtain model parameters after quantitative perception training, wherein the method comprises the following steps: step S5: and inputting the test set image into the trained neural network for forward reasoning test.

Description

Neural network image classification system and method based on resistive random access memory
Technical Field
The invention relates to the technical field of neural network image classification, in particular to a neural network image classification system and method based on a resistive random access memory.
Background
With the rapid development of deep learning, neural network technology has been widely applied in various fields such as image recognition, speech recognition, natural language processing, and the like. The application of neural networks is typically deployed at the edge device side. In a traditional chip architecture, a memory and a calculation are separated, a calculation unit reads data from the memory first, and the data is stored back to the memory after the calculation is completed. However, in the face of the high concurrency requirement of the neural network, the conventional chip architecture needs to frequently carry data, which results in huge power consumption and computation bottleneck.
The resistive random access memory (ReRAM), also known as a memristor, has the advantages of low power consumption, simple structure, high working speed, controllable and variable resistance value and the like, and meanwhile, the memristor can realize various operation forms such as logic operation, matrix multiplication and the like. The characteristic of using the memristor to store and calculate the whole body can reduce the transportation of data and reduce the storage requirement. Therefore, ReRAM has great potential to solve the problems of conventional chip architectures. In recent years, memristor-based neural network accelerators provide an effective solution for neural network reasoning.
Although ReRAM has great advantages for implementing neural network reasoning, the neural network model needs to be compressed during implementation, which results in loss of image recognition accuracy. The reasonable and effective quantization method can reduce the storage space of data and improve the calculation speed under the condition of low precision loss. In the existing mainstream deep learning platform such as the neural network quantization algorithm of tensorflow and pytorch, after the operation of a convolution operator and a full-link operator, data can exceed the quantized bit width. Therefore, the quantized bit width needs to be scaled, the algorithm needs to be multiplied by a floating point decimal number for scaling, and the hardware needs to approximate the scaling operation of the algorithm through two operations of left shift and right shift. However, the conductance range of the ReRAM device and the quantization bit width input by each layer are limited, so that the resource occupancy rate is high and the operation is complex.
Disclosure of Invention
In order to solve the defects of the prior art, based on the idea of device and algorithm collaborative design, the method combines the conductance range of a ReRAM device and the characteristic that the quantization bit width input by each layer is limited, optimizes the bit width of a quantization factor by designing a constraint condition, ensures that the quantization factor adopts the optimal power of 2, and only needs to shift a limited number of bits right in the operation process of scaling after convolution, realizes simple operation, and achieves the purposes of reducing the loss of image recognition precision, improving the speed of image recognition and reducing the resource occupancy rate, and the method adopts the following technical scheme:
a neural network image classification system based on a resistive random access memory comprises an input layer, a group of convolution layers and a full-connection layer which are sequentially connected, wherein a convolution quantization layer, a convolution inverse quantization layer, an activation layer and a pooling layer are arranged for the convolution layers in a matched mode, the input layer is used for obtaining a training set image, the convolution quantization layer quantizes an input value of the input layer and convolution of a first convolution layer to obtain a quantized input value and a convolution kernel, the convolution inverse quantization layer dequantizes the quantized input value and the convolution kernel to obtain a dequantized value of the first convolution layer, a bit on a storage device is subjected to shift operation based on a digital domain to obtain a dequantized shifted value of the first convolution layer, the dequantized shifted value is subjected to activation operation through the activation layer, the activated value is subjected to pooling operation through the pooling layer, and the pooled value is used as an input value of a next convolution quantization layer, until the pooled output corresponding to the last convolutional layer, the final pooled output is subjected to classification prediction results of training set images through a full connection layer, back propagation is carried out according to errors of the prediction results and training set true values, a neural network model based on a resistive random access memory is trained, and gradient solution cannot be carried out due to the fact that a rounding method is adopted in a quantization method, so that errors are directly transmitted back to a value before quantization by skipping a quantization layer in the back propagation process, and network parameters are optimized by updating the weight of the value before quantization, so that precision loss caused by quantization is reduced; inputting an image to be classified into a trained system, quantizing the convolution of an input value of an input layer and a first convolution layer through a convolution quantization layer, performing convolution operation on the quantized value of the input layer and the quantized value of the first convolution layer to obtain a quantized value output by the first convolution layer, mapping the quantized value of the input layer to a voltage value of a resistive random access memory, mapping the quantized value of the first convolution layer to a conductance value of the resistive random access memory, mapping the result of the convolution operation to a current value output by the resistive random access memory, converting the current value to the voltage value, performing shift operation on a bit on a storage device based on the voltage value to obtain a value output by the first convolution layer after quantization shift, performing activation operation on the quantized value through an activation layer, performing pooling operation on the activated value through a pooling layer, and using the pooled value as an input value of a next convolution layer, and obtaining the classification result of the image to be classified through the final pooled output through the full-connection layer until the pooled output corresponding to the last convolution layer.
The resistive random access memory is formed into a resistive random access memory array, the quantized value of the input layer is mapped into a voltage value of the resistive random access memory and is input into a first row of resistive random access memories, the quantized value of the coiling layer is mapped into a conductance value of each resistive random access memory, and the current value output by each row of resistive random access memories is the quantized value input by the row and is convolved with the quantized value of the coiling layer.
Furthermore, the convolution layer comprises a first convolution layer and a second convolution layer, and a convolution quantization layer, a convolution inverse quantization layer, an activation layer and a pooling layer are respectively matched with the first convolution layer and the second convolution layer.
Further, the quantization process is as follows:
formula (1) represents a floating-point convolution operation;
Figure 687980DEST_PATH_IMAGE001
(1)
wherein,
Figure 97096DEST_PATH_IMAGE002
a floating-point value representing the input layer,
Figure 89323DEST_PATH_IMAGE003
a floating-point value representing a convolution kernel of the first convolution layer,
Figure 237276DEST_PATH_IMAGE004
representing a convolution operation. Respectively mapping the floating point value of the input layer and the floating point value of the convolution kernel of the first convolution layer to a fixed point value, and determining the decimal bit width of the optimal fixed point value through formulas (2), (3) and (4);
formula (2) calculating the minimum value of the floating point value mapped to the fixed point value;
Figure 682164DEST_PATH_IMAGE005
(2)
where i represents the number of layers of the neural network model,
Figure 957288DEST_PATH_IMAGE006
the input layer is represented by a representation of,
Figure 69600DEST_PATH_IMAGE007
a first layer of the volume quantification is shown,
Figure 122876DEST_PATH_IMAGE008
a first output layer of a volume is shown,
Figure 320639DEST_PATH_IMAGE009
the floating-point values representing the ith layer are mapped to fractional bits wide of the fixed-point values,
Figure 71557DEST_PATH_IMAGE010
which represents the bit-width of the quantization,
Figure 287644DEST_PATH_IMAGE011
a minimum value representing that the floating point value of the ith layer is mapped to the fixed point value;
formula (3) calculating the maximum value of the floating point value mapped to the fixed point value;
Figure 262553DEST_PATH_IMAGE012
(3)
wherein
Figure 213192DEST_PATH_IMAGE013
The floating point value representing the ith layer is mapped to the maximum value of the fixed point value;
calculating the decimal bit width of the optimal fixed point value through the constraint condition of the formula (4)
Figure 564538DEST_PATH_IMAGE009
The constraint of equation (4) is such that the range of fixed point values is as close as possible to the range of floating point values to reduce the loss of accuracy caused by quantization;
constraint conditions are as follows:
Figure 838394DEST_PATH_IMAGE014
(4)
wherein
Figure 249784DEST_PATH_IMAGE015
It is shown that the absolute value is calculated,
Figure 156560DEST_PATH_IMAGE016
is shown asThe maximum value of the i-layer floating point,
Figure 295286DEST_PATH_IMAGE017
represents the minimum value of the floating point of the ith layer,
Figure 236697DEST_PATH_IMAGE016
Figure 818988DEST_PATH_IMAGE017
the method comprises the steps of obtaining a maximum value and a minimum value of an ith layer floating point value through statistics;
solving the quantization value of each layer through a formula (5);
Figure 213060DEST_PATH_IMAGE018
(5)
wherein
Figure 155478DEST_PATH_IMAGE019
A quantization factor representing the ith layer floating point value,
Figure 951395DEST_PATH_IMAGE020
a floating-point value representing the i-th layer,
Figure 704588DEST_PATH_IMAGE021
represents the quantized value of the ith layer,
Figure 585956DEST_PATH_IMAGE022
which represents the operation of rounding off,
Figure 269747DEST_PATH_IMAGE023
representing the minimum value after quantization to the integer,
Figure 920171DEST_PATH_IMAGE024
representing the maximum value after quantization to the integer,
Figure 844265DEST_PATH_IMAGE025
indicating a truncation operation.
Further, respectively carrying out inverse quantization on the quantized input value and the convolution kernel, and then carrying out convolution operation on the inverse quantized input value and the inverse quantized convolution kernel through a formula (6) to obtain an inverse quantized floating point value output by the first convolution layer;
Figure 462197DEST_PATH_IMAGE026
(6)
wherein,
Figure 497149DEST_PATH_IMAGE027
representing the input value after the quantization,
Figure 267659DEST_PATH_IMAGE028
representing the input value after the inverse quantization,
Figure 362654DEST_PATH_IMAGE029
representing the quantized convolution kernel or kernels and,
Figure 467882DEST_PATH_IMAGE030
representing the convolution kernel after the dequantization,
Figure 40946DEST_PATH_IMAGE031
representing the quantized value of the first convolutional output layer,
Figure 665962DEST_PATH_IMAGE032
representing the inverse quantized value of the first convolutional output layer.
Equation (7) can be exited by equation (6):
Figure 931858DEST_PATH_IMAGE033
(7)
performing shift operation by formula (7) to obtain the quantized value output by the first convolution layer
Figure 462066DEST_PATH_IMAGE031
Figure 104400DEST_PATH_IMAGE034
Representing the minimum value after quantization to the integer,
Figure 583923DEST_PATH_IMAGE024
representing the maximum value after quantization to the integer,
Figure 269988DEST_PATH_IMAGE025
indicating a truncation operation.
A neural network image classification method based on a resistive random access memory comprises the following steps:
step S1: normalizing the image to be classified to obtain a normalized image;
step S2: constructing a training set and a test set for the normalized image;
step S3: constructing a neural network model based on a resistive random access memory;
step S4: inputting the training set into a neural network model based on a resistive random access memory, and performing quantitative perception training to obtain model parameters after quantitative perception training, wherein the method comprises the following steps:
step S4-1: quantizing the convolution of the input value of the input layer and the first convolution layer to obtain a quantized input value and a convolution kernel;
step S4-2: respectively carrying out inverse quantization on the quantized input value and the convolution kernel, carrying out activation operation on the inverse quantized value through an activation layer, carrying out pooling operation on the activated value through a pooling layer to obtain a first convolution layer output inverse quantized value, and carrying out shift operation on a bit on the storage equipment based on the digital domain to obtain a first convolution layer output inverse quantized shifted value; in particular in the digital domain by
Figure 100540DEST_PATH_IMAGE035
Back (corresponding to shift operation on hardware storage device, shift right)
Figure 15407DEST_PATH_IMAGE036
After bit) to obtain a quantized value of the convolutional layer output
Figure 615015DEST_PATH_IMAGE031
. The right shift operation realizes the remaining operation of formula (7) in step S4, and finally obtains the pooled value through the activation operation and the maximum pooling operation;
step S4-3: activating the inversely quantized value through an activation layer, performing pooling operation on the activated value through a pooling layer, taking the pooled value as an input value of a next convolution quantization layer until the pooled output corresponding to the last convolution layer, obtaining a classification prediction result of a training set image through a full connection layer, performing back propagation according to an error between the prediction result and a training set true value, and training a neural network model based on a Resistive Random Access Memory (RRAM), wherein because a rounding method is adopted in a quantization method, gradient solution cannot be performed, so that in the process of back propagation, the error is directly transmitted back to the value before quantization by skipping a quantization layer, and network parameters are optimized by updating the weight of the value before quantization, thereby reducing the precision loss caused by quantization;
step S5: inputting the test set image into the trained neural network for forward reasoning test, comprising the following steps:
step S5-1: taking the test set as input, performing convolution operation on the quantized value of the input layer and the quantized value of the first coiling layer obtained in the steps S3 and S4 to obtain the quantized value output by the first coiling layer, mapping the quantized value of the input layer to be the voltage value of the resistive random access memory, mapping the quantized value of the first coiling layer to be the conductance value of the resistive random access memory, and mapping the result of the convolution operation to be the current value output by the resistive random access memory;
step S5-2: converting the current value into a voltage value, performing a shift operation based on the voltage value to obtain a value after the first convolution layer outputs the quantization shift, performing an activation operation on the quantized value through an activation layer, performing a pooling operation on the activated value through a pooling layer, taking the pooled value as an input value of the next convolution layer until the pooled output corresponding to the last convolution layer is output, and obtaining a classification result of the test set image through a full connection layer by the last pooled output.
Further, the specific quantification method of step S4-1 includes the following steps:
formula (1) represents a convolution operation of floating points;
Figure 471982DEST_PATH_IMAGE001
(1)
wherein,
Figure 524252DEST_PATH_IMAGE002
a floating-point value representing the input layer,
Figure 242809DEST_PATH_IMAGE003
a floating-point value representing a convolution kernel of the first convolution layer,
Figure 696924DEST_PATH_IMAGE004
representing a convolution operation. Respectively mapping the floating point value of the input layer and the floating point value of the convolution kernel of the first convolution layer to a fixed point value, and determining the decimal bit width of the optimal fixed point value through formulas (2), (3) and (4);
formula (2) calculating the minimum value of the floating point value mapped to the fixed point value;
Figure 928054DEST_PATH_IMAGE005
(2)
where i represents the number of layers of the neural network model,
Figure 467620DEST_PATH_IMAGE006
the input layer is represented by a representation of,
Figure 989868DEST_PATH_IMAGE007
a first layer of the volume quantification is shown,
Figure 547757DEST_PATH_IMAGE008
a first output layer of a volume is shown,
Figure 497259DEST_PATH_IMAGE009
the floating-point values representing the ith layer are mapped to fractional bits wide of the fixed-point values,
Figure 258541DEST_PATH_IMAGE010
which represents the bit-width of the quantization,
Figure 584480DEST_PATH_IMAGE011
a minimum value representing that the floating point value of the ith layer is mapped to the fixed point value;
formula (3) calculating the maximum value of the floating point value mapped to the fixed point value;
Figure 996876DEST_PATH_IMAGE012
(3)
wherein
Figure 382858DEST_PATH_IMAGE013
The floating point value representing the ith layer is mapped to the maximum value of the fixed point value;
calculating the decimal bit width of the optimal fixed point value through the constraint condition of the formula (4)
Figure 897016DEST_PATH_IMAGE009
The constraint of equation (4) is such that the range of fixed point values is as close as possible to the range of floating point values to reduce the loss of accuracy caused by quantization;
constraint conditions are as follows:
Figure 761067DEST_PATH_IMAGE014
(4)
wherein
Figure 27969DEST_PATH_IMAGE015
It is shown that the absolute value is calculated,
Figure 584852DEST_PATH_IMAGE016
represents the maximum value of the floating point of the ith layer,
Figure 523989DEST_PATH_IMAGE017
represents the minimum value of the floating point of the ith layer,
Figure 175419DEST_PATH_IMAGE016
Figure 313140DEST_PATH_IMAGE017
the method comprises the steps of obtaining a maximum value and a minimum value of an ith layer floating point value through statistics;
solving the quantization value of each layer through a formula (5);
Figure 40924DEST_PATH_IMAGE018
(5)
wherein
Figure 998516DEST_PATH_IMAGE019
A quantization factor representing the ith layer floating point value,
Figure 719216DEST_PATH_IMAGE020
a floating point value representing the ith layer,
Figure 977022DEST_PATH_IMAGE021
represents the quantized value of the i-th layer,
Figure 610129DEST_PATH_IMAGE022
which represents the operation of rounding off,
Figure 320596DEST_PATH_IMAGE023
representing the minimum value after quantization to the integer,
Figure 579408DEST_PATH_IMAGE024
representing the maximum value after quantization to the integer,
Figure 426141DEST_PATH_IMAGE025
indicating a truncation operation.
Further, in the step S4-2, the quantized input value and the convolution kernel are respectively dequantized, and then the dequantized input value and the dequantized convolution kernel are convolved by the formula (6), so as to obtain a dequantized floating point value output by the first convolution layer;
Figure 698990DEST_PATH_IMAGE037
(6)
wherein,
Figure 880442DEST_PATH_IMAGE027
representing the input value after the quantization,
Figure 693677DEST_PATH_IMAGE028
representing the input value after the inverse quantization,
Figure 660496DEST_PATH_IMAGE029
representing the quantized convolution kernel or kernels and,
Figure 635405DEST_PATH_IMAGE030
representing the convolution kernel after the inverse quantization,
Figure 569732DEST_PATH_IMAGE031
representing the quantized values of the first convolutional output layer,
Figure 921079DEST_PATH_IMAGE032
representing the inverse quantized value of the first convolutional output layer.
Equation (7) can be derived from equation (6):
Figure 742405DEST_PATH_IMAGE038
(7)
performing shift operation by formula (7) to obtain the quantized value output by the first convolution layer
Figure 888215DEST_PATH_IMAGE031
Figure 67696DEST_PATH_IMAGE034
Representing the minimum value after quantization to the integer,
Figure 488313DEST_PATH_IMAGE024
representing quantization to unityThe maximum value after the shape is determined,
Figure 164145DEST_PATH_IMAGE025
indicating a truncation operation.
In step S5-2, the resistive random access memory is configured to form a resistive random access memory array, the quantized values of the input layer are mapped to voltage values of the resistive random access memory and input to the first row of resistive random access memory, the quantized values of the rolling layer are mapped to conductance values of the resistive random access memory, and the current value output by each row of resistive random access memory is a convolution operation of the quantized values input by the row and the quantized values of the rolling layer.
A neural network image classification device based on a resistive random access memory comprises one or more processors and is used for realizing the neural network image classification method based on the resistive random access memory.
The invention has the advantages and beneficial effects that:
according to the neural network image classification system and method based on the resistive random access memory, due to the fact that the conductance range of a ReRAM device is limited, a limited bit width is needed to store a convolution kernel. Since the quantization bit width of each layer input is limited, the limited bit width is required to store the convolved output value. According to the method, the bit width of the quantization factor is optimized by designing the constraint condition, so that the quantization factor adopts the optimal power of 2, only limited digits need to be shifted to the right in the operation process of scaling after convolution, and the operation is simple. The precision loss caused by the right shift of ADC (analog-to-digital converter) is reduced. And meanwhile, quantization perception training is carried out, so that the loss of precision caused by quantization is reduced, and the reasoning speed of the model is improved.
Drawings
FIG. 1 is a flow chart of a method of an embodiment of the present invention.
Fig. 2 is a flowchart of training a neural network model based on a resistive random access memory in an embodiment of the present invention.
Fig. 3 is a flowchart of image classification prediction by a trained model according to an embodiment of the present invention.
Fig. 4 is a partial example diagram of an input image in the embodiment of the present invention.
FIG. 5 is a diagram of a ReRAM based crossbar array in an embodiment of the present invention.
FIG. 6 is a graph of a comparison of floating point models and classification accuracy of the method of the present invention for each class of the test set.
Fig. 7 is a diagram showing the structure of an apparatus according to an embodiment of the present invention.
Detailed Description
The following describes in detail embodiments of the present invention with reference to the drawings. It should be understood that the detailed description and specific examples, while indicating the present invention, are given by way of illustration and explanation only, not limitation.
As shown in fig. 1 to 3, an embodiment of the present invention is to classify a washion mnist dataset. As shown in fig. 4, the data set has a total of 50000 training samples and 10000 testing samples. Each sample is a 28 x 28 size grayscale image. The data sets are 10 types in total, namely T-shirts, trousers, blouses, dresses, coats, sandals, shirts, sports shoes, schoolbag and booties.
The invention provides a neural network image classification system based on a resistive random access memory, which comprises an input layer, a group of convolution layers and a full-connection layer which are sequentially connected, wherein the convolution layers are matched with a convolution quantization layer, a convolution inverse quantization layer, an activation layer and a pooling layer, the input layer is used for acquiring a training set image, the convolution quantization layer quantizes an input value of the input layer and convolution of a first volume of lamination to obtain a quantized input value and a convolution kernel, the convolution inverse quantization layer dequantizes the quantized input value and the convolution kernel to obtain a dequantized value of the first volume of lamination, the bit corresponding to a storage device is subjected to shift operation based on a digital domain to obtain a dequantized shifted value of the first volume of lamination, the dequantized shifted value is subjected to activation operation through the activation layer, the activated value is subjected to pooling operation through the pooling layer, and the pooled value is used as an input value of a next volume of the quantization layer, until the pooled output corresponding to the last convolutional layer, the final pooled output is subjected to classification prediction results of training set images through a full connection layer, back propagation is carried out according to errors of the prediction results and training set true values, a neural network model based on a resistive random access memory is trained, and gradient solution cannot be carried out due to the fact that a rounding method is adopted in a quantization method, so that errors are directly transmitted back to a value before quantization by skipping a quantization layer in the back propagation process, and network parameters are optimized by updating the weight of the value before quantization, so that precision loss caused by quantization is reduced; inputting an image to be classified into a trained system, quantizing the convolution of an input value of an input layer and a first convolution layer through a convolution quantization layer, performing convolution operation on the quantized value of the input layer and the quantized value of the first convolution layer to obtain a quantized value output by the first convolution layer, mapping the quantized value of the input layer to a voltage value of a resistive random access memory, mapping the quantized value of the first convolution layer to a conductance value of the resistive random access memory, mapping the result of the convolution operation to a current value output by the resistive random access memory, converting the current value to the voltage value, performing shift operation based on the voltage value to obtain a value after the quantized shift of the first convolution layer output, performing activation operation on the quantized value through an activation layer, performing pooling operation on the activated value through a pooling layer, taking the pooled value as an input value of a next convolution layer until the pooled output corresponding to the last convolution layer, and outputting the final pooled output through a full connection layer to obtain a classification result of the image to be classified.
The resistive random access memory forms a resistive random access memory array, the quantized value of the input layer is mapped into the voltage value of the resistive random access memory and is input into a first row of resistive random access memories, the quantized value of the winding layer is mapped into the conductance value of each resistive random access memory, and the current value output by each row of resistive random access memories is the convolution operation of the quantized value input by the row and the quantized value of the winding layer.
The convolution layer comprises a first convolution layer and a second convolution layer, and a convolution quantization layer, a convolution inverse quantization layer, an activation layer and a pooling layer are respectively matched with the first convolution layer and the second convolution layer.
The invention provides a neural network image classification method based on a resistive random access memory, which comprises the following steps:
step S1: normalizing the image to be classified to obtain a normalized image; the pixel values of the image are normalized to be between 0 and 1, and in the embodiment of the invention, after the normalization operation is performed on the pixel values of all the samples in the fast mnist data set by dividing the pixel values by 255, the range of the pixel values of the samples becomes [0,1 ].
Step S2: constructing a training set and a test set for the normalized image; and selecting the training sample in the fast mnist as a training set, and selecting the test sample in the fast mnist as a test set.
Step S3: constructing a neural network model based on a resistive random access memory;
specifically, the neural network model structure is as follows: the input layer → the first convolution quantization layer → the first convolution inverse quantization layer → the active layer → the pooling layer → the second convolution quantization layer → the second convolution inverse quantization layer → the active layer → the pooling layer → the all-connected quantization layer → the all-connected inverse quantization layer → the softmax layer. The size of each layer weight parameter is set as follows:
the input layer has a size of
Figure 746436DEST_PATH_IMAGE039
A first convolution quantization layer with a convolution kernel parameter of size
Figure 389776DEST_PATH_IMAGE040
Step length is 1;
a second convolution quantization layer with convolution kernel parameters of size
Figure 82926DEST_PATH_IMAGE041
Step length is 1;
all-connected quantified layer with all-connected parameters of size
Figure 82106DEST_PATH_IMAGE042
Step S4: inputting the training set into a neural network model based on a resistive random access memory, and performing quantitative perception training to obtain model parameters after quantitative perception training, wherein the method comprises the following steps:
in the embodiment of the present invention, the quantization bit width is 8 bits, the input is quantized to [0,255], and the weight parameter of each layer is quantized to [ -128, 127 ]. The method comprises the following specific steps:
step S4-1: quantizing the convolution of the input value of the input layer and the first convolution layer to obtain a quantized input value and a convolution kernel; the specific quantification method is as follows:
formula (1) represents a floating-point convolution operation;
Figure 84566DEST_PATH_IMAGE001
(1)
wherein,
Figure 700355DEST_PATH_IMAGE002
a floating-point value representing the input layer,
Figure 197195DEST_PATH_IMAGE003
a floating-point value representing a convolution kernel of the first convolution layer,
Figure 847619DEST_PATH_IMAGE004
representing a convolution operation. The floating point value of the input layer and the floating point value of the convolution kernel of the first convolution layer are mapped to the fixed point value, and the decimal bit width of the optimal fixed point value is determined through formulas (2), (3) and (4).
Formula (2) calculating the minimum value of the floating point value mapped to the fixed point value;
Figure 20981DEST_PATH_IMAGE005
(2)
wherein
Figure 389645DEST_PATH_IMAGE009
In
Figure 424597DEST_PATH_IMAGE006
The input layer is represented by a representation of,
Figure 929528DEST_PATH_IMAGE007
a first layer of the volume quantification is shown,
Figure 539370DEST_PATH_IMAGE008
representing a first convolution outputA layer of a material selected from the group consisting of,
Figure 129751DEST_PATH_IMAGE009
the floating-point values representing the i-layer are mapped to fractional bits wide of the fixed-point values,
Figure 968394DEST_PATH_IMAGE010
which represents the bit-width of the quantization,
Figure 593410DEST_PATH_IMAGE011
the floating-point values representing the i-layers are mapped to the minimum of the fixed-point values.
Formula (3) calculating the maximum value of the floating point value mapped to the fixed point value;
Figure 108574DEST_PATH_IMAGE012
(3)
wherein
Figure 186252DEST_PATH_IMAGE013
The floating point values representing the i layers are mapped to the maximum of the fixed point values. Calculating the decimal bit width of the optimal fixed point value through the constraint condition of the formula (4)
Figure 563006DEST_PATH_IMAGE009
The constraint of equation (4) is such that the range of fixed point values is as close as possible to the range of floating point values to reduce the loss of accuracy caused by quantization;
constraint conditions are as follows:
Figure 42529DEST_PATH_IMAGE014
(4)
wherein
Figure 931857DEST_PATH_IMAGE015
It is shown that the absolute value is calculated,
Figure 496830DEST_PATH_IMAGE016
represents the maximum value of the floating point of the i layer,
Figure 677276DEST_PATH_IMAGE017
represents the minimum value of the floating point of the i layer,
Figure 526152DEST_PATH_IMAGE016
Figure 133851DEST_PATH_IMAGE017
and counting the maximum value and the minimum value of the floating point values of the i layers.
The quantization value of each layer can be obtained through formula (5);
Figure 920541DEST_PATH_IMAGE018
(5)
wherein
Figure 904678DEST_PATH_IMAGE019
A quantization factor representing the i-layer floating-point values,
Figure 608061DEST_PATH_IMAGE020
the floating-point values of the i-layers are represented,
Figure 386661DEST_PATH_IMAGE021
represents the i-layer quantized values,
Figure 660647DEST_PATH_IMAGE022
it is meant to round-off the process,
Figure 182895DEST_PATH_IMAGE023
representing the minimum value after quantization to the integer,
Figure 740785DEST_PATH_IMAGE024
representing the maximum value after quantization to an integer,
Figure 690286DEST_PATH_IMAGE025
indicating a truncation operation.
Step S4-2: respectively carrying out inverse quantization on the quantized input value and the convolution kernel, carrying out activation operation on the inverse quantized value through an activation layer, carrying out pooling operation on the activated value through a pooling layer to obtain a first convolution layer output inverse quantized value, and carrying out shift operation on a bit on the storage equipment based on the digital domain to obtain a first convolution layer output inverse quantized shifted value;
in particular, in the digital domain by
Figure 982727DEST_PATH_IMAGE035
Back (corresponding to shift operation on hardware storage device, shift right)
Figure 43087DEST_PATH_IMAGE036
After bit) to obtain a quantized value of the convolutional layer output
Figure 721062DEST_PATH_IMAGE031
. The right shift operation realizes the remaining operation of formula (7) in step S4, and finally obtains the pooled value through the activation operation and the maximum pooling operation;
respectively carrying out inverse quantization on the quantized input value and the convolution kernel through a formula (6), and then carrying out convolution operation on the inverse quantized input value and the inverse quantized convolution kernel to obtain an inverse quantized floating point value of a first convolution output layer;
Figure 841465DEST_PATH_IMAGE026
(6)
wherein,
Figure 90043DEST_PATH_IMAGE027
representing the input value after the quantization,
Figure 954094DEST_PATH_IMAGE028
representing the input value after the inverse quantization,
Figure 486576DEST_PATH_IMAGE029
representing the quantized convolution kernel or kernels and,
Figure 777880DEST_PATH_IMAGE030
representing the convolution kernel after the inverse quantization,
Figure 513754DEST_PATH_IMAGE031
representing the quantized value of the first convolutional output layer,
Figure 181496DEST_PATH_IMAGE032
representing the inverse quantized value of the first convolutional output layer.
The formula (7) can be derived from the formula (6), and the quantized value of the first convolution output layer can be found by performing the shift operation using the formula (7)
Figure 771746DEST_PATH_IMAGE031
Figure 233952DEST_PATH_IMAGE033
(7)
Step S4-3: activating the inversely quantized and shifted value through an activation layer, performing pooling operation on the activated value through a pooling layer, taking the pooled value as an input value of a next convolution quantization layer until the pooled output corresponding to the last convolution layer, obtaining a classification prediction result of a training set image through a full connection layer from the pooled output, performing back propagation according to an error between the prediction result and a training set true value, and training a neural network model based on a resistive random access memory;
specifically, the floating point value after inverse quantization of the first convolution output layer is input into the next layer as the input of the next layer. By analogy, floating point values of the full connection layer can be obtained, then the output of the network is obtained through the softmax classifier, the error between the network output and the correct category of the artificial mark is solved, and the error is propagated reversely. And finally obtaining the neural network model after the quantitative perception training.
Step S5: and inputting the test set image into the trained neural network to perform forward reasoning test.
Specifically, the neural network model after the quantitative perception training is mapped to a ReRAM memristor, and a test set is input to perform a forward reasoning test. The specific steps are as shown in fig. 5, wherein V in fig. 5 represents a voltage value, G represents a conductance value, and I represents a current value.
Step S5-1: the quantized values of the input layer obtained in step S3 and step S4 are input to the test set
Figure 457123DEST_PATH_IMAGE027
Quantized value of the first convolution layer
Figure 177823DEST_PATH_IMAGE029
Performing convolution operation to obtain the quantized value output by the first convolution layer
Figure 170050DEST_PATH_IMAGE043
Inputting the quantized value of the layer
Figure 803156DEST_PATH_IMAGE027
Mapping to voltage value of the resistive random access memory, and quantizing the first convolution layer
Figure 513623DEST_PATH_IMAGE029
Mapping to conductance value of resistive random access memory, and performing convolution operation
Figure 772435DEST_PATH_IMAGE043
Mapping the result of (1) to a current value output by the resistive random access memory;
step S5-2: converting the current value into a voltage value, performing displacement operation on the bit corresponding to the voltage value on the storage device based on the voltage value to obtain a value after the first convolution layer outputs quantitative displacement, performing activation operation on the value after quantitative displacement through an activation layer, performing pooling operation on the activated value through a pooling layer, taking the pooled value as an input value of the next convolution layer until pooling output corresponding to the last convolution layer is reached, and obtaining a classification result of the test set image through a full connection layer by the last pooling output.
Specifically, the current value output in step S5-1 is converted into a voltage by the ADC, then converted into a numerical value, and finally divided by the numerical value in the digital domain
Figure 619168DEST_PATH_IMAGE035
Back (i.e. shift operation on the hardware storage device, shift right)
Figure 688756DEST_PATH_IMAGE036
After bit) to obtain a quantized value of the convolutional layer output
Figure 886519DEST_PATH_IMAGE031
. The right shift operation implements the remaining operations in equation (7) of step 4, and finally the pooled value is obtained by the activation operation and the maximum pooling operation.
And by analogy, obtaining the quantized value of the full connection layer, and taking the index of the maximum value of the quantized value of the full connection layer as the category of the network prediction. Wherein the pooling layer and the full-connectivity layer are implemented in software.
The effect of the present invention is further explained by combining the simulation experiment as follows:
1. simulation conditions are as follows:
the simulation experiment of the invention is carried out under the hardware environment of NVIDIA GV100 and the software environment of Pytrch 1.5.
2. Simulation content and result analysis:
a classification problem for a washion mini dataset. In the histogram shown in fig. 6, the dark gray histogram represents the classification result of forward inference on the test set by the floating point precision model, and the light gray histogram represents the classification result of forward inference on the test set by the 8-bit quantized model of the present invention. As can be seen from the figure, the result of testing the test set by using the floating point model is compared with the test result of the quantization method based on the resistive random access memory, and the difference between the identification precision of each type of the test set and the identification precision of the test set based on the quantization method is small. Table 1 shows the average recognition accuracy of the two methods to the test set, and it can be seen that the quantization method based on the resistive random access memory of the present invention has almost no accuracy loss, but can accelerate the inference speed of the model.
Table 1: floating point model and model precision comparison table after quantization of the invention
Test method Average recognition accuracy
Reasoning test of test set by floating point model 0.8864
The invention carries out reasoning test on the test set 0.8852
In summary, the invention provides a neural network model quantization method based on a resistive random access memory, the method combines the characteristics of the conductance range of a ReRAM device and the limited bit width of quantization input of each layer, and the limited bit width is required to store a convolution kernel due to the limited conductance range of the ReRAM device. The quantization bit width of each layer input is limited, and the limited bit width is needed to store the output value after convolution. According to the method, the bit width of the quantization factor is optimized by designing the constraint condition, so that the quantization factor adopts the optimal power of 2, only limited digits need to be shifted to the right in the operation process of scaling after convolution, and the operation is simple. The precision loss caused by the right shift of ADC (analog-to-digital converter) is reduced. And meanwhile, quantization perception training is carried out, so that the loss of precision caused by quantization is reduced, and the reasoning speed of the model is improved. For the fast mnist dataset classification, the 8-bit quantization precision loss is below 0.5 percentage points compared to the floating point precision.
Corresponding to the embodiment of the neural network image classification method based on the resistive random access memory, the invention also provides an embodiment of the neural network image classification device based on the resistive random access memory.
Referring to fig. 7, the neural network image classification device based on the resistive random access memory provided in the embodiment of the present invention includes one or more processors, and is configured to implement the neural network image classification method based on the resistive random access memory in the above embodiment.
The embodiment of the neural network image classification device based on the resistive random access memory can be applied to any equipment with data processing capability, and the equipment with data processing capability can be equipment or devices such as computers. The device embodiments may be implemented by software, or by hardware, or by a combination of hardware and software. The software implementation is taken as an example, and as a logical device, the device is formed by reading corresponding computer program instructions in the nonvolatile memory into the memory for running through the processor of any device with data processing capability. In terms of hardware, as shown in fig. 7, the present invention is a hardware structure diagram of any device with data processing capability in which the neural network image classification apparatus based on a resistive random access memory is located, except for the processor, the memory, the network interface, and the nonvolatile memory shown in fig. 7, in the embodiment, any device with data processing capability in which the apparatus is located may generally include other hardware according to the actual function of the any device with data processing capability, which is not described again.
The implementation process of the functions and actions of each unit in the above device is specifically described in the implementation process of the corresponding step in the above method, and is not described herein again.
For the device embodiments, since they substantially correspond to the method embodiments, reference may be made to the partial description of the method embodiments for relevant points. The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules can be selected according to actual needs to achieve the purpose of the scheme of the invention. One of ordinary skill in the art can understand and implement it without inventive effort.
Embodiments of the present invention further provide a computer-readable storage medium, on which a program is stored, where the program, when executed by a processor, implements the neural network image classification method based on a resistance random access memory in the foregoing embodiments.
The computer readable storage medium may be an internal storage unit, such as a hard disk or a memory, of any data processing capability device described in any of the foregoing embodiments. The computer readable storage medium may also be any external storage device of a device with data processing capabilities, such as a plug-in hard disk, a Smart Media Card (SMC), an SD Card, a Flash memory Card (Flash Card), etc. provided on the device. Further, the computer readable storage medium may include both internal storage units and external storage devices of any data processing capable device. The computer-readable storage medium is used for storing the computer program and other programs and data required by the arbitrary data processing-capable device, and may also be used for temporarily storing data that has been output or is to be output.
The above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims (10)

1. The utility model provides a neural network image classification system based on resistive random access memory, includes input layer, a set of convolution layer and the full connection layer that connects gradually, its characterized in that: the convolution layer is matched with a convolution layer to be provided with a convolution quantization layer, a convolution inverse quantization layer, an activation layer and a pooling layer, the input layer is used for obtaining a training set image, the convolution quantization layer quantizes an input value of the input layer and convolution of a first convolution layer to obtain a quantized input value and a convolution kernel, the convolution inverse quantization layer dequantizes the quantized input value and the convolution kernel, the dequantized input value and the convolution kernel are subjected to convolution operation to obtain a first convolution layer output dequantized value, shift operation is carried out based on a digital domain to obtain a first convolution layer output dequantized shifted value, the activation layer is used for carrying out activation operation on the dequantized shifted value, the activated value is subjected to pooling operation, the pooled value is used as an input value of a next convolution layer until the pooled output corresponding to the last convolution layer, and the final pooled output is subjected to full-connection layer to obtain a classification prediction result of the training set image, according to the error between the prediction result and the true value of the training set, carrying out back propagation, training a neural network model based on the resistive random access memory, skipping a quantization layer in the back propagation process, directly transmitting back to a value before quantization, and optimizing network parameters by updating the weight of the value before quantization; inputting an image to be classified into a trained system, quantizing the convolution of an input layer and a first volume layer through a convolution quantization layer, performing convolution operation on the quantized value of the obtained input layer and the quantized value of the first volume layer to obtain the quantized value output by the first volume layer, mapping the quantized value of the input layer to a voltage value of a resistive random access memory, mapping the quantized value of the first volume layer to a conductance value of the resistive random access memory, mapping the result of the convolution operation to a current value output by the resistive random access memory, converting the current value into the voltage value, performing shift operation based on the voltage value to obtain a value after the first volume layer outputs quantization shift, performing activation operation on the quantized value through an activation layer, performing pooling operation on the activated value through a pooling layer, using the pooled value as the input value of a next volume layer until the pooled output corresponding to the last volume layer, and outputting the final pooled output through a full connection layer to obtain a classification result of the image to be classified.
2. The neural network image classification system based on the resistive random access memory according to claim 1, wherein: the resistive random access memory forms a resistive random access memory array, the quantized value of the input layer is mapped into the voltage value of the resistive random access memory and is input into a first row of resistive random access memories, the quantized value of the winding layer is mapped into the conductance value of each resistive random access memory, and the current value output by each row of resistive random access memories is the convolution operation of the quantized value input by the row and the quantized value of the winding layer.
3. The neural network image classification system based on the resistive random access memory according to claim 1, wherein: the convolution layers comprise a first convolution layer and a second convolution layer, and a convolution quantization layer, a convolution inverse quantization layer, an activation layer and a pooling layer are respectively matched with the first convolution layer and the second convolution layer.
4. The resistive-switching-memory-based neural network image classification system according to one of claims 1 to 3, wherein: the quantization process is as follows:
formula (1) represents a floating-point convolution operation;
Figure 841595DEST_PATH_IMAGE001
(1)
wherein,
Figure 227577DEST_PATH_IMAGE002
a floating-point value representing the input layer,
Figure 476156DEST_PATH_IMAGE003
a floating-point value representing a convolution kernel of the first convolution layer,
Figure 855053DEST_PATH_IMAGE004
represents a convolution operation; respectively mapping the floating point value of the input layer and the floating point value of the convolution kernel of the first convolution layer to a fixed point value, and determining the decimal bit width of the optimal fixed point value through a formula (2), a formula (3) and a formula (4);
formula (2) calculating the minimum value of the floating point value mapped to the fixed point value;
Figure 138267DEST_PATH_IMAGE005
(2)
where i represents the number of layers of the neural network model,
Figure 632833DEST_PATH_IMAGE006
the floating-point values representing the ith layer are mapped to fractional bits wide of the fixed-point values,
Figure 617976DEST_PATH_IMAGE007
which represents the bit-width of the quantization,
Figure 285717DEST_PATH_IMAGE008
a minimum value representing that the floating point value of the ith layer is mapped to the fixed point value;
formula (3) calculating the maximum value of the floating point value mapped to the fixed point value;
Figure 423438DEST_PATH_IMAGE009
(3)
wherein
Figure 151222DEST_PATH_IMAGE010
The floating point value representing the ith layer is mapped to the maximum value of the fixed point value;
calculating the decimal bit width of the optimal fixed point value through the constraint condition of the formula (4)
Figure 623661DEST_PATH_IMAGE006
Constraint conditions are as follows:
Figure 32777DEST_PATH_IMAGE011
(4)
wherein
Figure 25003DEST_PATH_IMAGE012
It is shown that the absolute value is calculated,
Figure 172957DEST_PATH_IMAGE013
represents the maximum value of the floating point of the ith layer,
Figure 149003DEST_PATH_IMAGE014
represents the minimum value of the floating point of the ith layer,
Figure 79919DEST_PATH_IMAGE013
Figure 582444DEST_PATH_IMAGE014
the method comprises the steps of obtaining a maximum value and a minimum value of an ith layer floating point value through statistics;
solving the quantization value of each layer through a formula (5);
Figure 386452DEST_PATH_IMAGE015
(5)
wherein
Figure 584215DEST_PATH_IMAGE016
A quantization factor representing the ith layer floating point value,
Figure 131871DEST_PATH_IMAGE017
a floating-point value representing the i-th layer,
Figure 613537DEST_PATH_IMAGE018
represents the quantized value of the ith layer,
Figure 854026DEST_PATH_IMAGE019
which represents the operation of rounding off,
Figure 273506DEST_PATH_IMAGE020
representing the minimum value after quantization to the integer,
Figure 624853DEST_PATH_IMAGE021
representing the maximum value after quantization to the integer,
Figure 898708DEST_PATH_IMAGE022
indicating a truncation operation.
5. The resistive-switching-memory-based neural network image classification system according to one of claims 1 to 3, wherein: respectively carrying out inverse quantization on the quantized input value and the convolution kernel, and carrying out convolution operation on the inverse quantized input value and the inverse quantized convolution kernel through a formula (6) to obtain an inverse quantized floating point value output by the first convolution layer;
Figure 310098DEST_PATH_IMAGE023
(6)
wherein,
Figure 216874DEST_PATH_IMAGE024
representing the input value after the quantization,
Figure 355600DEST_PATH_IMAGE025
representing the input value after the inverse quantization,
Figure 297011DEST_PATH_IMAGE026
representing the quantized convolution kernel or kernels and,
Figure 144882DEST_PATH_IMAGE027
representing the convolution kernel after the inverse quantization,
Figure 538954DEST_PATH_IMAGE028
representing the quantized values of the first convolutional output layer,
Figure 684633DEST_PATH_IMAGE029
representing the inverse quantized value of the first convolution output layer;
equation (7) is derived from equation (6):
Figure 214972DEST_PATH_IMAGE030
(7)
performing shift operation by formula (7) to obtain the quantized value output by the first convolution layer
Figure 968164DEST_PATH_IMAGE028
Figure 98800DEST_PATH_IMAGE031
Representing the minimum value after quantization to the integer,
Figure 595640DEST_PATH_IMAGE021
representing the maximum value after quantization to the integer,
Figure 511644DEST_PATH_IMAGE022
indicating a truncation operation.
6. A neural network image classification method based on a resistive random access memory is characterized by comprising the following steps:
step S1: normalizing the image to be classified to obtain a normalized image;
step S2: constructing a training set and a test set for the normalized image;
step S3: constructing a neural network model based on a resistive random access memory;
step S4: inputting the training set into a neural network model based on a resistive random access memory, and performing quantitative perception training to obtain model parameters after quantitative perception training, wherein the method comprises the following steps:
step S4-1: quantizing the input value of the input layer and the convolution kernel of the first convolution layer to obtain a quantized input value and a convolution kernel;
step S4-2: respectively carrying out inverse quantization on the quantized input value and the convolution kernel, carrying out convolution operation on the inverse quantized input value and the convolution kernel to obtain a first convolution layer output inverse quantized value, and carrying out shift operation based on a digital domain to obtain a first convolution layer output inverse quantized shifted value;
step S4-3: activating the inversely quantized and shifted value through an activation layer, performing pooling operation on the activated value through a pooling layer, taking the pooled value as an input value of a next convolution and quantization layer until the pooled output corresponding to the last convolution layer, obtaining a classification prediction result of a training set image through a full connection layer, performing back propagation according to an error between the prediction result and a training set true value, training a neural network model based on a resistive random access memory, directly returning a skipped quantization layer to a value before quantization in the back propagation process, and optimizing network parameters by updating the weight of the value before quantization;
step S5: inputting the test set image into the trained neural network for forward reasoning test, comprising the following steps:
step S5-1: taking the test set as input, performing convolution operation on the quantized value of the input layer and the quantized value of the first convolution layer obtained in the steps S3 and S4 to obtain a quantized value output by the first convolution layer, mapping the quantized value of the input layer to a voltage value of the resistive random access memory, mapping the quantized value of the first convolution layer to a conductance value of the resistive random access memory, and mapping the result of the convolution operation to a current value output by the resistive random access memory;
step S5-2: converting the current value into a voltage value, performing a shift operation based on the voltage value to obtain a value after the first convolution layer outputs the quantization shift, performing an activation operation on the quantized value through an activation layer, performing a pooling operation on the activated value through a pooling layer, taking the pooled value as an input value of the next convolution layer until the pooled output corresponding to the last convolution layer is output, and obtaining a classification result of the test set image through a full connection layer by the last pooled output.
7. The neural network image classification method based on the resistive random access memory according to claim 6, characterized in that: the specific quantification method of step S4-1 includes the following steps:
formula (1) represents a convolution operation of floating points;
Figure 170158DEST_PATH_IMAGE001
(1)
wherein,
Figure 788090DEST_PATH_IMAGE002
a floating-point value representing the input layer,
Figure 823042DEST_PATH_IMAGE003
a floating-point value representing a convolution kernel of the first convolution layer,
Figure 593552DEST_PATH_IMAGE004
represents a convolution operation; respectively mapping the floating point value of the input layer and the floating point value of the convolution kernel of the first convolution layer to a fixed point value, and determining the decimal bit width of the optimal fixed point value through a formula (2), a formula (3) and a formula (4);
formula (2) calculating the minimum value of the floating point value mapped to the fixed point value;
Figure 875498DEST_PATH_IMAGE005
(2)
where i represents the number of layers of the neural network model,
Figure 731458DEST_PATH_IMAGE032
mapping floating point values representing the ith layer to fixed pointsThe fractional bit width of the point values,
Figure 570101DEST_PATH_IMAGE007
which represents the bit-width of the quantization,
Figure 195118DEST_PATH_IMAGE008
a minimum value representing that the floating point value of the ith layer is mapped to the fixed point value;
formula (3) calculating the maximum value of the floating point value mapped to the fixed point value;
Figure 710282DEST_PATH_IMAGE009
(3)
wherein
Figure 787959DEST_PATH_IMAGE010
The floating point value representing the ith layer is mapped to the maximum value of the fixed point value;
calculating the decimal bit width of the optimal fixed point value through the constraint condition of the formula (4)
Figure 430293DEST_PATH_IMAGE006
Constraint conditions are as follows:
Figure 909816DEST_PATH_IMAGE011
(4)
wherein
Figure 799143DEST_PATH_IMAGE012
It is shown that the absolute value is calculated,
Figure 364117DEST_PATH_IMAGE013
represents the maximum value of the floating point of the ith layer,
Figure 544563DEST_PATH_IMAGE014
represents the minimum value of the floating point of the ith layer,
Figure 127860DEST_PATH_IMAGE013
Figure 735558DEST_PATH_IMAGE014
the method comprises the steps of obtaining a maximum value and a minimum value of an ith layer floating point value through statistics;
solving the quantization value of each layer through a formula (5);
Figure 53407DEST_PATH_IMAGE015
(5)
wherein
Figure 771965DEST_PATH_IMAGE016
A quantization factor representing the ith layer floating point value,
Figure 475347DEST_PATH_IMAGE017
a floating point value representing the ith layer,
Figure 253947DEST_PATH_IMAGE018
represents the quantized value of the ith layer,
Figure 793513DEST_PATH_IMAGE019
which represents the operation of rounding off the object,
Figure 315761DEST_PATH_IMAGE020
representing the minimum value after quantization to the integer,
Figure 76913DEST_PATH_IMAGE021
representing the maximum value after quantization to the integer,
Figure 26414DEST_PATH_IMAGE022
indicating a truncation operation.
8. The neural network image classification method based on the resistive random access memory according to claim 6, characterized in that: in the step S4-2, performing inverse quantization on the quantized input value and the convolution kernel, and then performing convolution operation on the inverse-quantized input value and the inverse-quantized convolution kernel through a formula (6) to obtain an inverse-quantized floating point value output by the first convolution layer;
Figure 787697DEST_PATH_IMAGE023
(6)
wherein,
Figure 362904DEST_PATH_IMAGE024
representing the value of the input after the quantization,
Figure 526032DEST_PATH_IMAGE025
representing the input value after the inverse quantization,
Figure 646434DEST_PATH_IMAGE026
representing the quantized convolution kernel or kernels and,
Figure 160592DEST_PATH_IMAGE027
representing the convolution kernel after the inverse quantization,
Figure 562927DEST_PATH_IMAGE028
representing the quantized values of the first convolutional output layer,
Figure 580562DEST_PATH_IMAGE029
representing the inverse quantized value of the first convolution output layer;
equation (7) is derived from equation (6):
Figure 137445DEST_PATH_IMAGE030
(7)
performing shift operation by formula (7) to obtain the quantized value output by the first convolution layer
Figure 873320DEST_PATH_IMAGE028
Figure 790329DEST_PATH_IMAGE031
Representing the minimum value after quantization to an integer,
Figure 865733DEST_PATH_IMAGE021
representing the maximum value after quantization to the integer,
Figure 593517DEST_PATH_IMAGE022
indicating a truncation operation.
9. The neural network image classification method based on the resistive random access memory according to claim 6, characterized in that: in step S5-2, a resistive random access memory is constructed to form a resistive random access memory array, the quantized values of the input layer are mapped to voltage values of the resistive random access memory and input to the first row of resistive random access memory, the quantized values of the winding layer are mapped to conductance values of the resistive random access memory, and the current value output by each row of resistive random access memory is a convolution operation between the quantized values input by the row and the quantized values of the winding layer.
10. A neural network image classification device based on a resistive random access memory, which is characterized by comprising one or more processors and is used for realizing the neural network image classification method based on the resistive random access memory as claimed in any one of claims 6 to 9.
CN202210579664.2A 2022-05-26 2022-05-26 Neural network image classification system and method based on resistive random access memory Active CN114677548B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210579664.2A CN114677548B (en) 2022-05-26 2022-05-26 Neural network image classification system and method based on resistive random access memory

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210579664.2A CN114677548B (en) 2022-05-26 2022-05-26 Neural network image classification system and method based on resistive random access memory

Publications (2)

Publication Number Publication Date
CN114677548A true CN114677548A (en) 2022-06-28
CN114677548B CN114677548B (en) 2022-10-14

Family

ID=82080811

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210579664.2A Active CN114677548B (en) 2022-05-26 2022-05-26 Neural network image classification system and method based on resistive random access memory

Country Status (1)

Country Link
CN (1) CN114677548B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115311506A (en) * 2022-10-11 2022-11-08 之江实验室 Image classification method and device based on quantization factor optimization of resistive random access memory
CN115905546A (en) * 2023-01-06 2023-04-04 之江实验室 Graph convolution network document identification device and method based on resistive random access memory
CN116561050A (en) * 2023-04-07 2023-08-08 清华大学 Fine granularity mapping method and device for RRAM (remote radio access memory) integrated chip

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB201704751D0 (en) * 2017-03-24 2017-05-10 Imagination Tech Ltd Floating point to fixed point conversion
US20190042948A1 (en) * 2017-08-04 2019-02-07 Samsung Electronics Co., Ltd. Method and apparatus for generating fixed-point quantized neural network
CN110363281A (en) * 2019-06-06 2019-10-22 上海交通大学 A kind of convolutional neural networks quantization method, device, computer and storage medium
CN111260048A (en) * 2020-01-14 2020-06-09 上海交通大学 Method for realizing activation function in neural network accelerator based on memristor
CN111382788A (en) * 2020-03-06 2020-07-07 西安电子科技大学 Hyperspectral image classification method based on binary quantization network
CN111612147A (en) * 2020-06-30 2020-09-01 上海富瀚微电子股份有限公司 Quantization method of deep convolutional network
CN111695671A (en) * 2019-03-12 2020-09-22 北京地平线机器人技术研发有限公司 Method and device for training neural network and electronic equipment
CN112561049A (en) * 2020-12-23 2021-03-26 首都师范大学 Resource allocation method and device of DNN accelerator based on memristor
CN112884133A (en) * 2021-03-24 2021-06-01 苏州科达科技股份有限公司 Convolutional neural network quantization method, system, device and storage medium
WO2021179587A1 (en) * 2020-03-10 2021-09-16 北京迈格威科技有限公司 Neural network model quantification method and apparatus, electronic device and computer-readable storage medium
CN114330694A (en) * 2021-12-31 2022-04-12 上海集成电路装备材料产业创新中心有限公司 Circuit and method for realizing convolution operation
CN114418057A (en) * 2020-10-28 2022-04-29 华为技术有限公司 Operation method of convolutional neural network and related equipment

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB201704751D0 (en) * 2017-03-24 2017-05-10 Imagination Tech Ltd Floating point to fixed point conversion
US20190042948A1 (en) * 2017-08-04 2019-02-07 Samsung Electronics Co., Ltd. Method and apparatus for generating fixed-point quantized neural network
CN111695671A (en) * 2019-03-12 2020-09-22 北京地平线机器人技术研发有限公司 Method and device for training neural network and electronic equipment
CN110363281A (en) * 2019-06-06 2019-10-22 上海交通大学 A kind of convolutional neural networks quantization method, device, computer and storage medium
CN111260048A (en) * 2020-01-14 2020-06-09 上海交通大学 Method for realizing activation function in neural network accelerator based on memristor
CN111382788A (en) * 2020-03-06 2020-07-07 西安电子科技大学 Hyperspectral image classification method based on binary quantization network
WO2021179587A1 (en) * 2020-03-10 2021-09-16 北京迈格威科技有限公司 Neural network model quantification method and apparatus, electronic device and computer-readable storage medium
CN111612147A (en) * 2020-06-30 2020-09-01 上海富瀚微电子股份有限公司 Quantization method of deep convolutional network
CN114418057A (en) * 2020-10-28 2022-04-29 华为技术有限公司 Operation method of convolutional neural network and related equipment
CN112561049A (en) * 2020-12-23 2021-03-26 首都师范大学 Resource allocation method and device of DNN accelerator based on memristor
CN112884133A (en) * 2021-03-24 2021-06-01 苏州科达科技股份有限公司 Convolutional neural network quantization method, system, device and storage medium
CN114330694A (en) * 2021-12-31 2022-04-12 上海集成电路装备材料产业创新中心有限公司 Circuit and method for realizing convolution operation

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
DARRYL D.LIN 等,: "Fixed point quantization of deep convolutional networks", 《PROCEEDINGS OF THE 33RD INTERNATIONAL CONFERENCE ON MACHINE LEARNING,PMLR4 8》 *
孙磊 等,: "改进的基于嵌入式SoC卷积神经网络识别模型", 《计算机应用与软件》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115311506A (en) * 2022-10-11 2022-11-08 之江实验室 Image classification method and device based on quantization factor optimization of resistive random access memory
CN115905546A (en) * 2023-01-06 2023-04-04 之江实验室 Graph convolution network document identification device and method based on resistive random access memory
CN116561050A (en) * 2023-04-07 2023-08-08 清华大学 Fine granularity mapping method and device for RRAM (remote radio access memory) integrated chip

Also Published As

Publication number Publication date
CN114677548B (en) 2022-10-14

Similar Documents

Publication Publication Date Title
CN108701250B (en) Data fixed-point method and device
CN108345939B (en) Neural network based on fixed-point operation
CN114677548B (en) Neural network image classification system and method based on resistive random access memory
CN108510067B (en) Convolutional neural network quantification method based on engineering realization
US11593658B2 (en) Processing method and device
CN110413255B (en) Artificial neural network adjusting method and device
CN110880038B (en) System for accelerating convolution calculation based on FPGA and convolution neural network
CN109002889B (en) Adaptive iterative convolution neural network model compression method
WO2019238029A1 (en) Convolutional neural network system, and method for quantifying convolutional neural network
CN115311506B (en) Image classification method and device based on quantization factor optimization of resistive random access memory
CN110363297A (en) Neural metwork training and image processing method, device, equipment and medium
WO2020001401A1 (en) Operation method and apparatus for network layer in deep neural network
CN110874627B (en) Data processing method, data processing device and computer readable medium
CN113011532A (en) Classification model training method and device, computing equipment and storage medium
CN115564987A (en) Training method and application of image classification model based on meta-learning
CN116188878A (en) Image classification method, device and storage medium based on neural network structure fine adjustment
CN110503182A (en) Network layer operation method and device in deep neural network
CN113240090B (en) Image processing model generation method, image processing device and electronic equipment
CN114239799A (en) Efficient target detection method, device, medium and system
CN115905546B (en) Graph convolution network literature identification device and method based on resistive random access memory
CN116956997A (en) LSTM model quantization retraining method, system and equipment for time sequence data processing
US20220253709A1 (en) Compressing a Set of Coefficients for Subsequent Use in a Neural Network
CN116384471A (en) Model pruning method, device, computer equipment, storage medium and program product
Goel et al. CompactNet: High accuracy deep neural network optimized for on-chip implementation
CN116306879A (en) Data processing method, device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant