CN114723044B

CN114723044B - Error compensation method, device, chip and equipment for in-memory computing chip

Info

Publication number: CN114723044B
Application number: CN202210357661.4A
Authority: CN
Inventors: 严洪泽; 郭昕婕; 孙旭光
Original assignee: Hangzhou Zhicun Intelligent Technology Co ltd
Current assignee: Hangzhou Zhicun Intelligent Technology Co ltd
Priority date: 2022-04-07
Filing date: 2022-04-07
Publication date: 2023-04-25
Anticipated expiration: 2042-04-07
Also published as: CN114723044A

Abstract

The embodiment of the invention provides an error compensation method, device, chip and equipment for an in-memory computing chip, wherein the method comprises the following steps: inputting the sample image into a target neural network model, and taking the feature images output by all the convolution layers as calibration references of the convolution layers; acquiring a corresponding index matrix according to an input feature map of a current convolution layer of a corresponding sample image; obtaining a calculation result of a current convolution layer corresponding to the input feature map output by an in-memory calculation chip; and after comparing the calibration standard with the calculation result, generating an index value-compensation vector corresponding relation table corresponding to the current convolution layer according to the index matrix and the comparison result, wherein the index value-compensation vector corresponding relation table is used for compensating the calculation result output by the simulation domain in the digital domain when the in-memory calculation chip realizes the calculation of the target neural network model, so that the calculation result of the in-memory calculation integrated chip is more approximate to the calculation result under the noise-free condition.

Description

Error compensation method, device, chip and equipment for in-memory computing chip

Technical Field

The present invention relates to the field of semiconductor technologies, and in particular, to an error compensation method and apparatus for an in-memory computing chip, an electronic device, and a computer readable storage medium.

Background

Most conventional digital chips are based on von neumann architecture, and the physical separation of memory and processor creates a performance bottleneck for "memory wall". In recent years, in order to solve the structural bottleneck of the von neumann computing system, a memory Computing (CIM) chip based on a plurality of types of memory cells such as a Phase Change Memory (PCM), a Resistive Random Access Memory (RRAM) and a Flash memory (norflash) has been attracting attention, and the basic idea is to directly utilize analog memory capacity and kirchhoff circuit law to perform logic computation by using a memory, thereby reducing the data transmission amount and transmission distance between the memory and a processor, reducing power consumption and improving performance.

Unlike traditional digital computing chips, CIM chips employ analog circuits and accurate current control to implement computing functions; for example, to add two analog quantities, only two currents characterizing the analog quantities need to be connected, whereas when adding two 8-bit numbers in the digital domain, the transistors characterize 0/1 by gate voltage on and off, while one adder requires many such transistors (6 to 28 transistors per bit, depending on the circuit design).

However, due to the limitations of the current analog chip process capability and the memory state programming precision, the number of electrons on the floating gate of the memory cell cannot be accurately controlled, and in addition, in combination with the precision limitations of the digital-to-analog conversion module and the analog-to-digital conversion module and other factors, a certain gap exists between the accuracy of the input data calculated by the memory cell array of the CIM chip and the digital integrated circuit, so that the current CIM chip can only be applied to tasks with lower calculation precision requirements such as classification and the like, but not yet applied to scenes with higher precision requirements such as image processing and the like.

Disclosure of Invention

In view of the problems in the prior art, the present invention provides an error compensation method and apparatus for an in-memory computing chip, an electronic device, and a computer-readable storage medium, which can at least partially solve the problems in the prior art.

In order to achieve the above purpose, the present invention adopts the following technical scheme:

in a first aspect, an error compensation method for an in-memory computing chip is provided, including:

inputting the sample image into a target neural network model, and taking the feature images output by all the convolution layers as calibration references of the convolution layers;

acquiring a corresponding index matrix according to an input feature map of a current convolution layer of a corresponding sample image;

obtaining a calculation result of a current convolution layer corresponding to the input feature map output by an in-memory calculation chip, wherein the in-memory calculation chip is used for realizing target neural network model operation, and the input feature map is input into a flash memory cell array corresponding to the current convolution layer in an analog domain of the in-memory calculation chip to be operated to obtain the calculation result;

and after comparing the calibration standard with the calculation result, generating an index value-compensation vector corresponding relation table corresponding to the current convolution layer according to the index matrix and the comparison result, wherein the index value-compensation vector corresponding relation table is used for compensating the calculation result output by the simulation domain in the digital domain when the in-memory calculation chip realizes the calculation of the target neural network model.

Further, the neural network model is a direct-connected neural network model, a U-shaped neural network model, a residual structure network model, a cyclic neural network model, an MLP model, a transducer network model or an attention mechanism neural network model.

Further, the neural network model includes: 1D convolution, 2D convolution, 3D convolution, full join, transposed convolution, channel separable convolution, and dilation convolution neural network operators.

Further, the sample image is in a gray scale image format, a color transparency image format, a gray scale chrominance image format, or a hue saturation brightness image format.

Further, after comparing the calibration standard with the calculation result, generating an index value-compensation vector correspondence table corresponding to the current convolution layer according to the index matrix and the comparison result, including:

comparing the calibration standard with the calculation result;

and generating an index value-compensation vector corresponding relation table corresponding to the current convolution layer according to the index matrix and the comparison result.

Further, the obtaining the corresponding index matrix according to the input feature map of the current convolution layer of the corresponding sample image includes:

If the current convolution layer is the first convolution layer, supplementing 0 to the input feature map according to the step length and the convolution kernel size of the current convolution layer and in a tensorflow convolution operator supplementing 0 mode;

if the current volume layer is a non-first convolution layer and the upper layer is a convolution layer, supplementing 0 to the average value matrix of the upper convolution layer according to the step length and the convolution kernel size of the current convolution layer and the convolution operator 0 supplementing mode of tensorflow;

if the current convolution layer is a non-first convolution layer and the upper layer is a non-convolution layer, supplementing 0 to the mask front matrix of the upper non-convolution layer according to the step length and the convolution kernel size of the current convolution layer and in a tensorf low convolution operator supplementing 0 mode;

traversing the zero-filled image pixel by pixel, and calculating the average value of a preset area around each pixel point by sub-channels to obtain an average value matrix of the current convolution layer;

and calculating an index matrix of the current convolution layer according to the average value matrix.

Further, if the step size of the next convolution layer is equal to 1 and the convolution kernel sizes of the next convolution layer and the current convolution layer are the same, the index matrix of the next convolution layer is the same as the index matrix of the current convolution layer.

Further, after comparing the calibration standard with the calculation result, generating a corresponding index value-compensation vector correspondence table according to the index matrix and the comparison result, including:

And comparing the calibration standard with the calculation result, and sorting and averaging the difference values according to the index matrix to obtain a corresponding index value-compensation vector corresponding relation table.

In a second aspect, an error compensation method for an in-memory computing chip is provided, where the in-memory computing chip is configured to implement a target neural network model operation, and the error compensation method includes: a digital domain and an analog domain, the error compensation method being applied to the digital domain, the error compensation method comprising:

acquiring a corresponding index matrix according to an input feature map of a current convolution layer;

obtaining compensation vectors corresponding to all pixels of the input feature image based on a pre-obtained index value-compensation vector corresponding relation table of the current convolution layer according to the index matrix;

obtaining a calculation result corresponding to the input feature map output by a flash memory cell array corresponding to the current convolution layer in an analog domain;

and compensating the calculation result according to the compensation vector.

Further, the input feature map is in a gray scale map format, a color transparency map format, a gray scale chrominance image format, or a hue saturation brightness image format.

Further, the obtaining the corresponding index matrix according to the input feature map of the current convolution layer includes:

traversing the zero-filled image pixel by pixel, and calculating the average value of the surrounding areas of each pixel point by sub-channels to obtain an average value matrix of the current convolution layer;

Further, the method further comprises:

and transmitting the compensated calculation result as an input characteristic diagram of the next convolution layer of the target neural network model to a flash memory cell array corresponding to the next convolution layer in the simulation domain for calculation.

In a third aspect, there is provided an in-memory computing chip comprising: a digital domain for performing a matrix multiply-add operation of the target neural network model and an analog domain for performing an error compensation method as described above.

In a fourth aspect, there is provided an error compensation apparatus for an in-memory computing chip, comprising:

the calibration reference acquisition module inputs the sample image into the target neural network model, and takes the characteristic diagram output by each convolution layer as the calibration reference of the convolution layer;

the index matrix acquisition module acquires a corresponding index matrix according to the input feature map of the current convolution layer of the corresponding sample image;

the in-memory computing result acquisition module is used for acquiring a computing result of a current convolution layer corresponding to the input feature map output by the in-memory computing chip, wherein the in-memory computing chip is used for realizing target neural network model operation, and the input feature map is input into a flash memory cell array corresponding to the current convolution layer in an analog domain of the in-memory computing chip for operation to obtain the computing result;

And the compensation vector table generation module is used for comparing the calibration standard with the calculation result, and generating an index value-compensation vector corresponding relation table corresponding to the current convolution layer according to the index matrix and the comparison result, wherein the index value-compensation vector corresponding relation table is used for compensating the calculation result output by the simulation domain in the digital domain when the in-memory calculation chip realizes the calculation of the target neural network model.

In a fifth aspect, an electronic device is provided, including a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the error compensation method for an in-memory computing chip described above when the program is executed by the processor.

In a fifth aspect, a computer readable storage medium is provided, on which a computer program is stored which, when being executed by a processor, implements the steps of the error compensation method for an in-memory computing chip described above.

The embodiment of the invention provides an error compensation method, device, chip and equipment for an in-memory computing chip, wherein the method comprises the following steps: inputting the sample image into a target neural network model, and taking the feature images output by all the convolution layers as calibration references of the convolution layers; acquiring a corresponding index matrix according to an input feature map of a current convolution layer of a corresponding sample image; obtaining a calculation result of a current convolution layer corresponding to the input feature map output by an in-memory calculation chip, wherein the in-memory calculation chip is used for realizing target neural network model operation, and the input feature map is input into a flash memory cell array corresponding to the current convolution layer in an analog domain of the in-memory calculation chip to be operated to obtain the calculation result; and after comparing the calibration standard with the calculation result, generating an index value-compensation vector corresponding relation table corresponding to the current convolution layer according to the index matrix and the comparison result, wherein the index value-compensation vector corresponding relation table is used for compensating the calculation result output by the simulation domain in the digital domain when the in-memory calculation chip realizes the calculation of the target neural network model. In the application reasoning stage, in the digital domain of the chip, a corresponding index matrix is obtained according to the input feature diagram of the current convolution layer; obtaining compensation vectors corresponding to all pixels of the input feature image based on a pre-obtained index value-compensation vector corresponding relation table of the current convolution layer according to the index matrix; obtaining a calculation result corresponding to the input feature map output by a flash memory cell array corresponding to the current convolution layer in an analog domain; and compensating the calculation result according to the compensation vector, so that the calculation result of the integrated memory and calculation chip is closer to the calculation result under the condition of no noise, the calculation precision is improved, and the integrated memory and calculation chip can be applied to scenes with higher precision requirements such as image processing.

The foregoing and other objects, features and advantages of the invention will be apparent from the following more particular description of preferred embodiments, as illustrated in the accompanying drawings.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, and it is obvious that the drawings in the following description are some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art. In the drawings:

FIG. 1 is a flow chart of an error compensation method for an in-memory computing chip in an embodiment of the invention;

FIG. 2 illustrates a process of obtaining an index matrix in an embodiment of the invention;

FIG. 3 is a flow chart of a method of error compensation performed in an in-memory computing chip in accordance with an embodiment of the present invention;

FIG. 4 illustrates an integrated chip design framework for memory using the compensation method of the present invention in an embodiment of the present invention;

FIG. 5 illustrates the structure of a superdivision neural network model in an embodiment of the present invention;

FIG. 6 is a block diagram showing an error compensation apparatus for an in-memory computing chip in an embodiment of the present invention;

Fig. 7 is a block diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

In order to make the present application solution better understood by those skilled in the art, the following description will be made in detail and with reference to the accompanying drawings in the embodiments of the present application, it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art based on the embodiments herein without making any inventive effort, shall fall within the scope of the present application.

It will be appreciated by those skilled in the art that embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

It should be noted that the terms "comprises" and "comprising," and any variations thereof, in the description and claims of the present application and in the foregoing figures, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed or inherent to such process, method, article, or apparatus.

It should be noted that, in the case of no conflict, the embodiments and features in the embodiments may be combined with each other. The present application will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.

The excellent performance of convolutional neural networks in the field of image processing has been widely demonstrated, such as image classification, object detection, semantic segmentation, image superdivision, image noise reduction, image dynamic range fusion, and the like. Because of the process capability limitation of the current analog chip memory cell, the difference exists in the number of electrons, impedance, tunneling voltage and the like which can be carried on each memory cell; the memory state programming adopts an updating-checking iteration mode, the difference value between the read current and the target current is controlled to be smaller than a threshold value, the memory cell programming is realized, and the programming precision difference exists; the target image can introduce digital-to-analog conversion errors through the digital-to-analog conversion module; after the calculation is completed, the characteristic data can introduce an analog-to-digital conversion error through an analog-to-digital conversion module; meanwhile, due to the influence of other factors such as temperature drift, electrons on the unit memory may change, and the output current is changed to introduce temperature drift errors. Therefore, the accuracy of the feature map of the input data subjected to the storage array operation is different from that of the digital chip to a certain extent, so that the current integrated storage and calculation chip is only applied to tasks with lower calculation accuracy requirements such as classification and the like, and is not applied to scenes with higher accuracy requirements such as image processing and the like.

By using the method for compensating the lookup table of the memory chip for the image processing neural network model, which is provided by the embodiment of the invention, a gray level or color level index matrix is calculated according to a target image; selecting a calibration image or pixel, and calculating a calibration standard of each layer of convolution of the target neural network model in a digital domain; comparing each layer of convolution output feature diagram of the memory chip with a calibration reference, and generating an index value-compensation vector corresponding relation table of gray level or color level according to the index matrix; and in the reasoning stage, according to the index matrix of the target image, based on the index value-compensation vector corresponding relation table, extracting the compensation vector corresponding to each layer of feature layer pixels, and compensating the convolution output of each layer of the memory chip, so that the precision error of the calculated feature image of the memory array and the calculated result of the digital chip can be greatly improved.

Fig. 1 shows a flowchart of an error compensation method for an in-memory computing chip in an embodiment of the present invention, where the error compensation method is mainly applied to a high-speed computing chip such as a PC end, a server or a mobile phone SoC main control chip, and of course, may also be implemented on a CPU core of the in-memory computing SoC chip, so as to obtain an index value-compensation vector correspondence table for error compensation in the in-memory computing SoC chip; as shown in fig. 1, the error compensation method for the in-memory computing chip may include the following:

Step S100: inputting the sample image into a target neural network model, and taking the feature images output by all the convolution layers as calibration references of the convolution layers;

specifically, the target neural network model is a neural network model which is completed through training, one or more sample images can be used, and if the number of the sample images is multiple, the average processing is carried out on the multiple sample images. The calibration image or the pixel may be a sample image directly input into the target neural network model, or may be an abstract sample image extracted and input into the target neural network model.

Step S200: acquiring a corresponding index matrix according to an input feature map of a current convolution layer of a corresponding sample image;

specifically, after a sample image is input into a target neural network model, each layer of convolution layer takes the output of the upper layer as an input characteristic diagram; and calculating a corresponding index matrix according to the input feature map of the current convolution layer.

Wherein the index matrix is calculated using the gray scale map when the input feature map is the gray scale, and the index matrix is calculated using the RGB values when the input feature map is the color scale;

step S300: obtaining a calculation result of a current convolution layer corresponding to the input feature map output by an in-memory calculation chip;

The in-memory computing chip is used for realizing target neural network model operation, and comprises a digital domain and an analog domain, wherein the analog domain is mainly an analog circuit, and convolution layer matrix multiplication and addition operation in the neural network model is realized by using the flash memory cell array; the digital domain is mainly used for realizing digital operation, such as operation of a full connection layer and a sampling layer, and an output result of a certain convolution layer output by the analog domain is input into an analog circuit module corresponding to operation of a next convolution layer in the analog domain after error compensation is carried out on the output result;

each layer of convolution layer needs to calculate the corresponding index value-compensation vector corresponding relation table in turn, so that the input feature map is input into the flash memory cell array of the calculation chip in the memory to calculate the corresponding current convolution layer to obtain the calculation result;

step S400: and after comparing the calibration standard with the calculation result, generating an index value-compensation vector corresponding relation table corresponding to the current convolution layer according to the index matrix and the comparison result, wherein the index value-compensation vector corresponding relation table is used for compensating the calculation result output by the simulation domain in the digital domain when the in-memory calculation chip realizes the calculation of the target neural network model.

Specifically, each convolution layer corresponds to 1 index value-compensation vector correspondence table, there may be 2 convolution layers with the same index value-compensation vector correspondence table, or there may be two convolution layers with different index value-compensation vector correspondence tables, where the index value-compensation vector correspondence table is used to characterize compensation vector values corresponding to gray scales of different pixel points in the input feature map.

The applicant finds through a great deal of research that when the weight burning environment is similar to the testing environment, for the same memory chip, the main error sources of the weights after the same burning are digital-to-analog conversion errors and analog-to-digital conversion errors. When the input data are the same, the output change of the same position is very small after the memory chip with the burnt weight operates for a plurality of times. Therefore, the output of the current convolution layer can be error-compensated based on the index value-compensation vector correspondence table.

In the embodiment of the invention, the calculation result is compensated according to the compensation vector, so that the calculation result of the integrated memory and calculation chip is more similar to the calculation result under the noiseless condition, the calculation precision is improved, and the integrated memory and calculation chip can be applied to scenes with higher precision requirements such as image processing.

The invention can calibrate and calculate errors of various neural network models containing convolution layers, and is not limited to 1D convolution, 2D convolution, 3D convolution, full connection, transposition convolution, channel separable convolution and expansion convolution neural network operators; not limited to a direct neural network model, a U-shaped neural network model, a residual structure network model, a cyclic neural network model, an MLP model, a transducer network model, or an attention mechanism neural network model, etc.

It is worth to be noted that, the embodiment of the invention mainly aims at error correction of the integrated memory chip of the image processing neural network model. The convolutional neural network model is a neural network with limited receptive fields, i.e., processing is performed for the current pixel neighborhood. Thus, when the same pixel block is on different images, the convolution neural network model has the same output result of each layer of convolution. When the colors of adjacent pixels of the target image are similar, the convolution output of each layer of convolution neural network model is similar, and the influence on the calculation result of the current pixel is often a neighboring pixel.

According to the method, the pixel values on the feature map after convolution of each layer are compensated according to the input feature map pixels and the adjacent pixel values (the pixels with the preset number which are close to the current pixel), and as the compensation vectors of the convolution of each layer and the pixels of each feature layer are different, a compensation index matrix is calculated firstly to know that the pixels of different feature layers adopt different compensation vectors. In order to reduce the calculation amount and the size of the index matrix, the embodiment of the invention adopts a method for calculating the index matrix according to the input feature map, but is not limited to the input feature map, for example, in a video stream task, the feature map output according to the previous layer convolution can be also analyzed according to the previous frame of image by convolution statistics of each layer. That is, for different pixels, according to the index matrix, the different pixels extract the compensation vector from the compensation vector table by looking up the table to compensate the results of each layer of convolution or other AI operators (e.g., full link layer, 3D convolution, transposed convolution), so that the results are closer to the calculation results under the noiseless condition.

It should be noted that the neural network model is a direct neural network model, a U-shaped neural network model, a residual structure network model, a cyclic neural network model, an MLP model, a transducer network model, or an attention mechanism neural network model, which includes but is not limited to the above.

In addition, the neural network model includes: 1D convolution, 2D convolution, 3D convolution, full join, transpose convolution, channel separable convolution, dilation convolutional neural network operators, and the like, including but not limited to.

The sample image is in a gray scale image format, a color transparency image (rgb_alpha) format, a gray scale chrominance image format (YUV) or a hue saturation brightness image format (HSV, HSL), etc., including but not limited thereto.

In an alternative embodiment, the step S200 may include the following:

step I: (1) If the current convolution layer is the first convolution layer, supplementing 0 to the input feature map according to the step length and the convolution kernel size of the current convolution layer and in a tensorflow convolution operator supplementing 0 mode;

(2) If the current volume layer is a non-first convolution layer and the upper layer is a convolution layer, supplementing 0 to the average value matrix of the upper convolution layer according to the step length and the convolution kernel size of the current convolution layer and the convolution operator 0 supplementing mode of tensorflow;

(3) If the current convolution layer is a non-first convolution layer and the upper layer is a non-convolution layer, supplementing 0 to the mask front matrix of the upper non-convolution layer according to the step length and the convolution kernel size of the current convolution layer and in a tensorf low convolution operator supplementing 0 mode;

step II: traversing the zero-filled image pixel by pixel, and calculating the average value of a preset area around each pixel point by sub-channels to obtain an average value matrix of the current convolution layer;

step III: and calculating an index matrix of the current convolution layer according to the average value matrix.

If the step size of the next convolution layer is equal to 1 and the convolution kernel sizes of the next convolution layer and the current convolution layer are the same, the index matrix of the next convolution layer is the same as the index matrix of the current convolution layer.

Specifically, the embodiment of the invention can divide the input feature map into gray level or color level for statistical analysis, wherein the gray level and the color level only provide one pixel analysis and classification thought, and the embodiment of the invention is not limited to the two analysis methods. Taking an 8-bit RGB input feature map as an example, a gray-scale map gray_img=r 0.299+g 0.587+b 0.114 of the input feature map is calculated, where R is a red channel component of the input feature map, G is a green channel component of the input feature map, and B is a blue channel component of the input feature map. The gray scale map has a value range of 0 to 255, and can divide an 8bit gray scale into 26 levels for every 10 gray scales, i.e., 0 to 9 is one gray scale, 10 to 19 is one gray scale, etc.

When a color level index matrix scheme is employed, the values of the RGB three channel components are 0-255, respectively, different R/G/B channel spacing values may be used to define the color levels, e.g., R/G/B channels may be partitioned at intervals 43, 52, 64, and R/G/B channels may be used at different spacing values, e.g., 216, 125, 64 color levels. When the interval is 43, the R/G/B channels can be respectively divided into 0-42, 43-85, 86-128, 129-171, 172-214 and 215-257 and are divided into 6 levels, and then the RGB three-channel input characteristic image pixel can be divided into 216 color levels. When the interval is 52, the R/G/B channels can be respectively divided into 0-51, 52-103, 104-155, 156-207 and 208-259, and the total number of the R/G/B channels is 5, so that the RGB three-channel input characteristic image pixels can be divided into 125 color levels. When the interval is 64, the R/G/B channels can be respectively divided into 4 levels of 0-63, 64-127, 128-191 and 192-255, and then the RGB three-channel input characteristic image pixels can be divided into 64 color levels.

Considering the diversity of image processing neural network model structures and feature map sizes, a feature map compensation index matrix with different sizes needs to be established according to the neural network model, for example, whether a convolution step length is not equal to 1 is adopted in the model, whether a pooling series operator is adopted, whether an up/down sampling operator (bilinear up/down sampling, nearest neighbor up/down sampling and the like) is adopted, and the like, because different operators adopt different parameters, the feature map size has corresponding change as basic knowledge, for example, the convolution step length is equal to 2, and the feature map size is changed to h/2*w/2; pooling operators and downsampling operators reduce the feature map size; the upsampling operator enlarges the feature map size.

Before transplanting a neural network model to a memory chip for reasoning, the embodiment of the invention calculates the size of an index matrix of the required feature map, namely the length and width values of the index matrix are consistent with the length and width of the corresponding feature map; if gray scale is adopted, the number of channels is 1; if a color level is used, the number of channels is 3. Wherein the feature map index matrix is calculated from the input image.

For example, referring to fig. 2, the index matrix calculation flow is as follows:

1. if the gray level is adopted, calculating a gray level image of the input RGB image for index matrix calculation; if the color level is adopted, inputting an RGB image for index matrix calculation; (for convenience of description, the input gray-scale map or input RGB map required for the first index matrix calculation will be hereinafter collectively referred to as an input image.)

2. And analyzing a first layer network of the neural network model, namely a convolution layer, and judging whether the convolution step length s1 is equal to 1 or not. If the step size s1 is not equal to, assuming s1=2, according to the rule of setting the convolution operator parameters, the convolution kernel size is greater than or equal to 3, and assuming that the convolution kernel size (k 1, k 2) = (3, 3); when the gray level is adopted, 0 supplementing operation is carried out on the periphery of the gray level graph according to a convolution operator 0 supplementing mode; when the color level is adopted, 0 supplementing operation is carried out on the periphery of the input RGB image according to a convolution operator 0 supplementing mode. When the step length s1 is equal to 1, judging whether the convolution kernel sizes (k 1, k 2) are 1; if the convolution kernel size is not 1, performing 0 supplementing operation on the periphery of the input image according to a convolution operator 0 supplementing mode. For example, the convolution kernel size (k 1, k 2) = (3, 3), then one row of 0 is complemented around the input image; if (k 1, k 2) = (1, 1), the input image is not complemented with 0;

3. The input image is traversed pixel by pixel on a gray scale map or an input RGB map with a step length s1, and the average value of the (k 1, k 2) area around the pixel point is calculated in a sub-channel mode and used as the value of the position of the pixel point corresponding to the first average value matrix. For example, with gray level, convolution kernel size (k 1, k 2) = (3, 3), then a gray average value of a 3×3 region around the pixel point to be calculated of the input gray map is calculated; with color level, convolution kernel size (k 1, k 2) = (3, 3), then calculate the input RGB map split channel to calculate the R/G/B three channel average for the 3 x 3 region around the pixel. Then when the input image size is h×w×c, the first average matrix size is also h×w×c; with gray scale, then c=1; with color level, c=3.

4. Calculating an index matrix by using the average value matrix obtained in the step 3; with gray level, the gray interval is 10, and the gray index matrix gray_level=pixel_mean// 10 ("//" is a rounding operation), where pixel_mean is the average gray of each pixel, that is, the value of a certain pixel in the average matrix; with color level, R/G/B three channels uniformly adopt interval value x, color index matrix color_level= (pixel_mean_r// x, pixel_mean_g// x, pixel_mean_b// x), wherein pixel_mean_r, pixel_mean_g, pixel_mean_b are the R/G/B three channel average value of each pixel in the average matrix; it should be noted that the gray interval may be selected arbitrarily according to the application requirement, may be taken at equal intervals, or may be taken at unequal intervals, which is not limited in the embodiment of the present invention.

5. So far, obtaining an index matrix of the first layer convolution; and judging whether a second index matrix needs to be calculated according to a second layer network of the neural network model. The following scenario discusses how the second index matrix required for the layer two network is calculated.

6. If the second layer network is still a convolution layer and the convolution step length s2 is equal to 1, the convolution kernel size is the same as that of the first layer convolution, and the index matrix of the first layer network is used; subsequent convolutions are analyzed in this way, if all convolution layer steps s are equal to 1, the convolution kernel size is the same, and there is one and only one index matrix.

7. If the second network is still a convolution layer, but the convolution step s2 is not equal to 1, or the convolution kernel size is different from the first convolution, then the calculation of the second index matrix is started. Assuming that the step s2 of the second-layer convolution is equal to 2, and the convolution kernel size is (k 3, k 4) = (3, 3), the first average matrix is subjected to 0-supplementing operation in a manner of 0-supplementing the convolution operator of tensorflow, for example, one row of 0 can be supplemented on the right side and the lower side of the first average matrix. Assuming that the step size s2 of the second-layer convolution is equal to 2, and the convolution kernel size is (k 3, k 4) = (5, 5), taking the case of the 0-complement method of tensorf low as an example, two rows of 0 are complemented on the right and the bottom of the first average matrix, and one row of 0 is complemented on the left and the top.

8. And traversing pixel by pixel on the first average value matrix after 0 complementation by using the step length s2=2, and calculating the average value of the area (k 3, k 4) around the pixel point by using the sub-channel as the value of the pixel point position corresponding to the second average value matrix. For example, with gray scale, convolution kernel size (k 3, k 4) = (5, 5), then the gray average of 5×5 area around the pixel point on the first average matrix after 0 is complemented is calculated with step s2=2; with color level, convolution kernel size (k 3, k 4) = (5, 5), then calculate the R/G/B three-channel average of the 5×5 area around the pixel point with step s2=2 on the first average matrix after 0-padding. Then when the input image size is H W C, the second average matrix size is H/2W/2C; with gray scale, then c=1; with color level, c=3.

9. After calculating the second average value matrix, referring to the method of step 4, calculating a second index matrix according to the second average value matrix, and using the second index matrix as the index matrix of the second convolution layer.

10. If the second layer network is a pooling operator, for example, there are an average pooling layer, a maximum pooling layer, a minimum pooling layer, and the like, and the pooling layer performs calculation in the digital domain of the memory integrated chip, and does not need to perform compensation operation. However, the pooling layer will generally follow the convolutional layer, i.e. the third layer network is a convolutional layer, and the index matrix of the third layer network needs to be calculated. First, analyzing the pooling layer, if the pooling step length is s2, and the pooling core size is (k 3, k 4), referring to the method of step 7, taking the 0 supplementing mode of tensorf low as an example according to the step length and the core size, and performing 0 supplementing operation on the first average value matrix. And then, calculating the average value or the maximum value or the minimum value of the gray scale of the k3 multiplied by k4 area around the pixel point or the R/G/B channel on the first average value matrix according to the type of the pooling operator by using a step length s2 to obtain a second mask front matrix, wherein the average value is calculated if the second layer network is an average pooling layer, the maximum value is calculated if the second layer network is a maximum pooling layer, and the minimum value is calculated if the second layer network is a minimum pooling layer. Assuming that step s2=2, when the input image size is h×w×c, the second pre-mask matrix size is H/2×w/2×c; with gray scale, then c=1; with color level, c=3.

11. For the convolution layer that will follow the pooling layer, i.e. the third layer network is a convolution layer, and the convolution step size is s3=1, assuming that the convolution kernel size is (k 5, k 6) = (3, 3), the index matrix of the third layer convolution is calculated according to the second pre-mask matrix.

12. Specifically, when the index matrix of the third layer convolution is calculated in step 11, referring to the method of step 2, according to the convolution step length of s3=1, the convolution kernel size is (k 5, k 6) = (3, 3), and the 0-supplementing operation is performed on the second pre-mask matrix according to the 0-supplementing mode of tensorf low.

13. Referring to the method in step 3, on the basis of the second mask front matrix after 0 supplement, traversing pixel by pixel, and calculating the average value of the areas (k 5, k 6) around the pixel point in a sub-channel mode to serve as the value of the pixel point position corresponding to the average value matrix corresponding to the third-layer convolution. When the matrix size before the second mask is H/2 XW/2 XC, the average matrix size corresponding to the third layer convolution is H/2 XW/2 XC; with gray scale, then c=1; with color level, c=3.

14. Referring to the method in step 4, an index matrix corresponding to the third layer convolution may be calculated according to the average matrix corresponding to the third layer convolution.

15. If the second layer network is a downsampling operator (such as bilinear downsampling, nearest neighbor downsampling, etc.), the operation is also performed in the digital domain of the memory integrated chip, and no compensation operation is required. Referring to step 11, the downsampling layer is immediately followed by a convolution layer, i.e. the third layer network is a convolution layer, and the convolution step size is s3=1, assuming that the convolution kernel size is (k 5, k 6) = (3, 3). For the second-layer network as a downsampling operator, referring to the methods from step 10 to step 14, a corresponding pre-mask matrix can be obtained by calculating an average value.

16. Assuming that the step size is s3=1 and the convolution kernel size is (k 5, k 6) = (3, 3), the pre-mask matrix obtained in step 15 is first subjected to 0-compensating operation, then the corresponding average matrix is calculated, and then the index matrix required by the third layer convolution is calculated.

17. The above completes the calculation mode of the index matrix under different conditions of the second-layer network.

18. Considering the diversity of neural network models, feature map size enlarging operators such as upsampling operators (bilinear upsampling, nearest neighbor upsampling, etc.) also exist in the network in the subsequent network. When the actual neural network model is developed, the feature map size amplifying operator is performed in a digital domain and can follow a convolution layer, so that the feature map size is unchanged, and the convolution step length is s=1. When the U-shaped network structure is adopted, the first half of the network layer can reduce the characteristic diagram, and the second half of the network layer can enlarge the characteristic diagram. The enlarged feature map is often added to, combined with, and the like, that is, the feature map size is the same as the feature map size of the previous network layer. At this time, the index matrix of the convolution layer after the up-sampling operator is used, and the index matrix of the corresponding convolution layer of the front network is used, so that the same size of the feature images of the two convolutions is ensured.

19. If the second layer network is an upsampling operator (bilinear upsampling, nearest neighbor upsampling, etc.), the digital domain of the memory integrated chip is also used for calculation, and no compensation operation is needed. Taking part in the method of step 11, the upsampling layer is immediately followed by a convolution layer, i.e. the third layer network is a convolution layer, and the convolution step size is s3=1, assuming that the convolution kernel size is (k 5, k 6) = (3, 3). At this point, the feature map size of the third convolution is different from that of the first convolution. Referring to the method of step 15, for the second layer network being an upsampling operator, the nearest neighbor upsampling may be used to amplify the first average matrix, calculate a corresponding pre-mask matrix, and further calculate an index matrix of the third layer convolution.

20. If the second layer network is convolution, and the convolution step s2 is equal to 2, the convolution kernel size is the same as the subsequent convolution, and the subsequent convolution step is 1. The subsequent convolution takes over the index matrix of the second layer convolution; there are only two index matrices of the neural network model.

21. If the subsequent convolution step length is not equal to 1, repeating the steps 2 to 20, taking the current convolution as the first layer convolution in the steps, and analyzing and calculating an index matrix of the subsequent layer.

In an alternative embodiment, the step S400 may include the following:

step 1: comparing the calibration standard with the calculation result;

step 2: and generating an index value-compensation vector corresponding relation table corresponding to the current convolution layer according to the index matrix and the comparison result.

Specifically, after comparing the calibration standard with the calculation result, sorting and averaging the difference values according to the index matrix to obtain a corresponding index value-compensation vector corresponding relation table.

It should be noted that, when the present invention generates the calibration standard, the present invention may be used for the calibration standard calculation based on the full-frame input image, or may be used for the calibration standard calculation based on randomly selecting a plurality of pixels and surrounding adjacent pixel blocks for each gray level or color level, or may be used for the calibration standard calculation by customizing the input image. The current convolution characteristic diagram output by the integrated memory and calculation chip can be input into a subsequent convolution layer after being calibrated by the compensation vector table obtained through calculation, and the calculation and the calibration can be carried out layer by layer, so that error accumulation and transmission are avoided.

The calculation flow of the calibration reference generation method is described as follows:

1. when the gray level is adopted, selecting a frame of test image to ensure that the gray level of all index matrixes can be covered; when adopting the color level, selecting a plurality of frames of test images to ensure the color level capable of covering all index matrixes;

2. According to the index matrix calculation flow, calculating all index matrixes of the test image;

3. after the test image is supplemented with 0, the test image is processed on a PC end, a server and the like according to the digital domain soft running of the neural network model, and sequentially input into each network layer of the target neural network model, and an output characteristic diagram of each layer of convolution or a full-connection layer is used as a calibration reference calib_out_i, wherein i is a mark of the ith layer of convolution.

4. The flash memory unit array of the integrated memory unit chip is a two-dimensional matrix, and the calculation result of each layer of convolution is the result of multiplying the input vector of the flash memory unit by the matrix pre-stored on the flash memory unit to obtain an output vector. The input vector is obtained by rearranging the pixels of the input image or the input feature map and the adjacent areas into vectors, and the process is called img2col. For example, if the number of input RGB image channels is 3 and the first layer convolution kernel size is 3×3, the values involved in the calculation are h×w×c=3×3×3=27 numbers around the pixel, and the values are rearranged into 27-dimensional input vectors. Assuming that the first layer convolution output dimension is c1=32, the dimension of the output vector is 32. That is, if the step s=1 of the first-layer convolution, the convolution kernel size is 3×3, and the output feature map size of the first-layer convolution is h×w×32.

5. After the test image img2col is rearranged, the pixels are input into a flash memory unit array of a memory integrated chip for calculation, vectors of corresponding pixels in the current convolution feature map are output, and the scale is 1×1×c1, wherein c1=32 is taken as an example. And comparing the first layer convolution output characteristic diagram feature_ npu _1 calculated by the integrated memory chip with the digital domain soft running result calib_out_1, and sorting and averaging the difference diff_out_1 according to a first index matrix to obtain a compensation vector table. For example, when gray levels are adopted, each pixel in the first index matrix can be classified into 26 levels, the pixels with values of 0,1, 2, 24 and 25 in the index matrix are classified into one class, and the difference value of each class of gray level is averaged to obtain a compensation vector with a dimension c1=32, and the compensation vector is formed by 26 compensation vectors with dimensions c1=32. When the color level is adopted, each pixel of the first index matrix can be classified into 216 levels, the values of the index matrix are (0, 0), (0, 1), (0, 1), (0, 5), (0, 1, 5), (5,5,0), (5,5,1) and (5, 5) and are classified into one class, the difference value of each color level is averaged to obtain a compensation vector with one dimension c1=32, and the compensation vector is composed of the compensation vectors with 216 dimensions c1=32.

6. Using the compensation vector table (index value-compensation vector correspondence table) of the first-layer convolution obtained in the step 5, calibrating the feature_ npu _1 of the first-layer convolution corresponding to the flash memory unit array output in the integrated memory chip to obtain a calibrated feature_calib_1, obtaining the feature_act_1 after passing through activating functions such as a ReLU and the like and digital domain AI operators (such as a pooling operator, an up/down sampling operator and the like), and using the feature_act_1 as an input feature map of the second-layer convolution, wherein the size of the feature map is assumed to be H×w×32;

7. after the input feature diagram img2col of the second-layer convolution is rearranged, the pixel-by-pixel input feature diagram is input into a flash memory unit array corresponding to a memory integrated chip for calculation, and a vector of a corresponding pixel in the current convolution feature diagram is output, wherein the scale is 1 multiplied by C2, and C2=96 is taken as an example. The second-layer convolution output characteristic diagram of the digital domain soft running is calib_out_2, and the second-layer convolution output characteristic diagram is used as a calibration reference of the second-layer convolution. And comparing the second-layer convolution output characteristic diagram feature_ npu _2 calculated by the integrated memory chip with calib_out_2, and then averaging the difference diff_out_2 according to the index matrix corresponding to the second-layer convolution and the gray level or color level classes to obtain a compensation vector table. For example, when gray scale is used, the compensation vector table is composed of 26 compensation vectors of dimensions c2=96; with the color stage, the compensation vector table is composed of 216 compensation vectors of dimensions c2=96. And step six, obtaining a calibration feature_calib_2 of the second-layer convolution after calibration and a feature_act_2 after post-processing.

8. And (6) and 7) comparing the output characteristic diagram feature_ npu _i of the memory chip of the subsequent convolution layer or the full connection layer of the neural network model with the characteristic layer calib_out_i of the corresponding digital domain soft run convolution layer or the full connection layer, and averaging the difference diff_out_i according to the index matrix of the corresponding convolution layer and the gray level or color level classes to obtain the compensation vector table of the convolution layer. (where i is an indication of what number of convolutions layers)

9. And repeating the steps to finish the calculation of the compensation vector table of all convolution layers or all connection layers of the neural network model.

In an alternative embodiment, in the calibration standard generating process, after the output feature_ npu _i of the integrated chip is stored and compared with the feature_out_i of the corresponding digital domain soft run convolution layer, the difference diff_out_i=calib_out_i-feature_ npu _i is obtained, and the compensation vector table vector_calib_conv_i of the convolution layer is obtained after the difference diff_out_i=calib_out_i-feature_ npu _i is averaged according to the gray level or color level according to the index matrix of the corresponding convolution layer. (where i is an indication of what number of convolutions layers)

And taking the feature map feature_calib_i after the calibration of the current convolution layer and the feature map feature_act_i after the processing of the activation layer as the input calib_in_i of the next convolution layer for calculating a compensation vector table of the next convolution layer.

The embodiment of the invention also provides an error compensation method for the in-memory computing chip, wherein the in-memory computing chip is used for realizing the target neural network model operation, and comprises the following steps: the digital domain and the analog domain, wherein the digital domain is realized by a digital circuit, which can be a processor such as a GPU, the analog domain adopts an analog circuit, and mainly comprises a flash memory cell array for implementing matrix multiply-add operation for neural network operation and a matching circuit thereof, the error compensation method is applied to the digital domain, and referring to fig. 3, the error compensation method can comprise the following steps:

step S1: acquiring a corresponding index matrix according to an input feature map of a current convolution layer;

specifically, for a certain convolution layer of the convolution neural network model, a corresponding index matrix is obtained according to an input characteristic image of the convolution layer.

Step S2: obtaining compensation vectors corresponding to all pixels of the input feature image based on a pre-obtained index value-compensation vector corresponding relation table of the current convolution layer according to the index matrix;

specifically, the index value-compensation vector corresponding relation table of the current convolution layer is obtained by a method shown in fig. 1;

step S3: obtaining a calculation result corresponding to the input feature map output by a flash memory cell array corresponding to the current convolution layer in an analog domain;

Specifically, when the analog pre-operation is performed, the operation result of the current convolution layer is output to the digital domain, the digital domain receives the calculation result and then performs compensation and corresponding processing, then the digital domain is used as an input feature diagram of the next convolution layer to input the analog domain, the operation result of the next convolution layer is operated in the analog domain, then the operation result is input to the digital domain, the digital domain receives the calculation result and then performs compensation and corresponding processing, then the digital domain is used as an input feature diagram of the next convolution layer to input the analog domain, and the operation is repeated until the operation of the corresponding convolution neural network model is completed.

Step S4: and compensating the calculation result according to the compensation vector.

By adopting the technical scheme, the calculation result of the integrated memory chip is more similar to the calculation result under the noise-free condition, the calculation precision is improved, and the integrated memory chip can be applied to scenes with higher precision requirements such as image processing.

The input feature map is in a gray scale map format, a color transparency map (rgb_alpha) format, a gray scale chrominance image format (YUV) or a hue saturation brightness image format (HSV, HSL), etc., including but not limited thereto.

In an alternative embodiment, this step S1 may include the following:

The technical details of the index matrix acquisition are referred to above, and are not described herein.

In an alternative embodiment, the error compensation method for an in-memory computing chip may further include the steps of: and transmitting the compensated calculation result as an input characteristic diagram of the next convolution layer of the target neural network model to a flash memory cell array corresponding to the next convolution layer in the simulation domain for calculation.

The embodiment of the invention also provides an in-memory computing chip, which comprises: a digital domain and an analog domain, the analog domain is used for performing matrix multiply-add operation of the target neural network model, and the digital domain is at least used for performing the error compensation method as shown in fig. 3.

Of course, operations other than the convolutional layer, such as a sampling layer, a full-connection layer, etc., in the convolutional neural network may also be performed in the digital domain.

FIG. 4 illustrates an integrated chip design framework for memory calculation using the compensation method of the embodiment of the present invention; as shown in fig. 4, the data img2col corresponding to a certain convolution layer in the image is processed and then input into the memory array for analog domain calculation, the data is converted into an index matrix, a compensation vector is obtained through a compensation vector table (i.e. the above-mentioned correspondence table), the compensation vector is calculated with the output of the memory array, the output of the memory array is compensated, and the calculation precision of the convolution layer of the layer is improved.

It should be noted that, the integrated memory chip provided by the embodiment of the invention can be applied to various electronic devices, such as: smart phones, tablet electronics, network set-top boxes, portable computers, desktop computers, personal Digital Assistants (PDAs), vehicle-mounted devices, smart wearable devices, toys, smart home control devices, pipeline device controllers, and the like. Wherein, intelligent wearing equipment can include intelligent glasses, intelligent wrist-watch, intelligent bracelet etc..

For a better understanding of the present application, the following description will illustrate specific implementation procedures in the embodiments of the present invention with reference to fig. 4 and 5:

in an actual application scene, in an inference mode, the flow of calibrating the output of the memory chip by using the compensation vector table of each layer of convolution is as follows:

1. calculating a plurality of index matrixes of the input image according to the neural network model, wherein the index matrixes can be classified into gray level or color level;

2. according to the position of the current convolution layer in the neural network model, inputting an RGB image or an output characteristic layer of a front network into a memory calculation array of a memory calculation integrated chip corresponding to convolution for calculation pixel by pixel img2 col;

3. according to the corresponding index matrix of the current convolution layer, searching the gray level or the color level corresponding to the convolution output characteristic layer on the index matrix pixel by pixel;

4. The gray level or the color level is indexed in the compensation vector table, the compensation vector in the compensation vector table is determined, and the output result of the storage array is calibrated pixel by pixel; then, the feature image feature_act_i after post-processing of an activation layer and the like is used as an input calib_in_i of a next convolution layer;

5. the above steps can be expressed as the following formula, where j is the convolution layer index, pi is the pixel, and i is the number of pixels;

calib_in(j+1)＝ReLU[G _{NPU_j} [F _img2col (P _i )]+vector_calib_conv(j)]

wherein F is _img2col Rearranging elements participating in convolution calculation into columns, and inputting the columns into a storage array to participate in calculation; g _{NPU_j} Is a representation function for the j-th layer of the calculation NPU to calculate.

Embodiments of the present invention are illustrated below using an example of a superdivision network running on a memory card. Note that the superdivision network model is only one of application scenarios of the present invention. The invention can be applied to AI tasks such as image classification, object detection, semantic segmentation, image super-division, image noise reduction, image dynamic range fusion and the like, including but not limited to these tasks.

Referring to fig. 5, the first layer convolution kernel size is (k 1, k 1) =3×3, step size s1=1, and output feature map channel number c1=28; the second layer convolution kernel size is (k 2, k 2) =3×3, step size s2=1, and output feature map channel number c2=48; the third layer convolution kernel size is (k 3, k 3) =3×3, step size s3=2, and output feature map channel number c3=96; the fourth layer convolution kernel size is (k 4, k 4) =3×3, step size s4=1, and output feature map channel number c4=96; the fifth layer convolution kernel size is (k 5, k 5) =3×3, step size s5=1, and output feature map channel number c5=96; the sixth layer convolution kernel size is (k 6, k 6) =3×3, step size s6=1, and output feature map channel number c6=96; the seventh layer convolution kernel size is (k 7, k 7) =3×3, step size s7=1, and output feature map channel number c7=48. Concat is used for splicing the input images according to the channel number direction; d2sx4 is an abbreviation for depth_to_space_x4, i.e. the feature map is expanded from the channel number direction to the length-width direction, and the H/4×w/4×48 size feature map is expanded to the h×w×3 size feature map; d2sx2 is an abbreviation for depth_to_space_x2, i.e., the feature map is expanded from the channel number direction to the length-width direction, and the H/2×w/2×12 size feature map is expanded to the h×w×3 size feature map. The convolution layers are all operated on the analog domain of the memory calculation integrated chip, and other operators are all operated on the digital domain, so that the compensation flow of the convolution layers is described. The input image (LR) is an abbreviation for low resolution (low_resolution), and the output image (SR) is an abbreviation for super resolution (super_resolution).

And analyzing the compensation scheme of the superdivision neural network model on the memory integrated chip by using the index matrix calculation technology, the calibration reference generation technology, the convolution layer compensation vector table generation technology and the compensation flow under the reasoning mode. The convolution kernels of the first layer convolution and the second layer convolution have the same size and step length, and then the first index matrix is used. The third layer convolution step length s3=2, and a second index matrix is obtained through calculation; and the third, fourth, fifth, sixth and seventh convolution kernel sizes are 3×3, and the fourth, fifth, sixth and seventh convolution step sizes are 1, then the third, fourth, fifth, sixth and seventh convolution uses the second index matrix.

The calibration reference map is sequentially input into a digital domain (the digital domain can be the digital domain of a memory integrated chip, but in order to reduce the chip size and power consumption, the digital domain can be realized by a PC (personal computer), a server and the like), and the characteristic map output by each layer of convolution layer is used as a calibration standard. And sequentially inputting the calibration reference pictures into the super-division neural network model of the integrated memory and calculation chip, sequentially calculating a compensation vector table by each layer, calibrating the output characteristic picture of the current convolution layer, and inputting the final result into the next convolution layer after post-processing. The present example superdivision neural network model has 7 layers of convolutions, so each convolution layer has a compensation vector table.

Under the reasoning mode, the image to be superdivided is input into a memory integrated chip of the burnt neural network model, the index matrix is searched pixel by pixel for each layer of convolution output characteristic image, and the correct compensation vector is selected to calibrate the output characteristic image.

In summary, the compensation scheme provided by the embodiment of the invention for greatly improving the operation precision of the integrated memory chip in the application field of the high-precision neural network model such as image processing, so that the integrated memory chip has practicability in the application field of the high-precision image processing, and the advantages of low power consumption and high calculation power of the integrated memory chip are brought into play.

It should be noted that the storage medium of the integrated memory chip is not limited to a phase change memory (PRAM), a Static Random Access Memory (SRAM), a Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), a Read Only Memory (ROM), an Electrically Erasable Programmable Read Only Memory (EEPROM), a flash memory or other memory technology, a compact disc read only memory (CD-ROM), a Digital Versatile Disc (DVD) or other optical storage, a magnetic cassette, a magnetic tape disk storage or other magnetic storage device, or any other non-transmission medium, and may be used to store information that may be accessed by the computing device.

The embodiment of the invention utilizes an input image to calculate a compensation index matrix of a neural network operator on an integrated memory chip, wherein the input image format is not limited to a gray scale image, a color transparency image (RGB_alpha), a gray scale chroma image format (YUV), a hue saturation brightness image format (HSV, HSL) and the like, and different image formats are adopted to calculate the index matrix;

embodiments of the present invention are not limited to computing the compensation index matrix, calibration references, and compensation vectors from the input image, for example, the video stream task may also be based on the previous frame of image, with each layer of convolution statistics analysis based on the feature map output by the previous layer of convolution. That is, for different pixels, according to the index matrix, the different pixels extract the compensation vector from the compensation vector table by looking up the table to compensate the results of each layer of convolution or other AI operators (e.g., full link layer, 3D convolution, transposed convolution), so that the results are closer to the calculation results under the noiseless condition, and the methods are all within the definition scope of the present invention.

The invention calculates the compensation vector table needed by each convolution layer for the input image by referring to the structure of the neural network model, calculates the average value, the maximum value and the minimum value of the pixel neighborhood of each scale of the input image, and is not limited to the pixel-by-pixel index matrix calculation technology;

In addition, for each convolution layer, full connection layer and the like, 0 is complemented around the input image and the average matrix to calculate an index matrix in a pixel-by-pixel 3×3 neighborhood, and the index matrix is not limited to the neighborhood size and can be any size; the method is not limited to the input image and the line number of 0 supplement around the average matrix, is not limited to the data filled by the image expansion around, and can be any data;

of course, the neural network structure of the embodiment of the invention can be applied, and is not limited to neural network operators such as 2D convolution, 3D convolution, full-join convolution, transposed convolution, channel separable convolution, expansion convolution and the like; the system is not limited to a direct-connected neural network, a U-shaped neural network, a residual structure network, an attention network, a cyclic neural network, a MLP, transformer network and the like;

the embodiment of the invention obtains the compensation vector by using a pixel-by-pixel table lookup in the reasoning stage, is not limited to the obtaining mode of the compensation vector table, and adopts a table lookup compensation method and a chip design scheme to be within the protection scope of the invention;

in addition, in order to avoid error transfer and accumulation, the embodiment of the invention calculates a compensation vector table layer by layer, calibrates the output characteristic diagram of the current layer, and then inputs the calibrated output characteristic diagram into the next layer after post-processing;

The generation of the calibration standard can be based on the whole-frame input image for the calibration standard calculation, or can be based on randomly selecting a plurality of pixels and surrounding adjacent pixel blocks for the calibration standard calculation based on each gray level or color level.

Based on the same inventive concept, the embodiments of the present application also provide an error compensation device for an in-memory computing chip, which may be used to implement the method described in the above embodiments, as described in the following embodiments. Since the principle of solving the problem of the error compensation device for the in-memory computing chip is similar to that of the above method, the implementation of the error compensation device for the in-memory computing chip can be referred to the implementation of the above method, and the repetition is omitted. As used below, the term "unit" or "module" may be a combination of software and/or hardware that implements the intended function. While the means described in the following embodiments are preferably implemented in software, implementation in hardware, or a combination of software and hardware, is also possible and contemplated.

FIG. 6 is a block diagram of an error compensation apparatus for an in-memory computing chip in an embodiment of the present invention. The error compensation device for the in-memory computing chip comprises: the system comprises a calibration reference acquisition module 10, an index matrix acquisition module 20, an in-memory calculation result acquisition module 30 and a compensation vector table generation module 40.

The calibration standard acquisition module 10 inputs the sample image into a target neural network model, and takes the characteristic diagram output by each convolution layer as a calibration standard of the convolution layer;

the index matrix acquisition module 20 acquires a corresponding index matrix according to the input feature map of the current convolution layer of the corresponding sample image;

the in-memory computing result obtaining module 30 obtains a computing result of a current convolution layer corresponding to the input feature map output by an in-memory computing chip, where the in-memory computing chip is used to implement a target neural network model operation, and the input feature map is input into a flash memory cell array corresponding to the current convolution layer in an analog domain of the in-memory computing chip to perform operation to obtain the computing result;

after comparing the calibration standard with the calculation result, the compensation vector table generating module 40 generates an index value-compensation vector corresponding relation table corresponding to the current convolution layer according to the index matrix and the comparison result, where the index value-compensation vector corresponding relation table is used for compensating the calculation result output by the simulation domain in the digital domain when the in-memory calculation chip implements the target neural network model operation.

The apparatus, module or unit set forth in the above embodiments may be implemented in particular by a computer chip or entity, or by a product having a certain function. A typical implementation device is an electronic device, which may be, for example, a personal computer, a laptop computer, a cellular telephone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.

In a typical example the electronic device comprises in particular a memory, a processor and a computer program stored on the memory and executable on the processor, said processor implementing the steps of the error compensation method for an in-memory computing chip described above when said program is executed.

Referring now to fig. 7, a schematic diagram of an electronic device 600 suitable for use in implementing embodiments of the present application is shown.

As shown in fig. 7, the electronic apparatus 600 includes a Central Processing Unit (CPU) 601, which can perform various appropriate works and processes according to a program stored in a Read Only Memory (ROM) 602 or a program loaded from a storage section 608 into a Random Access Memory (RAM)) 603. In the RAM603, various programs and data required for the operation of the system 600 are also stored. The CPU601, ROM602, and RAM603 are connected to each other through a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.

The following components are connected to the I/O interface 605: an input portion 606 including a keyboard, mouse, etc.; an output portion 607 including a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, a speaker, and the like; a storage section 608 including a hard disk and the like; and a communication section 609 including a network interface card such as a LAN card, a modem, or the like. The communication section 609 performs communication processing via a network such as the internet. The drive 610 is also connected to the I/O interface 605 as needed. Removable media 611 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on drive 610 as needed, so that a computer program read therefrom is mounted as needed as storage section 608.

In particular, according to embodiments of the present invention, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, an embodiment of the present invention includes a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the error compensation method for an in-memory computing chip described above.

In such an embodiment, the computer program may be downloaded and installed from a network through the communication portion 609, and/or installed from the removable medium 611.

Computer readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device. Computer-readable media, as defined herein, does not include transitory computer-readable media (transmission media), such as modulated data signals and carrier waves.

For convenience of description, the above devices are described as being functionally divided into various units, respectively. Of course, the functions of each element may be implemented in one or more software and/or hardware elements when implemented in the present application.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises the element.

It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.

In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for system embodiments, since they are substantially similar to method embodiments, the description is relatively simple, as relevant to see a section of the description of method embodiments.

The foregoing is merely exemplary of the present application and is not intended to limit the present application. Various modifications and changes may be made to the present application by those skilled in the art. Any modifications, equivalent substitutions, improvements, etc. which are within the spirit and principles of the present application are intended to be included within the scope of the claims of the present application.

Claims

1. An error compensation method for an in-memory computing chip, comprising:

after comparing the calibration standard with the calculation result, generating an index value-compensation vector corresponding relation table corresponding to the current convolution layer according to the index matrix and the comparison result, wherein the index value-compensation vector corresponding relation table is used for compensating the calculation result output by the simulation domain in the digital domain when the in-memory calculation chip realizes the calculation of the target neural network model;

the obtaining the corresponding index matrix according to the input feature map of the current convolution layer of the corresponding sample image includes:

2. The error compensation method for an in-memory computing chip of claim 1, wherein the neural network model is a direct-connected neural network model, a U-shaped neural network model, a residual structure network model, a cyclic neural network model, an MLP model, a transducer network model, or an attention mechanism neural network model.

3. The error compensation method for an in-memory computing chip of claim 1, wherein the neural network model comprises: 1D convolution, 2D convolution, 3D convolution, full join, transposed convolution, channel separable convolution, and dilation convolution neural network operators.

4. The method for error compensation of an in-memory computing chip of claim 1, wherein the sample image is in a grayscale image format, a color transparency image format, a grayscale chrominance image format, or a hue saturation luminance image format.

5. The method for error compensation of an in-memory computing chip according to claim 1, wherein the generating an index value-compensation vector correspondence table corresponding to a current convolutional layer according to the index matrix and the comparison result after comparing the calibration standard with the calculation result comprises:

comparing the calibration standard with the calculation result;

6. The method for error compensation of an in-memory computing chip of claim 5, wherein if the step size of the next convolutional layer is equal to 1 and the convolution kernel sizes of the next convolutional layer and the current convolutional layer are the same, the index matrix of the next convolutional layer is the same as the index matrix of the current convolutional layer.

7. The method for error compensation of an in-memory computing chip according to claim 1, wherein the generating a corresponding index value-compensation vector correspondence table according to the index matrix and the comparison result after comparing the calibration standard with the calculation result comprises:

8. The error compensation method for the in-memory computing chip is characterized in that the in-memory computing chip is used for realizing target neural network model operation and comprises the following steps: a digital domain and an analog domain, the error compensation method being applied to the digital domain, the error compensation method comprising:

Compensating the calculation result according to the compensation vector; wherein, the liquid crystal display device comprises a liquid crystal display device,

the feature map output by each convolution layer is used as a calibration standard of the convolution layer, and after comparing the calibration standard with the calculation result, a corresponding index value-compensation vector corresponding relation table is generated according to the index matrix and the comparison result, and the method comprises the following steps:

9. The error compensation method for an in-memory computing chip of claim 8, wherein the neural network model is a direct-connected neural network model, a U-shaped neural network model, a residual structure network model, a cyclic neural network model, an MLP model, a transducer network model, or an attention mechanism neural network model.

10. The error compensation method for an in-memory computing chip of claim 8, wherein the neural network model comprises: 1D convolution, 2D convolution, 3D convolution, full join, transposed convolution, channel separable convolution, and dilation convolution neural network operators.

11. The method of claim 8, wherein the input feature map is in a gray scale map format, a color transparency map format, a gray scale chrominance image format, or a hue saturation luminance image format.

12. The method for error compensation of an in-memory computing chip of claim 8, wherein the obtaining a corresponding index matrix from the input feature map of the current convolutional layer comprises:

13. The method for error compensation of an in-memory computing chip of claim 12, further comprising:

14. An in-memory computing chip, comprising: a digital domain for performing a matrix multiply-add operation of the target neural network model and an analog domain for performing the error compensation method according to any one of claims 8 to 13.

15. An error compensation device for an in-memory computing chip, comprising:

The compensation vector table generation module is used for comparing the calibration standard with the calculation result, and generating an index value-compensation vector corresponding relation table corresponding to the current convolution layer according to the index matrix and the comparison result, wherein the index value-compensation vector corresponding relation table is used for compensating the calculation result output by the simulation domain in the digital domain when the in-memory calculation chip realizes the calculation of the target neural network model;

16. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the error compensation method for an in-memory computing chip of any of claims 1 to 7 when the program is executed by the processor.

17. A computer-readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the error compensation method for an in-memory computing chip as claimed in any one of claims 1 to 7.