CN114723044A

CN114723044A - Error compensation method, device, chip and equipment for memory computing chip

Info

Publication number: CN114723044A
Application number: CN202210357661.4A
Authority: CN
Inventors: 严洪泽; 郭昕婕; 孙旭光
Original assignee: Hangzhou Zhicun Intelligent Technology Co ltd
Current assignee: Hangzhou Zhicun Intelligent Technology Co ltd
Priority date: 2022-04-07
Filing date: 2022-04-07
Publication date: 2022-07-08
Anticipated expiration: 2042-04-07
Also published as: CN114723044B

Abstract

The embodiment of the invention provides an error compensation method, a device, a chip and equipment for a memory computing chip, wherein the method comprises the following steps: inputting the sample image into a target neural network model, and taking a characteristic diagram output by each convolution layer as a calibration standard of the convolution layer; acquiring a corresponding index matrix according to an input characteristic diagram of the current convolutional layer corresponding to the sample image; obtaining a calculation result of the current convolution layer corresponding to the input characteristic diagram and output by the in-memory calculation chip; after the calibration standard is compared with the calculation result, an index value-compensation vector corresponding relation table corresponding to the current convolutional layer is generated according to the index matrix and the comparison result, and the index value-compensation vector corresponding relation table is used for compensating the calculation result output by the analog domain in the digital domain of the memory calculation chip when the memory calculation chip realizes the operation of the target neural network model, so that the calculation result of the memory calculation integrated chip is closer to the calculation result under the condition of no noise.

Description

Error compensation method, device, chip and equipment for memory computing chip

Technical Field

The present invention relates to the field of semiconductor technologies, and in particular, to an error compensation method and apparatus for a memory computing chip, an electronic device, and a computer-readable storage medium.

Background

Most conventional digital chips are based on a von neumann architecture, and the physical space separation of the memory and the processor causes a performance bottleneck of a memory wall. In recent years, in order to solve the structural bottleneck of the von neumann computing system, a memory Computing (CIM) chip based on various types of memory cells such as a Phase Change Memory (PCM), a Resistive Random Access Memory (RRAM), a Flash memory (Nor Flash) and the like has attracted attention, and the basic idea is to perform logic computation by using a memory directly using an analog memory capability and kirchhoff's circuit law, thereby reducing the data transmission amount and transmission distance between the memory and a processor, reducing power consumption, and improving performance.

Different from the traditional digital computing chip, the CIM chip adopts an analog circuit and precise current control to realize the computing function; for example, to add two analog quantities, only the two currents that characterize the analog quantities need to be connected, whereas when adding two 8-bit numbers in the digital domain, the transistors are characterized 0/1 by being turned on and off by the gate voltage, whereas an adder requires many such transistors (6 to 28 transistors per bit depending on the circuit design).

However, due to the limitations of the current analog chip process capability and the memory state programming precision, the electronic number on the floating gate of the memory cell cannot be accurately controlled, and in addition, in combination with the precision limitations of the digital-to-analog conversion module and the analog-to-digital conversion module and other factors, the accuracy of input data after being calculated by the memory cell array of the CIM chip has a certain difference from a digital integrated circuit, so that the current CIM chip can only be applied to tasks with lower calculation precision requirements such as classification and the like, and is not applied to scenes with higher precision requirements such as image processing and the like.

Disclosure of Invention

In view of the problems in the prior art, the present invention provides an error compensation method and apparatus for an in-memory computing chip, an electronic device, and a computer-readable storage medium, which can at least partially solve the problems in the prior art.

In order to achieve the purpose, the invention adopts the following technical scheme:

in a first aspect, an error compensation method for a memory computing chip is provided, including:

inputting the sample image into a target neural network model, and taking a feature map output by each convolution layer as a calibration standard of the convolution layer;

acquiring a corresponding index matrix according to an input characteristic diagram of the current convolutional layer corresponding to the sample image;

obtaining a calculation result of a current convolution layer corresponding to the input feature map, which is output by an in-memory calculation chip, wherein the in-memory calculation chip is used for realizing target neural network model operation, and the input feature map is input into a flash memory unit array of the in-memory calculation chip in an analog domain corresponding to the current convolution layer for operation to obtain the calculation result;

and after the calibration standard is compared with the calculation result, generating an index value-compensation vector corresponding relation table corresponding to the current convolutional layer according to the index matrix and the comparison result, wherein the index value-compensation vector corresponding relation table is used for compensating the calculation result output by the analog domain in the digital domain of the memory calculation chip when the memory calculation chip realizes the operation of the target neural network model.

Further, the neural network model is a direct connection neural network model, a U-shaped neural network model, a residual structure network model, a cyclic neural network model, an MLP model, a Transformer network model or an attention mechanism neural network model.

Further, the neural network model includes: 1D convolution, 2D convolution, 3D convolution, full concatenation, transposed convolution, channel separable convolution, expanded convolution neural network operator.

Further, the sample image is in a gray scale image format, a color transparency image format, a gray scale chroma image format or a hue saturation brightness image format.

Further, after comparing the calibration standard with the calculation result, generating an index value-compensation vector correspondence table corresponding to the current convolutional layer according to the index matrix and the comparison result, including:

comparing the calibration standard with the calculation result;

and generating an index value-compensation vector corresponding relation table corresponding to the current convolutional layer according to the index matrix and the comparison result.

Further, the obtaining the corresponding index matrix according to the input feature map of the current convolutional layer corresponding to the sample image includes:

if the current convolution layer is the first convolution layer, performing 0 compensation on the input feature graph according to the step length of the current convolution layer and the size of a convolution kernel in a manner of compensating 0 by a convolution operator of tensorflow;

if the current volume hierarchy is a non-first volume layer and the upper layer is a volume layer, 0 is compensated for the average value matrix of the previous volume layer according to the step length and the convolution kernel size of the current volume layer and the convolution operator compensation 0 mode of tensorflow;

if the current convolution layer is a non-first convolution layer and the upper layer is a non-convolution layer, performing 0 compensation on the pre-mask matrix of the last non-convolution layer according to the step length of the current convolution layer and the size of the convolution kernel in a manner of compensating 0 by using a convolution operator of tensoflow;

traversing the image subjected to zero padding pixel by pixel, and calculating the average value of a preset area around each pixel point by channels to obtain an average value matrix of the current convolution layer;

and calculating an index matrix of the current convolutional layer according to the average value matrix.

Further, if the step size of the next convolutional layer is equal to 1 and the convolutional kernel size of the next convolutional layer is the same as that of the current convolutional layer, the index matrix of the next convolutional layer is the same as that of the current convolutional layer.

Further, after comparing the calibration standard with the calculation result, generating a corresponding index value-compensation vector correspondence table according to the index matrix and the comparison result, including:

and after comparing the calibration standard with the calculation result, classifying and averaging the difference values according to the index matrix to obtain a corresponding index value-compensation vector corresponding relation table.

In a second aspect, an error compensation method for an in-memory computing chip, the in-memory computing chip being used for implementing a target neural network model operation, includes: a digital domain and an analog domain, the error compensation method being applied in the digital domain, the error compensation method comprising:

acquiring a corresponding index matrix according to the input characteristic diagram of the current convolutional layer;

according to the index matrix, obtaining a compensation vector corresponding to each pixel of the input characteristic diagram based on a pre-acquired index value-compensation vector corresponding relation table of the current convolutional layer;

obtaining a calculation result corresponding to the input characteristic diagram and output by a flash memory unit array corresponding to the current convolutional layer in the analog domain;

and compensating the calculation result according to the compensation vector.

Further, the neural network model is a direct connection neural network model, a U-shaped neural network model, a residual structure network model, a circular neural network model, an MLP model, a Transformer network model or an attention mechanism neural network model.

Further, the input feature map is in a gray scale map format, a color transparency map format, a gray scale chroma image format or a hue saturation brightness image format.

Further, the obtaining a corresponding index matrix according to the input feature map of the current convolutional layer includes:

traversing the image subjected to zero padding pixel by pixel, and calculating the average value of the surrounding area of each pixel point by channels to obtain the average value matrix of the current convolution layer;

Further, the method further comprises:

and transmitting the compensated calculation result as an input characteristic diagram of the next volume of lamination of the target neural network model to a flash memory unit array corresponding to the next volume of lamination in the simulation domain for calculation.

In a third aspect, an in-memory computing chip is provided, including: a digital domain and an analog domain, wherein the analog domain is used for executing matrix multiplication and addition operation of the target neural network model, and the digital domain is used for executing the error compensation method.

In a fourth aspect, an error compensation apparatus for a memory computing chip is provided, comprising:

the calibration standard acquisition module is used for inputting the sample image into the target neural network model and taking the characteristic diagram output by each convolutional layer as the calibration standard of the convolutional layer;

the index matrix acquisition module is used for acquiring a corresponding index matrix according to the input characteristic diagram of the current convolutional layer corresponding to the sample image;

the in-memory calculation result acquisition module is used for acquiring a calculation result of a current convolutional layer corresponding to the input feature map and output by an in-memory calculation chip, wherein the in-memory calculation chip is used for realizing target neural network model operation, and the input feature map is input into a flash memory unit array corresponding to the current convolutional layer in an analog domain of the in-memory calculation chip for operation to obtain the calculation result;

and the compensation vector table generating module is used for generating an index value-compensation vector corresponding relation table corresponding to the current convolutional layer according to the index matrix and the comparison result after comparing the calibration standard with the calculation result, and the index value-compensation vector corresponding relation table is used for compensating the calculation result output by the analog domain in the digital domain of the memory calculation chip when the memory calculation chip realizes the operation of the target neural network model.

In a fifth aspect, an electronic device is provided, which includes a memory, a processor, and a computer program stored on the memory and executable on the processor, and when the processor executes the computer program, the steps of the error compensation method for the memory computing chip are implemented.

In a fifth aspect, a computer-readable storage medium is provided, on which a computer program is stored, which computer program, when being executed by a processor, carries out the steps of the above-mentioned error compensation method for a memory computing chip.

The embodiment of the invention provides an error compensation method, an error compensation device, a chip and error compensation equipment for a memory computing chip, wherein the method comprises the following steps: inputting the sample image into a target neural network model, and taking a characteristic diagram output by each convolution layer as a calibration standard of the convolution layer; acquiring a corresponding index matrix according to an input characteristic diagram of the current convolutional layer corresponding to the sample image; obtaining a calculation result of a current convolution layer corresponding to the input feature map, which is output by an in-memory calculation chip, wherein the in-memory calculation chip is used for realizing target neural network model operation, and the input feature map is input into a flash memory unit array of the in-memory calculation chip in an analog domain corresponding to the current convolution layer for operation to obtain the calculation result; and after the calibration standard is compared with the calculation result, generating an index value-compensation vector corresponding relation table corresponding to the current convolutional layer according to the index matrix and the comparison result, wherein the index value-compensation vector corresponding relation table is used for compensating the calculation result output by the analog domain in the digital domain of the memory calculation chip when the memory calculation chip realizes the operation of the target neural network model. In the application reasoning stage, acquiring a corresponding index matrix in a digital domain of the chip according to an input characteristic diagram of the current convolutional layer; according to the index matrix, obtaining a compensation vector corresponding to each pixel of the input characteristic diagram based on a pre-acquired index value-compensation vector corresponding relation table of the current convolutional layer; obtaining a calculation result corresponding to the input characteristic diagram and output by a flash memory unit array corresponding to the current convolutional layer in the analog domain; and compensating the calculation result according to the compensation vector, so that the calculation result of the storage and calculation integrated chip is closer to the calculation result under the noise-free condition, the calculation precision is improved, and the storage and calculation integrated chip can be applied to scenes with higher precision requirements, such as image processing and the like.

In order to make the aforementioned and other objects, features and advantages of the invention comprehensible, preferred embodiments accompanied with figures are described in detail below.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts. In the drawings:

FIG. 1 is a flow chart illustrating an error compensation method for a memory computing chip according to an embodiment of the present invention;

FIG. 2 illustrates a process for obtaining an index matrix in an embodiment of the invention;

FIG. 3 is a flow chart illustrating an error compensation method performed in an in-memory computing chip according to an embodiment of the present invention;

FIG. 4 illustrates a computationally integrated chip design framework utilizing the compensation method in an embodiment of the present invention;

FIG. 5 illustrates the structure of a hyper-neural network model in an embodiment of the present invention;

FIG. 6 is a block diagram showing an arrangement of an error compensation apparatus for a memory computing chip according to an embodiment of the present invention;

fig. 7 is a block diagram of an electronic device according to an embodiment of the invention.

Detailed Description

In order to make the technical solutions better understood by those skilled in the art, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only partial embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

It should be noted that the terms "comprises" and "comprising," and any variations thereof, in the description and claims of this application and the above-described drawings, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present application will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.

The excellent performance of convolutional neural networks in the field of image processing has been widely proven, such as image classification, object detection, semantic segmentation, image super-segmentation, image denoising, image dynamic range fusion, and the like. Due to the limitation of the process capability of the current analog chip storage unit, the number of electrons, impedance, tunneling voltage and the like which can be carried on each storage unit are different; the storage state programming adopts an updating-checking iteration mode, the difference value of the reading current and the target current is controlled to be smaller than a threshold value, the programming of the storage unit is realized, and the programming precision difference exists; the target image can introduce digital-to-analog conversion errors through the digital-to-analog conversion module; after the calculation is finished, the characteristic data can introduce analog-to-digital conversion errors through an analog-to-digital conversion module; meanwhile, due to the influence of other factors such as temperature drift, electrons on the unit memory may change, and the temperature drift error is introduced by changing the output current. Therefore, the accuracy of the feature map of the input data after the operation of the storage array is different from that of a digital chip, so that the current storage and calculation integrated chip is only applied to tasks with lower calculation accuracy requirements such as classification and the like due to accuracy limitation, and is not applied to scenes with higher accuracy requirements such as image processing and the like.

By utilizing the method for compensating the table look-up of the storage chip aiming at the image processing neural network model provided by the embodiment of the invention, a gray level or color level index matrix is calculated according to a target image; selecting a calibration image or pixel, and calculating a calibration standard of each layer of convolution of the target neural network model in a digital domain; comparing each layer of convolution output characteristic diagram of the storage chip with a calibration standard, and generating an index value-compensation vector corresponding relation table of gray level or color level according to the index matrix; and in the inference stage, according to the index matrix of the target image, based on the index value-compensation vector correspondence table, the compensation vector corresponding to the pixel of each layer of feature layer is extracted, and the convolution output of each layer of the storage and computation chip is compensated, so that the precision error of the feature graph after computation of the storage array and the computation result of the digital chip can be greatly improved.

Fig. 1 shows a flowchart of an error compensation method for an in-memory computing chip in an embodiment of the present invention, where the error compensation method is mainly applied to high-speed computing chips such as a PC terminal, a server, or a mobile SoC main control chip, and certainly can also be implemented on a CPU core of an in-memory SoC chip, and is intended to obtain an index value-compensation vector correspondence table for error compensation in the in-memory SoC chip; as shown in fig. 1, the error compensation method for the memory computing chip may include the following steps:

step S100: inputting the sample image into a target neural network model, and taking a characteristic diagram output by each convolution layer as a calibration standard of the convolution layer;

specifically, the target neural network model is a trained neural network model, one or more sample images may be used, and if there are multiple sample images, the multiple sample images may be averaged. The sample image can be directly input into the target neural network model as a calibration image or a pixel, or the sample image can be abstracted and extracted and then input into the target neural network model.

Step S200: acquiring a corresponding index matrix according to an input characteristic diagram of the current convolutional layer corresponding to the sample image;

specifically, after a sample image is input into a target neural network model, each convolution layer takes the output of the upper convolution layer as an input feature map; and calculating a corresponding index matrix according to the input characteristic diagram of the current convolutional layer.

Wherein, when the input characteristic diagram is a gray level, the index matrix is calculated by using the gray level diagram, and when the input characteristic diagram is a color level, the index matrix is calculated by using the RGB value;

step S300: obtaining a calculation result of the current convolution layer corresponding to the input characteristic diagram and output by the in-memory calculation chip;

the in-memory computing chip is used for realizing the operation of a target neural network model, comprises a digital domain and an analog domain, wherein the analog domain is mainly an analog circuit, and the convolution layer matrix multiplication and addition operation in the neural network model is realized by utilizing a flash memory unit array; the digital domain is mainly used for realizing digital operation, such as the operation of a full connection layer and a sampling layer, and inputting an output result of a certain convolution layer output by the analog domain after error compensation into an analog circuit module corresponding to the next convolution layer operation in the analog domain;

each layer of convolution layer needs to calculate a corresponding index value-compensation vector corresponding relation table in sequence, so that the input characteristic diagram is input into a flash memory unit array corresponding to the current convolution layer in an analog domain of a memory calculation chip to perform operation to obtain a calculation result;

step S400: and after the calibration standard is compared with the calculation result, generating an index value-compensation vector corresponding relation table corresponding to the current convolutional layer according to the index matrix and the comparison result, wherein the index value-compensation vector corresponding relation table is used for compensating the calculation result output by the analog domain in the digital domain of the memory calculation chip when the memory calculation chip realizes the operation of the target neural network model.

Specifically, each convolution layer corresponds to 1 index value-compensation vector corresponding relation table, there may be 2 convolution layers with the same index value-compensation vector corresponding relation table, or there may be two convolution layers with different index value-compensation vector corresponding relation tables, and the index value-compensation vector corresponding relation table is used to represent compensation vector values corresponding to gray levels of different pixel points in the input feature map.

The applicant finds, through a large amount of research, that when a weight burning environment is similar to a test environment, for the same storage chip, the main error sources of the weights after the same burning are digital-to-analog conversion errors and analog-to-digital conversion errors. When the input data is the same, the output change of the same position is very small after the memory chip with the burning weight operates for many times. Therefore, error compensation can be performed on the output of the current convolutional layer based on the index value-compensation vector correspondence table.

In the embodiment of the invention, the calculation result is compensated according to the compensation vector, so that the calculation result of the storage and calculation integrated chip is closer to the calculation result under the noise-free condition, the calculation precision is improved, and the storage and calculation integrated chip can be applied to scenes with higher precision requirements, such as image processing and the like.

The method can calibrate and calculate errors of various neural network models containing convolution layers, and is not limited to 1D convolution, 2D convolution, 3D convolution, full connection, transposition convolution, channel separable convolution and expansion convolution neural network operators; the model is not limited to a direct neural network model, a U-shaped neural network model, a residual structure network model, a recurrent neural network model, an MLP model, a transform network model, or an attention mechanism neural network model.

It is worth to be noted that the embodiment of the present invention mainly aims at the error correction of the storage and computation integrated chip of the image processing neural network model. The convolutional neural network model is a neural network with a limited receptive field, namely, the current pixel neighborhood is processed. Therefore, when the same pixel block is on different images, the convolution output result of each layer of the convolution neural network model is the same. When the colors of the adjacent pixels of the target image are similar, the convolution output of each layer of the convolution neural network model is similar, and the adjacent pixels are usually obvious in influence on the calculation result of the current pixel.

According to the method, the pixel value on the feature image after each layer of convolution is compensated according to the input feature image pixel and the adjacent pixel value (the preset number of pixels which are close to the periphery of the current pixel), and because the compensation vector of each layer of convolution and each feature layer pixel is different, a compensation index matrix needs to be calculated firstly to know that different feature layer pixels adopt different compensation vectors. In order to reduce the amount of computation and the size of the index matrix, the embodiment of the present invention adopts a method for computing the index matrix according to the input feature map, but is not limited to the input feature map, for example, in a video streaming task, each layer of convolution statistics may also analyze the feature map output according to the previous layer of convolution according to the previous frame of image. The statistical analysis is performed on different pixels, and the different pixels extract compensation vectors from the compensation vector table according to the index matrix, so that the compensation is performed on the result of each layer of convolution or other AI operators (such as full link layer, 3D convolution and transposed convolution), and the result is closer to the calculation result in the case of no noise.

It should be noted that the neural network model is a direct connection neural network model, a U-shaped neural network model, a residual structure network model, a recurrent neural network model, an MLP model, a transform network model, or an attention-machine neural network model, including but not limited thereto.

In addition, the neural network model includes: 1D convolution, 2D convolution, 3D convolution, fully-connected, transposed convolution, channel separable convolution, expanded convolution neural network operators, and the like, including but not limited to.

The sample image is in a gray scale image format, a color transparency image (RGB _ Alpha) format, a gray scale chrominance image format (YUV) or a hue saturation value image format (HSV, HSL), but is not limited thereto.

In an alternative embodiment, this step S200 may include the following:

step I: (1) if the current convolution layer is the first convolution layer, performing 0 compensation on the input feature graph according to the step length of the current convolution layer and the size of a convolution kernel in a manner of compensating 0 by a convolution operator of tensorflow;

(2) if the current volume hierarchy is a non-first volume layer and the upper layer is a volume layer, 0 is compensated for the average value matrix of the previous volume layer according to the step length and the convolution kernel size of the current volume layer and the convolution operator compensation 0 mode of tensorflow;

(3) if the current convolution layer is a non-first convolution layer and the upper layer is a non-convolution layer, performing 0 compensation on the pre-mask matrix of the last non-convolution layer according to the step length of the current convolution layer and the size of the convolution kernel in a manner of compensating 0 by using a convolution operator of tensoflow;

step II: traversing the image subjected to zero padding pixel by pixel, and calculating the average value of a preset area around each pixel point by channels to obtain an average value matrix of the current convolutional layer;

step III: and calculating an index matrix of the current convolutional layer according to the average value matrix.

And if the step length of the next convolutional layer is equal to 1 and the convolutional kernel size of the next convolutional layer is the same as that of the current convolutional layer, the index matrix of the next convolutional layer is the same as that of the current convolutional layer.

Specifically, the embodiment of the present invention may divide the input feature map into gray levels or color levels for statistical analysis, where the gray levels and the color levels only provide one kind of analyzing and grading idea of the pixels, and the embodiment of the present invention is not limited to these two analyzing methods. Taking an 8-bit RGB input feature map as an example, calculating a gray scale map gray _ img of the input feature map, where R is a red channel component of the input feature map, G is a green channel component of the input feature map, and B is a blue channel component of the input feature map, R is 0.299+ G is 0.587+ B is 0.114. The gray scale map has a value range of 0-255, and 8-bit gray scale can be divided into 26 levels according to every 10 gray scales, namely 0-9 gray scale, 10-19 gray scale and the like.

When the color level index matrix scheme is adopted, the RGB three-channel components have values of 0-255, and different R/G/B channel interval values can be used to define the color levels, for example, the R/G/B channels can be divided by intervals of 43, 52 and 64, and the R/G/B channels can adopt different interval values, for example, 216-level color levels, 125-level color levels and 64-level color levels can be defined. When the interval is 43, the R/G/B channels can be respectively divided into 6 levels of 0-42, 43-85, 86-128, 129-171, 172-214, 215-257, and then the RGB three-channel input feature image pixel can be divided into 216 levels of color levels. When the interval is 52, the R/G/B channels can be respectively divided into 5 levels of 0-51, 52-103, 104-155, 156-207, and 208-259, and then the RGB three-channel input feature image pixels can be divided into 125 color levels. When the interval is 64, the R/G/B channels can be respectively divided into 4 levels of 0-63, 64-127, 128-191 and 192-255, and the RGB three-channel input feature image pixel can be divided into 64 color levels.

Considering the diversity of the structure of an image processing neural network model and the size of a feature map, a feature map compensation index matrix with different sizes needs to be established according to the neural network model, for example, whether a convolution step length is not equal to 1 is adopted in the model, whether a pooling series operator is adopted, whether an up/down sampling operator (bilinear up/down sampling, nearest neighbor up/down sampling and the like) is adopted, and the like, because different operators adopt different parameters, the feature map size has corresponding change and is basic knowledge, for example, the convolution step length is equal to 2, and the feature map size is changed into h/2 x w/2; the pooling operator and the down-sampling operator can reduce the size of the feature map; the upsampling operator will enlarge the feature size.

The embodiment of the invention adopts the technical scheme that before a neural network model is transplanted to a storage chip for reasoning, the size of an index matrix of a required characteristic diagram is calculated, namely the length and width value of the index matrix is consistent with the length and width of a corresponding characteristic diagram; if the gray level is adopted, the number of channels is 1; if a color level is used, the number of channels is 3. Wherein the characteristic map index matrix is calculated according to the input image.

For example, referring to fig. 2, the index matrix is calculated as follows:

1. if the gray level is adopted, calculating a gray level image of the input RGB image for index matrix calculation; if the color level is adopted, inputting an RGB (red, green and blue) graph for index matrix calculation; (for convenience of description, the input gray-scale image or input RGB image required for the calculation of the first index matrix will be collectively referred to as an input image hereinafter.)

2. Analyzing the first layer network of the neural network model, assuming convolution layer, and convolving the layer with the step s1 to determine whether the step s1 is equal to 1. If the step length s1 is not equal to, assuming that s1 is 2, setting the convention according to the convolution operator parameters, wherein the convolution kernel size is greater than or equal to 3, and assuming that the convolution kernel size (k1, k2) is (3, 3); when the gray scale is adopted, performing 0 complementing operation around the gray scale according to a convolution operator 0 complementing mode; and when the color level is adopted, performing 0 complementing operation around the input RGB image according to a convolution operator 0 complementing mode. When the step s1 is equal to 1, judging whether the sizes of convolution kernels (k1, k2) are both 1; and if the sizes of the convolution kernels are not 1, performing 0 complementing operation on the periphery of the input image according to a 0 complementing mode of the convolution operator and a 0 complementing mode of the convolution operator of tensorflow. For example, if the convolution kernel size (k1, k2) ═ 3, then the input image is complemented by 0 for each row around the input image; if (k1, k2) ═ 1, then the input image is not complemented by 0;

3. and traversing the input image pixel by pixel on the gray-scale image or the input RGB image according to the step length s1, and calculating the average value of the areas (k1, k2) around the pixel point by channels to serve as the value of the pixel point position corresponding to the first average value matrix. For example, with the gray scale, if the convolution kernel size (k1, k2) is (3,3), calculating the average gray scale value of a 3 × 3 area around the pixel point to be calculated in the input gray scale map; and (3) calculating the input RGB image subchannel to calculate the R/G/B three-channel average value of a 3 multiplied by 3 area around the pixel point by adopting the color level and the convolution kernel size (k1, k2) ═ 3, 3. Then when the input image size is H × W × C, the first average matrix size is also H × W × C; if gray scale is adopted, C is 1; with color grade, C is 3.

4. Calculating an index matrix by using the average matrix obtained in the step (3); adopting a gray level, wherein the gray level interval is 10, and then a gray index matrix gray _ level is pixel _ mean//10 ("//" is a rounding operation), wherein pixel _ mean is an average gray level of each pixel, that is, a value of a certain pixel in an average value matrix; adopting a color level, uniformly adopting an interval value of x in R/G/B three channels, and then obtaining a color index matrix color _ level (pixel _ mean _ R// x, pixel _ mean _ G// x, pixel _ mean _ B// x), wherein pixel _ mean _ R, pixel _ mean _ G and pixel _ mean _ B are R/G/B three-channel average values of each pixel in an average value matrix respectively; it should be noted that the gray scale interval may be arbitrarily selected according to application requirements, and may be obtained at equal intervals or at unequal intervals, which is not limited in the embodiment of the present invention.

5. Thus, obtaining an index matrix of the first layer of convolution; and judging whether a second index matrix needs to be calculated according to the second layer network of the neural network model. The following scenario discusses how the second index matrix required by the layer two network is computed.

6. If the second layer network is still a convolution layer and the convolution step s2 is equal to 1, the convolution kernel size is the same as that of the first layer convolution, and the index matrix of the first layer network is used; the subsequent convolution is analyzed in this way, and if all convolution layer step lengths s are equal to 1 and the convolution kernel sizes are the same, there is one and only one index matrix.

7. If the layer two is still a convolutional layer but the convolution step s2 is not equal to 1 or the convolutional kernel size is different from the first convolution, then the calculation of the second index matrix is started. Assuming that the step s2 of the second layer convolution is equal to 2 and the convolution kernel size is (k3, k4) ═ 3, then 0 is complemented in the first average matrix by the convolution operator of tensoflow, for example, by 0 at the right and lower sides of the first average matrix. Assuming that the step s2 of the second layer convolution is equal to 2, and the convolution kernel size is (k3, k4) ═ 5, taking the complementary 0 mode of tenserflow as an example, two rows of 0 are complementary at the right and lower sides of the first average value matrix, and one row of 0 is complementary at the left and upper sides.

8. And traversing the first average matrix after 0 is complemented pixel by pixel with the step length s2 equal to 2, and calculating the average value of the areas (k3, k4) around the pixel point by channels as the value of the pixel point position corresponding to the second average matrix. For example, with the gray scale, the convolution kernel size (k3, k4) ═ 5,5), the gray scale average value of the 5 × 5 region around the pixel point on the first average matrix after 0 complementation is calculated with the step size s2 ═ 2; and (3) adopting a color level, and calculating the R/G/B three-channel average value of a 5 × 5 area around the pixel point by using the upper branch channel of the first average value matrix after the step length s2 is 2 and 0 is supplemented if the convolution kernel size (k3, k4) is (5, 5). Then when the input image size is H × W × C, the second average matrix size is H/2 × W/2 × C; if gray scale is adopted, C is 1; with a color grade, C is 3.

9. After the second average matrix is obtained, referring to the method of step 4, a second index matrix is obtained by calculation according to the second average matrix, and the second index matrix is used as the index matrix of the second convolutional layer.

10. If the second layer network is a pooling operator, such as an average pooling layer, a maximum pooling layer, a minimum pooling layer, etc., the pooling layer is calculated in a digital domain of the integrated computing chip, and no compensation operation is required. However, the convolutional layer is usually followed by the pooling layer, i.e. the layer three is the convolutional layer, and the index matrix of the layer three needs to be calculated. First, analyze the pooling layer, if the pooling step is s2 and the pooling kernel size is (k3, k4), refer to the method of step 7, according to the step and kernel size, take the padding 0 method of tenserflow as an example, perform padding 0 operation on the first average matrix. And then, calculating the gray level of a k3 multiplied by k4 area around the pixel point or the average value or the maximum value or the minimum value of an R/G/B channel on the first average value matrix according to the species of the pooling operators by step length s2 to obtain a second pre-mask matrix, wherein the average value is calculated if the second layer network is an average pooling layer, the maximum value is calculated if the second layer network is a maximum pooling layer, and the minimum value is calculated if the second layer network is a minimum pooling layer. Assuming that the step size s2 is 2, when the input image size is H × W × C, the matrix size before the second mask is H/2 × W/2 × C; if gray scale is adopted, C is 1; with a color grade, C is 3.

11. For the convolutional layer which is immediately after the pooling layer, that is, the layer three network is the convolutional layer, and the convolution step is s3 ═ 1, assuming that the convolution kernel size is (k5, k6) ═ 3, then the index matrix of the layer three convolution is calculated according to the second pre-mask matrix.

12. Specifically, when the index matrix of the third layer convolution is calculated in step 11, referring to the method in step 2, according to the convolution step size s3 being 1 and the convolution kernel size (k5, k6) being (3,3), the operation of 0 compensation is performed on the second pre-mask matrix according to the manner of 0 compensation of tensorflow.

13. Referring to the method in step 3, on the basis of the second pre-mask matrix after 0 is supplemented, pixel-by-pixel traversal is performed, and the average value of the area (k5, k6) around the pixel point is calculated in a channel-by-channel manner and is used as the value of the pixel point position corresponding to the average value matrix corresponding to the convolution of the third layer. When the size of the matrix before the second mask is H/2 xW/2 xC, the size of the average value matrix corresponding to the convolution of the third layer is H/2 xW/2 xC; if gray scale is adopted, C is 1; with a color grade, C is 3.

14. Referring to the method in step 4, the index matrix corresponding to the third layer of convolution may be calculated according to the average matrix corresponding to the third layer of convolution.

15. If the second layer network is a down-sampling operator (such as bilinear down-sampling, nearest neighbor down-sampling, etc.), the operation is also calculated in the digital domain of the integral calculation chip, and the compensation operation is not needed. Referring to step 11, the downsampled layer is followed by the convolutional layer, i.e. the layer three is the convolutional layer, and the convolution step is s3 ═ 1, assuming that the convolution kernel size is (k5, k6) ═ 3, 3. For the layer two network as a downsampling operator, see the method from step 10 to step 14, the corresponding pre-mask matrix can be obtained by using the average value calculation.

16. Assuming that the third layer of convolution has a step size of s3 equal to 1 and a convolution kernel size of (k5, k6) equal to (3,3), first, 0 is complemented by the pre-mask matrix obtained in step 15, then the corresponding average matrix is calculated, and then the index matrix required by the third layer of convolution is calculated.

17. The calculation mode of the index matrix under different conditions of the second-layer network is completed.

18. Considering the diversity of the neural network model, the network in the subsequent network also has characteristic graph size amplification operators such as an up-sampling operator (bilinear up-sampling, nearest neighbor up-sampling and the like). When an actual neural network model is developed, a feature map size amplification operator is carried out in a digital domain and is closely followed by a convolution layer, and in order to ensure that the feature map size is unchanged, the convolution step length is s-1. When the U-shaped network structure is adopted, the first half of the network layer will reduce the characteristic diagram, and the second half of the network layer will enlarge the characteristic diagram. The enlarged feature map is often added and merged with the feature map of the previous network layer, i.e. the feature map size is the same as the feature map size of the previous network layer. At this time, the index matrix of the convolution layer after the up-sampling operator is used to ensure that the sizes of the feature maps of the convolution layer and the index matrix of the convolution layer corresponding to the previous network are the same.

19. If the second layer network is an up-sampling operator (bilinear up-sampling, nearest neighbor up-sampling and the like), the calculation is also carried out in a digital domain of the storage-calculation integrated chip, and the compensation operation is not needed. In the method of step 11, the upsampling layer is followed by the convolutional layer, i.e. the layer three is the convolutional layer, and the convolution step s3 is 1, assuming that the convolution kernel size is (k5, k6) ═ 3, 3. At this time, the feature size of the third convolution is different from that of the first convolution. Referring to the method in step 15, for the second layer network as an upsampling operator, the first average matrix may be amplified by upsampling with nearest neighbors, and the corresponding pre-mask matrix is obtained through calculation, so as to calculate the index matrix of the convolution of the third layer.

20. If the layer two network is convolution and the convolution step s2 is equal to 2, the convolution kernel size is the same as the subsequent convolution and the subsequent convolution step is 1. The subsequent convolution follows the index matrix of the second layer of convolution; there are two and only two index matrices for the neural network model.

21. And if the subsequent convolution step length is not equal to 1, repeating the step 2 to the step 20, taking the current convolution as the first layer convolution in the step, and analyzing and calculating the index matrix of the subsequent layer.

In an alternative embodiment, this step S400 may include the following:

step 1: comparing the calibration standard with the calculation result;

and 2, step: and generating an index value-compensation vector corresponding relation table corresponding to the current convolutional layer according to the index matrix and the comparison result.

Specifically, after comparing the calibration standard with the calculation result, the difference is classified and averaged according to the index matrix to obtain a corresponding index value-compensation vector correspondence table.

It should be noted that, when generating the calibration standard, the present invention may use the full-frame input image for the calibration standard calculation, or may use several pixels and surrounding neighboring pixel blocks randomly selected for the calibration standard calculation based on each gray level or color level, or may use the custom input image for the calibration standard calculation. The current convolution characteristic diagram output by the storage and computation integrated chip can be input into a subsequent convolution layer after being calibrated by a compensation vector table obtained through computation, and the calculation and the calibration are carried out layer by layer, so that error accumulation and transmission are avoided.

The calculation flow of the calibration standard generation method is introduced as follows:

1. when the gray level is adopted, selecting a frame of test image to ensure that the gray level of all index matrixes can be covered; when the color level is adopted, selecting a plurality of frames of test images to ensure that the color levels of all the index matrixes can be covered;

2. calculating all index matrixes of the test image according to the index matrix calculation flow;

3. after the test image is supplemented with 0, the test image is sequentially input into each network layer of the target neural network model according to the digital domain soft running of the neural network model (namely, the test image is processed on a PC (personal computer) end, a server and the like), and the output characteristic diagram of each layer of convolution or full connection layer is used as a calibration standard calib _ out _ i, wherein i is the mark of the ith layer of convolution.

4. The flash memory unit array of the storage and calculation integrated chip is a two-dimensional matrix, and the calculation result of each layer of convolution is the result of multiplying the input vector of the flash memory unit by the matrix prestored on the flash memory unit to obtain an output vector. The input vector is obtained by rearranging pixels and neighboring regions of the input image or the input feature image into vectors, and the process is called img2 col. For example, if the number of channels of the input RGB image is 3 and the size of the convolution kernel in the first layer is 3 × 3, the values involved in the calculation are H × W × C, which is 3 × 3 × 3, which is 27 pixels around the pixel, and the input RGB image is rearranged into 27-dimensional input vectors. Assuming that the first layer convolution output dimension is C1-32, the output vector has a dimension of 32. That is, if the step s of the first layer convolution is 1 and the convolution kernel size is 3 × 3, the output feature map size of the first layer convolution is H × W × 32.

5. After the test image img2col is rearranged, the test image is input into a flash memory unit array of a storage and computation integrated chip pixel by pixel for computation, and a vector of a corresponding pixel in the current convolution feature map is output, wherein the scale is 1 × 1 × C1, and C1 is taken as an example. And comparing the first layer of convolution output characteristic diagram feature _ npu _1 calculated by the storage and calculation integrated chip with the digital domain soft running result calib _ out _1, and classifying and averaging the difference value diff _ out _1 according to the first index matrix to obtain a compensation vector table. For example, when gray levels are adopted, each pixel in the first index matrix can be classified into 26 levels, the pixels with the median values of 0,1, 2,. 24 and 25 in the index matrix are classified into one class, the difference value of each class of gray levels is averaged to obtain a compensation vector with one dimension of C1-32, and the compensation vector table is composed of 26 compensation vectors with the dimensions of C1-32. When color levels are adopted, each pixel of the first index matrix can be classified into 216 levels, pixels with the median values of (0,0,0), (0,0,1), (0,0,5), (0,1,0), (0,1,1), (0,1,5), (5,5,0), (5,5,1) and (5,5,5) of the index matrix are classified into one class, the difference values of each class of color levels are averaged to obtain a compensation vector with one dimension of C1-32, and the compensation vector table is composed of 216 compensation vectors with the dimension of C1.

6. Utilizing the compensation vector table (index value-compensation vector corresponding relation table) of the first layer of convolution obtained in the step 5, calibrating and calculating the first layer of convolution feature layer feature _ npu _1 output by the corresponding flash memory unit array in the storage integrated chip, obtaining a calibrated feature map feature _ calib _1, obtaining feature map feature _ act _1 after passing through an activation function such as ReLU and a digital domain AI operator (for example, a pooling operator, an up/down sampling operator and the like), and taking the feature map as an input feature map of the second layer of convolution, wherein the size of the feature map is assumed to be H × W × 32;

7. after the input feature map img2col of the second layer of convolution is rearranged, the input feature map is input into the flash memory unit array corresponding to the storage integrated chip pixel by pixel for calculation, and the vector of the corresponding pixel in the current convolution feature map is output, with the scale being 1 × 1 × C2, where C2 is 96 as an example. Wherein, the output characteristic diagram of the second layer convolution of the digital domain soft-running is calib _ out _2, which is used as the calibration standard of the second layer convolution. And comparing the second layer convolution output characteristic diagram feature _ npu _2 calculated by the storage and calculation integrated chip with the calib _ out _2, and classifying and averaging the difference value diff _ out _2 according to the gray level or the color level according to the index matrix corresponding to the second layer convolution to obtain the compensation vector table. For example, when gray scale is adopted, the compensation vector table is composed of 26 compensation vectors with the dimension C2 being 96; when a color level is adopted, the compensation vector table is composed of 216 compensation vectors with the dimension C2 being 96. And step six, obtaining a calibration feature map feature _ calib _2 of the second layer of convolution after calibration and a feature map feature _ act _2 after post-processing.

8. In the

same steps

6 and 7, comparing the output characteristic diagram feature _ npu _ i of the storage chip of the subsequent convolution layer or the full-connection layer of the neural network model with the characteristic layer calib _ out _ i of the corresponding digital domain soft-run convolution layer or the full-connection layer, and classifying and averaging the difference value diff _ out _ i according to the index matrix of the corresponding convolution layer according to the gray level or the color level to obtain the compensation vector table of the convolution layer. (wherein i is the label of the convolutional layer of the fourth layer)

9. And repeating the steps to complete the calculation of the compensation vector table of all the convolution layers or all the connection layers of the neural network model.

In an optional embodiment, in the calibration standard generation process, the output feature map feature _ npu _ i of the integrated chip is stored, and compared with the feature map calib _ out _ i of the corresponding digital domain soft run convolution layer, a difference diff _ out _ i is obtained as calib _ out _ i-feature _ npu _ i, and the compensation vector table vector _ calib _ conv _ i of the convolution layer is obtained after averaging according to the gray level or color level classification according to the index matrix of the corresponding convolution layer. (wherein i is the label of the convolutional layer of the fourth layer)

And taking the feature map feature _ calib _ i after the current convolutional layer is calibrated and the feature map feature _ act _ i after the active layer is processed as the input of the next convolutional layer, wherein the feature map feature _ calib _ i is used for calculating the compensation vector table of the next convolutional layer.

The embodiment of the invention also provides an error compensation method for the in-memory computing chip, wherein the in-memory computing chip is used for realizing the operation of the target neural network model, and comprises the following steps: a digital domain and an analog domain, wherein the digital domain is implemented by using a digital circuit, and may be a processor such as a GPU, and the analog domain uses an analog circuit, and mainly includes a flash memory cell array and a supporting circuit thereof for implementing a matrix multiply-add operation for a neural network operation, and the error compensation method is applied in the digital domain, as shown in fig. 3, and may include the following contents:

step S1: acquiring a corresponding index matrix according to the input characteristic diagram of the current convolutional layer;

specifically, for a convolution layer of the convolutional neural network model, a corresponding index matrix is obtained according to an input feature image of the convolution layer.

Step S2: according to the index matrix, obtaining a compensation vector corresponding to each pixel of the input characteristic diagram based on a pre-acquired index value-compensation vector corresponding relation table of the current convolutional layer;

specifically, the index value-compensation vector correspondence table of the current convolutional layer is obtained by the method shown in fig. 1;

step S3: obtaining a calculation result corresponding to the input characteristic diagram and output by a flash memory unit array corresponding to the current convolutional layer in the analog domain;

specifically, when the simulation pre-operation is performed, the operation result of the current convolutional layer is output to a digital domain, the digital domain receives the operation result and then performs compensation and corresponding processing, the operation result is input to the analog domain as the input feature diagram of the next convolutional layer, the operation result of the next convolutional layer is input to the digital domain after the operation is performed on the operation result of the next convolutional layer in the analog domain, the digital domain receives the operation result and then performs compensation and corresponding processing, the operation result is input to the analog domain as the input feature diagram of the next convolutional layer, and the operation is repeated until the operation of the corresponding convolutional neural network model is completed.

Step S4: and compensating the calculation result according to the compensation vector.

By adopting the technical scheme, the calculation result of the storage and calculation integrated chip is closer to the calculation result under the noise-free condition, the calculation precision is improved, and the storage and calculation integrated chip can be applied to scenes with higher precision requirements, such as image processing and the like.

It is noted that the neural network model is a direct connection neural network model, a U-shaped neural network model, a residual structure network model, a recurrent neural network model, an MLP model, a transform network model, or an attention-driven neural network model, including but not limited thereto.

In addition, the neural network model includes: 1D convolution, 2D convolution, 3D convolution, full concatenation, transposed convolution, channel separable convolution, expanded convolution neural network operators, and the like, including but not limited to.

The input feature map is, but not limited to, a gray scale map format, a color transparency map (RGB _ Alpha) format, a gray scale chrominance image format (YUV) or a hue saturation luminance image format (HSV, HSL).

In an alternative embodiment, this step S1 may include the following:

step II: traversing the image subjected to zero padding pixel by pixel, and calculating the average value of a preset area around each pixel point by channels to obtain an average value matrix of the current convolution layer;

For the specific details of the technique for obtaining the index matrix, reference is made to the above description, and details are not repeated here.

In an optional embodiment, the error compensation method for the in-memory computing chip may further include the steps of: and transmitting the compensated calculation result as an input characteristic diagram of the next volume of lamination of the target neural network model to a flash memory unit array corresponding to the next volume of lamination in the simulation domain for calculation.

The embodiment of the present invention further provides an in-memory computing chip, where the in-memory computing chip includes: a digital domain and an analog domain, wherein the analog domain is used for executing matrix multiplication and addition operation of the target neural network model, and the digital domain is at least used for executing the error compensation method shown in the figure 3.

Of course, operations other than convolutional layers, such as sampling layers, fully-connected layers, etc., in convolutional neural networks may also be performed in the digital domain.

FIG. 4 illustrates a computational integrated chip design framework utilizing the compensation method in an embodiment of the present invention; as shown in fig. 4, the data img2col corresponding to a convolution layer in the image is processed and then input to the storage array for analog domain calculation, and the data is converted into an index matrix, a compensation vector is obtained from a compensation vector table (i.e., the correspondence table), and the compensation vector is calculated with the output of the storage array, so as to compensate the output of the storage array and improve the calculation accuracy of the convolution layer.

It should be noted that the memory integrated chip provided in the embodiment of the present invention may be applied to various electronic devices, such as: smart phones, tablet electronic devices, network set-top boxes, portable computers, desktop computers, Personal Digital Assistants (PDAs), vehicle-mounted devices, smart wearable devices, toys, smart home control devices, pipeline device controllers, and the like. Wherein, intelligence wearing equipment can include intelligent glasses, intelligent wrist-watch, intelligent bracelet etc..

In order to make the present application better understood by those skilled in the art, the following describes a specific implementation process in the embodiment of the present invention with reference to fig. 4 and 5:

in an actual application scenario, in an inference mode, a process of calibrating the output of a memory chip by using a compensation vector table of each layer of convolution is as follows:

1. calculating a plurality of index matrixes of the input image according to the neural network model, wherein the index matrixes can be divided into gray levels or color levels;

2. inputting the RGB image or the output characteristic layer of the prepositive network into a storage and calculation array of a storage and calculation integrated chip corresponding to convolution pixel by pixel img2col for calculation according to the position of the current convolution layer in the neural network model;

3. searching the gray level or color level corresponding to the convolution output characteristic layer on the index matrix pixel by pixel according to the corresponding index matrix of the current convolution layer;

4. determining a compensation vector in the compensation vector table by indexing the gray level or the color level in the compensation vector table, and calibrating and calculating the output result of the array pixel by pixel; then, the feature graph feature _ act _ i after post-processing of an active layer and the like is used as the input calib _ in _ i of the next convolutional layer;

5. the above steps can be expressed as the following formula, where j is the convolutional layer label, Pi is the pixel, and i is the number of pixels;

calib_in(j+1)＝ReLU[G_{NPU_j}[F_img2col(P_i)]+vector_calib_conv(j)]

wherein, F_img2colRearranging elements participating in convolution calculation into columns, and inputting the elements into a storage calculation array to participate in calculation; g_{NPU_j}Is a representative function of the j-th layer memory NPU.

Embodiments of the present invention are illustrated below by taking as an example a hyper-divided network running on a monolithic chip. Note that the hyper-divided network model is only one of the application scenarios of the present invention. The invention can be applied to AI tasks such as image classification, target detection, semantic segmentation, image super-segmentation, image noise reduction, image dynamic range fusion and the like, and comprises but is not limited to the tasks.

Referring to fig. 5, the first layer convolution kernel size is (k1, k1) ═ 3 × 3, the step length s1 is 1, and the number of output feature map channels C1 is 28; the size of the second layer convolution kernel is (k2, k2) ═ 3 × 3, the step length s2 is 1, and the number of output characteristic map channels C2 is 48; the third layer convolution kernel size is (k3, k3) ═ 3 × 3, the step length s3 is 2, and the number of output feature map channels C3 is 96; the fourth layer convolution kernel size is (k4, k4) ═ 3 × 3, the step length s4 equals 1, and the number of output feature map channels C4 equals 96; the fifth layer convolution kernel size is (k5, k5) ═ 3 × 3, the step length s5 equals 1, and the number of output characteristic map channels C5 equals 96; the size of the convolution kernel of the sixth layer is (k6, k6) ═ 3 × 3, the step length s6 equals 1, and the number of channels C6 equals 96; the seventh layer convolution kernel size is (k7, k7) ═ 3 × 3, the step size s7 equals 1, and the number of output signature channels C7 equals 48. Concat is that the input images are spliced according to the direction of the number of channels; d2Sx4 is an abbreviation of depth _ to _ space _ x4, namely, a feature map is unfolded from the direction of the number of channels to the length and width direction, and an H/4 xW/4 x 48 dimension feature map is unfolded to an H xW x 3 dimension feature map; d2Sx2 is an abbreviation of depth _ to _ space _ x2, namely, a feature map is expanded from the direction of the number of channels to the length-width direction, and an H/2 xW/2 x 12 dimension feature map is expanded to an H xW x 3 dimension feature map. The convolution layers are all operated in an analog domain of the storage and calculation integrated chip, and other operators are all operated in a digital domain, so that the compensation flow of the convolution layers is explained. The input image (LR) is an abbreviation of low resolution (low _ resolution), and the output image (SR) is an abbreviation of super resolution (super _ resolution).

And analyzing the compensation scheme of the hyper-neural network model on the storage and computation integrated chip by utilizing the index matrix calculation technology, the calibration standard generation technology, the convolutional layer compensation vector table generation technology and the compensation process in the inference mode. And if the convolution kernel sizes and the step sizes of the first layer convolution and the second layer convolution are the same, the first index matrix is used. If the convolution step s3 in the third layer is 2, calculating to obtain a second index matrix; and the sizes of the convolution kernels of the third, fourth, fifth, sixth and seventh layers are all 3 multiplied by 3, and the step sizes of the convolution kernels of the fourth, fifth, sixth and seventh layers are all 1, so that the convolution of the third, fourth, fifth, sixth and seventh layers all use the second index matrix.

The calibration reference map is sequentially input into a hyper-resolution neural network model of a digital domain (the digital domain may be a digital domain of a storage integrated chip, but the digital domain may be realized by a PC, a server and the like in order to reduce the chip size and the power consumption), and a characteristic map output by each convolution layer is used as a calibration standard. And sequentially inputting the calibration reference images into a hyper-resolution neural network model of the storage and computation integrated chip, sequentially computing a compensation vector table and calibrating an output characteristic image of the current convolution layer by each layer, and inputting the output characteristic image into the next convolution layer after post-processing. The hyper-derivative neural network model of this example has 7 layers of convolution, so there is one compensation vector table per convolution layer.

And under the inference mode, inputting the to-be-hyper-divided image into a storage and calculation integrated chip of a burnt neural network model, retrieving an index matrix from each layer of convolution output characteristic diagram pixel by pixel, and selecting a correct compensation vector to calibrate the output characteristic diagram.

In summary, the compensation scheme provided by the embodiment of the invention for greatly improving the operation accuracy in the application field of the high-precision neural network model of the storage and computation integrated chip in image processing and the like enables the storage and computation integrated chip to have practicability in the application field of high-precision image processing, and is beneficial to exerting the advantages of low power consumption and high computation power of the storage and computation integrated chip.

It is noted that the storage medium storing the computer-readable chip is not limited to phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technologies, compact disc read only memory (CD-ROM), Digital Versatile Disc (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium, and may be used to store information that may be accessed by a computing device.

The embodiment of the invention utilizes an input image to calculate a compensation index matrix of a neural network operator on a storage-computation integrated chip, wherein the format of the input image is not limited to a gray image, a color transparency image (RGB _ Alpha), a gray scale chrominance image format (YUV), a hue saturation brightness image format (HSV, HSL) and the like, and different image formats are adopted to calculate the index matrix;

the embodiment of the invention is not limited to calculating the compensation index matrix, the calibration reference and the compensation vector by inputting the image, for example, each layer of convolution statistics analysis can also analyze the feature map output by the convolution of the previous layer according to the image of the previous frame in the video streaming task. That is, statistical analysis is performed on different pixels, different pixels extract compensation vectors from a compensation vector table according to an index matrix, and the table lookup compensates the result of each layer of convolution or other AI operators (for example, full link layer, 3D convolution, and transposed convolution) so that the result is closer to the calculation result in the case of no noise.

The compensation vector table required by each convolution layer is calculated aiming at the input image by referring to the structure of the neural network model, and the average value, the maximum value and the minimum value of the pixel neighborhood of each scale of the input image are calculated, so that the method is not limited to the pixel-by-pixel index matrix calculation technology;

in addition, for each convolution layer, full-link layer and the like, the embodiment of the invention complements 0 around the input image and the average matrix to calculate the index matrix by the neighborhood of 3 × 3 pixels by pixel, is not limited to the size of the neighborhood, and can be any size; the method is not limited to the rows of 0 complementing around the input image and the average value matrix, is not limited to the data complementing around the extended image, and can be any data;

of course, the neural network structure according to the embodiment of the present invention may be applied, and is not limited to neural network operators such as 2D convolution, 3D convolution, full connection, transposed convolution, channel separable convolution, and expanded convolution; the method is not limited to direct connection neural networks, U-shaped neural networks, residual structure networks, attention networks, recurrent neural networks, MLPs, Transformer networks and the like;

in the embodiment of the invention, the compensation vector is obtained by using pixel-by-pixel table look-up in the inference stage, and the adoption of the table look-up compensation method and the chip design scheme is within the protection range of the invention without being limited to the obtaining mode of the compensation vector table;

in addition, in order to avoid error transmission and accumulation, the embodiment of the invention calculates the compensation vector table layer by layer, and inputs the compensation vector table into the next layer after post-processing after calibrating the output characteristic diagram of the current layer;

the generation of the calibration standard can be used for the calibration standard calculation based on the full-frame input image, and also can be used for the calibration standard calculation by randomly selecting a plurality of pixels and peripheral adjacent pixel blocks based on each gray level or color level.

Based on the same inventive concept, the embodiments of the present application further provide an error compensation apparatus for a memory computing chip, which can be used to implement the methods described in the above embodiments, as described in the following embodiments. Because the principle of solving the problem of the error compensation device for the memory computing chip is similar to that of the method, the implementation of the error compensation device for the memory computing chip can refer to the implementation of the method, and repeated details are omitted. As used hereinafter, the term "unit" or "module" may be a combination of software and/or hardware that implements a predetermined function. Although the means described in the embodiments below are preferably implemented in software, an implementation in hardware, or a combination of software and hardware is also possible and contemplated.

FIG. 6 is a block diagram of an error compensation apparatus for a memory computing chip according to an embodiment of the present invention. The error compensation device for the in-memory computing chip comprises: a calibration standard obtaining module 10, an index matrix obtaining module 20, an in-memory calculation result obtaining module 30, and a compensation vector table generating module 40.

The calibration standard obtaining module 10 inputs the sample image into the target neural network model, and takes the feature map output by each convolutional layer as the calibration standard of the convolutional layer;

the index matrix obtaining module 20 obtains a corresponding index matrix according to the input feature map of the current convolutional layer corresponding to the sample image;

the in-memory computation result obtaining module 30 obtains a computation result of the current convolutional layer corresponding to the input feature map, which is output by an in-memory computation chip, wherein the in-memory computation chip is used for realizing the operation of the target neural network model, and the input feature map is input into a flash memory unit array corresponding to the current convolutional layer in an analog domain of the in-memory computation chip for operation to obtain the computation result;

the compensation vector table generating module 40 compares the calibration standard with the calculation result, and then generates an index value-compensation vector corresponding relation table corresponding to the current convolutional layer according to the index matrix and the comparison result, where the index value-compensation vector corresponding relation table is used for compensating the calculation result output by the analog domain in the digital domain of the memory calculation chip when the memory calculation chip implements the target neural network model operation.

The apparatuses, modules or units illustrated in the above embodiments may be implemented by a computer chip or an entity, or implemented by a product with certain functions. A typical implementation device is an electronic device, which may be, for example, a personal computer, a laptop computer, a cellular telephone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.

In a typical example, the electronic device specifically includes a memory, a processor, and a computer program stored on the memory and executable on the processor, and the processor executes the program to implement the steps of the error compensation method for the in-memory computing chip described above.

Referring now to FIG. 7, shown is a schematic diagram of an electronic device 600 suitable for use in implementing embodiments of the present application.

As shown in fig. 7, the electronic apparatus 600 includes a Central Processing Unit (CPU)601 that can perform various appropriate works and processes according to a program stored in a Read Only Memory (ROM)602 or a program loaded from a storage section 608 into a Random Access Memory (RAM)) 603. In the RAM603, various programs and data necessary for the operation of the system 600 are also stored. The CPU601, ROM602, and RAM603 are connected to each other via a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.

The following components are connected to the I/O interface 605: an input portion 606 including a keyboard, a mouse, and the like; an output portion 607 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage section 608 including a hard disk and the like; and a communication section 609 including a network interface card such as a LAN card, a modem, or the like. The communication section 609 performs communication processing via a network such as the internet. The driver 610 is also connected to the I/O interface 605 as needed. A removable medium 611 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 610 as necessary, so that a computer program read out therefrom is mounted as necessary on the storage section 608.

In particular, the processes described above with reference to the flowcharts may be implemented as a computer software program according to an embodiment of the present invention. For example, an embodiment of the present invention includes a computer-readable storage medium on which a computer program is stored, which computer program, when executed by a processor, implements the steps of the error compensation method for a memory computing chip described above.

In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 609, and/or installed from the removable medium 611.

Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.

For convenience of description, the above devices are described as being divided into various units by function, and are described separately. Of course, the functionality of the units may be implemented in one or more software and/or hardware when implementing the present application.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.

The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the system embodiment, since it is substantially similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment.

The above description is only an example of the present application and is not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.

Claims

1. An error compensation method for a memory computing chip, comprising:

inputting the sample image into a target neural network model, and taking a characteristic diagram output by each convolution layer as a calibration standard of the convolution layer;

acquiring a corresponding index matrix according to an input characteristic diagram of the current convolution layer corresponding to the sample image;

2. The error compensation method for the in-memory computing chip according to claim 1, wherein the neural network model is a direct connection neural network model, a U-shaped neural network model, a residual structure network model, a recurrent neural network model, an MLP model, a Transformer network model, or an attention-mechanism neural network model.

3. The error compensation method for the in-memory computing chip according to claim 1, wherein the neural network model comprises: 1D convolution, 2D convolution, 3D convolution, full concatenation, transposed convolution, channel separable convolution, expanded convolution neural network operator.

4. The method of claim 1, wherein the sample image is in a grayscale image format, a color transparency image format, a grayscale chrominance image format, or a hue saturation value image format.

5. The method of claim 1, wherein after comparing the calibration standard with the calculation result, generating an index value-compensation vector correspondence table corresponding to the current convolutional layer according to the index matrix and the comparison result, comprises:

comparing the calibration standard with the calculation result;

6. The method of claim 1, wherein the obtaining the corresponding index matrix according to the input feature map of the current convolutional layer of the corresponding sample image comprises:

if the current convolutional layer is the first convolutional layer, performing 0 compensation on the input characteristic graph according to the step length of the current convolutional layer and the size of the convolutional kernel in a mode of compensating 0 by a convolution operator of tensorflow;

traversing the image subjected to zero padding pixel by pixel, and calculating the average value of a preset area around each pixel point by channels to obtain an average value matrix of the current convolutional layer;

7. The method of claim 5, wherein if the step size of the next convolutional layer is equal to 1 and the convolutional kernel size of the next convolutional layer is the same as that of the current convolutional layer, the index matrix of the next convolutional layer is the same as that of the current convolutional layer.

8. The method of claim 6, wherein after comparing the calibration standard with the calculation result, generating a corresponding index value-compensation vector correspondence table according to the index matrix and the comparison result comprises:

9. An error compensation method for an in-memory computing chip, wherein the in-memory computing chip is used for implementing a target neural network model operation, and comprises: a digital domain and an analog domain, the error compensation method being applied in the digital domain, the error compensation method comprising:

and compensating the calculation result according to the compensation vector.

10. The error compensation method for the in-memory computing chip according to claim 9, wherein the neural network model is a direct connection neural network model, a U-shaped neural network model, a residual structure network model, a recurrent neural network model, an MLP model, a transform network model, or an attention-based neural network model.

11. The error compensation method for the in-memory computing chip according to claim 9, wherein the neural network model comprises: 1D convolution, 2D convolution, 3D convolution, full concatenation, transposed convolution, channel separable convolution, expanded convolution neural network operator.

12. The method of claim 9, wherein the input feature map is in a grayscale map format, a color transparency map format, a grayscale chroma image format, or a hue saturation value image format.

13. The method of claim 9, wherein the obtaining a corresponding index matrix according to the input feature map of the current convolutional layer comprises:

if the current convolution layer is a non-first convolution layer and the upper layer thereof is a non-convolution layer, performing 0 compensation on the pre-mask matrix of the last non-convolution layer according to the step length of the current convolution layer and the convolution kernel size and in a manner of compensating 0 by a convolution operator of tensoflow;

14. The method of claim 13, further comprising:

15. An in-memory computing chip, comprising: a digital domain and an analog domain, the analog domain being configured to perform a matrix multiply-add operation of a target neural network model, the digital domain being configured to perform the error compensation method of any one of claims 9 to 14.

16. An error compensation apparatus for a memory computing chip, comprising:

17. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the steps of the method for error compensation of an in-memory computing chip according to any one of claims 1 to 8 are implemented when the program is executed by the processor.

18. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method for error compensation of an in-memory computing chip of any one of claims 1 to 8.