CN114819097A - Model quantization method, electronic device, medium, and program product - Google Patents

Model quantization method, electronic device, medium, and program product Download PDF

Info

Publication number
CN114819097A
CN114819097A CN202210523024.XA CN202210523024A CN114819097A CN 114819097 A CN114819097 A CN 114819097A CN 202210523024 A CN202210523024 A CN 202210523024A CN 114819097 A CN114819097 A CN 114819097A
Authority
CN
China
Prior art keywords
operator
quantized
data
coefficient
quantization
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210523024.XA
Other languages
Chinese (zh)
Inventor
章小龙
武大伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
ARM Technology China Co Ltd
Original Assignee
ARM Technology China Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ARM Technology China Co Ltd filed Critical ARM Technology China Co Ltd
Priority to CN202210523024.XA priority Critical patent/CN114819097A/en
Publication of CN114819097A publication Critical patent/CN114819097A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
    • G06N3/065Analogue means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Neurology (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The present application relates to the field of machine learning technologies, and in particular, to a model quantization method, an electronic device, a medium, and a program product. The model comprises a first operation, a second operation and a third operation, wherein the third operation is a correlation operation of the first operation and the second operation; and the method comprises: acquiring a first quantization coefficient of a first operation, first quantization data of the first operation, a second quantization coefficient of a second operation and second quantization data of the second operation; determining the first quantized coefficient or the second quantized coefficient as a common quantized coefficient based on the magnitude relation of the first quantized coefficient and the second quantized coefficient; the first quantized data or the second quantized data is quantized based on the common quantization coefficient, and an operation result of a third operation is obtained based on the quantization result. Therefore, the accuracy of the model can be improved, and the speed of the electronic equipment for operating the model can be improved.

Description

Model quantization method, electronic device, medium, and program product
Technical Field
The present application relates to the field of machine learning technologies, and in particular, to a model quantization method, an electronic device, a medium, and a program product.
Background
With the rapid development of Artificial Intelligence (AI) technology, neural networks (e.g., deep neural networks) have been widely used in the fields of computer vision, speech, natural language, reinforcement learning, and the like in recent years. However, with the development of the neural network algorithm, the complexity of the algorithm is higher and higher, and the model structure is more and more complex, and accordingly, the computational resource and memory requirements of the device with the neural network model are also higher.
Therefore, when the neural network model is deployed in an electronic device (an embedded electronic device such as a mobile phone) with limited computational resources and storage resources, the neural network model generally needs to be quantized, so that the model memory is reduced, and the data processing speed is increased. However, since quantizing the neural network model quantizes the weight matrix data and the input data in the neural network model from a high-precision order of magnitude to a low-precision order of magnitude, a precision loss of the operation occurs; in addition, each operator in the current neural network generally has a plurality of sub-operators, and the plurality of sub-operators are generally independently quantized, and because input data for determining the quantization coefficients of each sub-operator and the quantization coefficients of weight matrix data are different, the quantization coefficients of some sub-operators may be inconsistent, and when the quantization coefficients of the plurality of sub-operators performing operation are inconsistent, a greater precision loss occurs.
Disclosure of Invention
The application aims to provide a model quantification method, electronic equipment and a medium.
The first aspect of the present application provides a model quantization method, which is applied to an electronic device, where the model is a GRU model or an LSTM model, and the model includes a first operation, a second operation, and a third operation, where the third operation is at least a correlation operation between the first operation and the second operation; and the method comprises: acquiring a first quantization coefficient of a first operation, first quantization data of the first operation, a second quantization coefficient of a second operation and second quantization data of the second operation; determining a common quantization coefficient based on the magnitude relation of the first quantization coefficient and the second quantization coefficient; the first quantized data or the second quantized data is quantized based on the common quantization coefficient, and an operation result of a third operation is obtained based on the quantization result.
By the method provided by the embodiment of the application, the operation precision of the third operation can be improved. In some embodiments, the first operation may be an operation corresponding to a first operator, hereinafter, the second operation may be an operation corresponding to a second operator, hereinafter, and the third operation may be an operation corresponding to a third operator, hereinafter.
In a possible implementation of the first aspect, the determining a common quantized coefficient based on a magnitude relationship between the first quantized coefficient and the second quantized coefficient includes: the larger one of the first quantized coefficient and the second quantized coefficient is taken as a common quantized coefficient.
In the embodiment of the application, the common quantization coefficient is one of the first quantization coefficient and the second quantization coefficient, so that the electronic device only needs to quantize the first quantization data or the second quantization data, the frequency of the electronic device inverse-quantizing the quantization data into floating point data and then re-quantizing the floating point data into quantization data in the operation model process can be reduced, and the speed of the electronic device in operating the model is increased.
In a possible implementation of the first aspect, the quantizing the first quantized data or the second quantized data based on the common quantized coefficient includes: quantizing the first quantized data of the first operation into third quantized data based on the second quantized coefficient in a case where the common quantized coefficient is the second quantized coefficient; in a case where the common quantized coefficient is the first quantized coefficient, the second quantized data of the second operation is quantized into fourth quantized data based on the first quantized coefficient.
In one possible implementation of the first aspect, the quantizing the first quantized data of the first operation into third quantized data based on the second quantization coefficient includes: and inversely quantizing the first quantized data into corresponding first floating point data according to the first quantized coefficient, and quantizing the first floating point data into third quantized data according to the second quantized coefficient.
In one possible implementation of the first aspect, the quantizing the second quantized data of the second operation into fourth quantized data based on the first quantization coefficient includes: and inversely quantizing the second quantized data into corresponding second floating point data according to the second quantized coefficient, and quantizing the second floating point data into fourth quantized data according to the first quantized coefficient.
In a possible implementation of the first aspect, the obtaining an operation result of the third operation based on the quantization result includes: under the condition that the common quantization coefficient is a second quantization coefficient, obtaining an operation result of a third operation according to third quantization data, the second quantization data and the second quantization coefficient; and obtaining an operation result of a third operation according to the first quantized data, the fourth quantized data and the first quantized coefficient under the condition that the common quantized coefficient is the first quantized coefficient.
In one possible implementation of the first aspect, the third operation is an association operation of the first operation and the second operation, and the third operation includes: the first quantized data is output data of a first operation, the second quantized data is output data of a second operation, and the input data of a third operation includes the first quantized data and the second quantized data.
In a second aspect, the present application provides a model quantization apparatus, comprising:
the public quantization coefficient determining module is used for acquiring a first quantization coefficient of a first operation, first quantization data of the first operation, a second quantization coefficient of a second operation and second quantization data of the second operation, and determining a public quantization coefficient based on the magnitude relation of the first quantization coefficient and the second quantization coefficient; the data quantization module is used for quantizing the first quantized data or the second quantized data based on the common quantization coefficient to obtain a quantization result; and the operation module is used for obtaining an operation result of a third operation based on the quantization result.
In a third aspect, the present application provides an electronic device comprising: a memory for storing instructions for execution by one or more processors of the electronic device, and the processor being one of the one or more processors of the electronic device for executing the instructions to cause the electronic device to implement the first aspect as such and any of the model quantization methods provided by the various possible implementations of the first aspect as such.
In a fourth aspect, embodiments of the present application provide a readable medium, where instructions are stored, and when executed by an electronic device, cause the electronic device to implement the first aspect and any one of the model quantization methods provided by the various possible implementations of the first aspect.
In a fifth aspect, embodiments of the present application provide a computer program product, where the computer program product includes instructions that, when executed by an electronic device, cause the electronic device to implement the first aspect and any one of the model quantization methods provided by the various possible implementations of the first aspect.
Drawings
FIG. 1A illustrates an application diagram of a GRU model, according to some embodiments of the present application;
FIG. 1B illustrates a structural schematic diagram of a GRU model, according to some embodiments of the present application;
FIG. 2 illustrates a schematic structural diagram of an LSTM model, according to some embodiments of the present application;
FIG. 3 illustrates a schematic diagram of a model quantization scenario, according to some embodiments of the present application;
FIG. 4 illustrates a flow diagram of a GRU model quantification method, according to some embodiments of the present application;
FIG. 5 illustrates a flow diagram of a method of LSTM model quantization, according to some embodiments of the present application;
FIG. 6 illustrates a schematic diagram of a model quantification apparatus, according to some embodiments of the present application;
FIG. 7 illustrates a block diagram of an electronic device 10, according to some embodiments of the present application;
FIG. 8 illustrates a block diagram of a SOC1100, according to some embodiments of the present application.
Detailed Description
The illustrative embodiments of the present application include, but are not limited to, a model quantification method, apparatus, electronic device, medium, and computer program product. Embodiments of the present application will be described in further detail below with reference to the accompanying drawings.
Since the present application relates to a neural network model and a model quantization, some contents related to the present application are explained below in order to more clearly illustrate the scheme of the present application.
(1) Neural network model
The neural network model is a complex network system formed by a large number of processing units (called neurons) widely connected with each other, is the core of artificial intelligence, and belongs to a branch of the artificial intelligence. The application field of the neural network model is very wide, for example: data mining, data classification, computer vision, Natural Language Processing (NLP), biometric recognition, search engines, medical diagnostics, stock market analysis, DNA sequence sequencing, speech and handwriting recognition, strategic gaming, and robotic use, among others. The neural network model includes, but is not limited to, a convolutional neural network model, a cyclic neural network model, a deep neural network model, and the like.
(2) Convolutional Neural Network (CNN)
The neural network is a multilayer neural network, each layer is composed of a plurality of two-dimensional planes, each plane is composed of a plurality of independent neurons, the neurons of each plane share weights, and the number of parameters in the neural network can be reduced through weight sharing. Currently, in a convolutional neural network, a convolution operation performed by a processor is usually to convert the convolution of an input signal feature and a weight into a matrix multiplication operation between a feature map matrix and a weight coefficient matrix.
(3) Recurrent Neural Network (RNN) model
Is a type of recurrent neural network (recurrent neural network) that takes sequence data as input (e.g., a piece of voice data), recurses in the evolution direction of the sequence (recursion), and all nodes (cyclic units) are connected in a chain. The RNN model can be applied to the fields of speech recognition, language modeling, machine translation and the like, and can also be applied to the fields of various time series predictions.
Wherein the RNN model comprises at least one of: a Bidirectional Recurrent neural network (Bi-RNN) model, a Long Short-Term Memory network (LSTM) model, and a threshold Recurrent Unit (GRU) model.
(5) Threshold cycling Unit (GRU) model
Is one of the Recurrent Neural Network (RNN) models. Like the Long-Short Term Memory (LSTM) model, it is proposed to solve the problems of Long-Term Memory and gradients in back propagation.
Fig. 1A shows an application diagram of a GRU model 100. As shown in fig. 1A, the input data of the GRU model 100 is the speech to be recognized, and the output data of the GRU model 100 is the text corresponding to the speech to be recognized.
In other embodiments, the GRU model 100 may also be used for text translation in different languages, for example, the input data to the GRU model 100 may be chinese text and the output data of the GRU model 100 may be english text corresponding to chinese text. The GRU model 100 may also be used for image classification, for example, the input data of the GRU model 100 may be a plurality of frames of images, and the output data of the GRU model 100 may be a corresponding image type for each frame of image. It is to be understood that the GRU model 100 is primarily used for processing and predicting sequence data, and the identification of the GRU model 100 is not specifically limited by the present application, depending on the application
Fig. 1B shows a schematic structural diagram of a GRU model 100. As shown in fig. 1B, the GRU model 100 includes n GRU networks, which are GRU1, GRU2, …, GRUt-1, GRUt … …, GRUn, respectively. Wherein, GRU1 represents the GRU network at the 1 st time step, GRU2 represents the GRU network at the 2 nd time step, GRUt-1 represents the GRU network at the t-1 th time step, GRUt represents the GRU network at the t-th time step, and GRUn represents the GRU network at the n-th time step. The time step is a time step and is used for describing a time sequence of the input data.
For example, as shown in FIG. 1B, { x 1 、x 2 、…、x t-1 、x t ……、x n Is the voice data to be recognized, wherein x 1 Speech input data for GRU network (GRU1) at time step 1, x 2 Speech input data for GRU network (GRU2) at time step 2, x t Input data for the GRU network (GRUt) for the t-th time step, … …, x n Is the n-thTime-stepped GRU network (GRUn) speech input data. Output data { h) of GRU model 100 1 、h 2 、…、h t-1 、h t ……、h n Can be the words corresponding to the speech to be recognized, wherein h 1 Output data of GRU network (GRU1) for the 1 st time step, h 2 Output data of GRU network (GRU2) for 2 nd time step, h t Output data of the GRU network (GRUt) for the t-th time step, … …, h n Output data of the GRU network (GRUn) for the nth time step.
It is to be understood that the input data or the output data of the GRU network at each time step may be a matrix, a tensor, a vector, etc., and for convenience of description, in the following description of the embodiments, the input data or the output data of the GRU network is taken as an example of a matrix, and data processing related to input or output of each layer of the neural network is described. It will be appreciated that since the GRU model 100 is primarily used to process and predict sequence data, the GRU network at the current time step needs to combine the output data at the previous time step, process the input data at the current time step and generate the output data at the current time step.
For example, in the calculation of the GRU network at the t-th time step of the GRU model 100, the input data x input to the GRU network (GRUt) at the t-th time step may be used t Output data h of GRU network (GRUt-1) at t-1 time step t-1 Weight coefficient matrix, error coefficient matrix, etc., determining output data h of the GRU network (GRUt) at the t-th time step t . In particular, the output data h of the GRU network (GRUt) at the t-th time step t Can be calculated by the following formula:
z t =σ(x t ×W xz +h t-1 ×W hz +b z ) (1)
r t =σ(x t ×W xr +h t-1 ×W hr +b r ) (2)
c t =tanh(x t ×W x +(h t-1 ·r t )W h +b c ) (3)
h t =(1-z t )·c t +z t ·h t-1 (4)
in the formulae (1) to (4), W xz 、W hz 、W xr 、W hr 、W z 、W r 、W h Respectively represent GRU t B matrix of weight coefficients, b z 、b r 、b c Respectively represent GRU t Error coefficient matrix of (1), x t Input data representing the t-th time step, h t-1 Output data representing the t-1 time step. Sigma represents a sigmoid function, and a calculation formula of the sigmoid function can be expressed as
Figure BDA0003642692350000041
the tan h function is calculated by the formula
Figure BDA0003642692350000042
z t The gate value of the update gate, r, representing the t-th time step t The gate value of the reset gate representing the t-th time step, c t The intermediate output result of the t-th time step is shown, ". cndot" shows that the data with the same row and column numbers in the two matrixes are correspondingly multiplied, and "×" shows that the matrixes are multiplied.
(6) Long Short-Term Memory network (LSTM) model
Is a special RNN model. In the LSTM model 200, states of the neural network layers and outputs of the hidden layers may be transferred between the neural network layers, i.e., the output of each of the hidden layers is related to the input of the input layer, the output of the previous neural network layer, and the state of the previous neural network layer. The neural network layer of the hidden layer can realize the filtering of the input data through a gate structure, namely forgetting unimportant and selecting important. Correspondingly, the gate structure comprises a forgetting gate, a selection gate and an output gate. Therefore, LSTM can perform better in longer sequences than normal RNNs.
FIG. 2 shows a schematic diagram of an LSTM model 200. The LSTM model 200 includes an input layer 110, hidden layers 120-1 through 120-t (hereinafter referred to as neural network layers 120-1 through 12)0-t), output layer 130. Wherein the data input by the input layer 110 comprises x 1 、x 2 、x 3 、…、x t Etc., the data output by the output layer 130 includes h 1 、h 2 、h 3 、…、h t And the like. The state c of the neural network layer 120 and the output h, e.g., state c, may be communicated between the neural network layers 120 1 、c 2 And an output h 1 、h 2 And the like.
For example, during operation of the neural network layer 120-t of the LSTM model 200, the input data x of the input layer 110 may be based on t Data h outputted from the output layer 130 t-1 State C of neural network layer 120-t-1 t-1 Weight coefficient matrix, error coefficient matrix, etc., to determine data h output from the output layer 130 t . Specifically, the data h output by the output layer 130 t Can be calculated by the following formula:
i t =σ(W xi ×x t +W hi ×h t-1 +b i ) (5)
g t =ψ(W xg ×x t +W hg ×h t-1 +b g ) (6)
f t =σ(W xf ×x t +W hf ×h t-1 +b f ) (7)
o t =σ(W xo ×x t +W ho ×h t-1 +b o ) (8)
C t =f t *C t-1 +i t *g t (9)
h t =o t *ψ(C t ) (10)
in the formulae (5) to (10), W xi 、W hi 、W xg 、W hg 、W xf 、W hf 、W xo 、W ho Weight coefficient matrix, b, representing the neural network layer 120-t, respectively i 、b g 、b g 、b o Error coefficient matrices, x, representing the neural network layers 120-t, respectively t Representing input data of the input layer 110x t ,h t-1 Representing data output by the output layer 130. Sigma represents a sigmoid function, and a calculation formula of the sigmoid function can be expressed as
Figure BDA0003642692350000051
Phi denotes the tanh function, the calculation formula of the tanh function is
Figure BDA0003642692350000052
z t The gate value of the update gate, r, representing the t-th time step t The value of the reset gate, C, representing the t-th time step t-1 Representing the state of the neural network layer 120-t-1, x represents the matrix multiplication. Denotes a dot product operation of the matrix.
(7) Data quantization (i.e. conversion of floating point data to quantized data)
May refer to converting a large data type number (e.g., a 32-bit floating point number) to a smaller data type number (e.g., 8-bit symmetrically quantized data or asymmetrically quantized data). For example, a data quantization formula for converting floating point numbers into quantized data is as follows:
the quantized data Q can be calculated by the following formula:
Figure BDA0003642692350000061
Figure BDA0003642692350000062
wherein, F max Representing the maximum value of the floating point number. F min Representing the minimum value of a floating point number. Q max Representing the maximum value of the quantized data. Q min Representing the minimum value of the quantized data. F represents a floating point number. S denotes a floating-point-quantized range conversion coefficient (i.e., a quantized coefficient).
(8) Data inverse quantization (i.e. converting quantized data to floating point data)
May refer to converting a number of a smaller data type (e.g., 8-bit symmetrically quantized data or asymmetrically quantized data) to a number of a large data type (e.g., 32-bit floating point number). For example, the data dequantization formula for converting quantized data to floating point numbers is as follows:
the floating point number F may be calculated by the following formula:
F=Q*S (13)
where Q represents quantized data. S denotes a floating-point-quantized range conversion coefficient (i.e., a quantized coefficient).
(9) Operator
The neural network model is composed of individual computing units, which are called operators (Op for short). In the operation of the neural network model, the operators correspond to computational logic in the neural network layer. The operators in the neural network model can be classified according to operation types and further divided into: linear operator, non-linear operator.
Nonlinear operator: i.e. a non-linear function, also called non-linear mapping, which is an operator that does not satisfy the linear condition.
Linear operator: i.e. an operator which is a linear function and satisfies a linear condition.
Taking the t-th GRU network of the GRU model 100 as an example, in formula (1), x t ×W xz +h t-1 ×W hz +b z Can be an operator, and x t ×W xz +h t-1 ×W hz +b z Is a linear operator, x t ×W xz Is also a linear operator, h t-1 ×W hz +b z Is also a linear operator. Sigma (x) t ×W xz +h t-1 ×W hz +b z ) Is a non-linear operator, σ () is a non-linear operator, x t ×W xz +h t-1 ×W hz +b z Is a linear operator.
(10) Model quantization
The model quantization is a process of converting weight matrix data, input data, output data, and the like of each node in the neural network model from a high-precision quantization level to a low-precision quantization level and performing an operation, for example, converting the weight matrix data, the input data, the output data, and the like from a 32-bit single-precision floating point number (FP32) to 8-bit integer data (INT 8).
Taking the model quantization of the GRU model 100 as an example, the GRU model 100 quantizes input data to be input to the GRU model 100, weight matrix data relating to the GRU model 100, data output from the GRU model 100, and the like, and for example, converts input data, weight matrix data, and output data of the GRU model 100 from 32-bit single-precision floating point number (FP32) to 8-bit integer data (INT8) before the GRU model 100 is operated.
As mentioned above, the quantized neural network model may suffer from a loss of precision in the operation. For example, since the weight matrix data, input data, and the like in the GRU model 100 are converted from high-precision orders of magnitude to low-precision quantization levels, a precision loss of the operation may occur, and each operator in the neural network generally has a plurality of sub-operators, and the plurality of sub-operators generally perform independent quantization, since the input data determining the quantization coefficient of each sub-operator and the quantization coefficient of the weight matrix data are different, there may be a case that the quantization coefficients of some sub-operators are inconsistent, and when the quantization coefficients of the plurality of sub-operators performing the operation are inconsistent, a greater precision loss occurs, so that the precision of the data finally output by the model is low.
Specifically, for example, in the operation of formula (1), the operator x is calculated t ×W xz +h t-1 ×W hz Due to operator x t ×W xz Is quantized data and operator h t-1 ×W hz The quantized coefficients corresponding to the quantized data of (1) are inconsistent, and the operator x is directly subjected to t ×W xz Quantized data and h t-1 ×W hz Performing matrix addition operation on the quantized data to calculate a main operator x t ×W xz +h t-1 ×W hz The quantization data of (a) is inaccurate, resulting in a low accuracy of the quantization model.
For example, in the non-quantized model, the operator x in equation (1) t ×W xz Is 0.2. Operator h t-1 ×W hz Is 0.5. Will sub-operator x t ×W xz Floating point data and sub-operator h t-1 ×W hz Performing matrix phasing on floating-point dataAddition operation, calculated operator x t ×W xz +h t-1 ×W hz Is 0.7.
In the quantization model, the operator x t ×W xz Is 20, operator x t ×W xz Has a quantization coefficient of 0.01, operator h t-1 ×W hz Has a quantized data of 50, operator h t-1 ×W hz The quantization coefficient of (a) is 0.005. Will sub-operator x t ×W xz Quantized data and sub-operator h t-1 ×W hz Is subjected to matrix addition operation, and a calculated operator x t ×W xz +h t-1 ×W hz The quantized data of (2) is 120.
It is easy to see that, either with the operator x t ×W xz As operator x, 0.01 t ×W xz +h t-1 ×W hz Is quantized to the operator x t ×W xz +h t-1 ×W hz The quantization coefficients are inversely quantized to obtain floating point data of 1.2. Or the operator h t-1 ×W hz As operator x, 0.005 t ×W xz +h t-1 ×W hz Is quantized to the operator x t ×W xz +h t-1 ×W hz The quantization coefficients are inversely quantized to obtain floating point data of 0.6. And operator x t ×W xz +h t-1 ×W hz All floating point data 0.7 have deviations. Thus, the calculated operator x t ×W xz +h t-1 ×W hz The operation precision of the quantized data is low, so that the problem of precision loss of the quantized GRU model is caused.
In order to solve the above problem, the present application provides a model quantization method, before performing an operation corresponding to a third operator on quantized data of a first operator and quantized data of a second operator, inverse quantization may be performed on first quantized data of the first operator and first quantized data of the second operator according to a quantization coefficient of the first operator and a quantization coefficient of the second operator, so as to obtain floating point data of the first operator and floating point data of the second operator.
Then, according to the comparison of the sizes of the quantization coefficients of the first operator and the second operator, the larger quantization coefficient in the quantization coefficients of the first operator and the second operator is determined to serve as the common quantization coefficient of the first operator and the second operator, and the floating point data of the first operator and the floating point data of the second operator are quantized to obtain the second quantization data of the first operator and the second quantization data of the second operator. And then, according to the second quantized data of the first operator and the second quantized data of the second operator, executing the operation of a third operator to generate the quantized data of the third operator. And the third operator is the corresponding operation of the first operator and the second operator with incidence relation. For example, in the above formula (1), the first operator is x t ×W xz The second operator is h t-1 ×W hz The third operator being the addition x of the first and second operators t ×W xz +h t-1 ×W hz
It can be understood that the third operator is a corresponding operation in which the first operator and the second operator have an association relationship, and means that an operation corresponding to the third operator uses an operation result corresponding to the first operator and an operation result corresponding to the second operator as input data, or an operation corresponding to the first operator and an operation corresponding to the second operator are sub-operations of an operation corresponding to the third operator. Furthermore, in some embodiments, when the output data of the first operation and the second operation is the input data of the third operation, or the first operation and the second operation are sub-operations of the third operation, the third operation may be referred to as an associated operation of the first operation and the second operation.
It is understood that the above-mentioned quantization coefficient of the first operator refers to a product of a quantization coefficient of the weight coefficient matrix of the first operator and a quantization coefficient of the input data matrix, and the above-mentioned quantization coefficient of the second operator refers to a product of a quantization coefficient of the weight coefficient matrix of the second operator and a quantization coefficient of the output data matrix.
Based on the scheme, the calculation result of the third operator can be more accurate, and when the whole model adopts the scheme to adjust the quantization coefficients of the operators, the quantization precision of the neural network model can be effectively improved. In addition, because the data processed in the GRU model or the LSTM model is related to time, that is, each time step needs to re-quantize the first operator and the second operator, by the method provided by the embodiment of the present application, only the data of one operator of the first operator and the second operator needs to be re-quantized, the number of times of dequantizing the quantized data into floating point data and then quantizing the floating point data into quantized data can be reduced, and the speed of the electronic device operating the GRU model or the LSTM model is increased.
The model quantization method can be suitable for the quantized neural network model, avoids the precision loss of the quantized neural network model by optimizing the quantized neural network model, and improves the quantization precision of the quantized neural network model.
For example, in the GRU model that has undergone initial quantization, operator x is calculated t ×W xz +h t-1 ×W hz Can be based on the sub-operator x t ×W xz Quantization coefficient of 0.01 (i.e. first operator) and sub-operator h t-1 ×W hz (i.e., second operator) quantization factor 0.005, for sub-operator x t ×W xz (i.e., first operator) first quantized data 20 and operator x t ×W xz (i.e., the second operator) dequantizing the quantized data 100 to generate operator x t ×W xz Is corresponding to floating point data 0.2 and operator x t ×W xz The first quantized data corresponds to floating point data 0.5.
By comparison of operator x t ×W xz Quantization coefficient of 0.01 and operator x t ×W xz 0.05, possibly with the operator x t ×W xz And operator x t ×W xz The larger quantization coefficient of the quantization coefficients of (1) is used as a common quantization coefficient, and an operator x is paired according to the common quantization coefficient of 0.01 t ×W xz And operator x t ×W xz Is quantized to generate an operator x t ×W xz Second quantized data 20 and operator x t ×W xz The second quantized data 50. According to the operator x t ×W xz Second amount ofThe transformation data 20 and the operator x t ×W xz To generate a primary operator x t ×W xz +h t-1 ×W hz The quantized data 70. It can be seen that the common quantization coefficient 0.01 is used as the main operator x t ×W xz +h t-1 ×W hz For the principal operator x t ×W xz +h t-1 ×W hz Is inverse quantized to generate a main operator x t ×W xz +h t-1 ×W hz 0.7. With the above principal operator x of the non-quantized model t ×W xz +h t-1 ×W hz 0.7 consistent. The model quantization method provided by the application is proved to be capable of improving the operation precision of the quantization model, and further improving the accuracy of the output data of the quantization model.
The following describes the technical solution of model quantization in detail with reference to fig. 1 to 7.
FIG. 3 illustrates a scene graph of model quantization according to an embodiment of the present application. As shown in fig. 3, the scene includes a first electronic device 10, a second electronic device 20, and a user a.
As shown in fig. 3, the first electronic device 10 may capture the to-be-recognized voice of the user a, wherein the to-be-recognized voice may be "small a, what is the weather today". The first electronic device 10 may input the speech to be recognized as input data to the quantized GRU model 100 by running the quantized GRU model 100, and output data of the quantized GRU model 100 is characters corresponding to the speech to be recognized.
As shown in fig. 3, the first electronic device 10 may receive quantized coefficients, model coefficients, etc. of the GRU model 100 transmitted by the second electronic device 20. The quantization coefficients, model coefficients, may be used for quantization by the GRU model 100. The quantization coefficient may be a quantization coefficient of a weight coefficient matrix, a quantization coefficient of an input data matrix, or the like. The model coefficients may be a weight coefficient matrix, an error coefficient matrix, or the like.
Specifically, for example, the second electronic device 20 may transmit floating point data of the weight coefficient matrix, floating point data of the error coefficient matrix, quantized coefficients of the weight coefficient matrix, quantized coefficients of the input data matrix, and the like in the GRU model 100 to the first electronic device 10. The second electronic device 20 may calculate the quantized data of the weight coefficient matrix and the quantized data of the error coefficient matrix according to the above formula (12) according to the floating point data of the weight coefficient matrix, the floating point data of the error coefficient matrix, the quantized coefficient of the weight coefficient matrix, and the like in the GRU model 100.
In some embodiments, when the first electronic device 10 quantizes the GRU model 100, to improve the accuracy of the quantized GRU model 100 run by the first electronic device 10, the first electronic device 10 not only converts weight matrix data, input data, etc. input into the GRU model 100 from high accuracy floating point data to corresponding low accuracy quantized data. Moreover, the first electronic device 10 may compare the quantization coefficient of the first operator in the GRU model 100 with the quantization coefficient of the second operator, use the larger quantization coefficient of the first operator in the GRU model 100 and the quantization coefficient of the second operator as a common quantization coefficient, and re-determine the quantized data of the first operator or the second operator in the GRU model 100 according to the common quantization coefficient, so that the quantization coefficient corresponding to the quantized data of the first operator in the GRU model 100 is consistent with the quantization coefficient corresponding to the quantized data of the second operator (that is, both are common quantization coefficients), so that the quantized data of the third operator in the GRU model 100 calculated by the first electronic device 10 according to the re-determined quantized data of the first operator and the second operator in the GRU model 100 is more accurate, and the quantization accuracy of the GRU model 100 is improved.
Therefore, it is understood that, by operating the GRU model 100 with higher quantization accuracy provided by the present application, the GRU model 100 with higher quantization accuracy takes the data of the speech to be recognized (for example, "small a, what weather is today") as input data, and performs an operation on the data of the speech to be recognized, so that the generated text corresponding to the speech to be recognized is more accurate.
The first electronic device 10 in the present application may include, but is not limited to: smart mobile phones, televisions, tablet computers, bracelets, Head Mounted Display (HMD), Augmented Reality (AR) devices, Mixed Reality (MR) devices, cellular phones (cellular phones), smart phones (smart phones), Personal Digital Assistants (PDAs), tablet computers, in-vehicle electronic devices, laptop computers (laptop computers), Personal Computers (PCs), monitoring devices, robots, in-vehicle terminals, autonomous vehicles, and the like. Of course, in the following embodiments, the specific form of the first electronic device 10 is not limited at all.
The second electronic device 20 of the present application may include, but is not limited to: laptop computers, handheld computers, notebook computers, desktop computers, ultra-mobile personal computers (UMPCs), servers, and the like. As an example, the server may be a cloud server, which may be a hardware server, or may be embedded in a virtualization environment, for example, the server may be a virtual machine executing on a hardware server including one or more other virtual machines. In addition, the server can also be an independent server, the independent server has all software and hardware resources of the whole server, and can automatically distribute and implement various services, such as generation of quantization coefficients and model coefficients of the neural network model of the application.
It is understood that the neural network model referred to in the technical solution of the present application may be a recurrent neural network model, for example, a GRU model, an LSTM model, a Bi-RNN model, or the like. But also convolutional neural network models, deep learning models, etc. It is understood that, according to the practical application, the neural network model to which the model quantization is applicable is not particularly limited.
The following describes in detail a specific process of quantifying the GRU model 100 according to the present application, taking the neural network model as the GRU model 100 as an example.
In conjunction with the scenario of fig. 3, fig. 4 shows a flow chart of a method of quantifying the GRU model 100. As shown in fig. 4, the executing subject of the method may be the first electronic device 10, and the method includes the following steps:
s401, acquiring first quantized data of a first operator, first quantized data of a second operator, a quantized coefficient of the first operator and a quantized coefficient of the second operator in the GRU model 100, wherein a third operator in the GRU model 100 is corresponding operation of the first operator and the second operator in an incidence relation.
In some embodiments, the first electronic device 10 obtains the first quantized data of the first operator, the first quantized data of the second operator, the quantized coefficient of the first operator, and the quantized coefficient of the second operator in the GRU model 100, where the third operator is an associated operational relationship between the first operator and the second operator.
In some embodiments, the first operator and/or the second operator may be a linear operator or a non-linear operator. The third operator may be a linear operator or a non-linear operator. It is understood that, according to practical applications, the operator types of the first operator, the second operator, and the third operator are not specifically limited in the present application.
Specifically, for example, in the operation of the tth GRU network of the GRU model 100, the first operator may be the operator x of formula (1) t ×W xz The second operator may be the operator h of formula (1) t-1 ×W hz +b z The third operator may be operator x of formula (1) t ×W xz +h t-1 ×W hz +b z . The first operator may also be the operator σ () of formula (1), and the second operator may also be the operator x of formula (1) t ×W xz +h t-1 ×W hz +b z The corresponding third operator may be the operator σ (x) of equation (1) t ×W xz +h t-1 ×W hz +b z )。
Similarly, in the operation of the tth GRU network of the GRU model 100, the first operator may be operator x of formula (2) t ×W xr The second operator may be the operator h of equation (2) t-1 ×W hr +b r The third operator may be operator x of equation (2) t ×W xr +h t-1 ×W hr +b r . The first operator may also be the operator σ () of formula (2), and the second operator may also be the operator x of formula (2) t ×W xr +h t-1 ×W hr +b r The corresponding third operator may be the operator σ (x) of equation (2) t ×W xr +h t-1 ×W hr +b r )。
Similarly, in the operation of the tth GRU network of the GRU model 100, the first operator may be operator x of formula (3) t ×W x The second operator may be the operator (h) of formula (3) t-1 ·r t )W h +b c The third operator may be operator x of formula (3) t ×W x +(h t-1 ·r t )W h +b c . The first operator may also be the operator tanh () of formula (3), and the second operator may also be the operator x of formula (3) t x t ×W x +(h t-1 ·r t )W h +b c The corresponding third operator may be the operator tanh (x) of equation (3) t ×W x +(h t-1 ·r t )W h +b c )。
Similarly, in the operation of the tth GRU network of the GRU model 100, the first operator may be the operator (1-z) of equation (4) t )·c t The second operator may be the operator (1-z) of equation (4) t )·c t The third operator may be the operator (1-z) of equation (4) t )·c t +z t ·h t-1
In some embodiments, the first electronic device 10 may obtain floating point data of the weight coefficient matrix, floating point data of the error coefficient matrix, quantized coefficients of the weight coefficient matrix, quantized coefficients of the input data matrix, etc. in the GRU model 100 from the second electronic device 20. The first electronic device 10 may also obtain floating point data of the input data matrix. The first electronic device 10 may calculate the quantized data of the weight coefficient matrix, the quantized data of the error coefficient matrix, and the quantized data of the input data matrix from the floating point data of the weight coefficient matrix, the floating point data of the error coefficient matrix, the quantized coefficient of the weight coefficient matrix, the quantized coefficient of the input data matrix, and the floating point data of the input data matrix by the above formula (12). For example, the quantized data of the input data matrix is equal to the floating point data of the input data matrix divided by the quantized coefficients of the input data matrix.
In other embodiments, the second electronic device 20 may calculate the quantized data of the weight coefficient matrix and the quantized data of the error coefficient matrix according to the floating point data of the weight coefficient matrix, the floating point data of the error coefficient matrix and the quantized coefficient of the weight coefficient matrix by the above formula (12). The first electronic device 10 may then obtain quantized data of the weight coefficient matrix, quantized data of the error coefficient matrix, etc. in the GRU model 100 directly from the second electronic device 20.
In some embodiments, the first electronic device 10 may calculate the first quantized data of the first operator and the first quantized data of the second operator from the quantized data of the weight coefficient matrix, the quantized data of the error coefficient matrix, and the quantized data of the input data matrix of the GRU model 100.
For example, in the operation of the tth GRU network of the GRU model 100, the first operator may be operator x of formula (1) t ×W xz The second operator may be the operator h of formula (1) t-1 ×W hz +b z . In particular, the first electronic device 10 may be based on a weight coefficient matrix W xz And the input data matrix x of the t time step t By matrix x of input data at the t-th time step t The quantized data and weight coefficient matrix W xz The quantized data is subjected to matrix multiplication operation to calculate a first operator x t ×W xz The first quantized data of (1).
In particular, the first electronic device 10 may also be based on a weight coefficient matrix W hz Quantized data of (1), output data matrix h of t-1 time step t-1 And error coefficient matrix b z By matrix h of output data at t-1 time step t-1 The quantized data and weight coefficient matrix W hz The quantized data is subjected to matrix multiplication operation, and then the output data h of the t-1 time step is subjected to matrix multiplication operation t-1 The quantized data and weight coefficient matrix W hz The result of matrix multiplication of the quantized data of (a) with an error coefficient matrix b z Is quantized data toMatrix summation to finally calculate the second operator h t-1 ×W hz +b z The first quantized data of (1).
In some embodiments, the first electronic device 10 may calculate the quantization coefficients of the first operator and the quantization coefficients of the second operator from the quantization coefficients of the weight coefficient matrix of the GRU model 100 and the quantization coefficients of the input data matrix.
For example, in the operation of the tth GRU network of the GRU model 100, the first operator may be operator x of formula (1) t ×W xz The second operator may be the operator h of formula (1) t-1 ×W hz +b z . In particular, the first electronic device 10 may be based on a weight coefficient matrix W xz And the input data matrix x of the t time step t By matrix x of input data at the t-th time step t The quantization coefficient and weight coefficient matrix W xz The quantization coefficients are subjected to matrix multiplication operation to calculate a first operator x t ×W xz The quantized coefficients of (1). In particular, the first electronic device 10 may also be based on a weight coefficient matrix W hz Quantized coefficient of (1), output data matrix h of t-1 time step t-1 By matrix h of output data at t-1 time step t-1 The quantization coefficient and weight coefficient matrix W hz The quantization coefficients are subjected to matrix multiplication operation to calculate a second operator h t-1 ×W hz +b z The quantized coefficients of (1).
In some other embodiments, when the first operator and/or the second operator is a main operator of another operator, the first quantized data of the first operator and/or the first quantized data of the second operator may also be quantized data calculated according to the model quantization method provided in the present application. For example, in the operation of the tth GRU network of the GRU model 100, the first operator may be the operator σ () of formula (1), and the second operator may be the operator x of formula (1) t ×W xz +h t-1 ×W hz +b z Wherein the second operator x t ×W xz +h t-1 ×W hz +b z Can be the operator x t ×W xz Sum operator h t-1 ×W hz +b z The main operator of (2). The quantized data of the second operator may be operator x t ×W xz Quantized data of (2) and operator h t-1 ×W hz +b z Is obtained by summing the quantized data matrices. Wherein the operator x t ×W xz Quantized coefficient and operator h corresponding to the quantized data of t-1 ×W hz +b z The quantized coefficients corresponding to the quantized data of (1) are the same.
S402, judging whether the quantization coefficient of the first operator in the GRU model 100 is larger than that of the second operator, and executing the step S403 under the condition that the quantization coefficient of the first operator in the GRU model 100 is larger than that of the second operator. In the case where the quantization coefficient of the first operator is not greater than the quantization coefficient of the second operator, step S404 is performed.
In some embodiments, the first electronic device 10 may select, as the common quantization coefficient of the first operator and the second operator, a larger quantization coefficient of the first operator or the quantization coefficient of the second operator by determining whether the quantization coefficient of the first operator in the GRU model 100 is larger than the quantization coefficient of the second operator, so that when the quantized data of the first operator and the quantized data of the second operator perform the operation of the third operator, the quantization coefficients corresponding to the quantized data of the first operator and the quantized data of the second operator are the same, thereby improving the operation accuracy of the quantized GRU model 100.
And S403, under the condition that the quantization coefficient of the first operator in the GRU model 100 is larger than that of the second operator, taking the quantization coefficient of the first operator in the GRU model 100 as a common quantization coefficient of the first operator and the second operator, and determining the precision coefficient of the second operator.
In order to improve the operation precision of model quantization, before the quantized data of the first operator and the quantized data of the second operator perform the operation of the third operator, the first electronic device 10 may perform inverse quantization on the first quantized data of the first operator and the first quantized data of the second operator to obtain floating point data of the first operator and floating point data of the second operator, and then quantize the floating point data of the first operator and the floating point data of the second operator according to the determined common quantization coefficient of the first operator and the second operator to obtain second quantized data of the first operator and the second quantized data of the second operator. The first electronic device 10 then performs an operation of a third operator on the basis of the second quantized data of the first operator and the second quantized data of the second operator, generating quantized data of the third operator.
It is understood that if the quantization coefficient of the first operator is used as the common quantization coefficient of the first operator and the second operator, and the quantization coefficient used when the first quantized data of the first operator is dequantized is the same as the quantization coefficient used when the floating point data of the first operator is quantized, the first quantized data of the first operator is the same as the second quantized data of the first operator. Therefore, the first electronic device 10 does not need to perform the operations of inverse quantization and quantization on the first quantized data of the first operator. Also, the first electronic device 10 may determine the second quantized data of the second operator from the precision coefficient of the second operator by determining the precision coefficient of the second operator.
Specifically, the precision coefficient of the second operator may be used to make the quantization coefficient used for generating the second quantized data of the second operator consistent with the quantization coefficient of the first quantized data of the first operator. The first electronic device 10 may calculate the precision coefficient of the second operator from the quantization coefficient of the first operator and the quantization coefficient of the second operator. In particular, the precision factor A of the second operator 2 Can be calculated by the following equation (14):
Figure BDA0003642692350000121
in the formula (14), S 1 Representing the quantized coefficient of the first operator, S 2 Representing the quantized coefficients of the second operator.
And S405, determining second quantized data of a second operator according to the precision coefficient of the second operator in the GRU model 100 and the first quantized data of the second operator.
In some embodiments, the first electronic device 10 may be according to the GRU modelThe precision coefficient of the second operator and the first quantized data of the second operator in 100, and determining the second quantized data of the second operator, wherein the quantized coefficient corresponding to the second quantized data of the second operator is the same as the quantized coefficient corresponding to the first data of the first operator. Specifically, second quantized data Q 'of a second operator' 2 Can be calculated by the following equation (15):
Q′ 2 =Q 2 *A 2 (15)
in the formula (15), Q 2 First quantized data representing a second operator, A 2 Representing the precision factor of the second operator.
And S406, determining the quantized data of the third operator in the GRU model 100 according to the first quantized data of the first operator and the second quantized data of the second operator in the GRU model 100.
In some embodiments, the third operator in the GRU model 100 is an association operation of the first operator with the second operator. The first electronic device 10 may determine quantized data of a third operator according to first quantized data of a first operator and second quantized data of a second operator in the GRU model 100, where a quantized coefficient corresponding to the first quantized data of the first operator is the same as a quantized coefficient corresponding to the second quantized data of the second operator.
For example, in the operation of the tth GRU network of the GRU model 100, the first operator may be operator x of formula (1) t ×W xz The second operator may be the operator h of formula (1) t-1 ×W hz +b z The third operator may be operator x t ×W xz +h t-1 ×W hz +b z . In particular, the third operator x t ×W xz +h t-1 ×W hz +b z May be the first operator x t ×W xz First quantized data and second operator h t-1 ×W hz +b z The second quantized data of (1) is obtained by matrix addition calculation.
In the case where the quantization coefficient of the first operator in the GRU model 100 is not greater than the quantization coefficient of the second operator, it is determined whether the quantization coefficient of the first operator in the GRU model 100 is equal to the quantization coefficient of the second operator S404.
In the case where the quantization coefficient of the first operator in the GRU model 100 is not larger than the quantization coefficient of the second operator, and the quantization coefficient of the first operator in the GRU model 100 is equal to the quantization coefficient of the second operator, step S407 is performed. In the case where the quantization coefficient of the first operator in the GRU model 100 is not larger than the quantization coefficient of the second operator, and the quantization coefficient of the first operator in the GRU model 100 is not equal to the quantization coefficient of the second operator, step S408 is performed.
In the case where the quantization coefficient of the first operator in the GRU model 100 is not greater than the quantization coefficient of the second operator and the quantization coefficient of the first operator in the GRU model 100 is equal to the quantization coefficient of the second operator, the quantized data of the third operator in the GRU model 100 is determined from the first quantized data of the first operator and the first quantized data of the second operator in the GRU model 100S 407.
And S408, under the condition that the quantization coefficient of the first operator is not larger than that of the second operator and the quantization coefficient of the first operator is not equal to that of the second operator, taking the quantization coefficient of the second operator in the GRU model 100 as a common quantization coefficient of the first operator and the second operator, and determining the precision coefficient of the first operator.
In order to improve the operation precision of model quantization, when the first operator and the second operator perform corresponding operation of the third operator, the first electronic device 10 may perform inverse quantization on the first quantized data of the first operator and the first quantized data of the second operator to obtain floating point data of the first operator and floating point data of the second operator, and then quantize the floating point data of the first operator and the floating point data of the second operator according to the determined common quantization coefficients of the first operator and the second operator to obtain second quantized data of the first operator and second quantized data of the second operator. It is understood that if the quantization coefficient of the second operator is used as the common quantization coefficient of the first operator and the second operator, and the quantization coefficient used when the first quantized data of the second operator is dequantized is the same as the quantization coefficient used when the floating point data of the second operator is quantized, the first quantized data of the second operator is the same as the second quantized data of the second operator. Therefore, the first electronic device 10 does not need to perform the operations of inverse quantization and quantization on the first quantized data of the second operator.
In some embodiments, the precision coefficient of the first operator is used to make the generated quantized coefficient corresponding to the second quantized data of the first operator consistent with the corresponding quantized coefficient of the first quantized data of the second operator. The first electronic device 10 may calculate the precision coefficient of the first operator from the quantization coefficient of the first operator and the quantization coefficient of the second operator. In particular, the precision factor A of the first operator 1 Can be calculated by the following equation (16):
Figure BDA0003642692350000131
in the formula (16), S 1 Representing the quantized coefficient of the first operator, S 2 Representing the quantized coefficients of the second operator.
And S409, determining second quantized data of the first operator in the GRU model 100 according to the precision coefficient of the first operator in the GRU model 100 and the first quantized data of the first operator.
In some embodiments, the first electronic device 10 may determine the second quantized data of the first operator according to the precision coefficient of the first operator in the GRU model 100 and the first quantized data of the first operator, wherein the quantized coefficient corresponding to the second quantized data of the first operator is the same as the quantized coefficient corresponding to the first quantized data of the second operator. Specifically, the second quantized data Q 'of the first operator' 1 Can be calculated by the following equation (17):
Q′ 1 =Q 1 *A 1 (17)
in formula (17), Q 1 First quantized data representing a first operator, A 1 Representing the precision factor of the first operator.
And S410, determining the quantized data of a third operator in the GRU model 100 according to the first quantized data of the second operator and the second quantized data of the first operator in the GRU model 100.
In some embodiments, the third operator in the GRU model 100 is an association operation of the first operator with the second operator. The first electronic device 10 may determine quantized data of a third operator according to the first quantized data of the second operator and the second quantized data of the first operator in the GRU model 100, where a quantized coefficient corresponding to the first quantized data of the second operator is the same as a quantized coefficient corresponding to the second quantized data of the first operator.
For example, in the operation of the tth GRU network of the GRU model 100, the first operator may be operator x of formula (1) t ×W xz The second operator may be the operator h of formula (1) t-1 ×W hz +b z The third operator may be operator x t ×W xz +h t-1 ×W hz +b z . In particular, the third operator x t ×W xz +h t-1 ×W hz +b z May be the first operator x t ×W xz And a second operator h t-1 ×W hz +b z The first quantized data of (1) is obtained by matrix addition calculation.
As can be seen from the above description, in order to improve the operation accuracy of model quantization, the first electronic device 10 may determine the quantized data of the first operator or the second operator again according to the common quantization coefficient by comparing the quantization coefficient of the first operator with the quantization coefficient of the second operator, and using the larger quantization coefficient of the first operator and the quantization coefficient of the second operator as the common quantization coefficient, so that the quantization coefficient corresponding to the quantized data of the first operator is consistent with the quantization coefficient corresponding to the quantized data of the second operator (i.e., both are the common quantization coefficient), so that the first electronic device 10 calculates the quantized data of the third operator more accurately according to the re-determined quantized data of the first operator and the second operator, and improves the quantization accuracy of the GRU model 100.
It should be understood that fig. 4 specifically illustrates the process of quantifying the GRU model 100, and in other embodiments, the model quantifying method of the present application is also applicable to the LSTM model 200, and the following describes the detailed process of quantifying the LSTM model 200 of the present application in detail by taking the neural network model as the LSTM model 200 as an example.
In connection with the scenario of fig. 3, fig. 5 shows a flow chart of a method of quantifying the LSTM model 200. As shown in fig. 5, the executing subject of the method may be the first electronic device 10, and the method includes the following steps:
s501, obtaining first quantization data of a first operator, first quantization data of a second operator, a quantization coefficient of the first operator and a quantization coefficient of the second operator in the LSTM model 200, wherein a third operator in the LSTM model 200 is corresponding operation of the first operator and the second operator in an incidence relation.
In some embodiments, the first electronic device 10 obtains the first quantized data of the first operator, the first quantized data of the second operator, the quantized coefficients of the first operator, and the quantized coefficients of the second operator in the LSTM model 200.
Specifically, for example, during operation of the neural network layer 120-t of the LSTM model 200, the first operator may be operator W of equation (5) xi ×x t The second operator may be operator W of equation (5) hi ×h t-1 +b i The third operator may be operator W of equation (5) xi ×x t +W hi ×h t-1 +b i . The first operator may also be the operator σ () of formula (5), and the second operator may also be the operator W of formula (5) xi ×x t +W hi ×h t-1 +b i The corresponding third operator may be the operator σ (W) of equation (5) xi ×x t +W hi ×h t-1 +b i )。
Similarly, during the operation of the neural network layer 120-t of the LSTM model 200, the first operator may be the operator W of equation (6) xg ×x t The second operator may be operator W of equation (6) hg ×h t-1 +b g The third operator may be operator W of equation (6) xg ×x t +W hg ×h t-1 +b g . The first operator may also be the operator ψ of equation (6)() The second operator may also be the operator W of equation (6) xg ×x t +W hg ×h t-1 +b g The corresponding third operator may be the operator ψ (W) of equation (6) xg ×x t +W hg ×h t-1 +b g )。
Similarly, during operation of the neural network layer 120-t of the LSTM model 200, the first operator may be operator W of equation (7) xf ×x t The second operator may be operator W of equation (7) hf ×h t-1 +b f The third operator may be operator W of equation (7) xf ×x t +W hf ×h t-1 +b f . The first operator may also be the operator σ () of formula (7), and the second operator may also be the operator W of formula (7) xf ×x t +W hf ×h t-1 +b f The corresponding third operator may be operator σ (W) of equation (7) xf ×x t +W hf ×h t-1 +b f )。
Similarly, during the operation of the neural network layer 120-t of the LSTM model 200, the first operator may be operator W of equation (8) xo ×x t The second operator may be operator W of equation (8) ho ×h t-1 +b o The third operator may be operator W of equation (8) xo ×x t +W ho ×h t-1 +b o . The first operator may also be the operator σ () of equation (8), and the second operator may also be the operator W of equation (8) xo ×x t +W ho ×h t-1 +b o The corresponding third operator may be operator σ (W) of equation (8) xo ×x t +W ho ×h t-1 +b o )。
Similarly, during the operation of the neural network layer 120-t of the LSTM model 200, the first operator may be the operator f of equation (9) t *C t-1 The second operator may be the operator f of equation (9) t *C t-1 The third operator may be operator f of formula (9) t *C t-1 +i t *g t
The process of determining the first quantized data of the first operator, the first quantized data of the second operator, the quantization coefficient of the first operator, and the quantization coefficient of the second operator in the LSTM model 200 specifically refers to step S401 in fig. 4, and is not described herein again.
And S502, judging whether the quantization coefficient of the first operator in the LSTM model 200 is larger than that of the second operator, and executing the step S503 under the condition that the quantization coefficient of the first operator in the LSTM model 200 is larger than that of the second operator. In the case where the quantized coefficient of the first operator is not larger than the quantized coefficient of the second operator, step S504 is performed.
In some embodiments, the first electronic device 10 may select, as the common quantization coefficient of the first operator and the second operator, a larger quantization coefficient of the first operator or the quantization coefficient of the second operator by determining whether the quantization coefficient of the first operator in the LSTM model 200 is larger than the quantization coefficient of the second operator, so that when the quantized data of the first operator and the quantized data of the second operator perform an operation of a third operator, the quantization coefficient corresponding to the quantized data of the first operator and the quantization coefficient corresponding to the quantized data of the second operator are the same, thereby improving the operation accuracy of the quantization model.
And S503, under the condition that the quantization coefficient of the first operator in the LSTM model 200 is larger than that of the second operator, taking the quantization coefficient of the first operator in the LSTM model 200 as a common quantization coefficient of the first operator and the second operator, and determining the precision coefficient of the second operator. For details, refer to step S403 in fig. 4, which is not described herein again.
And S505, determining second quantized data of a second operator according to the precision coefficient of the second operator in the LSTM model 200 and the first quantized data of the second operator. For details, refer to step S405 of fig. 4, which is not described herein.
S506, determining the quantized data of the third operator in the LSTM model 200 according to the first quantized data of the first operator and the second quantized data of the second operator in the LSTM model 200.
In some embodiments, the third operator in LSTM model 200 is an associative operation of the first operator with the second operator. The first electronic device 10 may determine quantized data of a third operator according to first quantized data of a first operator and second quantized data of a second operator of the LSTM model 200, where a quantized coefficient corresponding to the first quantized data of the first operator and a quantized coefficient corresponding to the second quantized data of the second operator are the same.
For example, during operation of the neural network layer 120-t of the LSTM model 200, the first operator may be operator W of equation (5) xi ×x t The second operator may be operator W of equation (5) hi ×h t-1 +b i The third operator may be operator W of equation (5) xi ×x t +W hi ×h t-1 +b i . In particular, the third operator W xi ×x t +W hi ×h t-1 +b i May be the first operator W xi ×x t First quantized data and second operator W hi ×h t-1 +b i The second quantized data of (1) is obtained by matrix addition calculation.
In the case where the quantization coefficient of the first operator in the LSTM model 200 is not greater than the quantization coefficient of the second operator, it is determined whether the quantization coefficient of the first operator in the LSTM model 200 is equal to the quantization coefficient of the second operator S504.
In the case where the quantization coefficient of the first operator in the LSTM model 200 is not larger than the quantization coefficient of the second operator, and the quantization coefficient of the first operator in the LSTM model 200 is equal to the quantization coefficient of the second operator, step S507 is performed. In the case where the quantization coefficient of the first operator in the LSTM model 200 is not greater than the quantization coefficient of the second operator, and the quantization coefficient of the first operator in the LSTM model 200 is not equal to the quantization coefficient of the second operator, step S508 is performed.
In the case where the quantization coefficient of the first operator in the LSTM model 200 is not greater than the quantization coefficient of the second operator and the quantization coefficient of the first operator in the LSTM model 200 is equal to the quantization coefficient of the second operator, the quantization data of the third operator in the LSTM model 200 is determined from the first quantization data of the first operator and the first quantization data of the second operator in the LSTM model 200S 507.
And S508, under the condition that the quantization coefficient of the first operator is not larger than that of the second operator and the quantization coefficient of the first operator is not equal to that of the second operator, taking the quantization coefficient of the second operator in the LSTM model 200 as a common quantization coefficient of the first operator and the second operator, and determining the precision coefficient of the first operator. For details, refer to step S408 of fig. 4, which is not described herein again.
S509, determining second quantized data of the first operator in the LSTM model 200 according to the precision coefficient of the first operator in the LSTM model 200 and the first quantized data of the first operator. For details, refer to step S409 in fig. 4, which is not described herein.
And S510, determining the quantized data of the third operator in the LSTM model 200 according to the first quantized data of the second operator and the second quantized data of the first operator in the LSTM model 200.
In some embodiments, the third operator in LSTM model 200 is an associative operation of the first operator with the second operator. The first electronic device 10 may determine quantized data of a third operator according to the first quantized data of the second operator and the second quantized data of the first operator in the LSTM model 200, where a quantized coefficient corresponding to the first quantized data of the second operator is the same as a quantized coefficient corresponding to the second quantized data of the first operator.
For example, during operation of the neural network layer 120-t of the LSTM model 200, the first operator may be operator W of equation (7) xf ×x t The second operator may be operator W of equation (7) hf ×h t-1 +b f The third operator may be operator W of equation (7) xf ×x t +W hf ×h t-1 +b f . In particular, the third operator W xf ×x t +W hf ×h t-1 +b f May be the first operator W xf ×x t Second quantized data and second operation W hf ×h t-1 +b f The first quantized data of (1) is obtained by matrix addition calculation.
As can be seen from the above description, in order to improve the operation precision of the LSTM model 200 quantization, the first electronic device 10 may compare the quantization coefficients of the first operator and the second operator in the LSTM model 200, use the larger quantization coefficient of the quantization coefficients of the first operator and the second operator as a common quantization coefficient, and re-determine the quantization data of the first operator or the second operator in the LSTM model 200 according to the common quantization coefficient, so that the quantization coefficient corresponding to the quantization data of the first operator is consistent with the quantization coefficient corresponding to the quantization data of the second operator (i.e., both are the common quantization coefficient), so that the first electronic device 10 calculates the quantization data of the third operator more accurately according to the re-determined quantization data of the first operator and the second operator, thereby improving the quantization precision of the LSTM model 200.
The embodiment of the application also provides a model quantization device, which is used for realizing the model quantization method in the embodiments.
In particular, fig. 6 illustrates a schematic structural diagram of a model quantification apparatus 300, according to some embodiments of the present application.
As shown in fig. 6, the model quantizing device 300 includes:
the common quantization coefficient determining module 301 is configured to obtain a first quantization coefficient of a first operation (e.g., an operation corresponding to the first operator), first quantization data of the first operation, a second quantization coefficient of a second operation (e.g., an operation corresponding to the second operator), and second quantization data of the second operation, and determine the common quantization coefficient according to a magnitude relationship between the first quantization coefficient and the second quantization coefficient. For example, in some embodiments, the common quantized coefficient determination module 301 may be configured to perform the related operations of the foregoing steps 401 to 404, 408, 501 to 504, and 508.
A data quantization module 302, configured to re-quantize the first quantized data and/or the second quantized data based on the common quantization coefficient. For example, in some embodiments, the data quantization module 302 may be configured to perform the related operations of the foregoing steps 405, 409, 505, 509.
The operation module 303 is configured to obtain an operation result of a third operation (for example, an operation corresponding to the third operator) based on data obtained by re-quantizing the first quantized data and/or the second quantized data, where the third operation is a correlation operation between the first operation and the second operation. For example, in some embodiments, the operation module 303 may be configured to perform the related operations of the foregoing steps 406, 410, 506, and 510.
Fig. 7 is a block diagram illustrating a first electronic device 10 according to one embodiment of the present application. In one embodiment, the first electronic device 10 may include one or more processors 1004, system control logic 1008 coupled to at least one of the processors 1004, system memory 1012 coupled to the system control logic 1008, non-volatile memory (NVM)1016 coupled to the system control logic 1008, and a network interface 1020 coupled to the system control logic 1008.
In some embodiments, processor 1004 may include one or more single-core or multi-core processors. In some embodiments, the processor 1004 may include any combination of general-purpose processors and special-purpose processors (e.g., graphics processors, application processors, baseband processors, etc.). In embodiments where the first electronic device 10 employs an eNB (enhanced Node B) or RAN (Radio Access Network) controller, the processor 1004 may be configured to perform various consistent embodiments.
In some embodiments, system control logic 1008 may include any suitable interface controllers to provide any suitable interface to at least one of processors 1004 and/or any suitable device or component in communication with system control logic 1008.
In some embodiments, system control logic 1008 may include one or more memory controllers to provide an interface to system memory 1012. System memory 1012 may be used to load and store data and/or instructions. Memory 1012 of system 1000 may include any suitable volatile memory, such as suitable Dynamic Random Access Memory (DRAM), in some embodiments.
The NVM 1016 may include one or more tangible, non-transitory computer-readable media for storing data and/or instructions. In some embodiments, the NVM 1016 may include any suitable non-volatile memory, such as flash memory, and/or any suitable non-volatile storage device, such as at least one of an HDD (Hard Disk Drive), CD (Compact Disc) Drive, DVD (Digital Versatile Disc) Drive.
The NVM 1016 may comprise a portion of a storage resource on the apparatus on which the first electronic device 10 is mounted, or it may be accessible by, but not necessarily a part of, the device. The NVM 1016 may be accessed over a network, for example, via the network interface 1020.
In particular, the system memory 1012 and the NVM 1016 may include: a temporary copy and a permanent copy of instructions 1024. The instructions 1024 may include: instructions that when executed by at least one of the processors 1004 cause the first electronic device 10 to implement the method as shown in fig. 7. In some embodiments, the instructions 1024, hardware, firmware, and/or software components thereof may additionally/alternatively be disposed in the system control logic 1008, the network interface 1020, and/or the processor 1004.
The network interface 1020 may include a transceiver for providing a radio interface for the first electronic device 10 to communicate with any other suitable device (e.g., front end module, antenna, etc.) over one or more networks. In some embodiments, the network interface 1020 may be integrated with other components of the first electronic device 10. For example, the network interface 1020 may be integrated with at least one of the processors 1004, the system memory 1012, the NVM 1016, and a firmware device (not shown) having instructions that, when executed by at least one of the processors 1004, the first electronic device 10 implements the method shown in fig. 4 or fig. 5.
The network interface 1020 may further include any suitable hardware and/or firmware to provide a multiple-input multiple-output radio interface. For example, network interface 1020 may be a network adapter, a wireless network adapter, a telephone modem, and/or a wireless modem.
In one embodiment, at least one of the processors 1004 may be packaged together with logic for one or more controllers of system control logic 1008 to form a System In Package (SiP). In one embodiment, at least one of the processors 1004 may be integrated on the same die with logic for one or more controllers of system control logic 1008 to form a system on a chip (SoC).
The first electronic device 10 may further include: input/output (I/O) devices 1032.
Fig. 8 shows a block diagram of a SoC (System on Chip) 1100, according to an embodiment of the present application. The SOC1100 is provided in the first electronic device 10. In fig. 8, like parts have the same reference numerals. In addition, the dashed box is an optional feature of more advanced socs. In fig. 8, the SoC1100 includes: an interconnect unit 1150 coupled to the application processor 1110; the system agent unit 1170; a bus controller unit 1180; an integrated memory controller unit 1240; a Direct Memory Access (DMA) unit 1160.
It is to be appreciated that as used herein, the term module may refer to or include an Application Specific Integrated Circuit (ASIC), an electronic circuit, a processor (shared, dedicated, or group) and/or memory that executes one or more software or firmware programs, a combinational logic circuit, and/or other suitable hardware components that provide the described functionality, or may be part of such hardware components.
It is to be appreciated that in various embodiments of the present application, the processor may be a microprocessor, a digital signal processor, a microcontroller, or the like, and/or any combination thereof. According to another aspect, the processor may be a single-core processor, a multi-core processor, the like, and/or any combination thereof.
Embodiments of the mechanisms disclosed herein may be implemented in hardware, software, firmware, or a combination of these implementations. Embodiments of the application may be implemented as computer programs or program code executing on programmable systems comprising at least one processor, a storage system (including volatile and non-volatile memory and/or storage elements), at least one input device, and at least one output device.
Program code may be applied to input instructions to perform the functions described herein and generate output information. The output information may be applied to one or more output devices in a known manner. For purposes of this application, a processing system includes any system having a processor such as, for example, a Digital Signal Processor (DSP), a microcontroller, an Application Specific Integrated Circuit (ASIC), or a microprocessor.
The program code may be implemented in a high level procedural or object oriented programming language to communicate with a processing system. The program code can also be implemented in assembly or machine language, if desired. Indeed, the mechanisms described in this application are not limited in scope to any particular programming language. In any case, the language may be a compiled or interpreted language.
In some cases, the disclosed embodiments may be implemented in hardware, firmware, software, or any combination thereof. The disclosed embodiments may also be implemented as instructions carried by or stored on one or more transitory or non-transitory machine-readable (e.g., computer-readable) storage media, which may be read and executed by one or more processors. For example, the instructions may be distributed via a network or via other computer readable media. Thus, a machine-readable medium may include any mechanism for storing or transmitting information in a form readable by a machine (e.g., a computer), including, but not limited to, floppy diskettes, optical disks, read-only memories (CD-ROMs), magneto-optical disks, read-only memories (ROMs), Random Access Memories (RAMs), erasable programmable read-only memories (EPROMs), electrically erasable programmable read-only memories (EEPROMs), magnetic or optical cards, flash memory, or tangible machine-readable memories for transmitting information using the Internet in the form of electrical, optical, acoustical or other propagated signals, e.g., carrier waves, infrared digital signals, etc.). Thus, a machine-readable medium includes any type of machine-readable medium suitable for storing or transmitting electronic instructions or information in a form readable by a machine (e.g., a computer).
In the drawings, some features of the structures or methods may be shown in a particular arrangement and/or order. However, it is to be understood that such specific arrangement and/or ordering may not be required. Rather, in some embodiments, the features may be arranged in a manner and/or order different from that shown in the illustrative figures. In addition, the inclusion of a structural or methodological feature in a particular figure is not meant to imply that such feature is required in all embodiments, and in some embodiments may not be included or may be combined with other features.
It should be noted that, in the embodiments of the apparatuses in the present application, each unit/module is a logical unit/module, and physically, one logical unit/module may be one physical unit/module, or may be a part of one physical unit/module, and may also be implemented by a combination of multiple physical units/modules, where the physical implementation manner of the logical unit/module itself is not the most important, and the combination of the functions implemented by the logical unit/module is the key to solve the technical problem provided by the present application. Furthermore, in order to highlight the innovative part of the present application, the above-mentioned device embodiments of the present application do not introduce units/modules which are not so closely related to solve the technical problems presented in the present application, which does not indicate that no other units/modules exist in the above-mentioned device embodiments.
It is noted that, in the examples and descriptions of this patent, relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, the use of the verb "comprise a" to define an element does not exclude the presence of another, same element in a process, method, article, or apparatus that comprises the element.
While the present application has been shown and described with reference to certain preferred embodiments thereof, it will be understood by those of ordinary skill in the art that various changes in form and details may be made therein without departing from the spirit and scope of the present application.

Claims (10)

1. A model quantization method is applied to electronic equipment and is characterized in that the model is a GRU model or an LSTM model and comprises a first operation, a second operation and a third operation, wherein the third operation is at least the correlation operation of the first operation and the second operation;
and the method comprises:
acquiring a first quantized coefficient of the first operation, first quantized data of the first operation, a second quantized coefficient of the second operation, and second quantized data of the second operation;
determining a common quantization coefficient based on the magnitude relation of the first quantization coefficient and the second quantization coefficient;
quantizing the first quantized data or the second quantized data based on the common quantization coefficient, and obtaining an operation result of the third operation based on the quantization result.
2. The method according to claim 1, wherein the determining a common quantized coefficient based on the magnitude relationship between the first quantized coefficient and the second quantized coefficient comprises:
and taking the larger one of the first quantized coefficient and the second quantized coefficient as the common quantized coefficient.
3. The method of claim 2, wherein quantizing the first quantized data or the second quantized data based on the common quantized coefficient comprises:
quantizing the first quantized data of the first operation into third quantized data based on the second quantized coefficient in a case where the common quantized coefficient is the second quantized coefficient;
quantizing the second quantized data of the second operation into fourth quantized data based on the first quantized coefficient in a case where the common quantized coefficient is the first quantized coefficient.
4. The method of claim 3, wherein the quantizing the first quantized data of the first operation to third quantized data based on the second quantized coefficients comprises:
and inversely quantizing the first quantized data into corresponding first floating point data according to the first quantized coefficient, and quantizing the first floating point data into third quantized data according to the second quantized coefficient.
5. The method of claim 3, wherein the quantizing the second quantized data of the second operation to fourth quantized data based on the first quantized coefficient comprises:
and inversely quantizing the second quantized data into corresponding second floating point data according to the second quantized coefficient, and quantizing the second floating point data into the fourth quantized data according to the first quantized coefficient.
6. The method according to any one of claims 3 to 5, wherein obtaining the operation result of the third operation based on the quantization result comprises:
obtaining an operation result of the third operation according to the third quantized data, the second quantized data, and the second quantized coefficient under the condition that the common quantized coefficient is the second quantized coefficient;
and obtaining an operation result of the third operation according to the first quantized data, the fourth quantized data and the first quantized coefficient when the common quantized coefficient is the first quantized coefficient.
7. The method of claim 1, wherein the third operation is an associated operation of the first operation and the second operation, comprising:
the first quantized data is output data of the first operation, the second quantized data is output data of the second operation, and input data of the third operation includes the first quantized data and the second quantized data.
8. An electronic device, comprising:
a memory for storing instructions for execution by one or more processors of the electronic device, an
A processor, one of the one or more processors of the electronic device, to execute the instructions to cause the electronic device to implement the model quantification method of any one of claims 1 to 7.
9. A readable medium having stored thereon instructions which, when executed by an electronic device, cause the electronic device to carry out the model quantification method of any one of claims 1 to 7.
10. A computer program product comprising instructions which, when executed by an electronic device, cause the electronic device to carry out the model quantification method of any one of claims 1 to 7.
CN202210523024.XA 2022-05-13 2022-05-13 Model quantization method, electronic device, medium, and program product Pending CN114819097A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210523024.XA CN114819097A (en) 2022-05-13 2022-05-13 Model quantization method, electronic device, medium, and program product

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210523024.XA CN114819097A (en) 2022-05-13 2022-05-13 Model quantization method, electronic device, medium, and program product

Publications (1)

Publication Number Publication Date
CN114819097A true CN114819097A (en) 2022-07-29

Family

ID=82516088

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210523024.XA Pending CN114819097A (en) 2022-05-13 2022-05-13 Model quantization method, electronic device, medium, and program product

Country Status (1)

Country Link
CN (1) CN114819097A (en)

Similar Documents

Publication Publication Date Title
US20240104378A1 (en) Dynamic quantization of neural networks
US20210004663A1 (en) Neural network device and method of quantizing parameters of neural network
US11429838B2 (en) Neural network device for neural network operation, method of operating neural network device, and application processor including the neural network device
US20190164043A1 (en) Low-power hardware acceleration method and system for convolution neural network computation
CN111652368A (en) Data processing method and related product
KR20180073118A (en) Convolutional neural network processing method and apparatus
CN112673383A (en) Data representation of dynamic precision in neural network cores
US11574239B2 (en) Outlier quantization for training and inference
US20220329807A1 (en) Image compression method and apparatus thereof
WO2020061884A1 (en) Composite binary decomposition network
JP7190799B2 (en) Low Precision Deep Neural Networks Enabled by Compensation Instructions
CN114207625A (en) System-aware selective quantization for performance-optimized distributed deep learning
CN114612996A (en) Method for operating neural network model, medium, program product, and electronic device
CN112085175A (en) Data processing method and device based on neural network calculation
US20200250523A1 (en) Systems and methods for optimizing an artificial intelligence model in a semiconductor solution
WO2021057926A1 (en) Method and apparatus for training neural network model
WO2021081854A1 (en) Convolution operation circuit and convolution operation method
CN114819097A (en) Model quantization method, electronic device, medium, and program product
CN115457365A (en) Model interpretation method and device, electronic equipment and storage medium
US11861452B1 (en) Quantized softmax layer for neural networks
CN113902107A (en) Data processing method, readable medium and electronic device for neural network model full connection layer
CN112884144A (en) Network quantization method and device, electronic equipment and storage medium
US11263517B1 (en) Flexible weight expansion
CN110852202A (en) Video segmentation method and device, computing equipment and storage medium
CN116188875B (en) Image classification method, device, electronic equipment, medium and product

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination