CN110764696B - Vector information storage and updating method and device, electronic equipment and storage medium - Google Patents

Vector information storage and updating method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN110764696B
CN110764696B CN201910916421.1A CN201910916421A CN110764696B CN 110764696 B CN110764696 B CN 110764696B CN 201910916421 A CN201910916421 A CN 201910916421A CN 110764696 B CN110764696 B CN 110764696B
Authority
CN
China
Prior art keywords
vector
error
quantization
value
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910916421.1A
Other languages
Chinese (zh)
Other versions
CN110764696A (en
Inventor
黄明飞
王海涛
姚宏贵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Open Intelligent Machine Shanghai Co ltd
Original Assignee
Open Intelligent Machine Shanghai Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Open Intelligent Machine Shanghai Co ltd filed Critical Open Intelligent Machine Shanghai Co ltd
Priority to CN201910916421.1A priority Critical patent/CN110764696B/en
Publication of CN110764696A publication Critical patent/CN110764696A/en
Application granted granted Critical
Publication of CN110764696B publication Critical patent/CN110764696B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0608Saving storage space on storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system

Abstract

The invention discloses a method and a device for storing and updating vector information, electronic equipment and a storage medium. The vector information storage method comprises the following steps: setting the total error and the information quantity of the stored information; quantizing each vector component of the vector to an effective value according with the information quantity to form a quantized vector; storing the quantized vector; wherein the step of vector component quantization comprises: calculating an extended error based on the current total error; quantizing the vector components to an effective value according with the information quantity according to the expansion error; and updating the total error, wherein the updated total error is equal to the sum of the total error before updating and the quantization error. The invention distributes the error to different quantization parameters by vector quantization and dithering technology in quantization, so that the precision of the average value in a region is improved, the vector can be stored in a quantization vector form with low precision and small occupied space, the storage space is reduced, the calculated amount is reduced, and the calculation speed is accelerated.

Description

Vector information storage and updating method and device, electronic equipment and storage medium
Technical Field
The invention belongs to the field of information storage, and particularly relates to a method and a device for storing and updating vector information, electronic equipment and a storage medium.
Background
Low-end devices cannot store large amounts of data and perform high-precision calculations due to limited storage space and computing power. Especially, when a machine learning algorithm, especially a deep learning algorithm, is implemented, a large amount of calculation is required regardless of training or reasoning, and meanwhile, the model parameters are very large and a large storage space is required, so that low-end equipment cannot train and run the algorithm model. Reducing the amount of data or the computational accuracy may alleviate the above-mentioned difficulties to some extent, but the model trained from this often does not yield good results when making inferences. Therefore, it is an urgent problem to reduce the storage space occupied by data, reduce the amount of computation, make the algorithm be widely deployed on low-end devices, or improve the utilization rate of computing devices.
Disclosure of Invention
The technical problem to be solved by the present invention is to overcome the defects that the prior art is limited by the storage space and the computing power of the device and cannot store a large amount of data and perform high-precision computation, and to provide a method, an apparatus, an electronic device and a storage medium for storing and updating vector information, which can reduce the storage space occupied by data and reduce the computation amount.
The invention solves the technical problems through the following technical scheme:
the invention provides a vector information storage method, which comprises the following steps:
setting an initial value of the total error and the information quantity of the stored information;
quantizing each vector component of the vector to an effective value according with the information quantity, wherein all the effective values form a quantized vector of the vector;
storing the quantized vector;
wherein the step of quantizing a vector component of said vector to a significant value corresponding to the magnitude of said information quantity comprises:
calculating an extension error when the vector components are quantized based on a current total error;
quantizing said vector components to a value corresponding to the magnitude of said information according to said extended error;
updating the total error, the updated total error being equal to the sum of the total error before updating and a quantization error, the quantization error being equal to the difference between the vector component and the resulting value after inverse quantization of the effective value.
Preferably, the step of quantizing the vector components according to the extended error to an effective value corresponding to the size of the information amount specifically includes:
calculated according to the following formula
Q=(P+E’-B)×S
Q’=g(Q)
Wherein the content of the first and second substances,
q is used to represent the initial quantization value;
p is used to represent the vector components;
e' is used to represent the spreading error;
s is a preset quantized scale factor;
b is a preset quantized offset value;
q' is used to represent the valid value;
g () is used to represent a function that defines Q as a significant value that corresponds to the magnitude of the information quantity;
the value obtained after inverse quantization of the effective value is (Q'/S) + B.
Preferably, the information amount large and small comprises a bit number.
The invention also provides a method for updating the quantization vector and storing the information, which comprises the following steps:
setting an initial value of the total error, the information quantity of the stored information and the change quantity of each vector component of the quantized vector;
quantizing the change amount of each vector component to an effective value according with the information amount;
changing the quantized effective value of the corresponding change quantity on each vector component to obtain updated vector components, wherein all the updated vector components form updated quantized vectors;
storing the updated quantization vector;
wherein the step of quantizing the amount of change of a vector component to an effective value corresponding to the amount of information comprises:
calculating an extension error when the change amount is quantized based on a current total error;
quantizing the change amount to an effective value according to the information amount according to the spread error;
updating the total error, the updated total error being equal to a sum of the total error before updating and a quantization error, the quantization error being equal to a difference between the change amount and a resulting value of the effective value after inverse quantization.
Preferably, the step of quantizing the change amount according to the extended error to an effective value corresponding to the size of the information amount specifically includes:
calculated according to the following formula
ΔQ=(Δ+E’-B)×S
ΔQ’=g(ΔQ)
Wherein the content of the first and second substances,
Δ Q is used to represent the initial quantization value;
Δ is used to represent the vector component;
e' is used to represent the spreading error;
s is a preset quantized scale factor;
b is a preset quantized offset value;
Δ Q' is used to represent the effective value;
g () is used to represent a function that defines Δ Q as a significant value that corresponds to the magnitude of the information quantity;
the value obtained after inverse quantization of the effective value is (Δ Q'/S) + B.
Preferably, the information amount large and small comprises a bit number.
The present invention also provides a vector information storage apparatus, comprising:
the first setting module is used for setting an initial value of the total error and the information quantity of the stored information;
a first quantization module, configured to quantize each vector component of a vector to an effective value that matches a size of the information amount, where all the effective values form a quantized vector of the vector;
a first storage module for storing the quantized vector;
wherein the first quantization module, when quantizing a vector component of the vector to a value that matches the magnitude of the information amount, is specifically configured to:
calculating an extension error when the vector components are quantized based on a current total error;
quantizing said vector components to a value corresponding to the magnitude of said information according to said extended error;
updating the total error, the updated total error being equal to the sum of the total error before updating and a quantization error, the quantization error being equal to the difference between the vector component and the resulting value after inverse quantization of the effective value.
Preferably, the first quantization module is specifically configured to, when quantizing the vector component according to the extended error to a valid value corresponding to the size of the information amount:
calculated according to the following formula
Q=(P+E’-B)×S
Q’=g(Q)
Wherein the content of the first and second substances,
q is used to represent the initial quantization value;
p is used to represent the vector components;
e' is used to represent the spreading error;
s is a preset quantized scale factor;
b is a preset quantized offset value;
q' is used to represent the valid value;
g () is used to represent a function that defines Q as a significant value that corresponds to the magnitude of the information quantity;
the value obtained after inverse quantization of the effective value is (Q'/S) + B.
Preferably, the information amount large and small comprises a bit number.
The invention also provides a device for updating the quantization vector and storing the information, which comprises:
the second setting module is used for setting an initial value of the total error, the information quantity of the stored information and the change quantity of each vector component of the quantization vector;
a second quantization module, configured to quantize the change amount of each vector component to an effective value that corresponds to the information amount;
a vector updating module, configured to change the quantized effective value of the corresponding change amount on each vector component to obtain an updated vector component, where all the updated vector components form an updated quantized vector;
a second storage module for storing the updated quantization vector;
wherein the second quantization module is specifically configured to quantize the change amount of one vector component to an effective value that matches the magnitude of the information amount:
calculating an extension error when the change amount is quantized based on a current total error;
quantizing the change amount to an effective value according to the information amount according to the spread error;
updating the total error, the updated total error being equal to a sum of the total error before updating and a quantization error, the quantization error being equal to a difference between the change amount and a resulting value of the effective value after inverse quantization.
Preferably, the second quantizing module is specifically configured to, when quantizing the change amount to conform to the effective value of the information amount according to the extended error:
calculated according to the following formula
ΔQ=(Δ+E’-B)×S
ΔQ’=g(ΔQ)
Wherein the content of the first and second substances,
Δ Q is used to represent the initial quantization value;
Δ is used to represent the vector component;
e' is used to represent the spreading error;
s is a preset quantized scale factor;
b is a preset quantized offset value;
Δ Q' is used to represent the effective value;
g () is used to represent a function that defines Δ Q as a significant value that corresponds to the magnitude of the information quantity;
the value obtained after inverse quantization of the effective value is (Δ Q'/S) + B.
Preferably, the information amount large and small comprises a bit number.
The invention also provides an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, which when executed by the processor implements the method as described above.
The invention also provides a computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method as described above.
On the basis of the common knowledge in the field, the above preferred conditions can be combined randomly to obtain the preferred embodiments of the invention.
The positive progress effects of the invention are as follows: according to the invention, through quantization of the vector and allocation of errors to different quantization parameters by using a dithering technology in quantization, the precision of an average value in a region is greatly improved, so that the vector can be stored in a quantization vector form with low precision and small occupied space, the storage space is reduced on the whole, the calculated amount is reduced, the calculation speed is accelerated, even if the storage space and the calculation capacity of the equipment are limited, training and reasoning machine learning can be realized, the overall performance of the equipment is improved, the power consumption and the cost are reduced, and the cost performance is improved. The invention can be used in the related software design and hardware design for improving the performance. During reasoning, the calculation precision can be reduced by using a jitter technology, so that the storage space is reduced, the calculation speed is accelerated, and the power consumption and the cost are reduced; when the method is used in training, besides the advantages of the reasoning, the overfitting can be reduced, and the training robustness is improved; recognition errors are also reduced when a fixed pattern (pattern) pattern is introduced into the sample data due to sampling.
Drawings
FIG. 1 is a flowchart illustrating a vector information storage method according to a preferred embodiment 1 of the present invention;
FIG. 2 is a flowchart illustrating step 12 of preferred embodiment 1 of the present invention;
FIG. 3 is a flowchart of a vector updating and information storing method according to the preferred embodiment 2 of the present invention;
FIG. 4 is a flowchart illustrating step 22 of the preferred embodiment 2 of the present invention;
FIG. 5 is a schematic block diagram of a vector information storage apparatus according to preferred embodiment 3 of the present invention;
FIG. 6 is a schematic block diagram of a vector update and information storage device according to the preferred embodiment 4 of the present invention;
fig. 7 is a schematic structural diagram of an electronic device according to embodiment 5 of the present invention.
Detailed Description
The invention is further illustrated by the following examples, which are not intended to limit the scope of the invention.
Example 1
As shown in fig. 1, a vector information storage method includes:
step 11: setting an initial value of the total error and the information quantity of the stored information; in this embodiment, the information amount may include the number of bits.
Step 12: and quantizing each vector component of the vector to effective values according with the information quantity, wherein all the effective values form the quantized vector of the vector.
Step 13: storing the quantized vector.
As shown in fig. 2, the step 12 of quantizing a vector component of the vector to a valid value corresponding to the size of the information amount specifically includes:
step 121: the extension error in the quantization of the vector components is calculated based on the current total error. For the first vector component of the vector, the corresponding current total error is an initial value, for the second vector component of the vector, the corresponding current total error is the total error after the first update, and so on; in this embodiment, the formula for calculating the extension error may be:
E’=f(E)
wherein the content of the first and second substances,
e' is used for representing the expansion error and can be an integer;
e is used to represent the current total error;
f () is used to represent any error extension method, depending on the particular algorithm of "dithering" used.
Step 122: and quantizing said vector components according to said extended error to a significant value corresponding to the magnitude of said information quantity. In this embodiment, the calculation may be specifically performed according to the following formula:
Q=(P+E’-B)×S
Q’=g(Q)
wherein the content of the first and second substances,
q is used to represent the initial quantization value;
p is used to represent the vector components;
s is a preset quantized scale factor, and specific numerical values are determined according to an actual quantization method, such as a maximum information entropy method, a maximum and minimum value method and the like;
b is a preset quantized offset value, and the specific numerical value is determined according to an actual quantizer, such as a maximum information entropy method, a maximum and minimum value method and the like;
q' is used to represent the valid value;
g () is used to denote a function that defines Q as a significant value that corresponds to the magnitude of the information quantity, and specifically may be rounding or discarding the lower bits.
Step 123: updating the total error, the updated total error being equal to the sum of the total error before updating and a quantization error, the quantization error being equal to the difference between the vector component and the resulting value after inverse quantization of the effective value. In this embodiment, the formula for calculating the inverse quantization value of the effective value may be:
the value obtained after inverse quantization of the effective value is (Q'/S) + B
The formula for updating the total error is:
E+=P-[(Q’/S)+B]
step 12 may repeat steps 121-123 until all vector components of the vector are quantized.
The vector information storage method of the embodiment quantizes the vector to the information quantity of the stored information, introduces a dithering technology during quantization, and distributes the total error to the quantization process of different vector components, so that the precision of a single component is reduced, but the influence is avoided, even the precision of the average value of the whole vector can be improved, the vector can be stored in the form of the quantized vector with low precision and small occupied space, the storage space is reduced on the whole, the calculated quantity is reduced, the calculation speed is accelerated, even if the storage space and the calculation capacity of the equipment are limited, the training and reasoning machine learning can be realized enough, the whole performance of the equipment is improved, the power consumption and the cost are reduced, and the cost performance is improved. Meanwhile, the method can be used in related software design and hardware design for improving performance.
The vector in this embodiment may be a one-dimensional vector or a two-dimensional vector. The following describes the vector information storage method of the present embodiment with a two-dimensional vector as an example.
Suppose, for a vector
Figure BDA0002216258290000091
Storing the information;
presetting E to be 0.0, wherein the information quantity of the stored information is 4 bits;
in calculating the spread error, the total error is spread to the subsequent data using a fixed-scale mode (e.g., 0.5 here) (i.e., f (E) ═ 0.5 × E);
quantization is performed by using a maximum-minimum method (Max: 1024.0, Min ═ 1024.0), and the quantization is 4bits (mapping [ -1024.0, 1024.0] to [ -7, 7], so that B ═ 0.0, S ═ 7+7)/(1024+1024) ═ 0.0068359375);
for the first vector component P11The quantization process is as follows:
E’=f(E)=0.5*E=0
P11=1024
Q11=(P11+E’-B)*S=(1024+0-0.0)*0.0068359375=6.99999232
Q11rounding off to obtain Q11’=7
And E, updating: e ═ E + P11-[Q11’/S]+B=0+1024-((7/0.0068359375)+0.0)=0
For the second vector component P12The quantization process is as follows:
E’=f(E)=0.5*E=0
P12=512
Q12=(P12+E’-B)*S=(512-0-0)*0.0068359375=3.5
Q12rounding off to obtain Q12’=4
E=E+P12-[Q12’/S]+B=0+512-((4/0.00732421875)+0)=-73.14288
In the same way, Q can be obtained13’=1、Q21’=7、Q22’=2、Q23’=-7
I.e. the vector P is finally quantized and stored as Q':
Figure BDA0002216258290000092
example 2
As shown in fig. 3, a method for quantization vector update and information storage includes:
step 21: an initial value of the total error, the information amount of the stored information, and the amount of change of each vector component of the quantized vector are set. In this embodiment, the information amount may include the number of bits.
Step 22: the amount of change of each vector component is quantized to an effective value corresponding to the magnitude of the information amount, respectively.
Step 23: and changing the quantized effective value of the corresponding change quantity on each vector component to obtain updated vector components, wherein all the updated vector components form an updated quantized vector.
Step 24: storing the updated quantization vector.
As shown in fig. 4, the step 22 of quantizing the change amount of one vector component to an effective value corresponding to the information amount specifically includes:
step 221: the extension error at quantization of the change amount is calculated based on the current total error. For the change amount of the first vector component of the quantized vector, the corresponding current total error is an initial value, for the change amount of the second vector component of the quantized vector, the corresponding current total error is the total error after the first update, and so on; in this embodiment, the formula for calculating the extension error may be:
E’=f(E)
wherein the content of the first and second substances,
e' is used for representing the expansion error and can be an integer;
e is used to represent the current total error;
f () is used to represent any error extension method, depending on the particular algorithm of "dithering" used.
Step 222: and quantizing the change amount according to the spread error to an effective value according to the information amount. In this embodiment, the calculation may be specifically performed according to the following formula:
ΔQ=(Δ+E’-B)×S
ΔQ’=g(ΔQ)
wherein the content of the first and second substances,
Δ Q is used to represent the initial quantization value;
Δ is used to represent the vector component;
s is a preset quantized scale factor, and specific numerical values are determined according to an actual quantization method, such as a maximum information entropy method, a maximum and minimum value method and the like;
b is a preset quantized offset value, and the specific numerical value is determined according to an actual quantizer, such as a maximum information entropy method, a maximum and minimum value method and the like;
Δ Q' is used to represent the effective value;
g () is used to denote a function that defines Δ Q as a significant value that corresponds to the magnitude of the information quantity, and specifically may be rounding or discarding the lower bits.
Step 223: updating the total error, the updated total error being equal to a sum of the total error before updating and a quantization error, the quantization error being equal to a difference between the change amount and a resulting value of the effective value after inverse quantization. In this embodiment, the formula for calculating the inverse quantization value of the effective value may be:
the value obtained after inverse quantization of the effective value is (Δ Q'/S) + B.
The formula for updating the total error is:
E+=Δ-[(ΔQ’/S)+B]
step 22 may repeat steps 221-223 until all changes are quantified.
The quantization vector updating and information storing method of the embodiment quantizes the updated quantization vector to the information quantity of the stored information, introduces a dithering technology during quantization of the change quantity, and distributes the total error to the quantization process of different change quantities, so that the precision of a single updated component is reduced, but the precision of the average value of the whole vector is not influenced or even can be improved, and the updated quantization vector can be stored in the form of the quantization vector with low precision and small occupied space, so that the storage space is reduced on the whole, the calculation quantity is reduced, the calculation speed is accelerated, even if the storage space and the calculation capacity of the equipment are limited, the training and reasoning machine learning can be realized enough, the whole performance of the equipment is improved, the power consumption and the cost are reduced, and the cost performance is improved. Meanwhile, the method can be used in related software design and hardware design for improving performance.
The vector in this embodiment may be a one-dimensional vector or a two-dimensional vector. The following describes the vector information storage method of the present embodiment with a two-dimensional vector as an example.
Suppose, again for vectors
Figure BDA0002216258290000111
The method of embodiment 1 is firstly adopted to store, and the quantized vector Q' is quantized and stored as:
Figure BDA0002216258290000112
assume again that △ is:
Figure BDA0002216258290000113
similarly, presetting that E is equal to 0.0, and the information quantity of the stored information is 4 bits;
in calculating the spread error, the total error is spread to the subsequent data using a fixed-scale mode (e.g., 0.5 here) (i.e., f (E) ═ 0.5 × E);
quantization is performed by using a maximum-minimum method (Max: 1024.0, Min ═ 1024.0), and the quantization is 4bits (mapping [ -1024.0, 1024.0] to [ -7, 7], so that B ═ 0.0, S ═ 7+7)/(1024+1024) ═ 0.0068359375);
for the first change, the quantization process is:
E’=f(E)=0.5*E=0
△Q11=(△11+E’-B)*S=(120+0-0.0)*0.0068359375=0.8203125
△Q11rounding off to obtain △ Q11’=1
Then the updated Q11’=7+1=8
E=E+△11-[△Q11’/S]+B=0+120-((1/0.0068359375)+0.0)=-26.28572
For the second change, the quantization process is:
E’=f(E)=0.5*E=-13.14286
△Q12=(△12+E’-B)*S=(80-13.14286-0.0)*0.0068359375=0.457031
△Q12rounding off to obtain △ Q12’=0
Then the updated Q12’=4+0=4
E=E+△12-[△Q12’/S]+B=0+80-((0/0.0068359375)+0.0)=53.71428
The updated Q can be obtained in the same way13’=2、Q21’=7、Q22’=2、Q23’=-7
I.e., Q' is updated and stored as:
Figure BDA0002216258290000121
example 3
As shown in fig. 5, a vector information storage device includes: a first setting module 31, a first quantization module 32 and a first storage module 33.
The first setting module 31 is used for setting an initial value of the total error and the information quantity of the stored information; in this embodiment, the information amount may include the number of bits.
The first quantization module 32 is configured to quantize each vector component of a vector to an effective value according to the size of the information amount, where all the effective values form a quantized vector of the vector;
a first storage module 33 is used for storing the quantization vectors;
the first quantization module 32 is specifically configured to quantize one vector component of the vector to a value that matches the magnitude of the information quantity:
calculating the extension error when the vector components are quantized based on the current total error, wherein for the first vector component of the vector, the corresponding current total error is an initial value, for the second vector component of the vector, the corresponding current total error is the total error after the first update, and so on; in this embodiment, the formula for calculating the extension error may be:
E’=f(E)
wherein the content of the first and second substances,
e' is used for representing the expansion error and can be an integer;
e is used to represent the current total error;
f () is used to represent any error extension method, depending on the particular algorithm of "dithering" used;
quantizing said vector components to a value corresponding to the magnitude of said information according to said extended error; in this embodiment, the calculation may be specifically performed according to the following formula:
Q=(P+E’-B)×S
Q’=g(Q)
wherein the content of the first and second substances,
q is used to represent the initial quantization value;
p is used to represent the vector components;
s is a preset quantized scale factor, and specific numerical values are determined according to an actual quantization method, such as a maximum information entropy method, a maximum and minimum value method and the like;
b is a preset quantized offset value, and the specific numerical value is determined according to an actual quantizer, such as a maximum information entropy method, a maximum and minimum value method and the like;
q' is used to represent the valid value;
g () is used to denote a function that defines Q as a significant value that corresponds to the magnitude of the information quantity, and may specifically be rounding or discarding the lower bits;
updating the total error, the updated total error being equal to the sum of the total error before updating and a quantization error, the quantization error being equal to the difference between the vector component and the resulting value after inverse quantization of the effective value; in this embodiment, the formula for calculating the inverse quantization value of the effective value may be:
the value obtained after inverse quantization of the effective value is (Q'/S) + B
The formula for updating the total error is:
E+=P-[(Q’/S)+B]
the vector information storage device quantizes the vector to the information quantity of the stored information, introduces a dithering technology during quantization, and distributes the total error to the quantization process of different vector components, so that the precision of a single component is reduced, but the influence is avoided, even the precision of the average value of the whole vector can be improved, the vector can be stored in the form of the quantized vector with low precision and small occupied space, the storage space is reduced on the whole, the calculated quantity is reduced, the calculation speed is accelerated, even if the storage space and the calculation capacity of the equipment are limited, the training and reasoning machine learning can be realized enough, the whole performance of the equipment is improved, the power consumption and the cost are reduced, and the cost performance is improved.
Example 4
As shown in fig. 6, an apparatus for quantization vector update and information storage includes: a second setting module 41, a second quantization module 42, a vector update module 43 and a second storage module 44.
The second setting module 41 is used for setting an initial value of the total error, the information quantity of the stored information and the change quantity of each vector component of the quantization vector; in this embodiment, the information amount may include the number of bits.
The second quantization module 42 is configured to quantize the change amount of each vector component to an effective value corresponding to the magnitude of the information amount.
The vector updating module 43 is configured to change the quantized effective value of the corresponding change amount on each vector component to obtain an updated vector component, and all the updated vector components constitute an updated quantized vector.
The second storage module 44 is configured to store the updated quantization vector.
The second quantization module 42 is specifically configured to quantize the change amount of one vector component to an effective value according to the information amount:
calculating the expansion error when the change amount is quantized based on the current total error, wherein for the change amount of the first vector component of the quantized vector, the corresponding current total error is an initial value, for the change amount of the second vector component of the quantized vector, the corresponding current total error is the total error after the first update, and so on; in this embodiment, the formula for calculating the extension error may be:
E’=f(E)
wherein the content of the first and second substances,
e' is used for representing the expansion error and can be an integer;
e is used to represent the current total error;
f () is used to represent any error extension method, depending on the particular algorithm of "dithering" used;
quantizing the change amount to an effective value according to the information amount according to the spread error; in this embodiment, the calculation may be specifically performed according to the following formula:
ΔQ=(Δ+E’-B)×S
ΔQ’=g(ΔQ)
wherein the content of the first and second substances,
Δ Q is used to represent the initial quantization value;
Δ is used to represent the vector component;
s is a preset quantized scale factor, and specific numerical values are determined according to an actual quantization method, such as a maximum information entropy method, a maximum and minimum value method and the like;
b is a preset quantized offset value, and the specific numerical value is determined according to an actual quantizer, such as a maximum information entropy method, a maximum and minimum value method and the like;
Δ Q' is used to represent the effective value;
g () is used to represent a function that defines Δ Q as a significant value that corresponds to the magnitude of the information quantity, and specifically may be rounding or discarding the lower bits;
updating the total error, the updated total error being equal to the sum of the total error before updating and a quantization error, the quantization error being equal to the difference between the change amount and the resulting value after inverse quantization of the effective value; in this embodiment, the formula for calculating the inverse quantization value of the effective value may be:
the value obtained after inverse quantization of the effective value is (Δ Q'/S) + B.
The formula for updating the total error is:
E+=Δ-[(ΔQ’/S)+B]
the quantization vector updating and information storing device of the embodiment quantizes the updated quantization vector to the information quantity of the stored information, introduces a dithering technology during quantization of the change quantity, distributes the total error to the quantization process of different change quantities, reduces the precision of a single updated component without influencing or even improving the precision of the average value of the whole vector, so that the updated quantization vector can be stored in the form of the quantization vector with low precision and small occupied space, reduces the storage space on the whole, reduces the calculated quantity, accelerates the calculation speed, can also sufficiently realize training and reasoning machine learning even if the storage space and the calculation capacity of the equipment are limited, improves the overall performance of the equipment, reduces the power consumption and the cost, and improves the cost performance.
The present invention can be selectively applied to some or all of the above embodiments 1 to 4 for different applications. Such as:
application 1: and converting the high-precision model into an n-bit quantization model.
For the existing high-precision model, the kernel parameter of the high-precision model is quantized into n bits, the low-precision calculation is adopted to improve the calculation speed and reduce the power consumption, and the parameter P of a convolution (kernel filter) and/or full connection layer in the model can be usedij(i∈[1,m],j∈[1,n]) The vector information storage method or device is adopted to obtain the n-bit quantization parameter model. For example, we can convert the weight coefficients of convolutional layers in MobileNet to 4bits by using the above method or device. Quantization to 4bits with simpler rounding algorithmThe model (2) is higher in precision when in use.
Application 2: and (4) performing n-bit low-precision inference calculation.
In the low-precision reasoning calculation, the weight of the n-bit quantization model is adopted; for input and output, the vector information storage method or device can be used to obtain n-bit quantized input and output vectors. Of course, only the weights may be quantized, or only the input/output may be quantized, depending on the actual situation. Besides the method of the quantization model is used for reasoning calculation, the input parameters can also be quantized by the method of the invention, and the precision is higher than that when the input parameters are quantized to 4bits by a simple rounding algorithm. In practice, the accuracy can be improved by the method of the invention for almost all operators with multiple data multiply-add operations (such as networks with convolution, attention, and all connected operators)
Application 3: better quantitative models are trained directly.
In order to obtain a better quantization model, n-bit weight coefficients are used when training the model. To avoid the Pattern (Pattern) problem introduced by using fixed quantization methods such as truncation or rounding, etc., a "dithering technique" can be used to introduce random errors while also maintaining a high accuracy of the average value within a block. In training, the "dithering technique" can be used on the input and output of an operator, and can also be used on the quantization of the weight parameter of the operator. During specific training, the two methods or devices of vector information storage, quantized vector updating and information storage can be comprehensively adopted, or one of the two methods or devices can be independently adopted. During training, the model quantization parameters are updated by adopting the method or the device for updating the quantization vectors and storing information. For example, for a face recognition model MFN, the model trained by the method has higher inference precision and generalization capability under the precision of int8 than under the precision of Float 32.
Application 4: an n-bit hardware inference accelerator.
In the design of the fixed-point computing unit, if the precision of the computing result register is higher than that of the result to be finally stored, an error accumulation and dispersion module based on a 'jitter' technology can be designed to randomly disperse the accumulated error to other components of the output vector. In addition, if the hardware has a quantization processing unit with vectors, an error accumulation and dispersion module can be designed to do the same operation. By the operation, the inference accuracy can be improved under the same bit number; the number of bits of the data can also be reduced at the same accuracy of the inference. The method of the invention can enable AI calculation to approximately reach the accuracy of Float32 calculation by using a few bits. This is a low cost and low power consumption approach for hardware. For example, although the current AI chip of Float32 can be used to design the chip of int8 and the existing int8 chip can be used to design hardware to improve the calculation accuracy of the AI network, or further reduce the cost of hardware by reducing bits number, the present invention can be used to design hardware to obtain a hardware solution with higher cost performance by adding the present invention to the IC.
Example 5
Fig. 7 is a schematic structural diagram of an electronic device according to embodiment 5 of the present invention. The electronic device comprises a memory, a processor and a computer program stored on the memory and executable on the processor, which when executed by the processor implements the method of embodiment 1 or 2. The electronic device 50 shown in fig. 7 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiment of the present invention.
As shown in fig. 7, the electronic device 50 may be embodied in the form of a general purpose computing device, which may be, for example, a server device. The components of the electronic device 50 may include, but are not limited to: the at least one processor 51, the at least one memory 52, and a bus 53 connecting the various system components (including the memory 52 and the processor 51).
The bus 53 includes a data bus, an address bus, and a control bus.
The memory 52 may include volatile memory, such as Random Access Memory (RAM)521 and/or cache memory 522, and may further include Read Only Memory (ROM) 523.
Memory 52 may also include a program/utility 525 having a set (at least one) of program modules 524, such program modules 524 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each of which, or some combination thereof, may comprise an implementation of a network environment.
The processor 51 executes various functional applications and data processing, such as the methods provided in embodiments 1 or 2 of the present invention, by running a computer program stored in the memory 52.
The electronic device 50 may also communicate with one or more external devices 54 (e.g., a keyboard, a pointing device, etc.). Such communication may be through an input/output (I/O) interface 55. Also, the model-generating device 30 may also communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the Internet) via the network adapter 56. As shown, the network adapter 56 communicates with the other modules of the model-generating device 50 over a bus 53. It should be appreciated that although not shown in the figures, other hardware and/or software modules may be used in conjunction with the model-generating device 50, including but not limited to: microcode, device drivers, redundant processors, external disk drive arrays, RAID (disk array) systems, tape drives, and data backup storage systems, etc.
It should be noted that although in the above detailed description several units/modules or sub-units/modules of the electronic device are mentioned, such a division is merely exemplary and not mandatory. Indeed, the features and functionality of two or more of the units/modules described above may be embodied in one unit/module according to embodiments of the invention. Conversely, the features and functions of one unit/module described above may be further divided into embodiments by a plurality of units/modules.
Example 6
The present embodiment provides a computer-readable storage medium on which a computer program is stored, which program, when being executed by a processor, realizes the steps of the method provided in embodiment 1 or 2.
More specific examples, among others, that the readable storage medium may employ may include, but are not limited to: a portable disk, a hard disk, random access memory, read only memory, erasable programmable read only memory, optical storage device, magnetic storage device, or any suitable combination of the foregoing.
In a possible implementation, the invention can also be implemented in the form of a program product comprising program code means for causing a terminal device to carry out the steps of implementing the method according to embodiment 1 or 2 when said program product is run on said terminal device.
Where program code for carrying out the invention is written in any combination of one or more programming languages, the program code may be executed entirely on the user device, partly on the user device, as a stand-alone software package, partly on the user device and partly on a remote device or entirely on the remote device.
While specific embodiments of the invention have been described above, it will be appreciated by those skilled in the art that these are by way of example only, and that the scope of the invention is defined by the appended claims. Various changes and modifications to these embodiments may be made by those skilled in the art without departing from the spirit and scope of the invention, and these changes and modifications are within the scope of the invention.

Claims (14)

1. A vector information storage method, comprising:
setting an initial value of the total error and the information quantity of the stored information;
quantizing each vector component of the vector to an effective value according with the information quantity, wherein all the effective values form a quantized vector of the vector;
storing the quantized vector;
wherein the step of quantizing a vector component of said vector to a significant value corresponding to the magnitude of said information quantity comprises:
calculating an extension error when the vector components are quantized based on a current total error;
quantizing said vector components to a value corresponding to the magnitude of said information according to said extended error;
updating the total error, the updated total error being equal to the sum of the total error before updating and a quantization error, the quantization error being equal to the difference between the vector component and the resulting value after inverse quantization of the effective value.
2. The vector information storage method according to claim 1, wherein the step of quantizing the vector components to a valid value that matches the magnitude of the information amount in accordance with the extended error specifically includes:
calculated according to the following formula
Q=(P+E’-B)×S
Q’=g(Q)
Wherein the content of the first and second substances,
q is used to represent the initial quantization value;
p is used to represent the vector components;
e' is used to represent the spreading error;
s is a preset quantized scale factor;
b is a preset quantized offset value;
q' is used to represent the valid value;
g () is used to represent a function that defines Q as a significant value that corresponds to the magnitude of the information quantity;
the value obtained after inverse quantization of the effective value is (Q'/S) + B.
3. The vector information storage method of claim 1, wherein the large amount of information comprises a number of bits.
4. A method for quantization vector update and information storage is characterized by comprising the following steps:
setting an initial value of the total error, the information quantity of the stored information and the change quantity of each vector component of the quantized vector;
quantizing the change amount of each vector component to an effective value according with the information amount;
changing the quantized effective value of the corresponding change quantity on each vector component to obtain updated vector components, wherein all the updated vector components form updated quantized vectors;
storing the updated quantization vector;
wherein the step of quantizing the amount of change of a vector component to an effective value corresponding to the amount of information comprises:
calculating an extension error when the change amount is quantized based on a current total error;
quantizing the change amount to an effective value according to the information amount according to the spread error;
updating the total error, the updated total error being equal to a sum of the total error before updating and a quantization error, the quantization error being equal to a difference between the change amount and a resulting value of the effective value after inverse quantization.
5. The method of claim 4, wherein the step of quantizing the change amount to an effective value corresponding to the size of the information amount according to the extended error comprises:
calculated according to the following formula
ΔQ=(Δ+E’-B)×S
ΔQ’=g(ΔQ)
Wherein the content of the first and second substances,
Δ Q is used to represent the initial quantization value;
Δ is used to represent the vector component;
e' is used to represent the spreading error;
s is a preset quantized scale factor;
b is a preset quantized offset value;
Δ Q' is used to represent the effective value;
g () is used to represent a function that defines Δ Q as a significant value that corresponds to the magnitude of the information quantity;
the value obtained after inverse quantization of the effective value is (Δ Q'/S) + B.
6. The method of claim 4, wherein the information amount comprises a number of bits.
7. A vector information storage apparatus, comprising:
the first setting module is used for setting an initial value of the total error and the information quantity of the stored information;
a first quantization module, configured to quantize each vector component of a vector to an effective value that matches a size of the information amount, where all the effective values form a quantized vector of the vector;
a first storage module for storing the quantized vector;
wherein the first quantization module, when quantizing a vector component of the vector to a value that matches the magnitude of the information amount, is specifically configured to:
calculating an extension error when the vector components are quantized based on a current total error;
quantizing said vector components to a value corresponding to the magnitude of said information according to said extended error;
updating the total error, the updated total error being equal to the sum of the total error before updating and a quantization error, the quantization error being equal to the difference between the vector component and the resulting value after inverse quantization of the effective value.
8. The vector information storage device of claim 7, wherein said first quantization module, when quantizing said vector components to fit said information quantity magnitudes in said extended error, is specifically configured to:
calculated according to the following formula
Q=(P+E’-B)×S
Q’=g(Q)
Wherein the content of the first and second substances,
q is used to represent the initial quantization value;
p is used to represent the vector components;
e' is used to represent the spreading error;
s is a preset quantized scale factor;
b is a preset quantized offset value;
q' is used to represent the valid value;
g () is used to represent a function that defines Q as a significant value that corresponds to the magnitude of the information quantity;
the value obtained after inverse quantization of the effective value is (Q'/S) + B.
9. The vector information storage device of claim 7, wherein the large amount of information comprises a number of bits.
10. An apparatus for quantization vector update and information storage, comprising:
the second setting module is used for setting an initial value of the total error, the information quantity of the stored information and the change quantity of each vector component of the quantization vector;
a second quantization module, configured to quantize the change amount of each vector component to an effective value that corresponds to the information amount;
a vector updating module, configured to change the quantized effective value of the corresponding change amount on each vector component to obtain an updated vector component, where all the updated vector components form an updated quantized vector;
a second storage module for storing the updated quantization vector;
wherein the second quantization module is specifically configured to quantize the change amount of one vector component to an effective value that matches the magnitude of the information amount:
calculating an extension error when the change amount is quantized based on a current total error;
quantizing the change amount to an effective value according to the information amount according to the spread error;
updating the total error, the updated total error being equal to a sum of the total error before updating and a quantization error, the quantization error being equal to a difference between the change amount and a resulting value of the effective value after inverse quantization.
11. The apparatus for quantization vector update and information storage of claim 10, wherein the second quantization module, when quantizing the change amount to conform to the effective value of the information amount according to the extended error, is specifically configured to:
calculated according to the following formula
ΔQ=(Δ+E’-B)×S
ΔQ’=g(ΔQ)
Wherein the content of the first and second substances,
Δ Q is used to represent the initial quantization value;
Δ is used to represent the vector component;
e' is used to represent the spreading error;
s is a preset quantized scale factor;
b is a preset quantized offset value;
Δ Q' is used to represent the effective value;
g () is used to represent a function that defines Δ Q as a significant value that corresponds to the magnitude of the information quantity;
the value obtained after inverse quantization of the effective value is (Δ Q'/S) + B.
12. The apparatus for quantization vector update and information storage of claim 10, wherein the large amount of information comprises a number of bits.
13. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the method of any of claims 1 to 6 when executing the program.
14. A computer-readable storage medium, on which a computer program is stored, which program, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 6.
CN201910916421.1A 2019-09-26 2019-09-26 Vector information storage and updating method and device, electronic equipment and storage medium Active CN110764696B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910916421.1A CN110764696B (en) 2019-09-26 2019-09-26 Vector information storage and updating method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910916421.1A CN110764696B (en) 2019-09-26 2019-09-26 Vector information storage and updating method and device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN110764696A CN110764696A (en) 2020-02-07
CN110764696B true CN110764696B (en) 2020-10-16

Family

ID=69330504

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910916421.1A Active CN110764696B (en) 2019-09-26 2019-09-26 Vector information storage and updating method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN110764696B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102034219A (en) * 2010-12-16 2011-04-27 北京航空航天大学 Reversible image watermarking method utilizing context modeling and generalizing expansion
CN103294813A (en) * 2013-06-07 2013-09-11 北京捷成世纪科技股份有限公司 Sensitive image search method and device
CN104766269A (en) * 2015-04-16 2015-07-08 山东大学 Spread transform dither modulation watermarking method based on JND brightness model

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6944350B2 (en) * 1999-12-17 2005-09-13 Utah State University Method for image coding by rate-distortion adaptive zerotree-based residual vector quantization and system for effecting same
KR101508704B1 (en) * 2008-08-19 2015-04-03 한국과학기술원 Apparatus and method for transmitting and receving in multiple antenna system
EP2449682B1 (en) * 2009-07-02 2018-05-09 Unify GmbH & Co. KG Method for vector quantization of a feature vector
WO2012035781A1 (en) * 2010-09-17 2012-03-22 パナソニック株式会社 Quantization device and quantization method
CN103176951A (en) * 2013-04-09 2013-06-26 厦门大学 Method for balancing accuracy and calculated amount of multifunctional sensor signal reconstruction
CN104104390B (en) * 2013-04-10 2017-04-19 华为技术有限公司 Signal compression method, signal reconstruction method, and correlation apparatus and system
EP3149971B1 (en) * 2014-05-30 2018-08-29 Qualcomm Incorporated Obtaining sparseness information for higher order ambisonic audio renderers
US10070094B2 (en) * 2015-10-14 2018-09-04 Qualcomm Incorporated Screen related adaptation of higher order ambisonic (HOA) content
CN107645662B (en) * 2017-10-19 2020-06-12 电子科技大学 Color image compression method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102034219A (en) * 2010-12-16 2011-04-27 北京航空航天大学 Reversible image watermarking method utilizing context modeling and generalizing expansion
CN103294813A (en) * 2013-06-07 2013-09-11 北京捷成世纪科技股份有限公司 Sensitive image search method and device
CN104766269A (en) * 2015-04-16 2015-07-08 山东大学 Spread transform dither modulation watermarking method based on JND brightness model

Also Published As

Publication number Publication date
CN110764696A (en) 2020-02-07

Similar Documents

Publication Publication Date Title
CN111652368B (en) Data processing method and related product
CN110378468B (en) Neural network accelerator based on structured pruning and low bit quantization
CN110717585B (en) Training method of neural network model, data processing method and related product
JP2021072103A (en) Method of quantizing artificial neural network, and system and artificial neural network device therefor
US20220164666A1 (en) Efficient mixed-precision search for quantizers in artificial neural networks
US20200293893A1 (en) Jointly pruning and quantizing deep neural networks
CN113780549A (en) Quantitative model training method, device, medium and terminal equipment for overflow perception
TW202022798A (en) Method of processing convolution neural network
Langroudi et al. Alps: Adaptive quantization of deep neural networks with generalized posits
US20220222533A1 (en) Low-power, high-performance artificial neural network training accelerator and acceleration method
CN110764696B (en) Vector information storage and updating method and device, electronic equipment and storage medium
CN111209083B (en) Container scheduling method, device and storage medium
CN112001495B (en) Neural network optimization method, system, device and readable storage medium
CN114065913A (en) Model quantization method and device and terminal equipment
CN114595627A (en) Model quantization method, device, equipment and storage medium
CN111240606A (en) Storage optimization method and system based on secure memory
CN112183744A (en) Neural network pruning method and device
CN115062777B (en) Quantization method, quantization device, equipment and storage medium of convolutional neural network
US20230342613A1 (en) System and method for integer only quantization aware training on edge devices
WO2023201424A1 (en) System and method for adaptation of containers for floating-point data for training of a machine learning model
CN115965047A (en) Data processor, data processing method and electronic equipment
CN116957007A (en) Feature quantization method, device, medium and program product for neural network training
CN114565076A (en) Adaptive incremental streaming quantile estimation method and device
CN116910311A (en) Graph network construction method and device, electronic equipment and storage medium
CN115951858A (en) Data processor, data processing method and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant