CN111340226B

CN111340226B - Training and testing method, device and equipment for quantitative neural network model

Info

Publication number: CN111340226B
Application number: CN202010155203.3A
Authority: CN
Inventors: 吴玉东; 吴一超; 梁鼎; 于志鹏; 吕元昊
Original assignee: Beijing Sensetime Technology Development Co Ltd
Current assignee: Beijing Sensetime Technology Development Co Ltd
Priority date: 2020-03-06
Filing date: 2020-03-06
Publication date: 2022-01-25
Anticipated expiration: 2040-03-06
Also published as: CN111340226A

Abstract

The embodiment of the application provides a training and testing method, device and equipment for a quantitative neural network model. The training method comprises the following steps: acquiring training data; the training data includes image samples and labels; the label comprises an identity of the image sample; inputting training data into a quantitative neural network model for quantitative processing to obtain quantitative characteristics of an image sample; and training the quantitative neural network model according to the quantitative characteristics to obtain the trained quantitative neural network model. Therefore, by implementing the embodiment of the application, the trained quantized neural network model adopts low-bit quantization processing, so that the operation amount is reduced, the operation rate is improved, the quantized neural network model can be more conveniently applied to terminal equipment, the face recognition is carried out by using the quantized neural network model, the quantization error can be minimized, and the recognition precision of the quantized neural network model is improved.

Description

Training and testing method, device and equipment for quantitative neural network model

Technical Field

The application relates to the field of image processing, in particular to a method, a device and equipment for training and testing a quantitative neural network model.

Background

The face recognition technology is widely applied in practice, and the face recognition precision is greatly developed in recent years by means of deep learning and convolutional neural network technology. However, as the neural network model prediction is more and more accurate, the neural network hierarchy is deeper and deeper, the complex neural network cannot be deployed at the mobile terminal, and real-time reasoning cannot be realized.

Therefore, a training method of a quantized neural network model is needed to solve the above technical problems.

Disclosure of Invention

The embodiment of the application provides a training and testing method, a device, equipment and a storage medium of a quantitative neural network model, which can solve the problems in the prior art, train the quantitative neural network model, simplify a complex neural network, facilitate the deployment of the complex neural network at a mobile terminal, and improve the recognition precision by adopting the quantitative neural network model to perform face recognition.

In a first aspect, an embodiment of the present application provides a training method for a quantized neural network model, where the method includes: acquiring training data; the training data comprises image samples and labels; the label comprises an identification of the image sample; inputting the training data into a quantitative neural network model for quantitative processing to obtain quantitative characteristics of the image sample; and training the quantitative neural network model according to the quantitative features to obtain the trained quantitative neural network model.

Firstly, acquiring training data, wherein the training data comprises an image sample and a label, and the label comprises an identity of the image sample; then, inputting the training data into a quantitative neural network model for quantitative processing to obtain quantitative characteristics; and finally, according to the quantization characteristics, performing back propagation by adopting a loss function, and training the quantization neural network model towards the direction of reducing the loss function value so as to minimize the loss function value, thereby finally obtaining the trained quantization neural network model. Therefore, when the trained quantization neural network model is used for face recognition, the quantization neural network model can perform quantization processing on the image, so that the operation amount is reduced and the operation rate is improved in the whole image processing process, and the method is more conveniently applied to terminal equipment; and training the quantitative neural network model by adopting a loss function so as to minimize errors and improve the identification precision and the identification accuracy of the quantitative neural network model.

Based on the first aspect, in a possible embodiment, the label further comprises a full-precision feature of the image sample; the method further comprises the following steps: and inputting the image sample into a full-precision neural network for processing to obtain the full-precision characteristic.

It can be understood that the quantization neural network model is represented by a low bit bandwidth, and when the quantization neural network model is used for quantization processing, a quantization error is generated, and if the quantization error is calculated, a full-precision feature is required to reduce the influence of the quantization error on face recognition, so that the label also includes a full-precision feature of the image sample, and the full-precision feature is obtained by inputting the image sample into a full-precision neural network in advance.

Based on the first aspect, in a possible implementation manner, the training the quantitative neural network model according to the quantitative features to obtain a trained quantitative neural network model includes: obtaining a loss function value of the quantitative neural network model according to the quantitative feature and the full-precision feature; and training the quantitative neural network model according to the loss function value of the quantitative neural network model to obtain the trained quantitative neural network model.

It can be understood that, when the loss function is used to reversely propagate the quantized neural network model, the quantized neural network model is trained according to the value of the loss function, in order to train toward the direction of reducing the value of the loss function, so as to obtain the value of the minimum loss function.

Based on the first aspect, in a possible implementation manner, the obtaining a value of a loss function of the quantized neural network model according to the quantized feature and the full-precision feature includes: obtaining an individual error of each quantization feature according to the quantization feature and the full-precision feature; the individual error is the intra-class deviation before and after single quantization characteristic quantization processing; the class represents a set of a plurality of quantized features corresponding to image samples with the same identity; obtaining a first parameter of the loss function corresponding to each quantization feature according to the individual error of each quantization feature; and obtaining the value of the loss function according to the first parameter.

It can be seen that, when solving the value of the loss function, it is necessary to first solve an individual error of each quantization feature according to the quantization feature and the full-precision feature, then solve according to the individual error of each quantization feature, obtain a first parameter value of the loss function corresponding to each quantization feature, and finally solve according to the value of the first parameter to obtain the value of the loss function. The individual error in the present application is an intra-class deviation before and after quantization processing of a single quantization feature, and a class represents a set of all single quantization features corresponding to image samples having the same identity. For example, in face recognition, the quantized features corresponding to face images of the same person (the same identity) are of one type.

It should be noted that the structure of the intra-class features is changed by the individual error, which affects the compactness of the intra-class features, when the quantized neural network model is used for face recognition, the individual error is a key affecting the recognition accuracy, and the individual error is introduced into the loss function in the form of an angle margin, so that (the first parameter of) the loss function contains an individual error term, and thus the loss function is used for back propagation to train the quantized neural network model, so that the quantization error is minimized, and the recognition accuracy of the quantized neural network model is improved. Therefore, by implementing the embodiment of the application, the face recognition is performed by adopting the quantitative neural network model, the recognition precision of the quantitative neural network model can be improved, and the quantitative neural network model is more suitable for being applied to terminal equipment.

Based on the first aspect, in a possible implementation manner, the obtaining an individual error of each quantization feature according to the quantization feature and the full-precision feature includes: obtaining a quantization error of each quantization feature and a class error of each quantization feature according to the quantization feature and the full-precision feature; the quantization error is the deviation of a single quantization characteristic before and after quantization processing; the class error is the inter-class deviation before and after the quantization processing of a single quantization characteristic; and obtaining the individual error of each quantization feature according to the quantization error and the class error.

It can be seen that the quantization error of each quantization feature and the class error of each quantization feature need to be obtained according to the quantization feature and the full-precision feature, and then the individual error of each quantization feature needs to be obtained according to the quantization error of each quantization feature and the class error of each quantization feature. In the present application, for a single quantization feature, according to the influence within a class and between classes, the quantization error difference is a class error and an individual error, where the quantization error is a deviation of the single quantization feature before and after quantization, the class error is an inter-class deviation of the single quantization feature before and after quantization, the class error affects inter-class separability, and the individual error affects intra-class compactness.

Based on the first aspect, in a possible implementation manner, in an angle space, a quantization error of each quantization feature is an included angle between the full-precision feature corresponding to each quantization feature and the quantization feature; the class error of each quantization feature is an included angle between a class center of the full-precision feature and a class center of the quantization feature; the class center of the full-precision feature is the average value of all full-precision features corresponding to the image samples with the same identity marks, and the class center of the quantitative feature is the average value of all quantitative features corresponding to the image samples with the same identity marks; the obtaining the individual error of each quantization feature according to the quantization error and the class error comprises: and subtracting the quantization error of each quantization feature from the class error of each quantization feature, and then taking an absolute value to obtain an individual error of each quantization feature.

It can be seen that, in the present application, according to the hypersphere distribution theory of face recognition, that is, features are all distributed on a hypersphere with a fixed radius, in an angle space, for each quantized feature, a quantization error is defined as an included angle between the quantized feature and a full-precision feature, an included angle between a class center of the quantized feature and a class center of the full-precision feature is defined as a class error, and an individual error in the angle space is an absolute value of a difference between the quantized error and the class error. Therefore, for each quantization feature, a quantization error of each quantization feature and a class error of each quantization feature are obtained according to the quantization feature and the full-precision feature, and then an individual error of each quantization feature is obtained according to the quantization error of each quantization feature and the class error of each quantization feature.

In a second aspect, an embodiment of the present application further provides a training apparatus for a quantized neural network model, including:

the acquisition module is used for acquiring training data; the training data comprises image samples and labels; the label comprises an identification of the image sample;

the quantization module is used for inputting the training data into a quantization neural network model for quantization processing to obtain quantization characteristics of the image sample;

and the training module is used for training the quantitative neural network model by adopting a loss function according to the quantitative characteristics to obtain the trained quantitative neural network model.

Based on the second aspect, in a possible embodiment, the label further comprises a full-precision feature of the image sample; the device further comprises: and the preprocessing module is used for inputting the image sample into a full-precision neural network for processing to obtain the full-precision characteristic.

Based on the second aspect, in a possible implementation manner, the training module is specifically configured to: obtaining a loss function value of the quantitative neural network model according to the quantitative feature and the full-precision feature; and training the quantitative neural network model according to the loss function value of the quantitative neural network model to obtain the trained quantitative neural network model.

Based on the second aspect, in a possible embodiment, the training module is further configured to: obtaining an individual error of each quantization feature according to the quantization feature and the full-precision feature; the individual error is the intra-class deviation before and after single quantization characteristic quantization processing; the class represents a set of a plurality of quantized features corresponding to image samples with the same identity; obtaining a first parameter of the loss function corresponding to each quantization feature according to the individual error of each quantization feature; and obtaining the value of the loss function according to the first parameter.

Based on the second aspect, in a possible embodiment, the training module is further configured to: obtaining a quantization error of each quantization feature and a class error of each quantization feature according to the quantization feature and the full-precision feature; the quantization error is the deviation of a single quantization characteristic before and after quantization processing; the class error is the inter-class deviation before and after the quantization processing of a single quantization characteristic; and obtaining the individual error of each quantization feature according to the quantization error and the class error.

Based on the second aspect, in a possible implementation manner, in an angle space, a quantization error of each quantization feature is an included angle between the full-precision feature corresponding to each quantization feature and the quantization feature; the class error of each quantization feature is an included angle between a class center of the full-precision feature and a class center of the quantization feature; the class center of the full-precision feature is the average value of all full-precision features corresponding to the image samples with the same identity marks, and the class center of the quantitative feature is the average value of all quantitative features corresponding to the image samples with the same identity marks; the training module is further configured to: and subtracting the quantization error of each quantization feature from the class error of each quantization feature, and then taking an absolute value to obtain an individual error of each quantization feature.

Each functional module in the apparatus provided in the embodiment of the present application is specifically configured to implement the method described in the first aspect.

In a third aspect, an embodiment of the present application provides a method for testing a quantized neural network model, including: acquiring a face image; inputting the face image into a trained quantization neural network model, and processing to obtain quantization characteristics corresponding to the face image; wherein the quantitative neural network model is the quantitative neural network model of any one of claims 1-6; and comparing the quantization characteristics corresponding to the face image with the characteristics prestored in a database to obtain a face recognition result.

It can be understood that, when the trained quantized neural network model is used for face recognition, firstly, a face image is obtained, where the face image may be one face image or multiple images of each person shot from different angles, then the face image is input into the trained quantized neural network model to be processed, so as to obtain quantized features corresponding to the face image, where the quantized neural network model is the quantized neural network model described in any one of the embodiments of the first aspect, and finally, the quantized features corresponding to the face image are compared with features pre-stored in a database, so as to obtain a face recognition result. By implementing the embodiment of the application, the face recognition is carried out by adopting the quantitative neural network model, so that the recognition precision and the recognition accuracy can be improved.

It should be noted that, in the embodiment of the present application, the testing of the quantized neural network model may include verifying whether the trained quantized neural network model is reliable, that is, whether the accuracy rate meets the requirement, and whether further adjustment of the network parameters is needed, and may also include an application of performing face recognition using the trained quantized neural network model or the trained and tested quantized neural network model. That is, for the first aspect and various possible implementation manners, the implementation manner mentioned in the third aspect may also be adopted to test and/or apply the trained quantitative neural network model.

In a fourth aspect, an embodiment of the present application provides a testing apparatus for quantizing a neural network model, including:

the acquisition module is used for acquiring a face image;

the quantization neural network module is used for inputting the face image into the trained quantization neural network model for processing to obtain quantization characteristics corresponding to the face image; wherein the quantitative neural network model is the quantitative neural network model of any one of claims 1-6;

and the result output module is used for comparing the quantized features corresponding to the face image with the features prestored in the database to obtain a face recognition result.

Each functional module in the apparatus provided in the embodiment of the present application is specifically configured to implement the method described in the third aspect.

In a fifth aspect, an embodiment of the present application provides a training apparatus for a quantized neural network model, including a processor, a communication interface, and a memory; the memory is configured to store instructions, the processor is configured to execute the instructions, and the communication interface is configured to receive or transmit data; wherein the processor executes the instructions to perform the method as described in the first aspect or any specific implementation manner of the first aspect.

In a sixth aspect, embodiments of the present application provide a non-volatile storage medium for storing program instructions, which, when applied to a training apparatus for quantifying a neural network model, can be used to implement the method described in the first aspect.

In a seventh aspect, the present application provides a computer program product, which includes program instructions, and when the computer program product is executed by a training apparatus for quantifying a neural network model, the apparatus executes the method of the first aspect. The computer program product may be a software installation package, which may be downloaded and executed on a training apparatus for quantifying a neural network model, in case the method provided using any of the possible designs of the first aspect described above is required, to implement the method of the first aspect.

In an eighth aspect, an embodiment of the present application provides a test apparatus for quantizing a neural network model, including a processor, a communication interface, and a memory; the memory is configured to store instructions, the processor is configured to execute the instructions, and the communication interface is configured to receive or transmit data; wherein the processor, when executing the instructions, performs the method as described in the third aspect or any specific implementation manner of the third aspect.

In a ninth aspect, embodiments of the present application provide a non-volatile storage medium for storing program instructions, which, when applied to a test apparatus for quantifying a neural network model, can be used to implement the method described in the third aspect.

In a tenth aspect, the present application provides a computer program product, which includes program instructions, and when the computer program product is executed by a testing apparatus for quantifying a neural network model, the apparatus executes the method of the third aspect. The computer program product may be a software installation package, which, in case it is desired to use the method provided by any of the possible designs of the aforementioned third aspect, may be downloaded and executed on a test device quantifying the neural network model to implement the method of the third aspect.

It can be seen that the present application provides a training and testing method for a quantitative neural network model, where the quantitative neural network model is used in open set face recognition. When training is carried out, firstly, training data are obtained, wherein the training data comprise image samples and labels, the labels comprise identity marks and full-precision characteristics of the image samples, and the full-precision characteristics are obtained by inputting the image samples into a full-precision neural network model in advance; then, inputting the training data into a quantitative neural network model for quantitative processing to obtain quantitative characteristics; and secondly, according to the quantization characteristics and the full-precision characteristics, obtaining a quantization error of each quantization characteristic and a class error of each quantization characteristic, wherein the quantization error is the deviation of a single quantization characteristic before and after quantization processing, and the class error is the inter-class deviation of a single quantization characteristic before and after quantization processing. During calculation, based on a hypersphere distribution theory of face recognition, a quantization error is an included angle between each full-precision feature and each quantization feature, and a class error is defined as an included angle between a class center of the full-precision feature and a class center of the quantization feature; thirdly, calculating to obtain an individual error of each quantization feature according to the quantization error of each quantization feature and the class error of each quantization feature, and introducing the individual error of each quantization feature into a loss function as an angle margin; and finally, training the quantitative neural network model by adopting the loss function containing the individual error of each quantitative characteristic according to the value of the loss function, and finally obtaining the trained quantitative neural network model. Therefore, in the embodiment of the application, for a single quantization feature, the quantization error is decomposed into a class error and an individual error in an angle space, wherein the class error affects the inter-class separability and the individual error affects the intra-class compactness; introducing individual errors serving as angle margins into a loss function, and training a quantitative neural network model by using the loss function to obtain a trained quantitative neural network model; the quantization neural network model is trained by adopting a low bit bandwidth, and is used for face recognition, so that the operation amount is reduced, the calculation efficiency is higher, and the quantization neural network model is more conveniently applied to terminal equipment; and the quantization error of the quantization neural network model can be minimized, and the identification precision of the quantization neural network model is improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

Fig. 1 is a schematic diagram of a training method of a quantized neural network model according to an embodiment of the present disclosure;

FIG. 2 is a schematic diagram of another training method for a quantized neural network model according to an embodiment of the present disclosure;

FIG. 3 is a schematic diagram of features of an image sample provided by an embodiment of the present application;

FIG. 4 is a diagram illustrating the relationship between errors provided by an embodiment of the present application;

FIG. 5 is a diagram illustrating a method for testing a quantized neural network model according to an embodiment of the present disclosure;

FIG. 6 is a block diagram of a system architecture according to an embodiment of the present application;

FIG. 7 is a schematic diagram of a training apparatus for a quantized neural network model according to an embodiment of the present disclosure;

FIG. 8 is a schematic diagram of a training apparatus for a quantized neural network model according to an embodiment of the present disclosure;

FIG. 9 is a schematic diagram of a testing apparatus for a quantized neural network model according to an embodiment of the present disclosure;

fig. 10 is a schematic diagram of a testing apparatus for quantizing a neural network model according to an embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

It is to be understood that the terminology used in the embodiments of the present application is for the purpose of describing particular embodiments only, and is not intended to be limiting of the application. As used in the examples of this application and the appended claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items.

It is noted that, as used in this specification and the appended claims, the term "comprises" and any variations thereof are intended to cover non-exclusive inclusions. For example, a system, article, or apparatus that comprises a list of elements/components is not limited to only those elements/components but may alternatively include other elements/components not expressly listed or inherent to such system, article, or apparatus.

It is also understood that the term "if" may be interpreted as "when", "upon" or "in response to" determining "or" in response to detecting "or" in the case of … "depending on the context.

The face recognition is divided into face recognition of an open set task and face recognition of a closed set task, most of face recognition in real life is carried out based on the open set task, and for convenience of understanding, the difference between the face recognition of the open set task and the face recognition of the closed set task is simply introduced.

The main difference between face recognition in an open-set (open-set) task and face recognition in a closed-set (close-set) task is embodied in the aspects of training and testing of a face recognition model. close-set is that the faces we use to test are all people in the training set, and people outside the training set cannot appear. In this case, training the face recognition model is equivalent to training a classification model, and we only need to be able to classify each person in the training set well.

However, in most application scenarios, people outside the training set are likely to be present, and open-set is needed at this time. open-set is that the person for testing is not in the training set, and training the face recognition model at this time is actually equivalent to training a feature extractor, and the training set is useless after the feature extractor is trained. For example, in the case of corporate face recognition, it is not necessary to use photos of the corporate people when we train, because we only need to train a feature extractor, so the model can be trained by using a training set on the internet, and we only need photos of the corporate people when we test. Suppose that library stores photos of the owner of the company, and face-a is a photo of a human face to be recognized. After a feature extractor is trained by a training set, a face stored in the library is input into the trained feature extractor, so that a feature vector corresponding to each face in the library is obtained, namely a feature vector library of the person, the face-a is recognized, the feature vector is input into the feature extractor, the feature vector is obtained, and then the face-a is judged to be the face in the library or not by comparing the feature vector with each feature vector in the feature vector library.

Along with the continuous deepening of the neural network hierarchy, the memory occupation amount of the neural network is larger and larger, and due to the limitations of memory consumption and operation speed, the complex neural network is difficult to deploy in a mobile terminal. As one of the technologies for solving this problem, the quantization technology can not only reduce the data bandwidth and reduce the memory usage, but also accelerate the inference.

The current quantization research is mainly oriented to common closed set tasks, such as ImageNet and the like, most of the application in daily life is face recognition based on an open set task, the quantization research aiming at the face recognition of the open set task is very deficient, the face recognition of the open set task belongs to the category of measurement learning, different photos with the same identity need to be mapped near the same point of a measurement space, the farther the photos with different identities are mapped, the better the photos are, and therefore the face recognition of the open set task is more sensitive to errors (namely quantization errors) caused by quantization, and the face recognition accuracy is reduced greatly.

In order to reduce quantization errors and improve the accuracy of face recognition, referring to fig. 1, an embodiment of the present application provides a training method for a quantized neural network model, where the method includes, but is not limited to, the following steps:

s101, training data are obtained, wherein the training data comprise image samples and labels, and the labels comprise identification marks of the image samples.

Acquiring training data, wherein the training data comprises image samples and labels, the image samples comprise image samples of various different identities, the number of the image samples of the same identity can be many, the image samples comprise face images of a plurality of people in the face images, and the face images of each person comprise the face images shot from different angles; the label includes an identification of the image sample, for example, in a face image, the identification may be an ID, the same ID is used to identify a plurality of image samples corresponding to the same person, and different IDs identify image samples corresponding to different persons.

In one embodiment, the label further comprises full-precision features of the image sample, and the full-precision features of the image sample can be obtained by inputting the image sample into a full-precision neural network model in advance. In the process of training the full-precision neural network model, data exists in a full-precision form, addition, multiplication and other operations are performed in the full-precision form, and the trained full-precision neural network model is finally obtained through multiple times of training, so that in the process of inputting the image sample into the trained full-precision neural network model, the obtained characteristics are also full-precision characteristics.

And S102, inputting the training data into a quantitative neural network model for quantization processing to obtain the quantization characteristics of the image sample.

The image samples and the labels are input into a quantization neural network model for quantization, the quantization neural network model performs the quantization process on the image samples, and is a process of substantially extracting the characteristics of each image sample, data in the quantization neural network model is expressed in a low-bit mode, weights and activations are quantized into fixed-point integers with low precision, and effective bit operation or integer operation is performed, so that the quantization characteristics are obtained.

The structure of the quantitative neural network model is similar to or the same as that of the traditional neural network model, and comprises an input layer, a convolution layer, a pooling layer, a full-link layer, an output layer and the like. The convolutional layer has the main function of extracting image features, the convolutional layer comprises a plurality of convolutional kernels, each convolutional kernel can extract specific features, different convolutional kernels can extract different features, and after an image sample passes through the convolutional layer (comprising a plurality of convolutional kernels), multidimensional features of the image can be extracted. For example, an image sample of a human face is input, a certain convolution kernel is used to extract the features of the eyes, another convolution kernel is used to extract the features of the mouth, and so on. The main function of the pooling layer is to reduce the dimension by extracting the features again, and to reduce the dimension of the features output by the convolutional layer or to reduce the feature image output by the convolutional layer, so that a reduced image or the like can be obtained. In the process of operations such as convolution and pooling, data such as weight and activation amount exist in a low-bit form, and operations such as convolution and multiplication are performed instead of a full-precision form, so that an image sample is input into a quantization neural network model to obtain quantization characteristics.

S103, training the quantitative neural network model according to the quantitative characteristics to obtain the trained quantitative neural network model.

In the present application, a loss function is used to back-propagate a quantized neural network model according to quantization characteristics, in order to minimize the output value of the loss function. In fact, a smaller output value of the loss function indicates a closer output result to the target value. In the process of carrying out face recognition by using the quantized neural network model, the smaller the loss function output value is, the higher the recognition accuracy of the quantized neural network model is, namely the higher the face recognition accuracy is. Therefore, in the training process, the loss function is used for back propagation to change the parameters in the previous layers, so that the output value of the loss function is trained towards the reduction direction, and the output value of the loss function is finally minimized through continuous training, thereby obtaining the trained neural network model.

In essence, when the loss function output value is minimum, that is, when the quantized feature obtained by inputting the image sample into the quantized neural network model has the highest similarity with the full-precision feature obtained by inputting the image sample into the full-precision neural network model, the recognition accuracy is also the highest when the face recognition is performed using the quantized neural network model.

It should be noted that, in the process of quantizing an image by using the quantization neural network model, although the amount of computation is reduced and the computation speed is increased, a quantization error is also generated at the same time. In order to reduce the influence of quantization errors and improve the identification precision of a quantization neural network model, the quantization errors generated in the face identification process of an open set task are redefined in an angle space, and the influence of the quantization errors on the open set face identification precision is researched and analyzed; the quantization error is decomposed into class error (class error) and individual error (individual error) by analyzing the influence of the quantization error on the distribution within and among the classes; a new loss function is proposed. Referring to fig. 2, the specific description is:

and S1031, obtaining the quantization error of each quantization feature and the class error of each quantization feature according to the quantization feature and the full-precision feature, and further obtaining the individual error of each quantization feature.

The hypersphere distribution theory based on face recognition can know that quantization features of face recognition are distributed on a hypersphere with a fixed radius after being normalized, and quantization errors can be regarded as rotation on original distribution. Rotation consistency means that the intra-class compactness is not influenced by class rotation, and the feature recognition rate can be kept the same as that of a full-precision neural network model. The Quantization error generated in the face recognition process of the open set task is redefined in the angle space (the angle space on the hypersphere), for a single feature, the Quantization Error (QEs) is defined as an included angle between a Full Precision (FP) feature and the Quantization feature, the Quantization error is a deviation of the single Quantization feature before and after Quantization processing, the Quantization error is decomposed into a class error (class error) and an individual error (individual error), the class error is an inter-class deviation of the single Quantization feature before and after Quantization processing, and the individual error is an intra-class deviation of the single Quantization feature before and after Quantization processing. By analyzing the influence of quantization errors on the distribution within and among classes, the following results are obtained: class errors are the integral rotation of classes caused by quantization, without changing the separability between classes; the individual error is the intra-class deviation of each quantized feature, which represents the change of the intra-class structure, which essentially affects the performance of the quantized neural network model.

Referring to fig. 3, fig. 3 is a schematic diagram of features provided in an embodiment of the present application. The left side is a full-precision feature structure 100 corresponding to an image sample of an identity (the image sample of the identity is input into a full-precision neural network model to obtain a full-precision feature), the full-precision feature structure 100 is a part of a hypersphere, a small circle in the figure represents a single full-precision feature vector, a large circle represents a class center 11 of the full-precision feature structure 100, the class center 11 is obtained by averaging all the single full-precision feature vectors, and a star in the figure represents a sample feature 12 in the full-precision feature structure 100 (the sample feature corresponds to a sample feature 22 in a quantized feature structure 200 in the right side).

The right side is a quantized feature structure 200 corresponding to an image sample with the same identity (the image sample with the same identity is input into a quantized neural network model to obtain quantized features), the quantized feature structure 200 is also a part of the same hypersphere, a small circle in the figure represents a single quantized feature vector, a large circle represents a class center 21 of the full-precision feature structure 200, the class center 21 is obtained by averaging all quantized feature vectors, and a star in the figure represents a sample feature 22 in the quantized feature structure 200 (the sample feature 22 corresponds to the sample feature 12 in the full-precision feature structure 100 in the left side).

Referring to the schematic diagram shown in fig. 3, how to obtain the quantization error of each quantization feature and the class error of each quantization feature according to the quantization feature and the full-precision feature, and how to obtain the individual error of each quantization feature according to the quantization error of each quantization feature and the class error of each quantization feature are described below.

The included angle formed by the connecting line of the class center 11 of the full-precision feature structure 100 and the class center 21 of the quantized feature structure 200 and the sphere center is called as class error, namely theta in the figure₁Class of error θ₁The overall deviation of the class (the quantization characteristics of the image sample corresponding to the identity) before and after the characterization quantization (it should be noted that the class error is the inter-class deviation before and after the quantization processing of a single quantization characteristic, so the class error of the individual sample characteristic is equal to the class error of the class sample characteristic of the same class), and the influence of the class error is inter-class separability (i.e. separability between the characteristics corresponding to the images of different identities). The angle formed by the connection line of the sample feature 12 and the sample feature 22 with the sphere center is called quantization error, i.e. θ in the figure₂The quantization error theta₂The deviation of the sample feature 22 before and after quantization is characterized (quantization error is the angle between the full-precision feature and the quantized feature). Quantizing the error theta₂And like error theta₁The absolute value of the difference of (a) is called the individual error, | theta₂-θ₁The individual error is the intra-class deviation of the sample features 22 before and after quantization, and the individual error affects the intra-class compactness.

In FIG. 3, θ₁And theta₂Only schematically, in practice, theta₁And theta₂All located on a hypersphere of the same fixed radius, theta₁Representing the angle between the centres of two classes, theta₂Angle of incidence, θ, representing the characteristics of the same sample before and after quantization₁And theta₂It is determined that the angle between the two class centers is a class error, the angle between the two sample features is a quantization error, and the absolute value of the difference between the quantization error and the class error (if the difference is a negative number, the absolute value is taken) is an individual error.

Referring to fig. 4, fig. 4 is a schematic diagram of a relationship between quantization errors, class errors, and individual errors provided by an embodiment of the present application. In fig. 4, similarly, the large circle on the left side represents the class center 11 of the full-precision feature 100, the star represents one sample feature 12 in the full-precision feature 10, the large circle on the right side represents the class center 21 of the quantized feature 200, and the star located near the bottom represents the sample feature 22 corresponding to the sample feature 12. In order to more clearly express the meaning of each error, if each error is expressed by euclidean distance, the line segment c represents the quantization error of the quantization feature, the line segment a represents the class error of the quantization feature, and the line segment b represents the individual error of the quantization feature. It is assumed that if there is no individual error, only a class error, then the sample feature 22 will be located at the position of the sample feature 23, but due to the individual error, the individual error affects the intra-class compactness, changing the intra-class structure, so what actually corresponds to the sample feature 12 is the position of the sample feature 22.

S1032, obtaining a value of the first parameter corresponding to each quantization feature in the loss function according to the individual error of each quantization feature, and further obtaining a value of the loss function.

Because the individual error changes the structure in the class, the compactness in the class is influenced, and when the quantized neural network model is used for face recognition, the recognition accuracy of the quantized neural network model is further influenced, so that the individual error is taken as a factor and added to the first parameter of the loss function, or the individual error is taken as the angle margin of the loss function and introduced into the loss function, so as to reduce the quantization error and improve the recognition accuracy of the quantized neural network model. The individual error of each quantization feature is obtained in the foregoing, the value of the first parameter corresponding to each quantization feature in the loss function can be obtained according to the individual error of each quantization feature, and then the value of the loss function is obtained according to the value of the corresponding first parameter. The first parameter of the loss function refers to a certain angle parameter in the loss function, and when the loss functions are different, the angle parameter is different, and the analysis is specifically performed according to the specific loss function form.

For example, in face recognition, a cosine-based (cosine-based) softmax cross entropy loss function of the mainstream is commonly used:

wherein the parameters

Parameter(s)

Is w_jAnd f_iAngle between w_jRepresents the jth weight vector in the fully-connected layer (j has a value range of 0 and the number of categories-1)])，f_iRepresenting the ith quantized feature vector in the sample feature, and the parameter m is a penalty term set, typically taken to be 0.5.

In order to improve the identification precision and identification rate of the quantitative neural network model, the individual error | theta is used₂-θ₁Parameter of | addition to cosine-based softmax cross-entropy loss function

In (1), obtaining a first parameter

(where λ is a hyper-parameter, λ is large, penalty term is large, λ is small, penalty term is small, and in specific application, the magnitude is adjusted according to actual situation), and then the value of the first parameter is brought into the loss function to obtainThe value of the loss function is new, and thus the value of the first parameter of the loss function, and thus the value of the loss function, can be obtained from the individual error.

S1033, training the quantitative neural network model according to the loss function value of the quantitative neural network model, and obtaining the trained quantitative neural network model.

The loss function of the quantitative neural network model is used for back propagation, the quantitative neural network model is trained according to the value of the loss function, the purpose is to minimize the value output by the loss function, the smaller the value output by the loss function is, the closer the obtained quantitative feature is to the full-precision feature, when the face recognition is carried out by using the quantitative neural network model, the higher the recognition precision and the higher the accuracy are, the training is carried out towards the direction of the reduction of the loss function value, finally, the value of the loss function is minimized, and the trained quantitative neural network model is obtained.

It can be seen that the embodiment of the application provides a training method of a quantization neural network model, which defines quantization errors in an angle space based on a hypersphere distribution theory of face recognition (i.e. features are all distributed on a hypersphere with a fixed radius), and decomposes the quantization errors into class errors and individual errors for a single quantization feature, wherein the class errors affect the separability between classes, and the individual errors affect the compactness in a class; introducing individual errors serving as angle margins into a loss function, and training a quantitative neural network model by using the loss function to obtain a trained quantitative neural network model; under the condition of carrying out face recognition by using the quantization neural network model, the quantization error can be minimized, and the recognition precision of the quantization neural network model is improved; the quantization neural network model can also be called as a low-bit face recognition model, quantization processing operation is carried out, the low-bit bandwidth is used, the operation amount in the image processing process is reduced, the operation rate is improved, and the quantization neural network model is more convenient to apply to terminal equipment.

Referring to fig. 5, the embodiment of the present application further provides a test method for quantizing a neural network model, which includes, but is not limited to, the following steps:

s201, acquiring a face image;

the method comprises the steps of obtaining a face image, wherein the face image can be one image or a plurality of face images shot by different people at different angles, and obtaining the corresponding face image according to specific conditions in a specific test scene or an application scene. The manner of acquiring the face image is not particularly limited in the present application.

S202, inputting the face image into the trained quantized neural network model for processing to obtain the quantized features corresponding to the face image.

And inputting the acquired face image into the trained quantized neural network model, and processing to obtain the quantized features of the face image. The quantitative neural network model is obtained by training based on image samples and labels of various different identities, the labels comprise identity marks of the image samples, the quantitative neural network model is trained by adopting a loss function, wherein an individual error is introduced into the loss function and serves as an angle margin, and the individual error is a key influencing the identification precision of the quantitative neural network model. The individual errors are obtained through solving according to quantization errors and class errors, the quantization errors refer to deviations obtained by quantization features and full-precision features, the image features are distributed on a hypersphere plane according to a hypersphere distribution theory, for a single feature sample, the quantization errors of an angle space refer to the angle deviations of the quantization features and the full-precision features, and the class errors refer to the angle deviations of the average values of all quantization feature samples and the average values of all full-precision features of the image sample with the same identity.

It should be noted that the quantization neural network model is performed based on a low bit bandwidth in a training process, so that the neural network model is convenient, the calculation amount of data is reduced, and the image processing rate is improved.

S203, comparing the quantization characteristics of the face image with the characteristics prestored in the database to obtain a face recognition result.

The database is pre-stored with the characteristics of the face image, the quantized characteristics output by the quantized neural network model are compared with the characteristics stored in the database, and the final result is output. Wherein, the output result can be similarity (percentage or decimal), attribute, name, success/failure, etc.

In a specific embodiment, the features pre-stored in the database may be full-precision features corresponding to the face image, the quantized features of the face image are matched with the full-precision features of the face image pre-stored in the database, the quantized features of the face image are calculated, and the similarity between the quantized features quantized by the quantized neural network model and the full-precision features is output, so that the test effect is achieved.

In a specific application scenario, a company employee can realize attendance card punching in a face recognition mode, the features pre-stored in the database can be the features corresponding to the face images of all employees in the company, the face images are the face images of any employee, the quantitative features of the face images of the employees output by the quantitative neural network model are compared or matched with the features of the face images of all employees in the company pre-stored in the database, the names of the employees are finally output, and the face recognition card punching is completed.

It should be noted that S203 in this embodiment is optional, and in practical applications, the quantitative neural model may be used to obtain a required result (for example, performing face recognition, obtaining face recognition accuracy, and the like) in other manners, which is not limited in this application.

It can be seen that the embodiment of the application provides a method for testing a quantitative neural network, and firstly, a face image is obtained; then inputting the face image into the trained quantized neural network model for processing to obtain the quantized features corresponding to the face image; and finally, comparing or matching the quantized features of the face image with features prestored in a database to obtain an output result, wherein the output result can be the accuracy, the image name, pass/fail and the like. Therefore, by implementing the embodiment of the application, after the quantitative neural network model is obtained by training the loss function containing the individual error, the quantitative neural network model is used for face recognition, so that the accuracy and the correct rate of the face recognition are improved, and the quantitative neural network model adopts low bit bandwidth for quantization processing, so that the operation amount in the image processing process is reduced, and the memory consumption is reduced, so that the quantitative neural network model can be more conveniently applied to the terminal equipment.

Referring to fig. 6, fig. 6 is a system architecture 500 provided by an embodiment of the present application, in the system architecture 500, a data acquisition device 560 is configured to acquire training data, in the embodiment of the present application, the training data includes an image sample and a label, where the label includes an identity of the image sample and a full-precision feature of the image sample, the full-precision feature of the image sample is obtained by inputting the image sample into a full-precision neural network model in advance, and the image sample includes a plurality of images of a plurality of identities.

After the training data is collected, the data collection device 560 stores the training data in the database 530, and the training device 520 trains the quantized neural network model 513 based on the training data maintained in the database 530. The method specifically comprises the following steps: the training device 530 obtains training data in a database, performs quantization processing on an image sample to obtain quantization characteristics of the image sample, the training device 530 obtains quantization errors and class errors according to the quantization characteristics and full-precision characteristics to obtain individual errors, obtains a value of a first parameter of a loss function (called as the first parameter of the loss function for distinguishing the parameter of the loss function before the individual error is added) by adding the individual errors to the parameter of the loss function, obtains a value of the loss function by calculation according to the value of the first parameter of the loss function, performs back propagation by using the loss function according to the value of the loss function, performs training for multiple times to finally minimize an output value of the loss function, and obtains a trained quantized neural network model 541.

After the quantized neural network model 541 is trained, the quantized neural network model 541 may be tested, and taking face recognition as an example, the testing process is described as follows: the face image is acquired by the data acquisition device 510, stored in the database 520, and then input to the trained quantized neural network model 541 through the I/O interface 544, or alternatively, the terminal device 560 inputs the face image into the quantized neural network model 541 through the I/O interface 544, the quantized neural network model 541 quantizes the face image to obtain quantized features, and outputs the quantized features to the other auxiliary modules 543, the other auxiliary modules 543 compare the quantized features output by the quantized neural network model 541 with the features stored in the data storage system 550, and outputs a final result through the I/O interface 544, where the output result may be a face recognition accuracy, a pass/fail, an attribute, a name, and the like. For example, the above-mentioned enterprise employee punches a card through face recognition, and the output result is the employee name; passengers in the high-speed rail station get in the station to take the vehicle through face recognition, and the output result is passing (the gate is opened) or not passing (the gate is in a closed state).

The other assistance module 543 performs subsequent processing on the output of the quantized neural network model 541 to arrive at the desired final result. For example, in the above test on the quantized neural network model 541, the other auxiliary modules 543 are used for comparing and calculating the features output by the quantized neural network model 541 with the features pre-stored in the data storage system 550 to obtain the identification rate, or the attribute, name, pass/fail, etc.

The quantized neural network model 541 trained according to the training device 530 may be applied to different systems or devices, such as the execution device 540 shown in fig. 5, where the execution device 540 may be an application server or a server in a cloud server cluster, and in fig. 5, the execution device 540 configures an input/output (I/O) interface 544 for data interaction with the terminal device 560, and a user may input data to the I/O interface 544 through the terminal device 560. It should be noted that the training device 530 may exist independently of the execution device 540, or may be integrated inside the execution device 540, which is not limited in the present application.

In the process of processing the input data by the execution device 540 or performing relevant processing such as calculation by the calculation module 542 of the execution device 540, the execution device 540 may call data, codes, and the like in the data storage system 550 for corresponding processing, and may store data, instructions, and the like obtained by corresponding processing in the data storage system 550. For example, during testing of the quantitative neural network model 541, other ancillary modules 543 in the execution device 540 call features stored in the data storage system 550, compare with the output of the quantitative neural network model 541, calculate, etc.

It should be noted that the training device 530 may generate corresponding quantized neural network models 541 for different targets or different tasks based on different training data, and the corresponding quantized neural network models 541 may be used to achieve the corresponding targets or complete the corresponding tasks, so as to provide the user with the required results.

Referring to fig. 7, fig. 7 is a schematic diagram of a training apparatus 60 for quantizing a neural network model according to an embodiment of the present application, where the apparatus 60 may include:

an obtaining module 601, configured to obtain training data; the training data includes image samples and labels; the label comprises an identity of the image sample;

the quantization module 602 is configured to input training data into a quantization neural network model for quantization processing to obtain quantization features of an image sample;

the training module 603 is configured to train the quantized neural network model according to the quantized features, so as to obtain a trained quantized neural network model.

In a possible embodiment, the label further comprises full-precision features of the image sample; the apparatus 60 further comprises: the preprocessing module 604 is configured to input the image sample into a full-precision neural network for processing, so as to obtain a full-precision feature.

In a possible embodiment, the training module 603 is specifically configured to: obtaining a loss function value of the quantitative neural network model according to the quantitative characteristic and the full-precision characteristic; and training the quantitative neural network model according to the value of the loss function of the quantitative neural network model to obtain the trained quantitative neural network model.

In a possible embodiment, the training module 603 is further configured to: obtaining an individual error of each quantization feature according to the quantization feature and the full-precision feature; the individual error is the intra-class deviation before and after the quantization processing of a single quantization characteristic; the class represents a set of a plurality of quantization characteristics corresponding to the image samples with the same identity marks; obtaining a first parameter of the loss function corresponding to each quantization feature according to the individual error of each quantization feature; from the first parameter, a value of a loss function is obtained.

In a possible embodiment, the training module 603 is further configured to: according to the quantization characteristics and the full-precision characteristics, obtaining the quantization error of each quantization characteristic and the class error of each quantization characteristic; the quantization error is the deviation before and after the quantization processing of a single quantization characteristic; the class error is the inter-class deviation before and after the quantization processing of a single quantization characteristic; and obtaining the individual error of each quantization characteristic according to the quantization error and the class error.

In a possible embodiment, in the angular space, the quantization error of each quantization feature is an angle between the full-precision feature corresponding to each quantization feature and the quantization feature; the class error of each quantization feature is an included angle between a class center of the full-precision feature and a class center of the quantization feature; the class center of the full-precision feature is an average value of the full-precision feature corresponding to the image samples with the same identity marks, and the class center of the quantitative feature is an average value of all quantitative features corresponding to the image samples with the same identity marks; the training module 603 is further configured to: and (4) making a difference between the quantization error of each quantization feature and the class error of each quantization feature, and then taking an absolute value to obtain an individual error of each quantization feature.

The functional modules of the apparatus 60 are used to implement the method described in the embodiment of fig. 1 or fig. 2, and specific contents may refer to descriptions in relevant contents of the embodiment of fig. 1 or fig. 2, and for brevity of description, no further description is given here.

Referring to fig. 8, fig. 8 is a schematic diagram of a training device for quantizing a neural network model according to an embodiment of the present application, where the training device may be implemented in an application server 700, and the apparatus at least includes: processor 710, communication interface 720, and memory 730, with processor 710, communication interface 720, and memory 730 coupled by a bus 740. Wherein the content of the first and second substances,

the processor 710 is used to execute the obtaining module 601, the quantifying module 602, the training module 603, and the preprocessing module 604 in fig. 7 by calling the program code in the memory 730. In practical applications, processor 710 may include one or more general-purpose processors, wherein a general-purpose processor may be any type of device capable of Processing electronic instructions, including a Central Processing Unit (CPU), a microprocessor, a microcontroller, a main processor, a controller, an Application Specific Integrated Circuit (ASIC), and so on. The processor 710 reads the program code stored in the memory 730 and cooperates with the communication interface 720 to execute part or all of the steps of the training method for quantizing a neural network model according to the above embodiments of the present application.

Communication interface 720 may be a wired interface (e.g., an ethernet interface) for communicating with other computing nodes or devices. When communication interface 720 is a wired interface, communication interface 720 may employ a Protocol family over TCP/IP, such as RAAS Protocol, Remote Function Call (RFC) Protocol, Simple Object Access Protocol (SOAP) Protocol, Simple Network Management Protocol (SNMP) Protocol, Common Object Request Broker Architecture (CORBA) Protocol, and distributed Protocol, among others.

Memory 730 may store program codes as well as program data. The program code includes code of the obtaining module 601, code of the quantization module 602, code of the training module 603, and code of the preprocessing module 604. The program data includes: image samples, labels, quantization features, loss functions, quantization errors, individual errors, and the like. In practical applications, the Memory 730 may include a Volatile Memory (Volatile Memory), such as a Random Access Memory (RAM); the Memory may also include a Non-Volatile Memory (Non-Volatile Memory), such as a Read-Only Memory (ROM), a Flash Memory (Flash Memory), a Hard Disk Drive (HDD), or a Solid-State Drive (SSD) Memory, which may also include a combination of the above types of memories.

Referring to fig. 9, fig. 9 is a schematic diagram of a testing apparatus 80 for quantizing a neural network model according to an embodiment of the present application, where the apparatus 80 may include:

an obtaining module 801, configured to obtain a face image;

the quantization neural network module 802 is configured to input the face image into the trained quantization neural network model for processing, so as to obtain quantization features of the face image; the quantitative neural network model is obtained by training based on a plurality of different identity image samples and labels, the labels comprise identity marks of the image samples, and the quantitative neural network model is trained by adopting a loss function, wherein individual errors are introduced into the loss function as angle margins.

It should be noted that the individual error is a key influencing the recognition accuracy of the quantitative neural network model. The individual errors are obtained through solving according to quantization errors and class errors, the quantization errors refer to deviations obtained by quantization features and full-precision features, the image features are distributed on a hypersphere plane according to a hypersphere distribution theory, for a single feature sample, the quantization errors in an angle space refer to the angle deviations of the quantization features and the full-precision features, and the class errors refer to the angle deviations of the average value of each quantization feature sample and the average value of each full-precision feature of the image sample with the same identity.

And the result output module 803 is used for comparing the quantized features of the face image with the features prestored in the database to obtain a face recognition result.

In a possible implementation mode, the features pre-stored in the database can be full-precision features corresponding to the face image, the quantized features of the face image are matched and calculated with the full-precision features of the face image pre-stored in the database, and the similarity between the quantized features and the full-precision features quantized by the quantized neural network model is output, so that the test effect is achieved.

In a possible implementation, the output result may be a similarity (percentage or decimal), an attribute, a name, success/failure, and so on. For example, in a specific application scenario, a company employee performs attendance checking in a face recognition mode, the features pre-stored in the database may be features corresponding to face images of all employees in the company, the face image is a face image of any employee, the quantized features of the face image of the employee output by the quantized neural network model are compared or matched with the features of the face images of all employees in the company pre-stored in the database, and finally the name of the employee is output to complete the attendance checking by face recognition.

The functional modules of the apparatus 80 are used for implementing the method described in the embodiment of fig. 5, and specific contents may refer to the description in the related contents of the embodiment of fig. 5, and for brevity of the description, are not repeated here.

Referring to fig. 10, the present application provides a schematic structural diagram of another test device for a quantized neural network model, where the test device for a quantized neural network model according to this embodiment may be implemented in a cloud server 900 of a cloud service cluster, and at least includes: at least one computing node 910, and at least one storage node 920. Wherein the content of the first and second substances,

the computing node 910 includes one or more processors 911, a communication interface 912, and a memory 913, which may be coupled via a bus 914 between the processors 911, the communication interface 912, and the memory 913.

The processor 911 includes one or more general-purpose processors for executing the obtaining module 801, the quantized neural network module 802, and the result output module 803 of fig. 9 by calling the program code in the memory 913. A general-purpose processor may be any type of device capable of Processing electronic instructions, including a Central Processing Unit (CPU), a microprocessor, a microcontroller, a main processor, a controller, an Application Specific Integrated Circuit (ASIC), and the like. It can be a dedicated processor for the compute node 910 only or can be shared with other compute nodes 910. The processor 911 reads the program code stored in the memory 913 to perform some or all of the steps of the method performed by the training apparatus for quantizing a neural network model in the above embodiments of the present application in cooperation with the communication interface 912.

The communication interface 912 may be a wired interface (e.g., an ethernet interface) for communicating with other computing nodes or users. When communication interface 912 is a wired interface, communication interface 912 may employ a Protocol family over TCP/IP, such as RAAS Protocol, Remote Function Call (RFC) Protocol, Simple Object Access Protocol (SOAP) Protocol, Simple Network Management Protocol (SNMP) Protocol, Common Object Request Broker Architecture (CORBA) Protocol, and distributed Protocol, among others.

The Memory 913 may include a Volatile Memory (Volatile Memory), such as a Random Access Memory (RAM); the Memory may also include a Non-Volatile Memory (Non-Volatile Memory), such as a Read-Only Memory (ROM), a Flash Memory (Flash Memory), a Hard Disk Drive (HDD), or a Solid-State Drive (SSD) Memory, which may also include a combination of the above types of memories.

The storage node 920 includes one or more storage controllers 921, storage arrays 922. The memory controller 921 and the memory array 922 may be connected by a bus 923.

Storage controller 921 includes one or more general-purpose processors, where a general-purpose processor may be any type of device capable of processing electronic instructions, including a CPU, microprocessor, microcontroller, host processor, controller, ASIC, and the like. It can be a dedicated processor for only a single storage node 920 or can be shared with the computing node 900 or other storage nodes 920. It is understood that in this embodiment, each storage node includes one storage controller, and in other embodiments, a plurality of storage nodes may share one storage controller, which is not limited herein.

Memory array 922 may include multiple memories. The memory may be a non-volatile memory, such as a ROM, flash memory, HDD or SSD memory, and may also include a combination of the above kinds of memory. For example, the storage array may be composed of a plurality of HDDs or a plurality of SDDs, or the storage array may be composed of HDDs and SDDs. In which a plurality of memories are combined in various ways to form a memory group with the aid of the memory controller 921, thereby providing higher storage performance than a single memory and providing a data backup technique. Optionally, memory array 922 may include one or more data centers. The plurality of data centers may be located at the same site or at different sites, and are not limited herein. Memory array 922 may store program codes and program data. The program code includes code of the acquisition module 801, code of the quantization neural network module 802, and code of the result output module 803. The program data includes: face images, quantization characteristics of the face images, recognition accuracy, and the like.

The present application also provides a computer-readable storage medium, where a computer program is stored, where the computer program is executed by hardware (for example, a processor, etc.) to implement part or all of the steps of any one of the methods performed by the training apparatus for quantizing a neural network model in the present application.

The embodiments of the present application also provide a computer program product, which, when being read and executed by a computer, causes a training apparatus for a quantitative neural network model to perform part or all of the steps of the training method for a quantitative neural network model in the embodiments of the present application.

Embodiments of the present application also provide a computer-readable storage medium, where a computer program is stored, where the computer program is executed by hardware (for example, a processor, etc.) to implement part or all of steps of any one of the methods performed by the test apparatus for quantifying a neural network model in the embodiments of the present application.

Embodiments of the present application also provide a computer program product, which, when being read and executed by a computer, causes a testing apparatus for quantifying a neural network model to perform part or all of the steps of the testing method for quantifying a neural network model in the embodiments of the present application.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented, in whole or in part, by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the application to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, the computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center by wire (e.g., coaxial cable, fiber optic, digital subscriber line) or wirelessly (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that incorporates one or more of the available media. The usable medium may be a magnetic medium (e.g., floppy Disk, storage Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid State Disk, SSD), among others. In the embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may also be an electric, mechanical or other form of connection.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiments of the present application.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially or partially contributed by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk. While the invention has been described with reference to specific embodiments, the scope of the invention is not limited thereto, and those skilled in the art can easily conceive various equivalent modifications or substitutions within the technical scope of the invention. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A training method of a quantitative neural network model is characterized by comprising the following steps:

acquiring training data; the training data comprises image samples and labels; the label comprises the identity and full-precision characteristics of the image sample;

inputting the training data into a quantitative neural network model for quantitative processing to obtain quantitative characteristics of the image sample;

obtaining an individual error of each quantization feature according to the quantization feature and the full-precision feature; the individual error is the intra-class deviation before and after single quantization characteristic quantization processing; the class represents a set of a plurality of quantized features corresponding to image samples with the same identity;

and training the quantitative neural network model according to the individual error to obtain a trained quantitative neural network model.

2. The method of claim 1, further comprising:

and inputting the image sample into a full-precision neural network for processing to obtain the full-precision characteristic.

3. The method of claim 2, wherein the training the quantitative neural network model according to the individual error to obtain a trained quantitative neural network model comprises:

obtaining a first parameter corresponding to each quantization feature in a loss function of the quantization neural network model according to the individual error of each quantization feature;

obtaining a value of the loss function according to the first parameter;

and training the quantitative neural network model according to the loss function value of the quantitative neural network model to obtain the trained quantitative neural network model.

4. The method of claim 3, wherein obtaining an individual error for each quantized feature from the quantized feature and the full-precision feature comprises:

obtaining a quantization error of each quantization feature and a class error of each quantization feature according to the quantization feature and the full-precision feature; the quantization error is the deviation of a single quantization characteristic before and after quantization processing; the class error is the inter-class deviation before and after the quantization processing of a single quantization characteristic;

and obtaining the individual error of each quantization feature according to the quantization error and the class error.

5. The method according to claim 4, characterized in that, in angular space,

the quantization error of each quantization feature is an included angle between the full-precision feature corresponding to each quantization feature and the quantization feature;

the class error of each quantization feature is an included angle between a class center of the full-precision feature and a class center of the quantization feature; the class center of the full-precision feature is the average value of all full-precision features corresponding to the image samples with the same identity marks, and the class center of the quantitative feature is the average value of all quantitative features corresponding to the image samples with the same identity marks;

the obtaining the individual error of each quantization feature according to the quantization error and the class error comprises:

and subtracting the quantization error of each quantization feature from the class error of each quantization feature, and then taking an absolute value to obtain an individual error of each quantization feature.

6. A test method of a quantized neural network model, comprising:

acquiring a face image;

inputting the face image into a trained quantization neural network model, and processing to obtain quantization characteristics corresponding to the face image; wherein the quantitative neural network model is the quantitative neural network model of any one of claims 1-5;

and comparing the quantization characteristics corresponding to the face image with the characteristics prestored in a database to obtain a face recognition result.

7. A training apparatus for a quantized neural network model, comprising:

the acquisition module is used for acquiring training data; the training data comprises image samples and labels; the label comprises the identity and full-precision characteristics of the image sample;

the training module is used for obtaining an individual error of each quantization feature according to the quantization feature and the full-precision feature; the individual error is the intra-class deviation before and after single quantization characteristic quantization processing; the class represents a set of a plurality of quantized features corresponding to image samples with the same identity;

and the training module is used for training the quantitative neural network model according to the individual error to obtain a trained quantitative neural network model.

8. The apparatus of claim 7, wherein the label further comprises full-precision features of the image sample; the device further comprises:

and the preprocessing module is used for inputting the image sample into a full-precision neural network for processing to obtain the full-precision characteristic.

9. The apparatus of claim 8, wherein the training module is further configured to:

obtaining a value of the loss function according to the first parameter;

10. The apparatus of claim 9, wherein the training module is further configured to:

11. The device according to claim 10, characterized in that in angular space,

the training module is further configured to:

12. A test apparatus for quantifying a neural network model, comprising:

the acquisition module is used for acquiring a face image;

the quantization neural network module is used for inputting the face image into the trained quantization neural network model for processing to obtain quantization characteristics corresponding to the face image; wherein the quantitative neural network model is the quantitative neural network model of any one of claims 1-5;

13. A computer-readable storage medium comprising program instructions which, when run on a computer, cause the computer to perform the method of any one of claims 1-6.

14. Training device for quantifying neural network models, characterized in that it comprises a memory for storing information and data, a processor for receiving or sending information and data, and a communication interface for invoking information and data stored in said memory, for performing the method according to claims 1-5.

15. A test apparatus for quantizing a neural network model, the apparatus comprising a memory for storing information and data, a processor, and a communication interface for receiving or transmitting information and data, the processor for invoking the information and data stored in the memory and performing the method of claim 6.