CN115171201B - Face information identification method, device and equipment based on binary neural network - Google Patents

Face information identification method, device and equipment based on binary neural network Download PDF

Info

Publication number
CN115171201B
CN115171201B CN202211092937.7A CN202211092937A CN115171201B CN 115171201 B CN115171201 B CN 115171201B CN 202211092937 A CN202211092937 A CN 202211092937A CN 115171201 B CN115171201 B CN 115171201B
Authority
CN
China
Prior art keywords
gradient
neural network
value
binary
binary neural
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211092937.7A
Other languages
Chinese (zh)
Other versions
CN115171201A (en
Inventor
陈鹏
陈宇
胡启昶
李腾
李发成
张如高
虞正华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Moshi Intelligent Technology Co ltd
Original Assignee
Suzhou Moshi Intelligent Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Moshi Intelligent Technology Co ltd filed Critical Suzhou Moshi Intelligent Technology Co ltd
Priority to CN202211092937.7A priority Critical patent/CN115171201B/en
Publication of CN115171201A publication Critical patent/CN115171201A/en
Application granted granted Critical
Publication of CN115171201B publication Critical patent/CN115171201B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Human Computer Interaction (AREA)
  • Computing Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

The application relates to a face information identification method, device and equipment based on a binary neural network, in particular to the technical field of neural networks. The method comprises the following steps: acquiring a target image, extracting an image characteristic value of the target image, and inputting the image characteristic value into a binary neural network; carrying out binary quantization processing on the image characteristic value to obtain a quantization value of the image characteristic value; determining the numerical value of a gradient adjusting coefficient according to the quantization value of the image characteristic value, wherein the gradient adjusting coefficient is used for controlling the probability of binary inversion of the weight in the binary neural network in the training process, and the binary inversion refers to that the weight corresponds to different quantization values before and after single iteration; carrying out gradient updating on the weights by using the gradient adjusting coefficient to obtain the weights after the training is finished, wherein the weights are used for forming a binary neural network after the training is finished; and inputting the image to be recognized into the trained binary neural network, and recognizing the face information in the image to be recognized.

Description

Face information identification method, device and equipment based on binary neural network
Technical Field
The invention relates to the technical field of neural networks, in particular to a face information identification method, a face information identification device and face information identification equipment based on a binary neural network.
Background
Face information recognition is widely applied to systems such as information encryption, system security, identity authentication and the like. Face recognition systems based on deep neural networks face a need for high computational complexity.
A binary neural network is a neural network that employs binary quantization. The binary quantization refers to discretizing the weight and the feature map of the neural network into two states including-1 and 1 (or 0 and 1). Therefore, in order to reduce the computational complexity of the deep neural network, face information recognition may be performed based on a binary neural network.
When the weights and the characteristic values in the neural network are subjected to binary quantization, for many neural networks, the precision loss is large. How to reduce the precision loss of the binary neural network during training, thereby improving the performance of the binary neural network, is a prerequisite for deploying the binary neural network in the practical application of face information recognition.
Disclosure of Invention
The application provides a face information recognition method, a face information recognition device and face information recognition equipment based on a binary neural network, and the training precision of the binary neural network is improved, so that the recognition precision of the trained binary neural network on face information in an image is improved. The technical scheme is as follows.
In one aspect, a face information recognition method based on a binary neural network is provided, and the method includes:
acquiring a target image, extracting an image characteristic value of the target image, and inputting the image characteristic value into a binary neural network;
performing binary quantization processing on the image characteristic value to obtain a quantization value of the image characteristic value;
determining the numerical value of a gradient adjusting coefficient according to the quantization value of the image characteristic value, wherein the gradient adjusting coefficient is used for controlling the probability of binary inversion of the weight in the binary neural network in the training process, and the binary inversion refers to that the weight corresponds to different quantization values before and after single iteration;
performing gradient updating on the weights by using the gradient adjusting coefficients to obtain the weights after training is completed, wherein the weights are used for forming the binary neural network after training is completed;
and inputting the image to be recognized into the trained binary neural network, and recognizing the face information in the image to be recognized.
In still another aspect, there is provided a face information recognition apparatus based on a binary neural network, the apparatus including:
the characteristic value input module is used for acquiring a target image, extracting an image characteristic value of the target image and inputting the image characteristic value into a binary neural network;
the binary quantization module is used for performing binary quantization processing on the image characteristic value to obtain a quantization value of the image characteristic value;
a numerical value determining module, configured to determine a numerical value of a gradient adjustment coefficient according to a quantization value of the image feature value, where the gradient adjustment coefficient is used to control a probability that a weight in the binary neural network undergoes binary inversion in a training process, and the binary inversion refers to that the weight corresponds to different quantization values before and after a single iteration;
the first gradient updating module is used for performing gradient updating on the weights by using the gradient adjusting coefficients to obtain the weights after training is completed, and the weights are used for forming the binary neural network after training is completed;
and the face information recognition module is used for inputting the image to be recognized into the trained binary neural network and recognizing the face information in the image to be recognized.
In a possible implementation manner, the numerical value determining module is further configured to:
determining the numerical value of the gradient adjustment coefficient as a first gradient adjustment coefficient value under the condition that the quantized value of the image characteristic value is a first quantized value;
determining the numerical value of the gradient adjustment coefficient as a second gradient adjustment coefficient value under the condition that the quantized value of the image characteristic value is a second quantized value;
wherein the first gradient adjustment coefficient value is different from the second gradient adjustment coefficient value.
In one possible implementation manner, the first gradient updating module is further configured to:
gradient updating weights in the binary neural network using the following formula:
Figure 215224DEST_PATH_IMAGE001
wherein,
Figure 722428DEST_PATH_IMAGE002
for the weight, is>
Figure 450082DEST_PATH_IMAGE003
For a quantized value of said weight>
Figure 359132DEST_PATH_IMAGE004
For the image characteristic value, < > or>
Figure 897561DEST_PATH_IMAGE005
For a quantized value of the image characteristic value>
Figure 892062DEST_PATH_IMAGE006
Adjust the coefficient for the gradient, < >>
Figure 174138DEST_PATH_IMAGE007
For a number of layers of said binary neural network, in>
Figure 937695DEST_PATH_IMAGE008
For training the number of iterations, ->
Figure 912604DEST_PATH_IMAGE009
For a loss function of the binary neural network>
Figure 128822DEST_PATH_IMAGE010
Is the learning rate.
In one possible implementation, the apparatus further includes: a second gradient update module;
and the second gradient updating module is used for performing gradient updating on the gradient adjusting coefficient so as to obtain the trained gradient adjusting coefficient.
In a possible implementation manner, the second gradient updating module is further configured to:
obtaining a gradient magnitude synchronization coefficient, wherein the gradient magnitude synchronization coefficient is used for ensuring that the magnitude of the gradient adjustment coefficient is synchronous with the magnitude of the gradient of the weight;
and performing gradient updating on the gradient adjusting coefficient by using the gradient magnitude synchronous coefficient to obtain the trained gradient adjusting coefficient.
In a possible implementation manner, the second gradient updating module is further configured to:
determining the gradient magnitude synchronization coefficient using the following equation:
Figure 962392DEST_PATH_IMAGE011
wherein,
Figure 783718DEST_PATH_IMAGE012
for the gradient magnitude synchronization coefficient, based on the gradient magnitude of the image data>
Figure 991845DEST_PATH_IMAGE013
And the number of the weights in the current layer of the binary neural network is used as the weight of the current layer of the binary neural network.
In still another aspect, a computer device is provided, where the computer device includes a processor and a memory, where the memory stores at least one instruction, at least one program, a code set, or a set of instructions, and the at least one instruction, at least one program, a code set, or a set of instructions is loaded and executed by the processor to implement the above-mentioned face information recognition method based on a binary neural network.
In still another aspect, a computer-readable storage medium is provided, and at least one instruction is stored in the storage medium, and the at least one instruction is loaded and executed by a processor to implement the above-mentioned face information recognition method based on a binary neural network.
In yet another aspect, a computer program product is provided, as well as a computer program product or a computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer readable storage medium, and the processor executes the computer instructions, so that the computer device executes the face information identification method based on the binary neural network.
The technical scheme provided by the application can comprise the following beneficial effects:
in a binary neural network, the difference between binary quantized data and original data is large, so that a large error exists in gradient calculation, and the error causes that the weight frequently undergoes binary inversion in training, namely the weight corresponds to different quantized values before and after single iteration, so that the optimization direction of the neural network is difficult to converge to a correct direction, and the precision of neural network training is reduced. Based on the technical scheme, in the training process of the binary neural network, a gradient adjustment coefficient is introduced, the numerical value of the gradient adjustment coefficient is related to the quantization value of the image characteristic value used in the training process, and when the weight training is carried out in a gradient updating mode, the gradient adjustment coefficient can control the probability of binary inversion of the weight in the training process, so that the training precision of the neural network can be improved, and the recognition precision of the trained binary neural network on the face information in the image is improved.
Drawings
In order to more clearly illustrate the detailed description of the present application or the technical solutions in the prior art, the drawings needed to be used in the detailed description of the present application or the prior art description will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present application, and other drawings can be obtained by those skilled in the art without creative efforts.
FIG. 1 is a schematic diagram illustrating training of a binary neural network, according to an example embodiment.
Fig. 2 is a flowchart illustrating a method of a face information recognition method based on a binary neural network according to an exemplary embodiment.
FIG. 3 is a diagram illustrating gradient adjustment coefficients learned by different layers according to an exemplary embodiment.
FIG. 4 is a diagram illustrating weight flip probabilities in a binary neural network training process, according to an exemplary embodiment.
FIG. 5 is a diagram illustrating a comparison of training accuracy for different training modes according to an example embodiment.
FIG. 6 is a diagram illustrating a comparison of training accuracy for different training modes according to an example embodiment.
Fig. 7 is a block diagram illustrating a configuration of a face information recognition apparatus based on a binary neural network according to an exemplary embodiment.
FIG. 8 is a schematic diagram of a computer device provided in accordance with an exemplary embodiment of the present application.
Detailed Description
The technical solutions of the present application will be described clearly and completely with reference to the accompanying drawings, and it is to be understood that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
It should be understood that "indication" mentioned in the embodiments of the present application may be a direct indication, an indirect indication, or an indication of an association relationship. For example, a indicates B, which may mean that a directly indicates B, e.g., B may be obtained by a; it may also mean that a indicates B indirectly, for example, a indicates C, and B may be obtained by C; it can also mean that there is an association between a and B.
In the description of the embodiments of the present application, the term "correspond" may indicate that there is a direct correspondence or an indirect correspondence between the two, may also indicate that there is an association between the two, and may also indicate and is indicated, configure and is configured, and the like.
In the embodiment of the present application, "predefining" may be implemented by saving a corresponding code, table, or other manners that may be used to indicate related information in advance in a device (for example, including a terminal device and a network device), and the present application is not limited to a specific implementation manner thereof.
Before describing the various embodiments shown herein, several concepts related to the present application will be described.
1) Neural network
The Neural Network is a mathematical model for simulating the structure and function of biological Neural Networks, such as Deep Neural Networks (DNN), convolutional Neural Networks (CNN), recurrent Neural Networks (RNN), and so on.
2) Binary quantization
In recent years, the performance of the neural network is remarkably improved in the fields of computer vision, natural language processing and the like, and the neural network is widely applied to practical applications such as automatic driving, smart phones and the like. However, the high computational load of the neural network poses a great challenge to its deployment on the mobile side. Binary quantization is an effective and extreme model compression means, and can reduce the complexity of a neural network, thereby causing wide attention.
The binary quantization is to discretize the weight (weight) or the characteristic value (feature map) of the neural network into two states including-1 and 1 (or 0 and 1), so as to achieve the purposes of improving the network calculation efficiency and reducing the energy consumption.
In particular, after the weights and the eigenvalues in the neural network are quantized bi-directionally, the calculation of the neural network (convolution operation or full concatenation operation) can use bit operation calculation (if only the weights are quantized, the network calculation uses addition and subtraction operation instead of multiplication). The calculation method using bit operation is very suitable for a computer system compared with the original numerical calculation method, and can bring various benefits such as calculation speed and energy consumption.
The neural network for performing binary quantization is called a binary neural network, and the binary neural network faces a huge challenge in the aspect of precision loss.
The amount of information that a binary neural network can carry is very limited. When the weights and eigenvalues in the neural network are binary quantized, for many neural networks, the accuracy is lost to an unacceptable step. The performance of the binary neural network is improved, the precision loss of the task is reduced, and the method is a prerequisite condition that the binary neural network can be deployed in practical application.
In order to solve the problems, some related works of the known binary neural network aim at special scenes, and a quantization algorithm of binary quantization is separately designed, so that the quantization algorithm is more suitable for the specific scenes; others are constrained using additional penalty functions.
In the technical solutions provided by the above related works, the reason for the accuracy degradation of the binary neural network is not analyzed from the perspective of the binary neural network optimization.
Based on the above situation, in the embodiment of the present application, a training strategy of a binary neural network based on state awareness is proposed, and the training strategy theoretically analyzes the reason that the precision of the binary neural network training is reduced: one of the reasons for instability of binary quantization training is frequent inversion of the binary state of data during training.
Illustratively, with reference to FIG. 1 in conjunction, FIG. 1 indicates the problem faced by a binary neural network in training. Where w and L represent the weight and network loss function, respectively, and t represents the number of iterations of the training.
In a binary neural network, w is quantized binary, depending on the sign (positive or negative) of the data, to determine whether it is quantized to 1 or-1. Because the binary neural network only has two states of-1 and 1, the information loss is large, and further the gradient calculation is inaccurate (when the gradient is calculated, a quantized value is used, and the actual optimization direction is greatly different).
For example, in the t-th iteration shown in fig. 1, the original gradient direction (dotted line head 101) of the weight w is desirably updated in the negative direction (the value decreases), but after quantization, the quantization value is-1, and the gradient direction (dotted line head 102) here is desirably updated in the positive direction in the next iteration of the weight w. Thus, in the t +1 th iteration, the weight w is updated to a positive number (solid arrow 103). Similarly, in the iteration t +1 shown in fig. 1, the actual gradient of the weight w may not match the quantized gradient direction, and the original gradient direction of the weight w (dotted head 104) is preferably updated (the value is increased) in the positive direction next time, whereas the quantized value is 1 after quantization, and the gradient direction (dotted head 105) in this case is preferably updated in the negative direction next time, so that the probability is updated again to a negative value (solid arrow 106). The problem that the actual optimization direction is inconsistent with the quantized optimization direction causes instability of the binary neural network training. A large amount of computing resources are wasted in the wrong update direction, resulting in a reduction in the final training accuracy.
Therefore, it can be seen from the problems faced by the above binary neural network in training that, in the binary neural network, because the expression capability is limited, the difference between the binary quantized data and the original data is large, and a large error is caused during gradient calculation. Such errors make it difficult for the network's optimization direction to converge to the correct direction, and in practice, the manifestation is the frequent repeated inversion of binary data. Some of the repeated inversions are caused by calculation errors after quantization, and belong to invalid inversions, so that the efficiency of network training is reduced.
Correspondingly, the embodiment of the application provides a technical scheme of using different gradient adjustment coefficients for different binary states, so that invalid turnover is reduced, the purposes of stable training and task precision improvement are achieved, and the trained binary neural network can be applied to face information recognition of images.
The technical solutions provided in the present application are further described below with reference to the following examples.
Fig. 2 is a flowchart illustrating a method of a face information recognition method based on a binary neural network according to an exemplary embodiment. The method is performed by a computer device. As shown in fig. 2, the face information recognition method based on the binary neural network may include the following steps:
step 201, acquiring a target image, extracting an image characteristic value of the target image, and inputting the image characteristic value into a binary neural network.
The image feature value refers to a feature value extracted from image data of a target image. Wherein the target image is an image containing face information.
In the embodiment of the application, the image characteristic values are input into the binary neural network, and the binary neural network is exemplarily described to be applied to face information recognition in the field of computer vision.
Step 202, performing binary quantization processing on the image characteristic value to obtain a quantization value of the image characteristic value.
The binary quantization processing is a processing method of quantizing an image feature value into one of a first quantized value and a second quantized value. Such as: the sign of the image feature value is taken, and if the sign is positive, the quantized value of the image feature value is a first quantized value "1", and if the sign is negative, the quantized value of the image feature value is a second quantized value "-1".
It is understood that the embodiment of the present application is only exemplified by a binary neural network, and the technical solution thereof can be similarly applied to other multivalued quantization networks.
Step 203, determining a value of a gradient adjustment coefficient according to the quantization value of the image characteristic value, wherein the gradient adjustment coefficient is used for controlling the probability of binary inversion of the weight in the binary neural network in the training process, and the binary inversion refers to that the weight corresponds to different quantization values before and after a single iteration.
In the computer device, the corresponding relation between the quantized value of the image characteristic value and the numerical value of the gradient adjusting coefficient is prestored, and after the quantized value of the image characteristic value is obtained, the numerical value of the gradient adjusting coefficient corresponding to the quantized value of the image characteristic value is determined according to the quantized value of the image characteristic value obtained currently based on the corresponding relation.
And 204, performing gradient updating on the weights by using the gradient adjusting coefficients to obtain the trained weights, wherein the weights are used for forming the trained binary neural network.
From a mathematical point of view, the direction of the gradient is the direction in which the function increases most rapidly, and the opposite direction of the gradient is the direction in which the function decreases most rapidly. Therefore, in order to minimize the loss function of the neural network (including the binary neural network) and obtain the optimal weights of the neural network, the weights need to be updated in a gradient manner, and the weights need to be continuously optimized in the opposite direction of the gradient.
When the weights are updated in a gradient manner, the weights before and after a single iteration may correspond to different quantization values, so that binary inversion may occur on the weights, and the binary inversion is highly likely to be caused by errors in the quantized gradient calculation and belongs to invalid inversion.
In the embodiment of the application, a new parameter, namely a gradient adjustment coefficient is introduced, and the new parameter is applied to the process of updating the gradient of the weight, and the gradient adjustment coefficient is used for adjusting the quantized gradient calculation so as to control the probability of binary inversion of the weight based on the adjusted gradient.
And step 205, inputting the image to be recognized into the trained binary neural network, and recognizing the face information in the image to be recognized.
In the embodiment of the application, the trained binary neural network can be used for recognizing the face information in the image.
To sum up, in the binary neural network, the difference between the binary quantized data and the original data is large, which causes a large error in gradient calculation, and this error causes that the weights will frequently undergo binary inversion during training, i.e. before and after a single iteration, the weights correspond to different quantized values, so that the optimization direction of the neural network is difficult to converge to the correct direction, resulting in a decrease in the precision of neural network training. Based on the technical scheme, in the training process of the binary neural network, a gradient adjustment coefficient is introduced, the numerical value of the gradient adjustment coefficient is related to the quantization value of the image characteristic value used in the training process, and when weight training is performed in a gradient updating mode, the gradient adjustment coefficient can control the probability of binary inversion of the weight in the training process, so that the training precision of the neural network can be improved, and the recognition precision of the trained binary neural network on the face information in the image is improved.
Since the numerical value of the gradient adjustment coefficient is related to the quantized value of the image feature value, and different quantized values can be understood as different states, the technical scheme provided by the application in the binary neural network training can be called as binary neural network training based on state perception.
Next, such state-aware binary neural network training will be described.
Giving an image characteristic value to be quantized, and when calculating the gradient of the image characteristic value, determining the numerical value of a gradient adjustment coefficient as a first gradient adjustment coefficient value under the condition that the quantization value of the image characteristic value is a first quantization value; under the condition that the quantized value of the image characteristic value is a second quantized value, determining the numerical value of the gradient adjustment coefficient as a second gradient adjustment coefficient value; wherein the first gradient adjustment coefficient value is different from the second gradient adjustment coefficient value.
That is, if the image feature value is recorded as
Figure 633042DEST_PATH_IMAGE014
Setting different gradient adjustment coefficients tau according to different states (1 or-1) −1 And τ 1 . As shown in equation 1:
Figure 319238DEST_PATH_IMAGE015
(formula 1)
Wherein,
Figure 995070DEST_PATH_IMAGE016
is a characteristic value of the image, is based on>
Figure 374099DEST_PATH_IMAGE017
For a quantized value of an image characteristic value>
Figure 751860DEST_PATH_IMAGE018
For the gradient adjustment factor, is selected>
Figure 241747DEST_PATH_IMAGE019
Is a loss function of the binary neural network. It should be understood that when τ is −1 Is equal to tau 1 Then equation 1 degenerates to the traditional binary neural network optimization algorithm.
Accordingly, the weights in the binary neural network are updated in a gradient using equation 2:
Figure 506506DEST_PATH_IMAGE020
(formula 2)
Wherein,
Figure 56436DEST_PATH_IMAGE021
is weighted, in>
Figure 672225DEST_PATH_IMAGE022
Is the quantized value of the weight>
Figure 965803DEST_PATH_IMAGE023
Number of layers for a binary neural network>
Figure 350648DEST_PATH_IMAGE024
For training the number of iterations, ->
Figure 805900DEST_PATH_IMAGE025
Is the learning rate.
In the following, compared with the traditional implementation, that is, the binary neural network training based on state consistency, the binary neural network training based on state perception can suppress frequent invalid rollover caused by gradient calculation errors in the binary neural network for proving.
If equation 3 is set:
Figure 659718DEST_PATH_IMAGE026
(formula)3)
Equation 2 can be simplified to
Figure 491408DEST_PATH_IMAGE027
As can be seen from simplified equation 2, the weight binary inversion occurs only when equation 4 holds:
Figure 996339DEST_PATH_IMAGE028
(formula 4)
Assuming that the initial state of the image feature value is-1, the probability of the weight being subjected to binary inversion is analyzed as follows.
a. When | τ −1| = |τ 1 When |, the probability of binary inversion of the weight from the t-th iteration to the t +1 iterations is as shown in formula 5:
Figure 622492DEST_PATH_IMAGE029
(equation 5)>
Wherein N is the number of all weights, and
Figure 212873DEST_PATH_IMAGE030
the number of weights to satisfy equation 6.
Figure 51516DEST_PATH_IMAGE031
(formula 6)
Equations 4 and 6 mean that if the absolute value of the gradient learning rate product is larger than the absolute value of the current weight, the probability that the quantization value is inverted is larger. Since in equation 4 and equation 6, the weight w of the current time t Fixed in the update, whether the quantized value of the next weight is inverted depends on the gradient update amount b of the current time t (i.e., the product of the gradient magnitude and the learning rate). In the technical scheme provided by the application, the quantity b is artificially updated in the gradient t Besides, a factor based on state perception is additionally added
Figure 207691DEST_PATH_IMAGE032
Figure 457276DEST_PATH_IMAGE032
Can affect the probability of flipping of the quantized values in the binary quantization.
The difficulty degree of the quantized value of the weight can be adjusted by adjusting the value of the gradient adjusting coefficient tau used by the quantized values (-1 and 1) in different states, so that the probability of continuous inversion of the quantized value of the weight in the binary neural network is suppressed. It can be understood that, in the technical solution of the present application, binary inversion is allowed to occur to the quantized values of the weights, because this is a normal training process, but it is necessary to suppress meaningless continuous inversion.
It can be inferred that the probability of the successive and repeated flipping from the t-th iteration to t +2 iterations is shown in equation 7:
Figure 331691DEST_PATH_IMAGE033
(equation 7)
b. When | τ −1 | ≠ |τ 1 In |, the continuous flipping probability of the state-aware-based binary neural network training is smaller than that of the state-aware-based binary neural network training, see the following formula 8 and formula 9:
Figure 708446DEST_PATH_IMAGE034
(formula 8)
Figure 984706DEST_PATH_IMAGE035
(formula 9)
Wherein A refers to the current iteration (A) t Representing the t-th iteration) of the set of quantization value flips. For example, in the above formula, the union of the t-th iteration and the t + 1-th iteration flip set is taken to represent the case of consecutive flip in two consecutive iterations.
The two formulas have the meanings that for the data with the quantized values of 1 and-1 which are very clear in the network optimization process, the updated values of the data cannot be influenced by the extra gradient adjustment factors, and the parameters are always optimized to the correct positions under the influence of the gradient reduction of the network; and for the weight of random swing of the quantization value between 1 and-1 due to the unknown gradient direction, the overturning difficulty of different quantization values is changed by setting a gradient adjusting coefficient tau, and the weight of the part is actively guided to a deviation value (1 or-1). Therefore, in the binary neural network optimization process, frequent invalid turnover of the quantization value of the weight can be inhibited, the training process is stabilized, and the training efficiency is improved.
In the following, the validity of the binary neural network training based on state perception in the present application is proved through experiments.
First, τ is fixed 1 =1, find optimal τ through network training −1 FIG. 3 shows the learned τ of the different layers of resnet-18 −1 Distribution diagram of (c). In fig. 3, the ordinate represents the different channels (channels) and the abscissa is τ −1 The size distribution of (a); sub-graphs a, b, c and d in fig. 3 are statistics of layers 14, 15, 16 and 17 of resnet-18, respectively. From fig. 3, it can be seen that the optimum | τ is −1 L is not equal to | τ 1 And l, namely the binary neural network training based on state perception can be closer to actual data.
Secondly, the weight turnover probability in the actual binary neural network training process is tracked, and fig. 4 is statistics of the turnover rate of the quantized value of the Resnet-18 layer 10 weight. In fig. 4, the ordinate represents the rollover probability and the abscissa represents the number of training epochs (epochs). m refers to the number of consecutive flips, e.g., m =2 represents 2 consecutive flips (only two consecutive flips were analyzed in the above theoretical demonstration); m =3 represents 3 consecutive flips. The probability of continuous flipping expresses to some extent the probability of invalid training occurring. From fig. 4, it can be seen that the binary neural network training based on state awareness (SA in the legend) has a lower probability of flipping compared to the binary neural network training based on state consistency.
Then, the training accuracy of the binary neural network training based on state perception is compared with that of the binary neural network training based on state consistency, see fig. 5. In FIG. 5, base + SA represents binary neural network training based on state perception, base represents binary neural network training based on state consistency, top-1 represents that an accurate value is the first name of a predicted value, and Top-5 represents that the accurate value belongs to the first five names of the predicted values. From fig. 5, it can be seen that the training precision of the binary neural network based on state perception is improved compared with the training precision of the binary neural network based on state consistency.
Finally, the state-aware based binary neural network training is compared with the accuracy of other related works to improve the accuracy of the binary neural network, see fig. 6. In fig. 6, the column SA-BNN represents the effect of state-aware binary neural network training, the column FP represents the performance of a full-precision network, and the other columns show the precision achieved by other related tasks. From fig. 6, it can be seen that the binary neural network training based on state perception has higher precision compared with other related work for improving the precision of the binary neural network.
In an exemplary embodiment, the gradient adjustment coefficients are learnable parameters obtained by training and learning of the network.
That is, before the binary neural network is actually applied to face information recognition, the training method of the binary neural network may further include the steps of:
and carrying out gradient updating on the gradient adjusting coefficient to obtain the trained gradient adjusting coefficient.
In a possible implementation manner, a gradient magnitude synchronization coefficient is obtained, and the gradient magnitude synchronization coefficient is used for ensuring that the magnitude of the gradient adjustment coefficient is synchronous with the magnitude of the gradient of the weight; and performing gradient updating on the gradient adjusting coefficient by using the gradient magnitude synchronous coefficient to obtain the trained gradient adjusting coefficient.
Extensive practice and theoretical analysis in the field of deep learning indicates that neural networks can converge to better local optima when the magnitude of each layer of gradient in the deep learning network is similar. According to this theory, the gradient introduced in equation 1 is adjusted for the coefficient (τ) −1 And τ 1 ) When optimization learning is carried out, the gradient updating magnitude is controlled, and the whole binary neural network can be converged to a better state.
Formally, the magnitude relation between the gradient of the original weight w and the gradient of the newly introduced gradient adjustment coefficient τ in the binary neural network is represented by equation 10:
Figure 155924DEST_PATH_IMAGE036
(equation 10)
According to the foregoing theoretical analysis, it is required that | | | R | =1 to make the magnitude of the gradient adjustment coefficient τ similar to the magnitude of the gradient of other parameters in the binary neural network.
Optionally, using the following equation 11, the gradient magnitude synchronization coefficient is determined:
Figure 517636DEST_PATH_IMAGE037
(formula 11)
Wherein,
Figure 432502DEST_PATH_IMAGE038
is a gradient magnitude synchronization coefficient>
Figure 563269DEST_PATH_IMAGE039
The number of weights in the current layer of the binary neural network.
In the implementation of equation 1, the gradient adjustment coefficient τ is associated with each quantized value of the layer, and its gradient is a sum of all quantized values of the layer, so its gradient quantization is higher than the quantized value of each individual weight. For correction, i | | R | =1, when the gradient adjustment coefficient τ is updated, a fixed gradient magnitude synchronization coefficient g is additionally multiplied.
In summary, in the neural network optimization process, the gradient descent of each learnable parameter is optimized, and when the magnitude of each learnable parameter is approximate, the learnable parameter can converge to a better local optimal point.
It should be noted that the method embodiments described above may be implemented alone or in combination, and the present application is not limited thereto.
Fig. 7 is a block diagram illustrating a configuration of a face information recognition apparatus based on a binary neural network according to an exemplary embodiment. The device comprises:
a feature value input module 701, configured to acquire a target image, extract an image feature value of the target image, and input the image feature value into a binary neural network;
a binary quantization module 702, configured to perform binary quantization processing on the image feature value to obtain a quantization value of the image feature value;
a numerical value determining module 703, configured to determine, according to the quantized value of the image feature value, a numerical value of a gradient adjustment coefficient, where the gradient adjustment coefficient is used to control a probability that a weight in the binary neural network undergoes binary inversion in a training process, where the binary inversion refers to that the weight corresponds to different quantized values before and after a single iteration;
a first gradient updating module 704, configured to perform gradient updating on the weights by using the gradient adjustment coefficients to obtain the trained weights, where the weights are used to form the trained binary neural network;
a face information recognition module 705, configured to input the image to be recognized into the trained binary neural network, and recognize face information in the image to be recognized.
In a possible implementation manner, the value determining module 703 is further configured to:
determining the numerical value of the gradient adjustment coefficient as a first gradient adjustment coefficient value under the condition that the quantized value of the image characteristic value is a first quantized value;
determining the numerical value of the gradient adjustment coefficient to be a second gradient adjustment coefficient value under the condition that the quantized value of the image characteristic value is a second quantized value;
wherein the first gradient adjustment coefficient value is different from the second gradient adjustment coefficient value.
In a possible implementation manner, the first gradient updating module 704 is further configured to:
gradient updating weights in the binary neural network using the following formula:
Figure 653191DEST_PATH_IMAGE040
wherein,
Figure 502199DEST_PATH_IMAGE041
for said weight, based on>
Figure 955177DEST_PATH_IMAGE042
For a quantized value of said weight>
Figure 206030DEST_PATH_IMAGE043
For the image characteristic value, < > or>
Figure 719051DEST_PATH_IMAGE044
For a quantized value of the image characteristic value>
Figure 55354DEST_PATH_IMAGE045
Adjust the coefficient for the gradient, < >>
Figure 312023DEST_PATH_IMAGE046
For a number of layers of said binary neural network, in>
Figure 417382DEST_PATH_IMAGE047
For the number of training iterations>
Figure 350572DEST_PATH_IMAGE048
For a loss function of the binary neural network>
Figure 908592DEST_PATH_IMAGE049
Is the learning rate. />
In one possible implementation, the apparatus further includes: a second gradient update module;
and the second gradient updating module is used for performing gradient updating on the gradient adjusting coefficient so as to obtain the trained gradient adjusting coefficient.
In a possible implementation manner, the second gradient updating module is further configured to:
obtaining a gradient magnitude synchronization coefficient, wherein the gradient magnitude synchronization coefficient is used for ensuring that the magnitude of the gradient adjustment coefficient is synchronous with the magnitude of the gradient of the weight;
and performing gradient updating on the gradient adjusting coefficient by using the gradient magnitude synchronous coefficient to obtain the trained gradient adjusting coefficient.
In a possible implementation manner, the second gradient updating module is further configured to:
determining the gradient magnitude synchronization coefficient using the following equation:
Figure 968952DEST_PATH_IMAGE050
wherein,
Figure 928818DEST_PATH_IMAGE051
for the gradient magnitude synchronization coefficient, based on the gradient magnitude of the image data>
Figure 783641DEST_PATH_IMAGE052
And the number of the weights in the current layer of the binary neural network is used as the weight of the current layer of the binary neural network.
It should be noted that: the face information recognition device based on the binary neural network provided in the above embodiment is only illustrated by the division of the above functional modules, and in practical applications, the above function distribution may be completed by different functional modules according to needs, that is, the internal structure of the device is divided into different functional modules to complete all or part of the above described functions. In addition, the apparatus and method embodiments provided by the above embodiments belong to the same concept, and specific implementation processes thereof are described in the method embodiments for details, which are not described herein again.
Please refer to fig. 8, which is a schematic diagram of a computer device according to an exemplary embodiment of the present application, the computer device includes a memory and a processor, the memory is used for storing a computer program, and the computer program is executed by the processor, so as to implement the face information recognition method based on the binary neural network.
The processor may be a Central Processing Unit (CPU). The Processor may also be other general purpose processors, digital Signal Processors (DSPs), application Specific Integrated Circuits (ASICs), field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components, or a combination thereof.
The memory, which is a non-transitory computer readable storage medium, may be used to store non-transitory software programs, non-transitory computer executable programs, and modules, such as program instructions/modules corresponding to the methods of the embodiments of the present invention. The processor executes various functional applications and data processing of the processor by executing non-transitory software programs, instructions and modules stored in the memory, that is, the method in the above method embodiment is realized.
The memory may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created by the processor, and the like. Further, the memory may include high speed random access memory, and may also include non-transitory memory, such as at least one disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory optionally includes memory located remotely from the processor, and such remote memory may be coupled to the processor via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
In an exemplary embodiment, a computer-readable storage medium is also provided for storing at least one computer program, which is loaded and executed by a processor to implement all or part of the steps of the above method. For example, the computer-readable storage medium may be a Read-Only Memory (ROM), a Random Access Memory (RAM), a Compact Disc Read-Only Memory (CD-ROM), a magnetic tape, a floppy disk, an optical data storage device, and the like.
Other embodiments of the present application will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the application being indicated by the following claims.
It will be understood that the present application is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the application is limited only by the appended claims.

Claims (8)

1. A face information identification method based on a binary neural network is characterized by comprising the following steps:
acquiring a target image, extracting an image characteristic value of the target image, and inputting the image characteristic value into a binary neural network;
carrying out binary quantization processing on the image characteristic value to obtain a quantization value of the image characteristic value;
determining a numerical value of a gradient adjustment coefficient as a first gradient adjustment coefficient value under the condition that the quantization value of the image characteristic value is a first quantization value, determining the numerical value of the gradient adjustment coefficient as a second gradient adjustment coefficient value under the condition that the quantization value of the image characteristic value is a second quantization value, wherein the gradient adjustment coefficient is used for adjusting the quantized gradient calculation so as to control the probability of binary inversion of the weight in the binary neural network in the training process, the binary inversion refers to the fact that the weight corresponds to different quantization values before and after a single iteration, and the first gradient adjustment coefficient value is different from the second gradient adjustment coefficient value;
performing gradient updating on the weights by using the gradient adjusting coefficients to obtain the weights after training is completed, wherein the weights are used for forming the binary neural network after training is completed;
and inputting the image to be recognized into the trained binary neural network, and recognizing the face information in the image to be recognized.
2. The method of claim 1, wherein the gradient updating the weights in the binary neural network using the gradient adjustment coefficients to complete the training of the weights comprises:
gradient updating weights in the binary neural network using the following formula:
Figure 910414DEST_PATH_IMAGE001
wherein,
Figure 11094DEST_PATH_IMAGE002
for said weight, based on>
Figure 99267DEST_PATH_IMAGE003
For the quantized value of the weight, <' >>
Figure 805055DEST_PATH_IMAGE004
For the image characteristic value,
Figure 140221DEST_PATH_IMAGE005
For a quantized value of the image characteristic value>
Figure 682192DEST_PATH_IMAGE006
Adjust the coefficient for the gradient, < >>
Figure 26586DEST_PATH_IMAGE007
For the number of layers of the binary neural network, is->
Figure 134350DEST_PATH_IMAGE008
For training the number of iterations, ->
Figure 356864DEST_PATH_IMAGE009
For a loss function of the binary neural network, <' >>
Figure 120552DEST_PATH_IMAGE010
Is the learning rate.
3. The method according to claim 1 or 2, characterized in that the method further comprises:
and carrying out gradient updating on the gradient adjusting coefficient to obtain the trained gradient adjusting coefficient.
4. The method of claim 3, wherein the gradient updating the gradient adjustment coefficients comprises:
acquiring a gradient magnitude synchronization coefficient, wherein the gradient magnitude synchronization coefficient is used for ensuring that the magnitude of the gradient adjustment coefficient is synchronous with the magnitude of the gradient of the weight;
and performing gradient updating on the gradient adjusting coefficient by using the gradient magnitude synchronous coefficient to obtain the trained gradient adjusting coefficient.
5. The method of claim 4, wherein obtaining gradient magnitude synchronization coefficients comprises:
determining the gradient magnitude synchronization coefficient using the following equation:
Figure 65374DEST_PATH_IMAGE011
wherein,
Figure 230907DEST_PATH_IMAGE012
for the gradient magnitude synchronization factor->
Figure 235772DEST_PATH_IMAGE013
And the number of the weights in the current layer of the binary neural network is used as the weight of the current layer of the binary neural network.
6. A face information recognition device based on a binary neural network is characterized in that the device comprises:
the characteristic value input module is used for acquiring a target image, extracting an image characteristic value of the target image and inputting the image characteristic value into a binary neural network;
the binary quantization module is used for carrying out binary quantization processing on the image characteristic value in the binary neural network to obtain a quantization value of the image characteristic value;
a numerical value determining module, configured to determine, when the quantization value of the image feature value is a first quantization value, a numerical value of a gradient adjustment coefficient as a first gradient adjustment coefficient value, and determine, when the quantization value of the image feature value is a second quantization value, the numerical value of the gradient adjustment coefficient as a second gradient adjustment coefficient value, where the gradient adjustment coefficient is used to adjust quantized gradient calculation so as to control a probability of binary inversion occurring in a weight in the binary neural network during training, where the binary inversion refers to that the weight corresponds to different quantization values before and after a single iteration, and the first gradient adjustment coefficient value is different from the second gradient adjustment coefficient value;
the first gradient updating module is used for performing gradient updating on the weights by using the gradient adjusting coefficients to obtain the weights after training is completed, and the weights are used for forming the binary neural network after training is completed;
and the face information recognition module is used for inputting the image to be recognized into the trained binary neural network and recognizing the face information in the image to be recognized.
7. A computer device comprising a processor and a memory, wherein the memory stores at least one instruction, at least one program, a set of codes, or a set of instructions, which is loaded and executed by the processor to implement the method for recognizing face information based on binary neural network as claimed in any one of claims 1 to 5.
8. A computer-readable storage medium, having at least one instruction, at least one program, a set of codes, or a set of instructions stored therein, which is loaded and executed by a processor to implement the method for recognizing face information based on binary neural network as claimed in any one of claims 1 to 5.
CN202211092937.7A 2022-09-08 2022-09-08 Face information identification method, device and equipment based on binary neural network Active CN115171201B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211092937.7A CN115171201B (en) 2022-09-08 2022-09-08 Face information identification method, device and equipment based on binary neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211092937.7A CN115171201B (en) 2022-09-08 2022-09-08 Face information identification method, device and equipment based on binary neural network

Publications (2)

Publication Number Publication Date
CN115171201A CN115171201A (en) 2022-10-11
CN115171201B true CN115171201B (en) 2023-04-07

Family

ID=83481475

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211092937.7A Active CN115171201B (en) 2022-09-08 2022-09-08 Face information identification method, device and equipment based on binary neural network

Country Status (1)

Country Link
CN (1) CN115171201B (en)

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7398259B2 (en) * 2002-03-12 2008-07-08 Knowmtech, Llc Training of a physical neural network
WO2021056112A1 (en) * 2019-09-24 2021-04-01 Huawei Technologies Co., Ltd. Training method for quantizing the weights and inputs of a neural network
CN112150497B (en) * 2020-10-14 2024-08-06 浙江大学 Local activation method and system based on binary neural network

Also Published As

Publication number Publication date
CN115171201A (en) 2022-10-11

Similar Documents

Publication Publication Date Title
US10332507B2 (en) Method and device for waking up via speech based on artificial intelligence
US11373087B2 (en) Method and apparatus for generating fixed-point type neural network
EP3971772A1 (en) Model training method and apparatus, and terminal and storage medium
US9400955B2 (en) Reducing dynamic range of low-rank decomposition matrices
CN109034371B (en) Deep learning model reasoning period acceleration method, device and system
CN108416440A (en) A kind of training method of neural network, object identification method and device
US20240071070A1 (en) Algorithm and method for dynamically changing quantization precision of deep-learning network
CN115239593A (en) Image restoration method, image restoration device, electronic device, and storage medium
EP4239585A1 (en) Video loop recognition method and apparatus, computer device, and storage medium
US20230316738A1 (en) Binary neural network-based local activation method and system
US20230252294A1 (en) Data processing method, apparatus, and device, and computer-readable storage medium
CN111178514A (en) Neural network quantification method and system
KR102505946B1 (en) Method and system for training artificial neural network models
KR20220059194A (en) Method and apparatus of object tracking adaptive to target object
CN113505797A (en) Model training method and device, computer equipment and storage medium
CN111275005A (en) Drawn face image recognition method, computer-readable storage medium and related device
JP6942203B2 (en) Data processing system and data processing method
WO2022083165A1 (en) Transformer-based automatic speech recognition system incorporating time-reduction layer
CN113851113A (en) Model training method and device and voice awakening method and device
CN115171201B (en) Face information identification method, device and equipment based on binary neural network
CN114021697A (en) End cloud framework neural network generation method and system based on reinforcement learning
CN111476731B (en) Image correction method, device, storage medium and electronic equipment
CN116957024A (en) Method and device for reasoning by using neural network model
CN110874553A (en) Recognition model training method and device
CN114065920A (en) Image identification method and system based on channel-level pruning neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant