CN109145986B

CN109145986B - Large-scale face recognition method

Info

Publication number: CN109145986B
Application number: CN201810956517.6A
Authority: CN
Inventors: 杨世杰; 黄坤山; 彭文瑜; 林玉山; 杨表
Original assignee: Foshan Tongguan Technology Co ltd; Foshan Nanhai Guangdong Technology University CNC Equipment Cooperative Innovation Institute
Current assignee: Foshan Tongguan Technology Co ltd; Foshan Nanhai Guangdong Technology University CNC Equipment Cooperative Innovation Institute
Priority date: 2018-08-21
Filing date: 2018-08-21
Publication date: 2021-12-24
Anticipated expiration: 2038-08-21
Also published as: CN109145986A

Abstract

A large-scale face recognition method comprises the following steps: the method comprises the following steps: training data and cleaning the trained data; step two: setting a network; step three: inputting settings, adopting 5 key points to calibrate the human face and normalizing the human face to a 112 × 112 color image; step four: outputting the setting; step five: and optimizing the Softmax loss function to obtain a new Softmax loss function, and improving the accuracy of face recognition according to the new Softmax loss function. The invention provides a large-scale face recognition method, which improves the face recognition precision, thereby realizing large-scale face recognition.

Description

Large-scale face recognition method

Technical Field

The invention relates to the technical field of face recognition, in particular to a large-scale face recognition method.

Background

The face recognition identity authentication technology is widely applied to the fields of security, entrance guard, attendance checking and the like due to the characteristics of convenience, rapidness, non-contact and the like. The convolutional neural network has the characteristic of large capacity in the aspect of characteristic, and is suitable for large-scale face recognition. However, in the face recognition process, intra-class changes caused by factors such as facial expressions, postures, ages, positions and coverings of a person and inter-class changes caused by different identities from outside light, backgrounds and the like need to be considered, the two changes are highly complex and nonlinear, and are generally analyzed by using a Softmax loss function, but the Softmax loss function only separates inter-class features and does not gather features belonging to the same class, and the features are not effective enough for face recognition.

Disclosure of Invention

The invention aims to provide a large-scale face recognition method, which improves the face recognition precision and further realizes large-scale face recognition.

In order to achieve the purpose, the invention adopts the following technical scheme:

a large-scale face recognition method comprises the following steps:

the method comprises the following steps: training data and cleaning the trained data;

step two: setting a network;

step three: inputting settings, adopting 5 key points to calibrate the human face and normalizing the human face to a 112 × 112 color image;

step four: outputting the setting;

step five: optimizing the Softmax loss function to obtain a new Softmax loss function, and improving the accuracy of face recognition according to the new Softmax loss function, wherein the Softmax loss function is defined as follows:

where m represents the number of samples, n represents the number of sample classifications, s represents the scale coefficient,_m+θyrepresents an angle increment manner, wherein m is 0.5.

Preferably, in step five, the optimization of the Softmax loss function includes a primary optimization and a secondary optimization, and the primary optimization includes the following steps:

setting bias as 0, and adding a margin to the included angle of projection to obtain a primary optimization formula as follows:

Preferably, the second optimization comprises adding a fixed value to the angle.

Preferably, in step one, the training data comprises training using a VGG network.

Preferably, in step four, the output setting includes a method of using the optimal structure (contribution- > BN- > Dropout- > Fullyconnected- > BN) to realize the connection of the last Convolution layer to the feature vector.

Has the advantages that:

by optimizing a traditional softmax loss function, theta is expressed as an included angle formed by normalizing w and x, so that only one straight line exists between classes, the boundary is enlarged, and classification is facilitated. Meanwhile, training data are optimized and cleaned, and an optimal network structure is selected, so that the recognition precision is greatly improved, and along with the improvement of the recognition precision, the recognition efficiency and accuracy of large-scale human faces are improved.

Drawings

Fig. 1 is a flow chart of the present invention for improving the accuracy of large-scale face recognition.

Detailed Description

The technical scheme of the invention is further explained by the specific implementation mode in combination with the attached drawings.

As shown in fig. 1, the large-scale face recognition method of this embodiment includes the following steps:

the method comprises the following steps: training data and cleaning the trained data; the training set used VGG network training 8000 people, contained 3000000 images, the test set contained 160000 images of 500 people, image quality spanning pose, age, lighting, race, occupation, etc. The training data is cleaned.

Step two: setting a network; various network settings were verified using the MxNet architecture, using VGG2 as training data, softmax as a loss function. The batch size is set to 512, the learning rate starts from 0.1, the learning rate is reduced by 10 times at 100k, 140k and 160k respectively, the impulse unit takes a value of 0.9, and the weight attenuation is set to 5 e-4.

Step three: inputting settings, adopting 5 key points to calibrate the human face and normalizing the human face to a 112 × 112 color image; the face was aligned using 5 keypoints and normalized to a 112 × 112 color image, and in order to ensure the resolution of the feature map, this example replaced conv7 × 7stride2 with conv3 × 3stride 1. The output of the last convolution is 7 x 7.

Step four: outputting the setting; including the implementation of the last Convolution layer connected to the feature vector by the optimal structure (constraint- > BN- > Dropout- > Fullyconnected- > BN).

Step five: optimizing the Softmax loss function to obtain a new Softmax loss function, and improving the accuracy of face recognition according to the new Softmax loss function, wherein the definition of the common Softmax loss function is as follows:

the traditional Softmax can not optimize the inter-class distance and the intra-class distance of the feature points, and bias is set to be 0 for calculation convenience, so that the included angle after normalization is represented. In such Softmax, class-to-class boundaries are simply one line. Such points that fall near the boundary may make the generalization capability of the entire model poor. To make this limit larger, the points between the different classes are made as far apart as possible. This can be achieved by adding a margin to the included angle of the projection. The function after the initial optimization is therefore as follows:

after the primary optimization, the Softmax function stops converging in the initial stage, so that secondary optimization is performed, a fixed value is added to the angle, and the function after the secondary optimization is as follows:

after optimization, the angular distance is more directly affected on the angle than the cosine distance.

The technical principle of the present invention is described above in connection with specific embodiments. The description is made for the purpose of illustrating the principles of the invention and should not be construed in any way as limiting the scope of the invention. Based on the explanations herein, those skilled in the art will be able to conceive of other embodiments of the present invention without inventive effort, which would fall within the scope of the present invention.

Claims

1. A large-scale face recognition method is characterized in that: the method comprises the following steps:

step two: setting a network, using an MxNet architecture, using a VGG2 network as training data, and using softmax as a loss function to verify various network settings; the batch size is set to 512, the learning rate is reduced by 10 times from 0.1 at 100k, 140k and 160k respectively, the impulse unit takes a value of 0.9, and the weight attenuation is set to 5 e-4;

step four: output settings including methods for implementing the last Convolution layer connected to the feature vector using the optimal structure Convolume- > BN- > Dropout- > Fullyconnected- > BN;

step five: optimizing the Softmax loss function to obtain a new Softmax loss function, and improving the accuracy of face recognition according to the new Softmax loss function, wherein the method comprises the following steps:

optimizing the Softmax loss function comprises a primary optimization and a secondary optimization, wherein the primary optimization comprises the following steps:

；

m represents the number of samples, i represents the number of samples, represents the ith sample, n represents the number of sample classifications, j represents the jth number of sample classifications, s represents the scale coefficient,

representing a fixed value, representing the angle of projection of the feature points,

representing the included angle of the feature point after projection normalization;

and performing secondary optimization on the formula after the primary optimization, wherein the secondary optimization comprises adding a fixed value on the angle as follows:

；

where m represents the number of samples, i represents the ith number of samples, n represents the number of sample classifications, j represents the jth number of sample classifications, s represents the scale coefficient,

indicating an angular incremental manner in which

，

Which represents a fixed value of the time-varying,

and representing the included angle of the projection normalization of the characteristic points.