WO2021218060A1

WO2021218060A1 - Face recognition method and device based on deep learning

Info

Publication number: WO2021218060A1
Application number: PCT/CN2020/122220
Authority: WO
Inventors: 张芳健; 刘军; 程炜; 裴炜冬; 李六武
Original assignee: 深圳英飞拓智能技术有限公司
Priority date: 2020-04-29
Filing date: 2020-10-20
Publication date: 2021-11-04
Also published as: CN111639535A; CN111639535B

Abstract

A face recognition method and device based on deep learning. The method comprises: obtaining a face image training sample and a face image to be detected (S101); extracting a face training feature from the face image training sample, and extracting a face feature to be detected from said face image (S102); constructing a convolutional neural network model, and training the face training feature of the face image training sample by using the convolutional neural network model to obtain a face recognition model (S103); and comparing said face feature of said face image according to the trained face recognition model so as to recognize said face image (S104). The inter-class spacing can be more uniform, more types of training data can be trained, large-scale face data training can be implemented, and thus, the face recognition efficiency can be improved and the face recognition performance can be improved.

Description

Face recognition method and device based on deep learning

This application claims the priority of the Chinese patent application with application number 202010358934.8 filed at the Chinese Patent Office on April 29, 2020, the entire content of which is incorporated into this application by reference.

Technical field

The present invention relates to the technical field of image processing, in particular to a face recognition method, device and readable storage medium based on deep learning.

Background technique

With the development of face recognition technology, various face recognition related products have been widely used in people's lives. At present, the main recognition function of face recognition technology is based on Convolutional Neural Network (CNN). Use a large number of face image data sets to train the convolutional neural network, so that after the convolutional neural network training converges, it has the ability of face recognition. Considering that many products currently need to target millions of people, the difficulty of training the network model increases. For this reason, many current training methods use the classification activation function softmax to define the identity, so that the process of model training can be transformed into the iterative training optimization of the classification loss function softmax loss, so as to reduce the complexity of training and increase the effect.

The model training of the above-mentioned softmax loss method can correctly distinguish faces of different categories, but its effect cannot make the interval between different categories large enough, which leads to the effect of face recognition is not ideal. In order to increase the facial feature interval between different categories, the RegularFace method is often used to recognize faces. The RegularFace method can ensure that a certain safe interval is formed between different categories, and can control the distribution of categories. But its existence: 1. When training the loss function of RegularFace, if there are more training samples in a certain category, it may cause greater interference between the distances between classes and make the distance between the categories not uniform; 2. In the early stages of training , Because the model does not form a good classification function, that is, the center points of each category represented by the W parameter of the convolutional layer are not sufficiently separated, which will lead to prolonged training time; 3. When the number of training sample categories is large, The amount of calculation to find the cosine distance between classes will be very large, which makes it difficult or impossible for most computers to run at present.

In view of this, it is necessary to propose further improvements to the current face recognition technology.

technical problem

In order to solve at least one of the above technical problems, the main purpose of the present invention is to provide a face recognition method, device and readable storage medium based on deep learning.

Technical solutions

In order to achieve the above objective, the first technical solution adopted by the present invention is to provide a face recognition method based on deep learning, including:

Obtain face image training samples and face images to be tested;

Extracting face training features from face image training samples, and extracting face features to be tested from face images to be tested;

Construct a convolutional neural network model, and use the convolutional neural network model to train the face training features of the face image training samples to obtain a face recognition model, where the training specifically includes using the Arcface loss function to perform the convolutional neural network The model is trained in the first stage to obtain the state of convergence of the convolutional neural network model, and the second-stage training of the convolutional neural network model is performed using the intra-class and inter-class loss function;

According to the trained face recognition model, the face image to be tested is compared with the face feature to be tested to recognize the face image to be tested.

Wherein, the comparison of the face features to be tested in the face image to be tested according to the trained face recognition model to recognize the face image to be tested specifically includes:

According to the trained face recognition model, compare the face features of the face images to be tested;

When the comparison is successful, obtain the face ID corresponding to the face training feature in the face recognition model; and

Use the face ID as the recognition result of the face picture to be tested.

Wherein, after obtaining the face ID corresponding to the face training feature in the face recognition model, the method further includes:

Check whether the number of face IDs of the face recognition model after the comparison is unique;

When the number of face IDs is unique, the cosine distance between the face feature to be tested and the face training feature of the face recognition model that has been successfully compared is used to identify whether the face feature to be tested and the compared face ID are the same person.

Wherein, the first stage training of the convolutional neural network model by using the Arcface loss function to obtain the convolutional neural network model convergence state includes:

Normalize the face training features and the weight parameters of the fully connected layer in the convolutional neural network model respectively, and calculate the Arcface loss function in the loss layer of the convolutional neural network model;

The Arcface loss function is used to guide the convolutional neural network model for training, and the convergence state of the convolutional neural network model is obtained.

Wherein, the second-stage training of the convolutional neural network model using the intra-class and inter-class loss function includes:

Normalize the input parameters of the loss layer in the convolutional neural network model;

According to the input parameters of the loss layer and the weight parameters of the fully connected layer, the intra-class and inter-class loss functions are calculated;

Use the intra-class and inter-class loss function to guide and guide the convolutional neural network model to train, and obtain the face recognition model.

Wherein, the face training features and the weight parameters of the fully connected layer in the convolutional neural network model are respectively normalized, and the Arcface loss function is calculated in the loss layer of the convolutional neural network model, which specifically includes:

Normalize the face training features and the weight parameters of the fully connected layer in the convolutional neural network model to obtain the cosine distance between the face training features and the corresponding weight parameters;

Calculate the inverse trigonometric function through the cosine distance to obtain the angle of the feature category;

Increase the angle of the feature category by the interval value to get the angle of the modified feature category;

The Arcface loss function is formed according to the angle of the feature category and the angle of the modified feature category.

Wherein, the intra-class and inter-class loss function is calculated according to the input parameters of the loss layer and the weight parameters of the fully connected layer, which specifically includes:

Perform a negative logarithmic change on the angle of the modified feature category to obtain the intra-class angle distance;

According to the input parameters of the loss layer and the weight parameters of the fully connected layer, the mean value of the inter-class distance and the variance of the inter-class distance of the feature categories are calculated respectively, and the inter-class distance is obtained according to the sum of the mean value of the inter-class distance and the variance of the inter-class distance distance;

According to the sum of the angular distance within the class and the distance between the classes, the intra-class and inter-class loss function is obtained.

In order to achieve the above objective, the second technical solution adopted by the present invention is to provide a face recognition device based on deep learning, including:

The acquisition module is used to acquire training samples of face pictures and face pictures to be tested;

The extraction module is used to extract face training features from face image training samples, and to extract face features to be tested from face images to be tested;

The construction module is used to build a convolutional neural network model, and use the convolutional neural network model to train the face training features of the face image training samples to obtain a face recognition model, wherein the training specifically includes using the Arcface loss function Carry out the first stage training of the convolutional neural network model to obtain the state of convergence of the convolutional neural network model, and use the intra-class and inter-class loss function to conduct the second-stage training of the convolutional neural network model;

The recognition module is used to compare the face features of the face images to be tested with the face images to be tested according to the trained face recognition model, so as to recognize the face images to be tested.

In order to achieve the above objective, the third technical solution adopted by the present invention is to provide an electronic device including: a memory, a processor, and a computer program stored on the memory and capable of running on the processor. When the processor executes the computer program, it realizes the steps in the above method.

In order to achieve the above objective, the fourth technical solution adopted by the present invention is to provide a readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, the steps in the above method are implemented.

Beneficial effect

The technical scheme of the present invention first obtains the face picture training sample and the face picture to be tested, then extracts the face training features from the face picture training sample, and extracts the face features to be tested from the face picture to be tested, and then constructs the volume Integral neural network model, and use the convolutional neural network model to train the face training features of the face image training samples to obtain the face recognition model, and finally treat the person to be tested for the face image according to the trained face recognition model The face features are compared to recognize the face image to be tested. Through the implementation of the technical solution of the present invention, the spacing between classes can be made more uniform, and at the same time more types of training data can be trained, and large-scale face data training can be realized. In this way, the efficiency of face recognition can be improved, and the human face can be improved. Face recognition effect.

Description of the drawings

In order to explain the embodiments of the present invention or the technical solutions in the prior art more clearly, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, the drawings in the following description are only These are some embodiments of the present invention. For those of ordinary skill in the art, without creative work, other drawings can be obtained based on the structure shown in these drawings.

FIG. 1 is a method flowchart of a face recognition method based on deep learning according to the first embodiment of the present invention;

2 is a specific flow chart of the present invention using the convolutional neural network model to train the face training features of the face image training samples;

Figure 3 is a schematic diagram of the face training feature distribution that uses the Arcface loss function to guide the training of the convolutional neural network model;

Figure 4 is a schematic diagram of the face training feature distribution that uses the intra-class and inter-class loss function to guide the training of the convolutional neural network model;

Fig. 5 is a schematic diagram of calculating the cosine distance between some categories of the present invention and all other categories;

6 is a block diagram of modules of a face recognition device based on deep learning according to a third embodiment of the present invention;

Fig. 7 is a block diagram of modules of an electronic device according to a fourth embodiment of the present invention.

The realization of the objectives, functional characteristics and advantages of the present invention will be further described in conjunction with the embodiments and with reference to the accompanying drawings.

Embodiments of the present invention

The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only a part of the embodiments of the present invention, rather than all the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of the present invention.

It should be noted that the descriptions related to "first", "second", etc. in the present invention are only for descriptive purposes, and cannot be understood as indicating or implying their relative importance or implicitly indicating the number of indicated technical features. Therefore, the features defined with "first" and "second" may explicitly or implicitly include at least one of the features. In addition, the technical solutions between the various embodiments can be combined with each other, but it must be based on what can be achieved by a person of ordinary skill in the art. When the combination of technical solutions is contradictory or cannot be achieved, it should be considered that such a combination of technical solutions does not exist. , Is not within the protection scope of the present invention.

Different from the prior art in the face recognition feature training, because the center points of each feature category are not sufficiently separated, the training time is longer. The present invention provides a face recognition method based on deep learning, which can make The spacing between classes is more uniform, and more types of training data can be trained at the same time, enabling large-scale face data training. For the specific implementation of the face recognition method based on deep learning, please refer to the following embodiments.

Please refer to FIG. 1, which is a method flowchart of a face recognition method based on deep learning according to a first embodiment of the present invention. In the embodiment of the present invention, the face recognition method based on deep learning includes:

S101. Obtain a training sample of a face picture and a face picture to be tested.

Specifically, this embodiment is applied to face recognition applications such as office building deployment control, construction site monitoring, and access control check-in. Before face recognition, multiple face image training samples are acquired, the face image training samples are used to form a face recognition database, and the face recognition database is used to recognize the face image to be tested. Both the face image training sample and the face image to be tested can extract the set face area from the collected image to facilitate subsequent face comparison.

S102. Extracting face training features from the face image training samples, and extracting face features to be tested from the face image to be tested.

Specifically, in this embodiment, feature extraction is performed on the training sample of the face image and the face image to be tested, and the extracted features may be the face region, nose, eyes, eyebrows, mouth, and so on.

S103. Construct a convolutional neural network model, and use the convolutional neural network model to train the face training features of the face image training samples to obtain a face recognition model, where the training specifically includes using the Arcface loss function to convolve The neural network model is trained in the first stage to obtain the state of convergence of the convolutional neural network model, and the second-stage training of the convolutional neural network model is performed using the intra-class and inter-class loss function.

Specifically, in this embodiment, the above-mentioned training includes the Arcface loss function to perform the first-stage training of the convolutional neural network model to obtain the state of convergence of the convolutional neural network model. Using the Arcface loss function, you can effectively train a good-performance face recognition neural network model. The intra-class distance in the training data is small, and the inter-class distance has enough intervals to achieve better classification. The second-stage training of the convolutional neural network model is carried out using the intra-class and inter-class loss function. According to the regularface loss function and make improvements, the ultimate goal is to make the categories evenly distributed. The aforementioned convolutional neural network model includes a loss layer, a fully connected layer, a pooling layer, and several convolutional layers. The face training feature of the convolutional neural network model passes through the last fully connected layer to obtain the final fully connected layer FC output for classification, and the weight parameter of the fully connected layer FC is weight.

S104: Compare the face features to be tested in the face image to be tested according to the trained face recognition model, so as to recognize the face image to be tested.

Specifically, the face feature to be tested of the face picture to be tested is compared with the face training feature of the face recognition model, and the face picture to be tested is identified according to the comparison result.

Further, the comparison of the features of the face to be tested in the face image to be tested according to the trained face recognition model to recognize the face image to be tested specifically includes:

Use the face ID as the recognition result of the face picture to be tested.

Specifically, during the comparison process, this embodiment uses a traversal method for comparison. When the comparison is successful, the face ID corresponding to the face training feature in the face recognition model is obtained, and the face ID is the person to be tested. The recognition result of the face image; when the comparison fails, the face image to be tested is compared with the next face training feature in the face recognition model until the correct face training feature is compared. If there is no comparison result, it will return the face image recognition failure to be tested.

Further, after obtaining the face ID corresponding to the face training feature in the face recognition model, the method further includes:

In this embodiment, when the number of face IDs is unique, it can be regarded as a 1V1 comparison. According to the cosine distance between the face feature to be tested and the face training feature of the face recognition model that has been successfully compared, the cosine distance between the two When the range is set, the facial features to be tested and the compared face ID can be identified as the same person; when the cosine distance between the two is beyond the set range, the facial features to be tested and the compared face ID can be identified as Different people.

Please refer to FIG. 2. FIG. 2 is a specific flow chart of using the convolutional neural network model to train the face training features of the face image training samples in the present invention. Figure 3 is a schematic diagram of the face training feature distribution using the Arcface loss function to guide the training of the convolutional neural network model; Figure 4 is a schematic diagram of the face training feature distribution using the intra-class and inter-class loss function to guide the training of the convolutional neural network model; Figure 5 is A schematic diagram of calculating the cosine distance between some categories of the present invention and all other categories.

Further, the first-stage training of the convolutional neural network model by using the Arcface loss function to obtain the state of convergence of the convolutional neural network model includes:

S131. Normalize the face training features and the weight parameters of the fully connected layer in the convolutional neural network model, respectively, and calculate the Arcface loss function in the loss layer of the convolutional neural network model;

S132: Use the Arcface loss function to guide the convolutional neural network model for training, and obtain a convergent state of the convolutional neural network model.

Specifically, in this embodiment, the Arcface loss function is formed by calculating the angle information between the face training feature and the weight parameter of the fully connected layer, which can increase the angle interval of different categories.

Further, said normalizing the face training features and the weight parameters of the fully connected layer in the convolutional neural network model respectively, and calculating the Arcface loss function in the loss layer of the convolutional neural network model, specifically includes:

Specifically, the FC output of the fully connected layer can be regarded as the cross product of the feature and the weight parameter weight. When the face training feature x ⁱ and the weight parameter weight of the fully connected layer in the convolutional neural network model are normalized, the face is obtained The specific formula for the cosine distance between the training feature and the weight parameter W _{j of each fully connected layer is as follows:}

in,

Represents the FC output of the fully connected layer, x _i represents the i-th face training feature, and θ _j represents the angle.

Through the inverse trigonometric function change of the cosine distance, the angle θ can be obtained. Extract the angle of the corresponding category position of the face training image by training the label information in each iteration, then add an interval value m, and put the modified angle and cosine distance back into the classification loss function softmax loss to form The final Arcface loss function L, the specific formula is as follows, where m is usually taken as 0.5:

Among them, s represents _{the modulus of x i} , and m represents the angular interval value.

The Arcface loss function L conducts training through the angle information between the face training features, and adds the angle interval value m to achieve a better classification purpose and increase the interval between different categories. After the Arcface loss function guides the training, the representation of the feature (before normalization) in the multi-dimensional space is reduced to the representation in the two-dimensional space, please refer to Figure 3. The features of the same ID are basically gathered in the same angle range, and there will be a certain interval between different IDs.

The above classification loss function softmax loss can perform iterative optimization on training. In this way, the complexity of training can be reduced and the processing efficiency can be improved.

S133: Normalize the input parameters of the loss layer in the convolutional neural network model;

S134: Calculate the intra-class and inter-class loss functions according to the input parameters of the loss layer and the weight parameters of the fully connected layer;

S135. Use the intra-class and inter-class loss function to guide and guide the convolutional neural network model for training to obtain a face recognition model.

In this embodiment, S133 can also be executed in advance. A face recognition model with good performance has been trained to achieve good classification. However, the face recognition model still needs to be improved, because the Arcface loss function can only ensure that there are enough intervals between categories, and cannot make the intervals evenly distributed in the entire feature space. In order to obtain uniformly distributed categories, in this embodiment, the second stage of training is performed after the above steps. The training in this stage is based on the regular face loss function and is improved. The final goal is to make the categories evenly distributed. Please refer to Figure 4 for details.

The normalized face training features and the weight parameter weight are used as the input of the loss function parameter at this stage. The first is to find the information that characterizes the center of the category. Based on the Arcface loss function, the vector multiplication result of the normalized weight parameter W and the face training feature x represents the cosine distance between the two. The closer the cosine value is to 1, it represents The greater the probability that the face training feature x is the category in the position of the weight parameter W. At the same time, the cosine distance between the weight parameter W vector and itself is equal to 1, so the weight parameter W can be regarded as the center of each category. Since the Arcface loss function has been trained, the position represented by the weight parameter W has sufficient credibility. This step is to train the position to achieve a more uniform distribution without increasing the distance within the class.

Further, the calculation to obtain the intra-class and inter-class loss function according to the input parameters of the loss layer and the weight parameters of the fully connected layer includes:

According to the input parameters of the loss layer and the weight parameters of the fully connected layer, the mean value of the inter-class distance and the variance of the inter-class distance of the feature category are respectively calculated, and the inter-class distance is obtained according to the sum of the mean value of the inter-class distance and the variance of the inter-class distance distance;

Specifically, the intra-class and inter-class loss function is composed of the part that only represents the intra-class distance and only limits the intra-class distance during training. The former on the left side of the equal sign in the following equation, namely L _s (θ+w), It is composed of the parts that play a role in the distance between them, and the latter on the left side of the equation is L _r (W). The specific formula is as follows:

L(θ,W)=L _s (θ,W)+L _r (W),

Among them, Ls represents and only represents the intra-class distance. In this embodiment, the intra-class distance is measured by using angle information. For example, the normalized cross product mentioned above can get the cosine information, and according to the label information in each iteration, take out the cosine value of each face training feature and the weight parameter W of the category, and make a negative logarithmic change to get The final Ls value is as follows:

Adding a coefficient k greater than 1 to the angular distance θ can achieve a better intra-class angular distance limitation effect, and ensure that the larger the angular distance, the greater the display effect. The purpose of using negative logarithms after the cosine distance is to make the cosine The value converges to 1, that is, the intra-class angle converges to 0.

The angle between classes means that each class takes the value closest to the cosine distance of the class, which is the class distance. Obviously, all inter-class information can be selected from the set of inter-class information corresponding to each category. The specific formula is as follows:

Among them, the former on the right side of the equal sign represents the mean value of the inter-class distance of all C categories, and the latter represents the variance of the inter-class distance. λ1 and λ2 represent the weight coefficients of the mean value of the inter-class distance and the variance of the inter-class distance, respectively. With the decrease of Lr, the distance between classes will become smaller and the gap of each distance will also be reduced, so that the distance between classes will be uniform.

The above Ls and Lr respectively represent the distance between the classes and the distance between classes, and there is no mutual functional overlap, so there will be no mutual influence during training.

Through the intra-class and inter-class loss function constructed through the above steps, a model with more uniform class distribution can be trained, and a better face recognition effect can be obtained. The intra-class inter-class loss function is implemented in a computer program. When calculating the inter-class distance corresponding to each class, you need to use the weight parameter weight and your own transpose matrix to do matrix multiplication to obtain each class and other classes. The cosine distance. Since the number of categories will be large when training on a large data set, matrix multiplication will rely on particularly large computing resources, so matrix multiplication can be performed in batches during calculation to reduce the amount of calculations processed at the same time. Please refer to FIG. 5, as shown by the dashed box in FIG. 5, which represents the calculation of the cosine distance between some categories and all other categories.

In summary, the embodiments of the present invention have at least the following advantages:

1. Training in stages, first use the Arcface method to train the neural network model, so that the model already has a good classification function, and the position of its category is more credible. Subsequently, the improved regularface training is carried out, using the parameters trained in the previous stage to uniformize the distance between classes, and at the same time keep the distance within the class sufficiently small.

2. Use the intra-class and inter-class distance loss function to train. The function is divided into a part that only characterizes the distance between classes and a part that only characterizes the distance between classes. The sum of the two is used as a guide for the second training stage. The functions of the two parts of the loss function do not overlap each other, so there is no interference, which makes the distance distribution between classes more uniform.

3. Using the final fully connected layer parameters, calculate the mean value and variance of the distance between classes, and use this as a part of the loss function, which can more directly represent the purpose of training.

4. For the calculation of the minimum inter-class distance corresponding to each category, the split calculation is used to make the calculation amount at the same time smaller, to ensure that the computer does not overflow the calculation amount, and realize the training of large-scale data.

Please refer to FIG. 6, which is a block diagram of modules of a face recognition apparatus based on deep learning according to a third embodiment of the present invention. In an embodiment of the present invention, the face recognition device based on deep learning includes:

The obtaining module 101 is used to obtain training samples of face pictures and face pictures to be tested;

The extraction module 102 is used for extracting face training features from the face image training samples, and extracting face features to be tested from the face image to be tested;

The construction module 103 is used to construct a convolutional neural network model, and use the convolutional neural network model to train the face training features of the face image training samples to obtain a face recognition model, wherein the training specifically includes the use of Arcface loss The function performs the first stage training of the convolutional neural network model to obtain the state of convergence of the convolutional neural network model, and uses the intra-class and inter-class loss function to perform the second-stage training of the convolutional neural network model;

The recognition module 104 is configured to compare the face features of the face images to be tested with the face images to be tested according to the trained face recognition model, so as to recognize the face images to be tested.

In this embodiment, through the acquiring module 101, the face image training sample and the face image to be tested can be acquired, and the face training feature can be extracted from the face image training sample and the face image to be tested through the extraction module 102 For the face features to be tested, through the construction module 103, a convolutional neural network model can be constructed, and the convolutional neural network model can be used to train the face training features of the face image training samples to obtain a face recognition model. The recognition module 104 , The facial features to be tested in the face picture to be tested can be compared according to the trained face recognition model to recognize the face picture to be tested. Through the implementation of the technical solution of the present invention, the spacing between classes can be made more uniform, and at the same time more types of training data can be trained, and large-scale face data training can be realized. In this way, the efficiency of face recognition can be improved, and the human face can be improved. Face recognition effect.

Wherein, the identification module 104 is specifically used for:

Use the face ID as the recognition result of the face picture to be tested.

Wherein, the identification module 104 is also used for:

Wherein, the building module 103 is used for:

Wherein, the building module 103 is also used for:

According to the input parameters of the loss layer and the weight parameters of the fully connected layer, the mean value of the inter-class distance of the feature category and the variance of the inter-class distance are calculated respectively;

According to the sum of the mean value of the inter-class distance and the variance of the inter-class distance, the intra-class inter-class loss function is obtained.

Please refer to FIG. 7, which is a module block diagram of an electronic device according to a fourth embodiment of the present invention. The electronic device can be used to implement the face recognition method based on deep learning in the foregoing embodiment. As shown in FIG. 7, the electronic device mainly includes: a memory 301, a processor 302, a bus 303, and a computer program stored on the memory 301 and running on the processor 302. The memory 301 and the processor 302 are connected by the bus 303. When the processor 302 executes the computer program, it implements the face recognition method based on deep learning in the foregoing embodiment. Among them, the number of processors can be one or more.

The memory 301 may be a high-speed random access memory (RAM, Random Access Memory) memory, or a non-volatile memory (non-volatile memory), such as a magnetic disk memory. The memory 301 is used to store executable program codes, and the processor 302 is coupled with the memory 301.

Further, an embodiment of the present application also provides a readable storage medium, which may be an electronic device provided in each of the above embodiments, and the readable storage medium may be the embodiment shown in FIG. 7 In the memory.

A computer program is stored on the readable storage medium, and when the program is executed by the processor, the deep learning-based face recognition method in the foregoing embodiment is implemented. Further, the computer storage medium may also be a U disk, a mobile hard disk, a read-only memory (ROM, Read-Only Memory), RAM, a magnetic disk, or an optical disk, and other various media that can store program codes.

In the several embodiments provided in this application, it should be understood that the disclosed device and method may be implemented in other ways. For example, the device embodiments described above are only illustrative. For example, the division of modules is only a logical function division, and there may be other divisions in actual implementation, for example, multiple modules or components can be combined or integrated. To another system, or some features can be ignored, or not implemented. In addition, the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or modules, and may be in electrical, mechanical or other forms.

The modules described as separate components may or may not be physically separate, and the components displayed as modules may or may not be physical modules, that is, they may be located in one place, or they may be distributed on multiple network modules. Some or all of the modules can be selected according to actual needs to achieve the objectives of the solutions of the embodiments.

In addition, the functional modules in the various embodiments of the present application may be integrated into one processing module, or each module may exist alone physically, or two or more modules may be integrated into one module. The above-mentioned integrated modules can be implemented in the form of hardware or software functional modules.

If the integrated module is implemented in the form of a software function module and sold or used as an independent product, it can be stored in a computer readable storage medium. Based on this understanding, the technical solution of this application essentially or the part that contributes to the existing technology or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a readable storage. The medium includes several instructions to enable a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods in the various embodiments of the present application. The aforementioned readable storage medium includes: U disk, mobile hard disk, ROM, RAM, magnetic disk or optical disk and other media that can store program codes.

It should be noted that for the foregoing method embodiments, for simplicity of description, they are all expressed as a series of action combinations, but those skilled in the art should know that this application is not limited by the described sequence of actions. Because according to this application, some steps can be performed in other order or at the same time. Secondly, those skilled in the art should also know that the embodiments described in the specification are all preferred embodiments, and the involved actions and modules are not necessarily all required by this application.

In the above-mentioned embodiments, the description of each embodiment has its own focus. For a part that is not described in detail in an embodiment, reference may be made to related descriptions of other embodiments.

The above descriptions are only the preferred embodiments of the present invention, and do not limit the scope of the present invention. Under the conception of the technical solution of the present invention, equivalent structural transformations made by using the content of the description and drawings of the present invention, or direct/indirect Applications in other related technical fields are included in the scope of patent protection of the present invention.

Claims

A face recognition method based on deep learning, characterized in that the face recognition method based on deep learning includes:

Obtain face image training samples and face images to be tested;

Extracting face training features from face image training samples, and extracting face features to be tested from face images to be tested;

Construct a convolutional neural network model, and use the convolutional neural network model to train the face training features of the face image training samples to obtain a face recognition model, where the training specifically includes using the Arcface loss function to perform the convolutional neural network The model is trained in the first stage to obtain the state of convergence of the convolutional neural network model, and the second-stage training of the convolutional neural network model is performed using the intra-class and inter-class loss function;

According to the trained face recognition model, the face image to be tested is compared with the face feature to be tested to recognize the face image to be tested.
The face recognition method based on deep learning according to claim 1, wherein the face recognition model to be tested according to the trained face recognition model compares the face features of the face to be tested, so as to compare the face to be tested. Picture recognition, including:

According to the trained face recognition model, compare the face features of the face images to be tested;

When the comparison is successful, obtain the face ID corresponding to the face training feature in the face recognition model; and

Use the face ID as the recognition result of the face picture to be tested.
The face recognition method based on deep learning according to claim 2, wherein after obtaining the face ID corresponding to the face training feature in the face recognition model, the method further comprises:

Check whether the number of face IDs of the face recognition model after the comparison is unique;

When the number of face IDs is unique, the cosine distance between the face feature to be tested and the face training feature of the face recognition model that has been successfully compared is used to identify whether the face feature to be tested and the compared face ID are the same person.
The face recognition method based on deep learning according to claim 1, wherein the first stage training of the convolutional neural network model by using the Arcface loss function to obtain the convolutional neural network model convergence state comprises:

Normalize the face training features and the weight parameters of the fully connected layer in the convolutional neural network model respectively, and calculate the Arcface loss function in the loss layer of the convolutional neural network model;

The Arcface loss function is used to guide the convolutional neural network model for training, and the convergence state of the convolutional neural network model is obtained.
The face recognition method based on deep learning according to claim 4, wherein the second-stage training of the convolutional neural network model using the intra-class and inter-class loss function comprises:

Normalize the input parameters of the loss layer in the convolutional neural network model;

According to the input parameters of the loss layer and the weight parameters of the fully connected layer, the intra-class and inter-class loss functions are calculated;

Use the intra-class and inter-class loss function to guide and guide the convolutional neural network model to train, and obtain the face recognition model.
The face recognition method based on deep learning according to claim 5, wherein the face training feature and the weight parameters of the fully connected layer in the convolutional neural network model are respectively normalized, and the convolution The loss layer in the neural network model calculates the Arcface loss function, which includes:

Normalize the face training features and the weight parameters of the fully connected layer in the convolutional neural network model to obtain the cosine distance between the face training features and the corresponding weight parameters;

Calculate the inverse trigonometric function through the cosine distance to obtain the angle of the feature category;

Increase the angle of the feature category by the interval value to get the angle of the modified feature category;

The Arcface loss function is formed according to the angle of the feature category and the angle of the modified feature category.
The face recognition method based on deep learning according to claim 6, characterized in that the calculation to obtain the intra-class and inter-class loss function according to the input parameters of the loss layer and the weight parameters of the fully connected layer specifically includes:

Perform a negative logarithmic change on the angle of the modified feature category to obtain the intra-class angle distance;

According to the input parameters of the loss layer and the weight parameters of the fully connected layer, the mean value of the inter-class distance and the variance of the inter-class distance of the feature categories are calculated respectively, and the inter-class distance is obtained according to the sum of the mean value of the inter-class distance and the variance of the inter-class distance distance;

According to the sum of the angular distance within the class and the distance between the classes, the intra-class and inter-class loss function is obtained.
A face recognition device based on deep learning, characterized in that the face recognition device based on deep learning comprises:

The acquisition module is used to acquire training samples of face pictures and face pictures to be tested;

The extraction module is used to extract face training features from face image training samples, and to extract face features to be tested from face images to be tested;

The construction module is used to build a convolutional neural network model, and use the convolutional neural network model to train the face training features of the face image training samples to obtain a face recognition model, wherein the training specifically includes using the Arcface loss function Carry out the first stage training of the convolutional neural network model to obtain the state of convergence of the convolutional neural network model, and use the intra-class and inter-class loss function to conduct the second-stage training of the convolutional neural network model;

The recognition module is used to compare the face features of the face images to be tested with the face images to be tested according to the trained face recognition model, so as to recognize the face images to be tested.
An electronic device comprising: a memory, a processor, and a computer program stored on the memory and running on the processor, wherein the processor implements claim 1 when the computer program is executed by the processor Steps in the method described in any one of to 7.
A readable storage medium having a computer program stored thereon, wherein the computer program implements the steps in the method of any one of claims 1 to 7 when the computer program is executed by a processor.