CN116895093B

CN116895093B - Face recognition method, device, equipment and computer readable storage medium

Info

Publication number: CN116895093B
Application number: CN202311155058.9A
Authority: CN
Inventors: 葛沅; 赵雅倩; 史宏志; 温东超; 张英杰
Original assignee: Suzhou Inspur Intelligent Technology Co Ltd
Current assignee: Suzhou Inspur Intelligent Technology Co Ltd
Priority date: 2023-09-08
Filing date: 2023-09-08
Publication date: 2024-01-23
Anticipated expiration: 2043-09-08
Also published as: CN116895093A

Abstract

The invention relates to the field of face recognition, and particularly discloses a face recognition method, a device, equipment and a computer readable storage medium.

Description

Face recognition method, device, equipment and computer readable storage medium

Technical Field

The present invention relates to the field of face recognition, and in particular, to a face recognition method, apparatus, device, and computer readable storage medium.

Background

The face recognition technology is a branch of the computer vision technology, is realized based on various technologies such as target classification, target positioning, semantic segmentation, instance segmentation and the like, mainly extracts face characteristics from face pictures through a deep neural network, retrieves image processing technology of similar faces from a face database, and can be applied to various scenes such as face unlocking mobile phone screens, face payment, fraud recognition and the like.

The face recognition comparison is to compare the collected face picture with the face data in the face database, so that a great deal of calculation is needed for comparing the face picture with one piece of face data at a time, and whether the face data matched with the face picture exists in the massive face database is determined to be more needed, which also results in lower face recognition efficiency of the existing face recognition technology.

The face recognition efficiency is improved, and the face recognition method is a technical problem which needs to be solved by a person skilled in the art.

Disclosure of Invention

The invention aims to provide a face recognition method, a face recognition device, face recognition equipment and a computer readable storage medium, which are used for improving face recognition efficiency.

In order to solve the technical problems, the invention provides a face recognition method, which comprises the following steps:

Training to obtain a face detection model and a face feature extraction model;

extracting face detection features of each face image in a face image dataset by using the face detection model, extracting face feature vectors of each face image by using the face feature extraction model, and carrying out association storage on the face detection features of the face image, the face feature vectors of the face image and identity information of the face image to obtain a face database;

detecting by using the face detection model to obtain face detection characteristics of an image to be detected, and extracting face characteristic vectors of the image to be detected by using the face characteristic extraction model;

screening out face data to be compared matched with the face detection features of the images to be detected from the face database, and carrying out similarity calculation on the face feature vectors of the images to be detected and the face feature vectors in the face data to be compared to obtain face recognition results of the images to be detected;

the face recognition result comprises the following steps: and the identity information of the image to be detected or the notification of the identity information of the image to be detected is not detected.

In some implementations, the face detection features include: at least one of an age detection result, a sex detection result, a decoration detection result, and a hairstyle detection result.

In some implementations, extracting face detection features using the face detection model includes:

extracting reinforcement features of an input image to obtain image feature parameters of the input image;

inputting the image characteristic parameters of the input image into a face detection prediction function to obtain the face detection characteristics of the input image;

the input image is the face image or the image to be detected.

In some implementations, the performing the enhancement feature extraction on the input image to obtain an image feature parameter of the input image includes:

carrying out N-stage gradient extraction on an input image through N times of depth separable convolution, and outputting N feature maps with different compression ratios;

after the feature graphs are adjusted to be the same in channel number, sequentially carrying out feature fusion on the feature graphs with the channel number adjusted according to the sequence of the compression ratio from large to small to obtain N corresponding feature fusion results;

respectively carrying out enhanced feature extraction on each feature fusion result to obtain a feature layer corresponding to each feature fusion result;

Outputting a characteristic layer corresponding to the input image as an image characteristic parameter of the input image;

wherein N is a non-1 positive integer.

In some implementations, the step of extracting the enhanced features from each feature fusion result to obtain a feature layer corresponding to each feature fusion result includes:

respectively extracting the characteristics of each characteristic fusion result by using M convolution kernels with different sizes, and outputting a characteristic set corresponding to each characteristic fusion result;

splicing and stacking the corresponding feature sets to each feature fusion result to obtain the feature layer corresponding to the feature fusion result;

wherein M is a non-1 positive integer.

In some implementations, the inputting the image feature parameters of the input image into a face detection prediction function to obtain the face detection feature of the input image includes:

inputting the image characteristic parameters of the input image into a face detection prediction function, and outputting a face detection result predicted value of the input image;

and carrying out prediction result correction and non-maximum value inhibition and filtration on the face detection result predicted value of the input image to obtain the face detection characteristic of the input image.

In some implementations, the face detection features are of a plurality of categories;

training to obtain the face detection model, including:

superposing face detection prediction functions corresponding to the face detection features to obtain an overall loss function;

and carrying out gradient descent training on the face detection model to be trained by using the overall loss function to obtain the face detection model for detecting the face detection characteristics of each type.

inputting the image characteristic parameters of the input image into a face attribute prediction function to obtain the face attribute characteristics of the input image;

the step of screening the face data to be compared matched with the face detection characteristics of the image to be detected from the face database comprises the following steps:

and screening the face data to be compared matched with the face attribute characteristics of the image to be detected from the face database.

In some implementations, the inputting the image feature parameters of the input image into a face detection prediction function to obtain the face detection feature of the input image further includes:

And inputting the image characteristic parameters of the input image into a face coordinate prediction function to obtain a face coordinate detection result of the input image.

In some implementations, extracting face feature vectors of the input image using the face feature extraction model includes:

and extracting a standard face image from the input image based on the face coordinate detection result of the input image, and extracting a face feature vector of the input image from the standard face image corresponding to the input image by using the face feature extraction model.

In some implementations, the face attribute features include at least one of a trim detection result, a hairstyle detection result;

the step of inputting the image characteristic parameters of the input image into a face attribute prediction function to obtain the face attribute characteristics of the input image comprises the following steps:

and obtaining the face attribute characteristics of the input image according to the face key point coordinate detection result in the face detection characteristics of the input image and the pixel value change information of the input image.

In some implementations, when the face attribute feature is a glasses wearing detection result in the accessory detection result, the obtaining the detection result of the face attribute feature of the input image according to the face key point coordinate detection result in the face detection feature of the input image and the pixel value change information of the input image includes:

Obtaining the human five-sense organ coordinates of the input image according to the detection result of the human face key point coordinates of the input image;

performing eye positioning on the input image according to the human five-element coordinate of the input image to obtain the eye coordinate of the input image, and generating a glasses frame surrounding the eye position of the input image according to the eye coordinate of the input image;

calculating a first difference value of pixel values within a glasses frame of the input image and pixel values outside the glasses frame of the input image; if the first difference value exceeds an eye difference degree threshold value, determining that the glasses wearing information of the input image is that glasses are worn; and if the first difference value does not exceed the eye difference degree threshold value, determining that the glasses wearing information of the input image is not glasses.

In some implementations, when the face attribute feature is a hairstyle detection result, the obtaining the detection result of the face attribute feature of the input image according to the face key point coordinate detection result in the face detection feature of the input image and the pixel value variation information of the input image includes:

Performing forehead positioning on the input image according to the human five sense organs coordinates of the input image to obtain the forehead coordinates of the input image, and generating a forehead frame surrounding the forehead position of the input image according to the forehead coordinates of the input image;

calculating a second difference value of a pixel value in the forehead frame of the input image and a pixel value outside the forehead frame of the input image; if the second difference value exceeds the forehead difference threshold value, determining that the hairstyle detection result of the input image is with a bang; and if the second difference value does not exceed the forehead difference threshold value, determining that the hairstyle detection result of the input image is that no bang exists.

In some implementations, the face attribute features include a trim detection result and a hairstyle detection result;

the step of screening the face data to be compared matched with the face detection features of the image to be detected from the face database, and carrying out similarity calculation on the face feature vector of the image to be detected and the face feature vector in the face data to be compared to obtain the face recognition result of the image to be detected, the step of obtaining the face recognition result of the image to be detected comprises the following steps:

screening face data which are consistent with both the accessory detection result of the image to be detected and the hairstyle detection result of the image to be detected from the face database to serve as the face data to be compared, and carrying out similarity calculation on the face feature vector of the image to be detected and the face feature vector in the face data to be compared;

If target face data with similarity score reaching a similarity threshold is obtained, determining identity information of the image to be detected in the target face data as a face recognition result of the image to be detected; if the target face data is not obtained, re-screening the face data which is consistent with the accessory detection result of the image to be detected and the face data which is consistent with the hairstyle detection result of the image to be detected from the face database, taking the face data as the face data to be compared, and carrying out similarity calculation on the face feature vector of the image to be detected and the face feature vector in the face data to be compared;

if the target face data are obtained, the identity information of the image to be detected is determined in the target face data to serve as a face recognition result of the image to be detected; if the target face data is not obtained, taking the remaining face data in the face database as the face data to be compared, and carrying out similarity calculation on the face feature vector of the image to be detected and the face feature vector in the face data to be compared;

if the target face data are obtained, the identity information of the image to be detected is determined in the target face data to serve as a face recognition result of the image to be detected; and if the target face data is not obtained, outputting a face recognition result of the image to be detected as that the face identity is not detected.

In some implementations, the face attribute features include an age detection result and a gender detection result;

screening face data which are consistent with the age detection result of the image to be detected and the sex detection result of the image to be detected from the face database to serve as the face data to be compared, and carrying out similarity calculation on the face feature vector of the image to be detected and the face feature vector in the face data to be compared;

if target face data with similarity score reaching a similarity threshold is obtained, determining identity information of the image to be detected in the target face data as a face recognition result of the image to be detected; if the target face data is not obtained, rescreening face data which is consistent with the age detection result of the image to be detected and face data which is consistent with the sex detection result of the image to be detected from the face database, as the face data to be compared, and carrying out similarity calculation on the face feature vector of the image to be detected and the face feature vector in the face data to be compared;

outputting anchor frames with different scales to an input image by utilizing a network detection branch, and calling the face detection model to perform forward propagation calculation on the input image to obtain candidate face detection results corresponding to each anchor frame on the input image;

filtering candidate face detection results with confidence scores lower than a confidence threshold value from the candidate face detection results on the input image, and filtering adjacent frame regression results with the intersection ratio smaller than a non-maximum value filtering threshold value through a non-maximum value filtering algorithm to obtain the face detection results of the input image;

The input image is the face image or the image to be detected.

In some implementations, the variety of face detection features is multiple;

creating all face detection feature subsets of all kinds of face detection features;

selecting one face detection feature subset, screening the face data to be compared matched with all face detection features in the face detection feature subset from the face database, and carrying out similarity calculation on the face feature vector of the image to be detected and the face feature vector in the face data to be compared to obtain a face recognition result of the image to be detected;

if target face data with similarity score reaching a similarity threshold is obtained, determining identity information of the image to be detected in the target face data as a face recognition result of the image to be detected;

If the target face data is not obtained and the number of the face detection feature subsets which are not taken is not zero, selecting another face detection feature subset, entering the face database, screening the face data to be compared matched with all face detection features in the face detection feature subsets, and carrying out similarity calculation on the face feature vector of the image to be detected and the face feature vector in the face data to be compared to obtain the face recognition result of the image to be detected;

if the target face data is not obtained and the number of the face detection feature subsets which are not taken is zero, taking the remaining face data in the face database as the face data to be compared, and carrying out similarity calculation on the face feature vector of the image to be detected and the face feature vector in the face data to be compared;

if the target face data are obtained, the identity information of the image to be detected is determined in the target face data to serve as a face recognition result of the image to be detected;

and if the target face data is not obtained, outputting a face recognition result of the image to be detected as that the face identity is not detected.

In some implementations, the selecting one of the face detection feature subsets includes:

and sequentially selecting the face detection feature subsets according to the sequence of at least more number of the face detection features.

In some implementations, the screening the face data to be compared, which is matched with all face detection features in the face detection feature subset, from the face database, and performing similarity calculation on the face feature vector of the image to be detected and the face feature vector in the face data to be compared to obtain a face recognition result of the image to be detected, including:

and screening the face data to be compared matched with all face detection features in the face detection feature subset from the face database, deleting the compared face data to be compared, and then carrying out similarity calculation on the face feature vector of the image to be detected and the face feature vector in the face data to be compared to obtain the face recognition result of the image to be detected.

In some implementations, the variety of face detection features is multiple;

Taking all types of face detection features as screening conditions, screening the face data to be compared matched with all the face detection features in the screening conditions from the face database, and carrying out similarity calculation on the face feature vectors of the image to be detected and the face feature vectors in the face data to be compared to obtain a face recognition result of the image to be detected;

if the target face data is not obtained and the number of the face detection features in the screening condition is not one, after one face detection feature is removed from the screening condition, entering the face database to be screened for the face data to be compared, which is matched with all the face detection features in the screening condition, and carrying out similarity calculation on the face feature vector of the image to be detected and the face feature vector in the face data to be compared to obtain the face recognition result of the image to be detected;

If the target face data is not obtained and the number of face detection features in the screening condition is one, taking the remaining face data in the face database as the face data to be compared, and carrying out similarity calculation on the face feature vector of the image to be detected and the face feature vector in the face data to be compared;

In some implementations, the step of screening the face data to be compared, which is matched with all face detection features in the screening condition, from the face database, and performing similarity calculation on the face feature vector of the image to be detected and the face feature vector in the face data to be compared to obtain a face recognition result of the image to be detected includes:

and screening the face data to be compared matched with all face detection features in the screening conditions from the face database, deleting the compared face data to be compared, and then carrying out similarity calculation on the face feature vector of the image to be detected and the face feature vector in the face data to be compared to obtain the face recognition result of the image to be detected.

In order to solve the above technical problem, the present invention further provides a face recognition device, including:

the model training unit is used for training to obtain a face detection model and a face feature extraction model;

the database building unit is used for extracting the face detection characteristics of each face image in the face image data set by utilizing the face detection model, extracting the face characteristic vector of each face image by utilizing the face characteristic extraction model, and carrying out association storage on the face detection characteristics of the face image, the face characteristic vector of the face image and the identity information of the face image to obtain a face database;

the face detection unit is used for detecting the face detection characteristics of the image to be detected by using the face detection model, and extracting the face characteristic vectors of the image to be detected by using the face characteristic extraction model;

the face recognition unit is used for screening face data to be compared, which is matched with the face detection features of the image to be detected, from the face database, and carrying out similarity calculation on the face feature vector of the image to be detected and the face feature vector in the face data to be compared to obtain a face recognition result of the image to be detected;

In order to solve the above technical problems, the present invention further provides a face recognition device, including:

a memory for storing a computer program;

a processor for executing the computer program, which when executed by the processor implements the steps of the face recognition method according to any one of the above.

To solve the above technical problem, the present invention further provides a computer readable storage medium, on which a computer program is stored, the computer program implementing the steps of the face recognition method according to any one of the above when being executed by a processor.

According to the face recognition method provided by the invention, the face detection results and the face feature vectors of the face images in the face image data set are respectively obtained by training the face detection model and the face feature extraction model, and the face database is screened by the face detection results measured by the face detection model to obtain the face data to be compared when the face detection model is used for face detection of the images to be detected, so that the range of the face data to be compared is reduced, then the similarity calculation of the face feature vectors of the images to be detected and the face data to be compared is carried out, and compared with the current full similarity calculation, the calculated amount of face data comparison is reduced, so that the face recognition efficiency is improved.

The invention also provides a face recognition device, equipment and a computer readable storage medium, which have the beneficial effects and are not repeated here.

Drawings

For a clearer description of embodiments of the invention or of the prior art, the drawings that are used in the description of the embodiments or of the prior art will be briefly described, it being apparent that the drawings in the description below are only some embodiments of the invention, and that other drawings can be obtained from them without inventive effort for a person skilled in the art.

Fig. 1 is a flowchart of a face recognition method according to an embodiment of the present invention;

fig. 2 is a schematic diagram of a face detection network according to an embodiment of the present invention;

fig. 3 is a schematic structural diagram of a face recognition device according to an embodiment of the present invention;

fig. 4 is a schematic structural diagram of a face recognition device according to an embodiment of the present invention.

Detailed Description

The core of the invention is to provide a face recognition method, a device, equipment and a computer readable storage medium, which are used for improving the face recognition efficiency.

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

The following describes an embodiment of the present invention.

For ease of understanding, a description will first be given of a suitable scenario of the present invention.

Face recognition scenarios can be distinguished into 1:1 face recognition and 1: n face recognition. Among them, 1:1 face recognition is generally used for authentication verification scenarios. 1: the N face recognition is used for scenes such as face unlocking of a mobile phone screen, face payment, fraud recognition and the like. Face recognition scenes are various in expression forms, but are finally realized based on the face feature retrieval technology. Compared with other biological recognition technologies, the face recognition technology can acquire face information under the condition that the user does not feel even without the need of the direct contact of the user with equipment or the active cooperation of the user, and can complete identity verification and retrieval. The face recognition method provided by the embodiment of the invention is mainly applicable to 1: n face recognition scenarios.

The face recognition method provided by the embodiment of the invention can be applied to any face recognition device or device comprising a face recognition function, including but not limited to mobile phones, tablet computers, computers and the like.

The second embodiment of the present invention will be described below.

Fig. 1 is a flowchart of a face recognition method according to an embodiment of the present invention.

As shown in fig. 1, the face recognition method provided by the embodiment of the invention includes:

s101: training to obtain a face detection model and a face feature extraction model.

S102: extracting the face detection characteristics of each face image in the face image data set by using the face detection model, extracting the face characteristic vectors of each face image by using the face characteristic extraction model, and carrying out association storage on the face detection characteristics of the face image, the face characteristic vectors of the face image and the identity information of the face image to obtain a face database.

S103: and detecting by using a face detection model to obtain face detection characteristics of the image to be detected, and extracting face characteristic vectors of the image to be detected by using a face characteristic extraction model.

S104: and screening the face data to be compared, which is matched with the face detection characteristics of the image to be detected, from the face database, and carrying out similarity calculation on the face characteristic vector of the image to be detected and the face characteristic vector in the face data to be compared to obtain the face recognition result of the image to be detected.

The face recognition result comprises the following steps: the identity information of the image to be detected or the notification of the identity information of the image to be detected is not detected.

In a specific implementation, for S101, the face detection feature may include, but is not limited to, at least one of an age detection result, a gender detection result, a accessory detection result, a hairstyle detection result, that is, may also include other types of detection results, a makeup detection result, and the like.

The face detection model is used for detecting obvious face features on an input image (a face image or an image to be detected), namely face detection features, and the face detection features are helpful for screening face data which are consistent with possible identity information of the image to be detected in a face database as screening conditions. The traditional face recognition scheme needs to adopt a face detection model to extract face coordinate data in an input image, then utilizes a face feature extraction model to extract face feature vectors based on the face coordinate data, and has the defects of long calculation time and low face recognition efficiency due to the fact that similarity calculation is required between the face feature vectors of the image to be detected and face feature vectors of all face data in a face database. The face recognition method provided by the invention also adopts the face detection model to extract obvious face detection features on the input image, so that the face detection method not only can assist the face feature extraction model to extract the face feature vectors, but also can reduce the number of face data to be compared by using the face detection features before the similarity calculation of the face feature vectors is carried out, thereby obviously shortening the calculation time in the link of the similarity calculation of the face feature vectors, and further improving the overall efficiency of face recognition.

The face feature extraction model is used for extracting face feature vectors of an input image (a face image or an image to be detected) so as to calculate the similarity of the two face images. When the method is applied, if the similarity calculation result of the face feature vector of the image to be detected and the face feature vector in certain face data in the face database is higher than a similarity threshold value or the similarity is highest, the face identity information of the image to be detected can be considered as the identity information of the face data.

The process of training the face detection model and the face feature extraction model may include: the method comprises a data acquisition process, a data preprocessing process, a model training process and a model test evaluation process.

In the data acquisition process, a certain number of face images need to be collected, including face images of different angles and conditions such as front, side, shielding and the like. And selecting a face image data set with the labeling information file, or labeling the face image to generate a file for verification. The face image used for training the face detection model needs to be marked with detailed information such as picture names, face formats of each picture, coordinates of each face boundary frame and the like; the face images used for training the face feature extraction model need to be marked with detailed information such as picture names, key point coordinates of each image, scores and the like.

In the data preprocessing process, the collected face image needs to be preprocessed, including operations of adjusting the size of the face image to the size required by the model, and the like, so that the subsequent model training can be performed.

In the model training process, a data set and model parameters are loaded, and model training parameters such as training period (epoch), batch size (batch_size), learning speed (learning_rate) and the like are set according to actual conditions. And selecting samples from the training data set, sending the samples into a model to calculate a loss function, updating the model weight, and continuously iterating until the preset period times are reached or the model convergence is finished, and outputting a trained model.

In the model test evaluation process, the trained model is tested and evaluated by using the test data set so as to check the performance and the accuracy of the model.

For S102, face detection features of the face image are obtained by using the face detection model, face feature vectors of the face image are extracted by using the face feature extraction model, and the face detection features of the face image, the face feature vectors of the face image and identity information of the face image are stored in a correlated manner to obtain a face database.

The identity information of the face image is known, and can be identity mark information such as name, identity card number and the like.

In the face image data set, if the same person has only one face image, the face feature vectors of the face image are directly associated and stored. If there are multiple face images of the same person, there are multiple face feature vectors. At this time, face feature vectors corresponding to all face images of the same person can be spliced and strung together, or the corresponding position elements of each face feature vector are averaged, so that the dimension of the obtained average feature vector is unchanged.

When creating the face database, a corresponding face information table may be created for each face image to store the face detection features. For example, let the vector X be a face information table of each face, x= { X1, X2, X3, X4}, where the information list elements X1, X2, X3, X4 are represented by gender, age group, whether glasses are worn, and whether there is a bang, respectively. When the face detection characteristics of the image to be detected are extracted, a corresponding face information table is also established for the image to be detected and used for screening the face data to be compared. Other ways of storing the face detection features of the face image are possible in addition to this.

For S103 and S104, compared with the scheme that in the related art, similarity calculation is carried out on face feature vectors of an image to be detected and face feature vectors of all face data in a face database, the face recognition method provided by the embodiment of the invention firstly utilizes a face detection model to detect the face detection features of the image to be detected, utilizes the face detection features of the image to be detected to screen face data to be compared, which are matched with the face detection features of the image to be detected, in the face database, and then carries out similarity calculation on the face feature vectors of the image to be detected and the face feature vectors in the face data to be compared, so that the data quantity required for similarity calculation is reduced, and the face recognition efficiency is effectively improved.

For example, if the type of the set face detection feature is an age detection result, the age detection result of the image to be detected is detected by using the face detection model, face data of the same age detection result is screened out from the face database and is defined as face data to be compared, similarity calculation is performed on the face feature vector of the image to be detected and the face feature vector in the face data to be compared, and if the face data to be compared meeting the similarity threshold is not found, the search range is enlarged.

When the face detection features of the input image are extracted by using the face detection model, the face detection features are determined by generating a plurality of prior frames, respectively, and in S102 and S103, the face detection features are extracted by using the face detection model, including:

outputting anchor frames with different scales for an input image by utilizing a network detection branch, and calling a face detection model to perform forward propagation calculation on the input image to obtain candidate face detection results corresponding to each anchor frame on the input image;

filtering candidate face detection results with confidence scores lower than a confidence threshold value from the candidate face detection results on the input image, and filtering adjacent frame regression results with the intersection ratio smaller than the non-maximum value filtering threshold value through a non-maximum value filtering algorithm to obtain the face detection result of the input image;

The input image is a face image or an image to be detected.

In addition, because the face data in the face database may be too old or have poor quality in acquisition, the face recognition method provided by the embodiment of the invention can further include performing quality detection and update on the corresponding face data in the face database by applying the image to be detected for detecting the identity information. If the standard face image in the image to be detected is clearer or the corresponding face data in the face database is long in generation date, the face data of the image to be detected can be used for replacing the corresponding face data in the face database.

According to the face recognition method provided by the embodiment of the invention, the face detection results and the face feature vectors of the face images in the face image data set are respectively obtained by training the face detection model and the face feature extraction model, and the face database is screened by the face detection results measured by the face detection model to obtain the face data to be compared when the face detection model is used for face detection of the images to be detected, so that the range of the face data to be compared is reduced, then the similarity calculation of the face feature vectors of the images to be detected and the face data to be compared is carried out, and compared with the current full similarity calculation, the calculated amount of the face data comparison is reduced, so that the face recognition efficiency is improved. 10 ten thousand data sets of the IJBC face verification data set are used for carrying out 1: n compares the experiment, apply the existing face recognition scheme, only adopt the face detection model to draw the face coordinate data in the input image, then utilize the face feature to draw the model to draw the face feature vector based on the face coordinate data, the average comparison time of a picture is 6.58-6.72 seconds, and the face recognition method that the embodiment of the invention provides, the average comparison time of a picture has shortened to about 0.53 seconds. It can be seen that the face recognition method provided by the embodiment of the invention obviously improves the face recognition speed.

The following describes a third embodiment of the present invention.

Fig. 2 is a schematic diagram of a face detection network according to an embodiment of the present invention.

In the related art, the extraction capability of a face detection model for detecting face information is to be improved. Therefore, on the basis of the above embodiment, in the face recognition method provided by the embodiment of the present invention, the extracting the face detection feature by using the face detection model may include:

extracting the enhanced features of the input image to obtain image feature parameters of the input image;

inputting image characteristic parameters of an input image into a face detection prediction function to obtain face detection characteristics of the input image;

the input image is a face image or an image to be detected.

In some implementations, performing the enhancement feature extraction on the input image to obtain the image feature parameters of the input image may include:

after the feature images are adjusted to be the same in channel number, sequentially carrying out feature fusion on the feature images with the channel number adjusted according to the sequence of the compression ratio from large to small to obtain N corresponding feature fusion results;

outputting an image characteristic parameter of the input image, which is the characteristic layer corresponding to the input image;

wherein N is a non-1 positive integer.

As shown in fig. 2, an embodiment of the present invention provides a network architecture of a face detection model, where the face detection model may include: a compression extraction network 201, a feature fusion network 202, an enhanced feature extraction network 203, and a prediction calculation network 204.

The compression extraction network 201 is configured to perform multi-stage gradient extraction on the input image by performing multiple-time depth separable convolution, so as to obtain feature maps with different compression ratios.

The feature fusion network 202 is configured to perform channel number adjustment on feature graphs with different compression ratios, adjust the feature graphs to the same channel number, and then sequentially perform feature fusion on the feature graphs with the adjusted channel number according to the order of the compression ratios from large to small, specifically, start from the feature graph with the largest compression ratio, take the adjusted feature graph as one of the feature fusion results, perform feature fusion on the feature graph after the up-sampling of the adjusted feature graph and the previous feature graph to obtain a feature fusion result, and then perform up-sampling and fusion on the feature fusion result and the previous feature graph, so as to output the feature fusion result with the same number as the feature graph.

In order to further increase the extracted features, the feature fusion results are respectively subjected to enhanced feature extraction through the enhanced feature extraction network 203, so as to obtain feature layers corresponding to the feature fusion results. The step of respectively extracting the enhanced features of each feature fusion result to obtain a feature layer corresponding to each feature fusion result may include: respectively extracting the characteristics of each characteristic fusion result by using M convolution kernels with different sizes, and outputting a characteristic set corresponding to each characteristic fusion result; splicing and stacking the corresponding feature sets to obtain feature layers corresponding to the feature fusion results; wherein M is a non-1 positive integer.

The prediction calculation network 204 is configured to input image feature parameters of an input image into a face detection prediction function to obtain face detection features of the input image, that is, obtain the face detection features of the input image according to the feature layers and the prediction functions corresponding to the face detection features. Different face detection features have corresponding prediction functions according to the method for obtaining the face detection features, the face detection features are generally classified into two types of classification prediction, wherein the classification result is for example, a gender detection result (male/female) is a classification prediction, and the identification result is for example, an age detection result is a regression prediction (the age detection can also be realized by adopting a classification prediction mode in a manner of dividing age groups). And inputting the characteristic parameters corresponding to the characteristic layers into the prediction functions corresponding to the face detection characteristics, obtaining corresponding predicted values, and decoding (decoding) the predicted values to obtain the predicted results of the face detection characteristics.

Assuming that the size of the input image is 720×1280×3, the compression extraction network 201 performs three-stage feature extraction on the input image by three-time depth separable convolution, where the first compression ratio scale=8 and the output feature size is (1,64,90,160), where 64 is the fixed output dimension of the stage, 160=1280/8 (compression ratio value), and 90=1280×720/64/160; a second compression ratio scale=16, output feature size (1,128,45,80), where 128 is the fixed output dimension of the stage, 80=1280/16 (compression ratio value), 45=1280 x 720/128/80/2; the third compression ratio scale=32, the output characteristic is (1,256,23,40), where 256 is the fixed output dimension for this stage, 40=1280/32 (compression ratio value), 23=1280 x 720/256/40/4, rounded up. These three output features, corresponding to the last three feature maps (feature maps) of the compression extraction network 201 in fig. 2, are respectively denoted as a first feature map C3, a second feature map C4, and a third feature map C5.

The feature fusion network 202 is configured to obtain the first feature map C3, the second feature map C4, and the third feature map C5, and then adjust the channel numbers to be identical (e.g. 64 in fig. 3). Specifically, the feature fusion network 202 performs channel number adjustment on the first feature map C3, the second feature map C4, and the third feature map C5 of the compression extraction network 201 by using a convolution check of 1×1, and then uses the adjusted third feature map C5 as a third feature fusion result P5. And adjusting the size of the adjusted third feature map to be consistent with the adjusted second feature map by utilizing up-sampling, and fusing the adjusted third feature map and the adjusted second feature map to obtain a second feature fusion result P4. And convolving the second feature fusion result P4 by using a convolution kernel of 3 multiplied by 3, adjusting the size of the second feature fusion result P4 to be consistent with the adjusted first feature map by using up-sampling, and fusing the second feature fusion result P4 and the first feature map to obtain a first feature fusion result P3. The first feature fusion result P3 is convolved with a convolution kernel of 3×3. Finally, the three fused branches are denoted as P3', P4', P5', and the sizes are respectively as follows: (1,64,90,160), (1,64,45,80), (1,64,23,40) respectively, to the enhanced feature extraction network 203.

Feature extraction is further enhanced for feature layers in enhanced feature extraction network 203. The enhanced feature extraction network 203 may use three convolutions in parallel, the first being a 3 x 3 convolution, the second being a 2 times 3 x 3 convolution instead of a 5 x 5 convolution, and the third being a 3 times 3 x 3 convolution instead of a 5 x 5 convolution. And then respectively splicing and stacking the results of the three convolutions, and outputting three characteristic layers, namely a characteristic layer S3 of the first prediction characteristic, a characteristic layer S4 of the second prediction characteristic and a characteristic layer S5 of the third prediction characteristic. The feature layer S3 of the first predicted feature, the feature layer S4 of the second predicted feature, and the feature layer S5 of the third predicted feature are each sized (1,64,90,160), (1,64,45,80), and (1,64,23,40), respectively.

The feature layer S3 of the first predicted feature, the feature layer S4 of the second predicted feature, and the feature layer S5 of the third predicted feature are input into the prediction calculation network 204, and calculated according to a face detection prediction function corresponding to the face detection feature to be calculated by the prediction calculation network 204, to obtain a predicted value of the face detection feature.

In order to improve the detection efficiency and the detection precision, inputting the image feature parameters of the input image into the face detection prediction function to obtain the face detection feature of the input image, the method may include:

Inputting image characteristic parameters of an input image into a face detection prediction function, and outputting a face detection result prediction value of the input image;

and carrying out prediction result correction and non-maximum value inhibition and filtration on the face detection result predicted value of the input image to obtain the face detection characteristics of the input image.

Taking the above-listed network parameters as an example, when the predicted values of the face detection features output by the prediction calculation network 204 are decoded, three effective feature layers are obtained by using the predicted values of the face detection features, and the prior frame coordinates are adjusted to obtain a central predicted frame, and the converted coordinates are the upper left corner and the lower right corner. And after the prior frame is adjusted, obtaining a corresponding face detection result predicted value. And then removing the prediction frame with higher overlap ratio from the decoding result by using Non-maximum value suppression (Non-Maximum Suppression, NMS).

The face detection model is one or more network model description files, a training set file and the network model description files are loaded by a face detection model training algorithm, an input face detection model is built according to the network model description files, and the weight parameters of the built face detection model are initialized. The weight parameters of the face detection model are parameters obtained through a model pre-training method.

When training the face detection model, a gradient descent method may be employed. Specifically, a gradient descent algorithm is initialized, and a training period (epoch), a batch size (batch_size), and a learning rate (learning_rate) are set. One training period (epoch) refers to a process in which all data samples in the data set pass through the neural network once and return once, i.e., a training process is completed. During gradient updating, batch (batch) sample data is taken one at a time for training. The batch size (batch_size) setting is determined primarily from the configuration of the compute nodes. And then selecting samples from the face image data set to form a batch (batch) training sample. The Loss of Loss function is calculated using sample training. And updating the weight of the detection model by using a gradient descent method. And continuously repeating the processes of selecting samples, calculating a loss function and updating model weights until reaching a preset epoch value, ending the iteration, and outputting a trained face detection model.

Based on the face detection model provided by the embodiment of the invention, the image characteristic parameters required by face detection characteristic detection of an input image (a face image or an image to be detected) can be subjected to enhanced characteristic extraction, the calculation of the face detection characteristic can be accelerated, and the accuracy of the face detection characteristic detection is improved.

The fourth embodiment of the present invention will be described below.

In order to further improve the detection effect of the face detection model and reduce the training task amount of the model and the calculation amount during face detection, in the face recognition method provided by the embodiment of the invention, when the types of the face detection features are multiple, the training is performed to obtain the face detection model, which may include:

and performing gradient descent training on the face detection model to be trained by using the overall loss function to obtain the face detection model for detecting the face detection characteristics of each type.

In a specific implementation, on the basis of the face detection model provided in the third embodiment of the present invention, different prediction calculation networks 204 may be set for different types of face detection features, where each prediction calculation network 204 corresponds to different face detection prediction functions and loss functions, and each prediction calculation network 204 is also connected to an output end of one enhanced feature extraction network 203. And superposing the loss function of each face detection characteristic as the overall loss function of the face detection model, and performing global optimization by taking the overall loss function as the loss function of the face detection model to obtain the face detection model capable of detecting different face detection characteristics.

In order to realize detection of face detection features such as age, gender and the like, inputting image feature parameters of an input image into a face detection prediction function to obtain the face detection features of the input image, the method can comprise the following steps:

inputting image characteristic parameters of an input image into a face attribute prediction function to obtain face attribute characteristics of the input image;

screening the face data to be compared matched with the face detection characteristics of the image to be detected from the face database, wherein the face data to be compared comprises the following steps:

and screening the face data to be compared, which are matched with the face attribute characteristics of the image to be detected, from the face database.

The face attribute characteristics such as age, sex, wearing ornament, hairstyle and the like can be used as screening conditions to screen face data which is closer to the face on the image to be detected. And screening the face data to be compared from the face database through at least one face detection feature to be used as the face data which is preferentially subjected to similarity calculation with the image to be detected, and if the face data which accords with the similarity threshold cannot be obtained from the face data, carrying out similarity calculation with the image to be detected in the rest face data.

The following describes a loss function in a face prediction function corresponding to a face attribute feature detected by using a face detection model.

The age detection result may be to divide the age into a plurality of age groups, for example, six age groups, or more age groups. When a face database is established, each face image is identified in which age group is used as a face age detection result of the face image. When the face detection of the image to be detected is carried out, the face detection model is utilized to detect which age group the image to be detected is in as the face age detection result of the image to be detected. It can be divided into 6 age groups, such as 0-3 years old, 4-12 years old, 13-18 years old, 19-30 years old, 30-60 years old, and over 60 years old. The age detection result may have a loss function of:

；

wherein,. In the loss function of the age detection result,C ₂ for the number of classifications of the age detection results, the age group corresponds to +.>A value of 0 indicates that the face age detection result is not within the age group, and the age group corresponds to +.>1 indicates that the face age detection result is within the age group,/->Output for face age group detection task modulekIndividual result(s)>The function output value is also a prediction result of the model output. For example: sample trueValue= [0,0,0,0,1,0 ]]Predicted value= [0.1,0.1,0.1,0.1,0.5,0.1 ] ]Then->=-log0.5。/>

The age detection result can also adopt a mode of marking the age of the face, the default age value range is a numerical value ranging from 0 to 100, an age softmax loss function is set, and the age detection adopts a classification method to take integer values according to the range from 0 to 100. softmax is a set of results output by the neural network, and the function is expressed as follows:

；

wherein,zfor the number of classifications, herezIs a feature vector of 1×101, letC ₃ At the maximum value of 100 a,jis 0 to 0-C ₃ The index number of the index is given to the index,the first of predictive computing networks, which is a model of face attribute detectionkAnd a result. />Indicating that the sample belongs to the agekIs also the age prediction result of the last output layer of the model.

Definition of the definitionMean of (2)mMean value ofmMean for each age predictor. The function is defined as follows:

；

adding an age mean loss function, wherein the function is as follows:

；

the age group softmax loss function softmaxloss was added in the following form:

；

is the softmax function output value,C ₃ =101，/>the true age sample label is reflected, the corresponding element of the true age is valued as 1, and the rest values are valued as 0.

Defining the loss function of the age detection result as。

The face gender detection results may be classified into men and women. When a face database is established, each face image is identified with gender as a face gender detection result of the face image. When the face detection of the image to be detected is carried out, the sex of the image to be detected is detected by using a face detection model and is used as a face sex detection result of the image to be detected. The loss function of the face gender detection result may be:

；

Wherein,. In the loss function of the face gender detection result,C ₄ 2, respectively corresponding to male and female, +.>A value of 0 indicates that the corresponding face gender detection result is wrong, < ->1 indicates that the corresponding face gender detection result is correct;network output for predicting and calculating corresponding face gender detection resultkIndividual result(s)>The function output value is also a prediction result of the model output. For example, the true sample isy=[0,1]The probability of detecting as male is 0 and the probability of detecting as female is 1. First order model predictive outcome->=[0.1,0.9]Second order model prediction +.>=[0.7,0.3]The first loss function value is +.>= -log0.9, second order loss function value +.>= -log0.3, then the first loss function value is smaller and the prediction result is closer to the real sample.

By applying the face detection model provided by the embodiment of the invention, not only can various face coordinate detection results be detected simultaneously, but also various face attribute features can be detected simultaneously, and the face attribute features can be used for screening out the face data to be compared in the face database before the similarity calculation of face recognition so as to reduce the face data quantity to be subjected to the similarity calculation, namely, the face detection model can be used for generating a standard face image to extract the face feature vector so as to facilitate the similarity calculation, and can also be used for detecting the face attribute features so as to reduce the face data range needing the similarity calculation with the image to be detected, thereby not only improving the face recognition efficiency, but also improving the face recognition accuracy.

Through the face attribute detection model, some remarkable face attribute features on the input image can be directly detected. When the face attribute features include an age detection result and a gender detection result, screening in a face database according to the face attribute features of the image to be detected to obtain face data to be compared, and performing similarity calculation on a face feature vector corresponding to the image to be detected and a face feature vector in the face data to be compared to obtain a face recognition result of the image to be detected, which may include:

screening face data which are consistent with the age detection result of the image to be detected and the sex detection result of the image to be detected from a face database to serve as face data to be compared, and carrying out similarity calculation on face feature vectors of the image to be detected and face feature vectors in the face data to be compared;

if the target face data with the similarity score reaching the similarity threshold value is obtained, determining the identity information of the image to be detected in the target face data as a face recognition result of the image to be detected; if the target face data is not obtained, re-screening the face data which is consistent with the age detection result of the image to be detected and the face data which is consistent with the sex detection result of the image to be detected from a face database as the face data to be compared, and carrying out similarity calculation on the face feature vector of the image to be detected and the face feature vector in the face data to be compared;

If the target face data are obtained, the identity information of the image to be detected is determined in the target face data and is used as the face recognition result of the image to be detected; if the target face data is not obtained, taking the remaining face data in the face database as the face data to be compared, and carrying out similarity calculation on the face feature vector of the image to be detected and the face feature vector in the face data to be compared;

if the target face data are obtained, the identity information of the image to be detected is determined in the target face data and is used as the face recognition result of the image to be detected; and if the target face data is not obtained, outputting a face recognition result of the image to be detected as that the face identity is not detected.

The fifth embodiment of the present invention will be described below.

On the basis of the foregoing embodiment, in the face recognition method provided by the embodiment of the present invention, the input image feature parameters of the input image are input into the face detection prediction function to obtain the face detection feature of the input image, and the method may further include:

The face detection model provided by the embodiment of the invention not only detects the face attribute characteristics used for screening the face data to be compared, but also can be used for detecting the face coordinate detection result of the input image, and the face coordinate detection result can be used for better positioning the face position on the input image.

Based on this, extracting the face feature vector of the input image using the face feature extraction model may include: and extracting a standard face image from the input image based on a face coordinate detection result of the input image, and extracting a face feature vector of the input image from the standard face image corresponding to the input image by using a face feature extraction model. When the face feature vector is extracted, the face feature vector is not directly extracted on the face image or the image to be detected, but is extracted on the basis of the standard face image after the standard face image is extracted from the input images, so that the consistency and the accuracy of the face feature vector data are ensured.

Specifically, when the face image is processed in S102 or the image to be detected is processed in S103, the system loads the face detection model and the face feature extraction model, reads the input image for preprocessing, and then sends the input image into the face detection model to obtain a series of prior frames, five-sense organ key point coordinates and confidence scores. And carrying out coordinate correction on the prior frame to obtain the face detection frame. After the face detection frame is obtained, an intersection ratio (Intersection over Union, IOU, the ratio of intersection and union of two boundary frames) can be calculated according to a Non-maximum suppression (Non-Maximum Suppression, NMS) algorithm to filter out the detection frame with higher overlapping degree, and meanwhile filter out the detection frame with low confidence (score) to obtain final face detection characteristics, wherein the final face detection characteristics comprise a face classification result (confidence score with face/confidence score without face), a face frame position detection result (bbox) and a face key point coordinate detection result (landmark). The face classification result not only can be used for indicating whether a corresponding detection frame has a face, but also can be used for indicating whether the face image quality is high or low through the confidence coefficient with the face, namely, the higher the confidence coefficient is, the higher the face image quality is. The face frame position detection result (bbox) is a frame regression (bbox) result in the corresponding detection frame and is used for identifying the face contour position in the corresponding detection frame, and if the face contour position can be identified through the coordinates of the upper left and the lower right, the total number of the coordinate values is 4. The face key point coordinate detection result (landmark) is a detection result of coordinates of the face key points in the corresponding detection frame, for example, five-sense organ coordinates, specifically, two eye coordinates, nose coordinates and coordinates of boundary points at two sides of a mouth corner can be used as the face key point coordinates, and 10 coordinate values are total.

In the process of carrying out face correction and clipping on an image to be detected based on the face detection characteristics to obtain a standard face image (standard positive face image), the geometric structure of the face in the image to be detected is obtained, for example, the aligned standard face image is obtained based on translation, scaling and rotation by combining a face key point coordinate detection result (landmark) in the face detection result and a standard face reference point. In the process of extracting the face feature vector from the standard face image by using the face feature extraction model, the multi-dimensional feature vector is obtained after the standard face image is sent into the face feature extraction model. The feature vector dimension of the output of each face feature extraction model is fixed.

It can be understood that the image to be detected may not have a face, or the face is relatively blurred and cannot be identified, if the face detection model exists, the detection result of the image to be detected is that the face is not present, and at this time, the face detection result of the image to be detected may be directly output as undetected face information.

The face classification result in the face coordinate detection result may use the following loss function:

；

wherein,. In the loss function of the face classification result, C ₁ 2, respectively corresponding to two classification results of face recognition and face recognition not, and +.>A value of 0 indicates that the corresponding face detection result is wrong, < >>1 indicates that the corresponding face detection result is correct; />Output by the predictive computation network 204 for the face classification resultkIndividual result(s)>The function output value is also a prediction result of the model output. For example, the true sample isy=[0,1]The probability that no face is detected is 0, and the probability that a face is detected is 1. First order model predictive outcome->=[0.1,0.9]Second order model prediction results=[0.7,0.3]The first loss function value is +.>= -log0.9, second-order loss function value of= -log0.3, then the first loss function value is smaller and the prediction result is closer to the real sample.

The face frame position detection result (bbox) is used for identifying the face contour, and the face contour position can be identified through the coordinates of the upper left and the lower right, so that 4 coordinate values are shared, and the following loss function can be adopted:

；

wherein,and detecting a loss function of the task module for the face frame position.

The face key point coordinate detection result (landmark) is used for identifying the detection result of the coordinates of the face key points in the detection frame, such as five sense organs coordinates, specifically, two eye coordinates, nose coordinates and two side boundary point coordinates of a mouth corner can be used as the face key point coordinates, and 10 coordinate values are total. The loss function of the face key point coordinate detection task module may be:

；

Wherein,and detecting a loss function of the task module for the coordinates of the key points of the human face.

By applying the face detection model provided by the embodiment of the invention, various face coordinate detection results such as a face classification result, a face frame position detection result, a face key point coordinate detection result and the like can be detected at the same time, and compared with the traditional face detection model for detecting only one face detection result, the face detection model not only realizes the high efficiency and accuracy of face detection feature detection through enhanced feature extraction, but also improves the model training efficiency and model calculation efficiency.

The sixth embodiment of the present invention will be described.

By detecting the face coordinates in the input image, the remarkable face attribute characteristics such as accessories, hairstyles and the like can be indirectly detected. When the face attribute features include at least one of a decoration detection result and a hairstyle detection result, extracting the face attribute features by using the face detection model may include:

In the embodiment of the invention, whether to wear accessories and remarkable hairstyle information (such as whether the forehead has a bang shielding) can also be used as the mark characteristic of face recognition. Both of these marker features can be detected by distinguishing them significantly from the color of the surrounding skin.

When the face attribute feature is a glasses wearing detection result in the accessory detection result, obtaining a detection result of the face attribute feature of the input image according to a face key point coordinate detection result in the face detection feature of the input image and pixel value change information of the input image may include:

obtaining the human five-sense organ coordinates of the input image according to the human face key point coordinate detection result of the input image;

performing eye positioning on the input image according to the human five-sense organ coordinates of the input image to obtain eye coordinates of the input image, and generating an eyeglass frame surrounding the eye positions of the input image according to the eye coordinates of the input image;

calculating a first difference value between pixel values in a glasses frame of the input image and pixel values outside the glasses frame of the input image; if the first difference value exceeds the eye difference threshold value, determining that the glasses wearing information of the input image is that glasses are worn; and if the first difference value does not exceed the eye difference threshold value, determining that the glasses wearing information of the input image is that the glasses are not worn.

For other types of accessories, such as whether to wear the earstuds or whether to wear the necklace, the position of the accessory can be determined according to the change condition of the pixel values after the accessory is positioned in the corresponding area.

When the face attribute feature is a hairstyle detection result, obtaining a detection result of the face attribute feature of the input image according to a face key point coordinate detection result in the face detection feature of the input image and pixel value change information of the input image may include:

calculating a second difference value between a pixel value in the forehead frame of the input image and a pixel value outside the forehead frame of the input image; if the second difference value exceeds the forehead difference threshold value, determining that the hairstyle detection result of the input image is with bang; if the second difference value does not exceed the forehead difference threshold value, determining that the hairstyle detection result of the input image is that no bang exists.

For other types of hair style information, such as straight hair and curly hair, further determinations can be made by changing the pixel value of the hair area.

When a face database is created, after a face detection result of each face image is detected by using a face detection model, an eye area and a forehead area are positioned through the obtained detection frame and face key point coordinates, and whether the face is wearing glasses is judged through comparing whether pixel points of the eye area are suddenly changed. Whether the bang is present is determined by comparing whether the frontal area has pixels with areas exceeding c% (c may be 80) less than (128,48,48) (taking RBG channel order as an example). By adding the face glasses and the Liu-bang tag, the search range of the face library can be further reduced in the subsequent comparison.

The face 1 is performed: and when N is compared, the face additional information of the image to be detected is obtained according to the same mode so as to serve as a screening condition of the face data to be compared.

When the face attribute features include an accessory detection result and a hairstyle detection result, screening in a face database according to the face attribute features of the image to be detected to obtain face data to be compared, and performing similarity calculation on the face feature vector corresponding to the image to be detected and the face feature vector in the face data to be compared to obtain a face recognition result of the image to be detected, which may include:

screening face data which are consistent with both the accessory detection result of the image to be detected and the hairstyle detection result of the image to be detected from a face database to be used as face data to be compared, and carrying out similarity calculation on face feature vectors of the image to be detected and face feature vectors in the face data to be compared;

if the target face data with the similarity score reaching the similarity threshold value is obtained, determining the identity information of the image to be detected in the target face data as a face recognition result of the image to be detected; if the target face data is not obtained, re-screening the face data which is consistent with the accessory detection result of the image to be detected and the face data which is consistent with the hairstyle detection result of the image to be detected from a face database as the face data to be compared, and carrying out similarity calculation on the face feature vector of the image to be detected and the face feature vector in the face data to be compared;

The seventh embodiment of the present invention will be described.

When the types of the face attribute features are plural, one or more of them may be adopted as a screening condition for screening the face data to be compared. It can be understood that the more screening conditions, the less the face data to be compared is screened, but the more the face data to be compared does not contain the face data matched with the image to be detected possibly because of detection errors, the new face data to be compared can be screened out by adjusting the screening conditions.

In the face recognition method provided by the embodiment of the present invention, when the types of the face attribute features are plural, the face data to be compared, which is matched with the face attribute features of the image to be detected, is screened out from the face database, and the similarity calculation is performed on the face feature vector of the image to be detected and the face feature vector in the face data to be compared, so as to obtain the face recognition result of the image to be detected, which may include:

creating all face attribute feature subsets of all kinds of face attribute features;

selecting a face attribute feature subset, screening face data to be compared, which is matched with all face attribute features in the face attribute feature subset, from a face database, and carrying out similarity calculation on face feature vectors of an image to be detected and face feature vectors in the face data to be compared to obtain a face recognition result of the image to be detected;

if the target face data with the similarity score reaching the similarity threshold value is obtained, determining the identity information of the image to be detected in the target face data as a face recognition result of the image to be detected;

if the target face data is not obtained and the number of the face attribute feature subsets which are not taken is not zero, selecting another face attribute feature subset, entering to-be-compared face data which are matched with all face attribute features in the face attribute feature subset and are screened from a face database, and carrying out similarity calculation on the face feature vector of the to-be-detected image and the face feature vector in the to-be-compared face data to obtain the face recognition result of the to-be-detected image;

If the target face data is not obtained and the number of the face attribute feature subsets which are not taken is zero, taking the remaining face data in the face database as the face data to be compared, and carrying out similarity calculation on the face feature vector of the image to be detected and the face feature vector in the face data to be compared;

if the target face data are obtained, the identity information of the image to be detected is determined in the target face data and is used as the face recognition result of the image to be detected;

When the types of the face attribute features are multiple, different subsets of the whole face attribute features can be created to serve as screening conditions of the face data to be compared. The sequence of selecting the face attribute feature subsets can be selected randomly according to the sequence of containing the face attribute features from more to less if the number of the face attribute features is the same. Then selecting a subset of face attribute features may include: and sequentially selecting the face attribute feature subsets according to the number of the face attribute features in the sequence of at least more.

Because the face data corresponding to the different face attribute feature subsets may overlap, the face data to be compared, which is matched with all the face attribute features in the face attribute feature subsets, is screened from the face database, and the similarity calculation is performed on the face feature vector of the image to be detected and the face feature vector in the face data to be compared, so as to obtain the face recognition result of the image to be detected, which may include: screening the face data to be compared matched with all the face attribute features in the face attribute feature subset from the face database, deleting the compared face data to be compared, and then carrying out similarity calculation on the face feature vector of the image to be detected and the face feature vector in the face data to be compared to obtain the face recognition result of the image to be detected. The compared face data are removed during each screening, so that repeated similarity calculation is avoided, and the calculated amount is increased.

The eighth embodiment of the present invention will be described.

In addition to the screening sequence described in the foregoing embodiment, in the face recognition method provided by the embodiment of the present invention, when the types of the face attribute features are plural, the face data to be compared, which is matched with the face attribute features of the image to be detected, is screened out from the face database, and the similarity calculation is performed on the face feature vector of the image to be detected and the face feature vector in the face data to be compared, so as to obtain the face recognition result of the image to be detected, which may further include:

screening face data to be compared, which are matched with all face attribute features in the screening conditions, from a face database by taking all face attribute features of all types as screening conditions, and carrying out similarity calculation on face feature vectors of images to be detected and face feature vectors in the face data to be compared to obtain face recognition results of the images to be detected;

if the target face data is not obtained and the number of face attribute features in the screening condition is not one, after removing one face attribute feature in the screening condition, entering to-be-compared face data which is matched with all face attribute features in the screening condition in a self-face database, and carrying out similarity calculation on the face feature vector of the to-be-detected image and the face feature vector in the to-be-compared face data to obtain a face recognition result of the to-be-detected image;

If the target face data is not obtained and the number of face attribute features in the screening condition is one, taking the remaining face data in the face database as the face data to be compared, and carrying out similarity calculation on the face feature vector of the image to be detected and the face feature vector in the face data to be compared;

When the types of the face attribute features are multiple, the face attribute features can be removed from all the face attribute features to serve as the face data screening condition to be compared. For example, when the face attribute features include a face age detection result, a face gender detection result, glasses wearing information, and face hairstyle information, screening face data to be compared, which is matched with the face attribute features of the image to be detected, in a face database, and performing similarity calculation on a face feature vector of the image to be detected and a face feature vector in the face data to be compared to obtain a face recognition result of the image to be detected, which may include:

Screening face data which are consistent with the face age detection result, the face gender detection result, the glasses wearing information and the face hairstyle information of the image to be detected from a face database to serve as face data to be compared, and carrying out similarity calculation on the face feature vector of the image to be detected and the face feature vector in the face data to be compared;

if the target face data with the similarity score reaching the similarity threshold value is obtained, determining the identity information of the image to be detected in the target face data as a face recognition result of the image to be detected; if the target face data is not obtained, the face data which is consistent with the face age detection result, the face gender detection result and the glasses wearing information of the image to be detected is rescreened from the face database to be used as the face data to be compared, and the similarity calculation is carried out on the face feature vector of the image to be detected and the face feature vector in the face data to be compared;

if the target face data with the similarity score reaching the similarity threshold value is obtained, determining the identity information of the image to be detected in the target face data as a face recognition result of the image to be detected; if the target face data is not obtained, re-screening the face data which is consistent with the face age detection result and the face gender detection result of the image to be detected from a face database as the face data to be compared, and carrying out similarity calculation on the face feature vector of the image to be detected and the face feature vector in the face data to be compared;

If the target face data with the similarity score reaching the similarity threshold value is obtained, determining the identity information of the image to be detected in the target face data as a face recognition result of the image to be detected; if the target face data is not obtained, re-screening the face data consistent with the face gender detection result of the image to be detected from a face database as the face data to be compared, and carrying out similarity calculation on the face feature vector of the image to be detected and the face feature vector in the face data to be compared;

if the target face data with the similarity score reaching the similarity threshold value is obtained, determining the identity information of the image to be detected in the target face data as a face recognition result of the image to be detected; and if the target face data is not obtained, outputting a face recognition result of the image to be detected as that the face identity is not detected.

Because the face data corresponding to the different face attribute feature subsets may overlap, the face data to be compared, which is matched with all face attribute features in the screening condition, is screened from the face database, and the similarity calculation is performed on the face feature vector of the image to be detected and the face feature vector in the face data to be compared, so as to obtain the face recognition result of the image to be detected, which may include: and screening the face data to be compared, which are matched with all face attribute features in the screening conditions, from a face database, deleting the face data to be compared which are already compared, and then carrying out similarity calculation on the face feature vector of the image to be detected and the face feature vector in the face data to be compared to obtain the face recognition result of the image to be detected. The compared face data are removed during each screening, so that repeated similarity calculation is avoided, and the calculated amount is increased.

The following describes an embodiment nine of the present invention.

On the basis of the above embodiments, the present invention provides a face recognition method that can be implemented.

After training a face detection model, a face attribute detection model and a face feature extraction model, each face image and corresponding identity information in a face image dataset are sequentially read, and anchor boxes (anchor boxes) with different scales are output by utilizing network detection branches.

Let the vector X be the face information table of each face, x= { X1, X2, X3, X4}, wherein the information list elements X1, X2, X3, X4 are respectively represented as gender, age group, whether glasses are worn, and whether there is a bang. For example, it may be designed that the sex is male with x1 being 1 and female with x1=0; the x2 value represents age bracket estimation, and 0-5 represents six age brackets respectively; x3 is 1 if a glasses fit is detected, whereas x3=0; there is a bang x4=1, there is no bang x4=0. System initialization x= {0, 0}.

And (3) calling the face detection model to perform forward propagation calculation to obtain a series of face detection results, wherein the face detection results comprise a priori frame, key points and confidence scores, and further correcting and converting coordinates to obtain a face frame position detection result (bbox) and a face key point coordinate detection result (landmark) to obtain five sense organ coordinates. And (5) forward propagation calculation is carried out by using the face attribute detection model, so as to obtain an age detection result of the face image and a gender detection result of the face image. The face attribute feature detection model and the face detection model can adopt the same model, and the face detection result and the face attribute feature are detected by setting the loss functions of different prediction calculation networks.

The poor detection results were filtered. Fractional filtering and/or non-maxima suppression face filtering may be employed. Score filtration: and setting a confidence threshold, and filtering out detection results with scores lower than the confidence threshold. Non-maximum suppression (Non-Maximum Suppression, NMS) face box filtering: sequencing the detection frames (bbox) according to the confidence score from high to low, finding the detection frame (bbox) with the highest confidence score, respectively calculating the cross ratio (Intersection over Union, IOU) of the detection frame (bbox) and other detection frames (bbox), and reserving the adjacent detection frames with the cross ratio exceeding a preset non-maximum value inhibition threshold as a face detection output result. The gender and age group detected by the face are filled into the face supplementary information list elements x1, x2.

The upper left vertex of the face image is taken as an origin (0, 0) of a coordinate system, the horizontal direction is an x-axis, the horizontal direction and the right direction are positive directions, the vertical direction is a y-axis, and the vertical direction and the downward direction are positive directions. Taking the detection frame as a matrix, acquiring four coordinate vertexes of the detection frame, marking the upper left corner as (X1, Y1), and marking the lower right corner as (X2, Y2). With (X1, Y1) as the upper left corner of the matrix, X2-X1 as the width w of the matrix, and Y2-Y1 as the length h of the matrix.

And acquiring facial features coordinates according to the facial key point coordinates detection result, and performing eye positioning on the facial image. Assume that the left eye is marked (LX 1, LY 1) and the right eye is marked (LX 2, LY 2). Taking (LX 1-0.1w, LY1-0.1 h) as the left upper corner coordinate of the left eye glass frame, taking (LX1+0.1 w, LY1+0.1 h) as the right lower corner coordinate of the left eye glass frame, taking the pixel values of the glass frame as the glass frame matrix elements, and recording all element values in the left glass frame matrix. Taking (LX 2-0.1w, LY2-0.1 h) as the left upper corner coordinate of the right-eye glasses frame, taking (LX2+0.1 w, LY2+0.1 h) as the right lower corner coordinate of the right-eye glasses frame, taking the pixel values of the glasses frame as the right glasses frame matrix elements, and recording all element values in the right glasses frame matrix.

And sequentially comparing the average value of the elements in the eye rectangular frame matrix with the average value of four adjacent elements, if the difference between some points and the average value of the neighborhood elements exceeds 64, regarding that the point is subjected to numerical mutation, and if the eye rectangular frame exceeds q (q=5%) elements to be subjected to mutation, judging that the glasses frame is worn on the face image, and the supplementary information list element x3=1. And otherwise, setting to 0.

The forehead position is defined in the following manner: the upper left corner coordinates of the forehead frame are (0.5X 1+ Lx 1), the lower right corner coordinates of the forehead frame are (0.5X 2+ Lx 2), and (0.5X (LY 1+ LY 2)), the forehead frame is circled and expressed in a matrix form, and the pixel values of the forehead frame are taken as elements of the forehead frame matrix. Taking an RGB image as an example, if the forehead frame corresponding matrix element is greater than c (e.g., c=80%) and is lower than (128,48,48), it is determined that the face has a bang, and the supplementary information list element x4=1. And otherwise, setting to 0.

And obtaining the geometric structure of the face according to the face frame position detection result (bbox) and the face key point coordinate detection result (landmark), and performing image matrix similarity transformation based on translation, scaling and rotation to obtain a cut standard face image (standard front face image).

And (5) forward propagation calculation is carried out by using the face feature extraction model, and face feature vectors are extracted from the standard face image.

And sequentially reading all face images of the face image data set, and repeating the steps to finish the detection and the identification of all faces of the face image data set. Creating a face database file A and a face database file B, wherein the file A stores face information, stores identity information corresponding to a face image, a face frame position detection result (bbox) and a face feature vector (score) according to rows, and sequentially writes a face information list containing an age detection result, a gender detection result, glasses wearing information and Liu-bang information into the database file A according to a face image reading sequence according to rows. File a contains N rows if there are N face images. Creating a file B, recording index information of a database file A, and recording identity information of each face feature vector in the database A in each row to obtain a face database, wherein the face feature vector, a detection frame, confidence coefficient, the size of a face information list and offset.

Carrying out 1: and (3) comparing N faces and optimizing a face library template:

when a to-be-detected image is read, mass retrieval 1 is needed to be carried out with a face database: and when N is compared, firstly, executing the step of screening the face data to be compared in the face database. Firstly, a face information table of each face data is obtained, and the face information table corresponding to the image to be detected is obtained by utilizing a face detection model.

And screening out a feature vector set which is identical to the gender of the image to be detected, has the same age range, and has the same eyeglass prediction and bang prediction from a face database, assuming that the feature vector set is M1, the feature vector of the face to be compared is X, calculating cosine distances of all feature vectors in the X and M1 set, and comparing the result if the similarity score exceeds the threshold value of the same person. Otherwise, executing the next comparison.

And screening out a feature vector set which is consistent with the gender of the image to be detected, has the same age range and is predicted by the glasses from a face database, and comparing the result if the similarity score exceeds the threshold value of the same person by calculating the cosine distance of all feature vectors in the X and M2 set on the premise that the feature vector set of the image to be detected is M2 and the feature vector of the image to be detected is X. Otherwise, executing the next comparison.

And screening out a feature vector set with the same gender and age range as the image to be detected from a face database, assuming the feature vector set as M3, calculating cosine distances of all feature vectors in the feature vector set of X and M3 to be compared as X, and comparing the feature vectors if the similarity score exceeds the threshold value of the same person. Otherwise, executing the next comparison.

And screening out a feature vector set consistent with the gender of the image to be detected from a face database, assuming the feature vector set as M4, comparing the feature vectors of the face to be compared as X, calculating cosine distances of all feature vectors in the X and M4 set, and comparing the feature vectors if the similarity score exceeds the threshold value of the same person. Otherwise, executing 1 of all feature vectors of the face database: and (5) N comparison.

If the image to be detected finds the corresponding identity information in the above steps, the following situations exist: and compared with the face frame area bbox2 and the confidence score2 corresponding to the image to be detected, if score2 is greater than k times (k can be 2) of score1, or score2 is greater than t times (t can be 1.2) of score1 score and bbox2 is less than or equal to h times (h can be 1.2) of bbox1, the face detection effect to be compared is better, and the face data of the image to be detected is replaced by the original face template (corresponding face data) of the face database.

If the corresponding identity information is found by the faces to be compared in the steps, the following situations exist: and compared with the face frame area bbox2 and the confidence score2 corresponding to the image to be detected, if score2 is greater than m times score1 (m can be 1), and the storage time of the face data in the face database is longer than n days (n can be 365), the face data of the image to be detected is replaced by the original face template (corresponding face data) of the face database.

Various embodiments of the face recognition method are detailed above, and on the basis of the embodiments, the invention also discloses a face recognition device, equipment and a computer readable storage medium corresponding to the method.

The following describes embodiments of the present invention.

Fig. 3 is a schematic structural diagram of a face recognition device according to an embodiment of the present invention.

As shown in fig. 3, the face recognition device provided by the embodiment of the present invention includes:

a model training unit 301, configured to train to obtain a face detection model and a face feature extraction model;

the database creating unit 302 is configured to extract a face detection feature of each face image in the face image dataset by using the face detection model, extract a face feature vector of each face image by using the face feature extraction model, and store the face detection feature of the face image, the face feature vector of the face image, and identity information of the face image in a correlated manner to obtain a face database;

A face detection unit 303, configured to detect a face detection feature of an image to be detected using a face detection model, and extract a face feature vector of the image to be detected using a face feature extraction model;

the face recognition unit 304 is configured to screen face data to be compared, which is matched with the face detection feature of the image to be detected, from the face database, and perform similarity calculation on the face feature vector of the image to be detected and the face feature vector in the face data to be compared, so as to obtain a face recognition result of the image to be detected;

In some implementations, the library-building unit 302 or the face-detection unit 303 extracts the face-detection features using the face-detection model, may include:

The input image is a face image or an image to be detected.

The database unit 302 or the face detection unit 303 performs enhanced feature extraction on the input image to obtain image feature parameters of the input image, which may include:

wherein N is a non-1 positive integer.

The library building unit 302 or the face detection unit 303 respectively performs enhanced feature extraction on each feature fusion result to obtain a feature layer corresponding to each feature fusion result, which may include:

splicing and stacking the corresponding feature sets to obtain feature layers corresponding to the feature fusion results;

Wherein M is a non-1 positive integer.

In some implementations, the library unit 302 or the face detection unit 303 inputs the image feature parameters of the input image into a face detection prediction function to obtain the face detection feature of the input image, including:

In some implementations, the variety of face detection features is multiple;

the training of the model training unit 301 to obtain a face detection model may include:

In some implementations, the library unit 302 or the face detection unit 303 inputs the image feature parameters of the input image into a face detection prediction function to obtain the face detection feature of the input image, which may include:

In some implementations, the library unit 302 or the face detection unit 303 inputs the image feature parameters of the input image into a face detection prediction function to obtain the face detection feature of the input image, and may further include:

The library unit 302 or the face detection unit 303 extracts a face feature vector of the input image by using a face feature extraction model, which may include:

and extracting a standard face image from the input image based on a face coordinate detection result of the input image, and extracting a face feature vector of the input image from the standard face image corresponding to the input image by using a face feature extraction model.

The library unit 302 or the face detection unit 303 inputs the image feature parameters of the input image into a face attribute prediction function to obtain the face attribute features of the input image, which may include:

In some implementations, when the face attribute feature is a glasses wearing detection result in the accessory detection result, the library unit 302 or the face detection unit 303 obtains a detection result of the face attribute feature of the input image according to the face key point coordinate detection result in the face detection feature of the input image and the pixel value change information of the input image, and may include:

In some implementations, when the face attribute feature is a hairstyle detection result, the library unit 302 or the face detection unit 303 obtains a detection result of the face attribute feature of the input image according to a face key point coordinate detection result in the face detection feature of the input image and pixel value variation information of the input image, and may include:

the face recognition unit 304 screens face data to be compared matched with the face detection feature of the image to be detected from the face database, and performs similarity calculation on the face feature vector of the image to be detected and the face feature vector in the face data to be compared to obtain a face recognition result of the image to be detected, which may include:

The face recognition unit 304, the library unit 302 or the face detection unit 303 extracts face detection features using a face detection model, and may include:

The input image is a face image or an image to be detected.

In some implementations, the variety of face detection features is multiple;

selecting a face detection feature subset, screening face data to be compared, which is matched with all face detection features in the face detection feature subset, from a face database, and carrying out similarity calculation on face feature vectors of an image to be detected and face feature vectors in the face data to be compared to obtain a face recognition result of the image to be detected;

if the target face data is not obtained and the number of the face detection feature subsets which are not taken is not zero, selecting another face detection feature subset, entering to-be-compared face data which are matched with all face detection features in the face detection feature subsets and are screened from a face database, and carrying out similarity calculation on the face feature vectors of the to-be-detected image and the face feature vectors in the to-be-compared face data to obtain the face recognition result of the to-be-detected image;

The face recognition unit 304 selects a face detection feature subset, which may include:

the face detection feature subsets are sequentially selected in an order of at least more than the number of face detection features.

The face recognition unit 304 screens face data to be compared, which is matched with all face detection features in the face detection feature subset, from a face database, and performs similarity calculation on face feature vectors of an image to be detected and face feature vectors in the face data to be compared to obtain a face recognition result of the image to be detected, and the face recognition unit includes:

screening the face data to be compared matched with all face detection features in the face detection feature subset from the face database, deleting the compared face data to be compared, and then carrying out similarity calculation on the face feature vector of the image to be detected and the face feature vector in the face data to be compared to obtain the face recognition result of the image to be detected.

In some implementations, the variety of face detection features is multiple;

screening face data to be compared, which are matched with all face detection features in the screening conditions, from a face database by taking all the face detection features of all the types as screening conditions, and carrying out similarity calculation on face feature vectors of an image to be detected and face feature vectors in the face data to be compared to obtain a face recognition result of the image to be detected;

if the target face data is not obtained and the number of the face detection features in the screening conditions is not one, after removing one face detection feature in the screening conditions, entering to-be-compared face data which is matched with all the face detection features in the screening conditions in a self-face database, and carrying out similarity calculation on the face feature vector of the to-be-detected image and the face feature vector in the to-be-compared face data to obtain the face recognition result of the to-be-detected image;

The face recognition unit 304 screens face data to be compared, which is matched with all face detection features in the screening condition, from a face database, and performs similarity calculation on a face feature vector of an image to be detected and a face feature vector in the face data to be compared, so as to obtain a face recognition result of the image to be detected, which may include:

and screening the face data to be compared, which are matched with all face detection features in the screening conditions, from a face database, deleting the face data to be compared which are already compared, and then carrying out similarity calculation on the face feature vector of the image to be detected and the face feature vector in the face data to be compared to obtain the face recognition result of the image to be detected.

Since the embodiments of the apparatus portion and the embodiments of the method portion correspond to each other, the embodiments of the apparatus portion are referred to the description of the embodiments of the method portion, and are not repeated herein.

An eleventh embodiment of the present invention will be described.

As shown in fig. 4, the face recognition device provided by the embodiment of the present invention includes:

a memory 410 for storing a computer program 411;

a processor 420 for executing a computer program 411, which computer program 411 when executed by the processor 420 implements the steps of the face recognition method according to any one of the embodiments described above.

Processor 420 may include one or more processing cores, such as a 3-core processor, an 8-core processor, etc., among others. The processor 420 may be implemented in at least one hardware form of digital signal processing DSP (Digital Signal Processing), field programmable gate array FPGA (Field-Programmable Gate Array), programmable logic array PLA (Programmable Logic Array). Processor 420 may also include a main processor, which is a processor for processing data in an awake state, also referred to as central processor CPU (Central Processing Unit), and a coprocessor; a coprocessor is a low-power processor for processing data in a standby state. In some embodiments, the processor 420 may be integrated with an image processor GPU (Graphics Processing Unit), a GPU for use in responsible for rendering and rendering of the content required to be displayed by the display screen. In some embodiments, the processor 420 may also include an artificial intelligence AI (Artificial Intelligence) processor for processing computing operations related to machine learning.

Memory 410 may include one or more computer-readable storage media, which may be non-transitory. Memory 410 may also include high-speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In this embodiment, the memory 410 is at least used for storing a computer program 411, where the computer program 411 can implement relevant steps in the face recognition method disclosed in any one of the foregoing embodiments after being loaded and executed by the processor 420. In addition, the resources stored in the memory 410 may further include an operating system 412, data 413, and the like, where the storage manner may be transient storage or permanent storage. The operating system 412 may be Windows. The data 413 may include, but is not limited to, data related to the above-described method.

In some embodiments, the face recognition device may further include a display 430, a power source 440, a communication interface 440, an input-output interface 460, a sensor 470, and a communication bus 480.

Those skilled in the art will appreciate that the structure shown in fig. 4 is not limiting of the face recognition device and may include more or fewer components than shown.

The face recognition device provided by the embodiment of the invention comprises the memory and the processor, and the processor can realize the face recognition method when executing the program stored in the memory, so that the effects are the same.

The twelfth embodiment of the present invention will be described below.

It should be noted that the apparatus and device embodiments described above are merely exemplary, and for example, the division of modules is merely a logic function division, and there may be other division manners in actual implementation, for example, multiple modules or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or modules, which may be in electrical, mechanical, or other forms. The modules illustrated as separate components may or may not be physically separate, and components shown as modules may or may not be physical modules, i.e., may be located in one place, or may be distributed over a plurality of network modules. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional module in each embodiment of the present invention may be integrated into one processing module, or each module may exist alone physically, or two or more modules may be integrated into one module. The integrated modules may be implemented in hardware or in software functional modules.

The integrated modules, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on this understanding, the technical solution of the present invention may be embodied essentially or in part or all of the technical solution or in part in the form of a software product stored in a storage medium for performing all or part of the steps of the method according to the embodiments of the present invention.

To this end, an embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, which when executed by a processor implements steps of a face recognition method, for example.

The computer readable storage medium may include: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (ram) RAM (Random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

The computer program included in the computer readable storage medium provided in this embodiment can implement the steps of the face recognition method as described above when executed by the processor, and the same effects are achieved.

The above describes in detail a face recognition method, device, apparatus and computer readable storage medium provided by the present invention. In the description, each embodiment is described in a progressive manner, and each embodiment is mainly described by the differences from other embodiments, so that the same similar parts among the embodiments are mutually referred. The apparatus, device and computer readable storage medium of the embodiments are described more simply because they correspond to the methods of the embodiments, and the description thereof will be given with reference to the method section. It should be noted that it will be apparent to those skilled in the art that various modifications and adaptations of the invention can be made without departing from the principles of the invention and these modifications and adaptations are intended to be within the scope of the invention as defined in the following claims.

It should also be noted that in this specification, relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

Claims

1. A face recognition method, comprising:

training to obtain a face detection model and a face feature extraction model;

the face recognition result comprises the following steps: the identity information of the image to be detected or the notification of the identity information of the image to be detected which is not detected;

The variety of the face detection features is a plurality of;

if the target face data is not obtained, outputting a face recognition result of the image to be detected as that the face identity is not detected;

extracting face detection features by using the face detection model comprises the following steps:

the input image is the face image or the image to be detected;

the step of inputting the image feature parameters of the input image into a face detection prediction function to obtain the face detection feature of the input image comprises the following steps:

screening the face data to be compared matched with the face attribute characteristics of the image to be detected from the face database;

the step of inputting the image characteristic parameters of the input image into a face detection prediction function to obtain the face detection characteristics of the input image, and the step of further comprises:

inputting the image characteristic parameters of the input image into a face coordinate prediction function to obtain a face coordinate detection result of the input image;

when the face attribute feature is a glasses wearing detection result in the accessory detection result, obtaining a detection result of the face attribute feature of the input image according to a face key point coordinate detection result in the face detection feature of the input image and pixel value change information of the input image, wherein the detection result comprises:

2. The face recognition method of claim 1, wherein the face detection feature comprises: at least one of an age detection result, a sex detection result, a decoration detection result, and a hairstyle detection result.

3. The face recognition method according to claim 1, wherein the performing the enhanced feature extraction on the input image to obtain the image feature parameters of the input image includes:

wherein N is a non-1 positive integer.

4. A face recognition method according to claim 3, wherein the step of performing enhanced feature extraction on each of the feature fusion results to obtain a feature layer corresponding to each of the feature fusion results includes:

wherein M is a non-1 positive integer.

5. The face recognition method according to claim 1, wherein the step of inputting the image feature parameters of the input image into a face detection prediction function to obtain the face detection feature of the input image includes:

6. The face recognition method according to claim 1, wherein the kinds of the face detection features are plural;

training to obtain the face detection model, including:

7. The face recognition method according to claim 1, wherein extracting the face feature vector of the input image using the face feature extraction model includes:

8. The face recognition method according to claim 1, wherein the face attribute features include at least one of a fitting detection result and a hairstyle detection result;

9. The face recognition method according to claim 8, wherein when the face attribute feature is a hairstyle detection result, the obtaining the detection result of the face attribute feature of the input image according to the face key point coordinate detection result in the face detection feature of the input image and the pixel value variation information of the input image includes:

10. The face recognition method according to claim 8, wherein the face attribute features include a fitting detection result and a hairstyle detection result;

11. The face recognition method according to claim 1, wherein the face attribute features include an age detection result and a gender detection result;

12. The face recognition method according to claim 1, wherein extracting face detection features using the face detection model comprises:

the input image is the face image or the image to be detected.

13. The method of claim 12, wherein said selecting one of said face detection feature subsets comprises:

14. The face recognition method according to claim 12, wherein the step of screening the face database for the face data to be compared that matches all the face detection features in the face detection feature subset, and performing similarity calculation on the face feature vector of the image to be detected and the face feature vector in the face data to be compared to obtain the face recognition result of the image to be detected includes:

15. The face recognition method according to claim 1, wherein the kinds of face detection features are plural;

16. The face recognition method according to claim 15, wherein the step of screening the face data to be compared, which is matched with all face detection features in the screening condition, from the face database, and performing similarity calculation on the face feature vector of the image to be detected and the face feature vector in the face data to be compared, to obtain a face recognition result of the image to be detected, includes:

17. A face recognition device, comprising:

the variety of the face detection features is a plurality of;

the input image is the face image or the image to be detected;

18. A face recognition device, comprising:

A memory for storing a computer program;

a processor for executing the computer program, which when executed by the processor performs the steps of the face recognition method according to any one of claims 1 to 16.

19. A computer readable storage medium having stored thereon a computer program, which when executed by a processor performs the steps of the face recognition method according to any one of claims 1 to 16.