WO2021232985A1 - 人脸识别方法、装置、计算机设备及存储介质 - Google Patents

人脸识别方法、装置、计算机设备及存储介质 Download PDF

Info

Publication number
WO2021232985A1
WO2021232985A1 PCT/CN2021/085978 CN2021085978W WO2021232985A1 WO 2021232985 A1 WO2021232985 A1 WO 2021232985A1 CN 2021085978 W CN2021085978 W CN 2021085978W WO 2021232985 A1 WO2021232985 A1 WO 2021232985A1
Authority
WO
WIPO (PCT)
Prior art keywords
feature
image
face
face image
model
Prior art date
Application number
PCT/CN2021/085978
Other languages
English (en)
French (fr)
Inventor
许剑清
沈鹏程
李绍欣
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Publication of WO2021232985A1 publication Critical patent/WO2021232985A1/zh
Priority to US17/744,260 priority Critical patent/US11816880B2/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/75Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
    • G06V10/751Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/761Proximity, similarity or dissimilarity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/7715Feature extraction, e.g. by transforming the feature space, e.g. multi-dimensional scaling [MDS]; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/776Validation; Performance evaluation

Definitions

  • the embodiments of the present application relate to the field of computer technology, and in particular, to a face recognition method, device, computer equipment, and storage medium.
  • Face recognition is a kind of biometric recognition technology based on human facial features.
  • artificial intelligence technology With the rapid development of artificial intelligence technology, the application of artificial intelligence-based face recognition in daily life is becoming more and more extensive. Monitor user identity in scenarios such as face recognition payment and face recognition login applications. In scenarios such as face recognition payment and face recognition login applications, the inaccuracy of face recognition will result in the security of identity verification cannot be guaranteed. Therefore, how to improve the accuracy of face recognition has become an urgent problem to be solved.
  • the embodiments of the present application provide a face recognition method, device, computer equipment, and storage medium, which can improve the accuracy of face recognition.
  • the technical solution includes the following contents.
  • a face recognition method includes: extracting features of a target face image to obtain a first feature image corresponding to the target face image and a first feature image corresponding to the first feature image.
  • Feature vector the first feature image is used to represent the face features of the target face image; the first feature image is processed to obtain the first feature value corresponding to the first feature image, and the first feature image is A feature value is used to indicate the uncertainty corresponding to the first feature image, and the uncertainty refers to the difference between the facial features included in the first feature image and the facial features in the target facial image.
  • the second feature vector is the feature vector corresponding to the second feature image of the template face image
  • the second feature value is the feature value corresponding to the second feature image
  • the second feature value is used to represent the The uncertainty corresponding to the second feature image, where the uncertainty corresponding to the second feature image refers to the difference between the facial features included in the second feature image and the facial features in the template facial image Degree of difference; in a case where the similarity is greater than a preset threshold, it is determined that the target face image matches the template face image.
  • a face recognition device includes: a feature extraction module for performing feature extraction on a target face image to obtain a first feature image corresponding to the target face image and the first feature image A first feature vector corresponding to a feature image, where the first feature image is used to represent the face features of the target face image; a feature value acquisition module is used to process the first feature image to obtain the A first feature value corresponding to the first feature image, where the first feature value is used to indicate the uncertainty corresponding to the first feature image, and the uncertainty refers to the face included in the first feature image The degree of difference between the feature and the face feature in the target face image; the similarity acquisition module is used to obtain the first feature vector, the first feature value, the second feature vector, and the second feature value , Acquiring the similarity between the target face image and the template face image, the second feature vector is the feature vector corresponding to the second feature image of the template face image, and the second feature value is the The feature value corresponding to the second feature image, the second feature value is used to indicate
  • a computer device in another aspect, includes a processor and a memory, and at least one instruction is stored in the memory, and the at least one instruction is loaded and executed by the processor to realize Face recognition method.
  • a computer-readable storage medium is provided, and at least one instruction is stored in the computer-readable storage medium, and the at least one instruction is loaded and executed by a processor to implement the face recognition method as described above.
  • the method, device, computer equipment, and storage medium provided by the embodiments of the present application acquire the first feature image corresponding to the target face image and the first feature vector and the first feature value corresponding to the first feature image, according to the first feature vector, The first feature value and the second feature vector and the second feature value corresponding to the second feature image of the template face image are used to obtain the similarity between the target face image and the template face image, where the similarity is greater than the preset threshold In this case, it is determined that the target face image matches the template face image.
  • the first feature value represents the uncertainty corresponding to the first feature image
  • the second feature value represents the uncertainty corresponding to the second feature image
  • the uncertainty can represent the degree of difference between the feature image and the face image
  • the feature of the target face image is mapped to the hyperspherical space to obtain the first feature image corresponding to the target face image.
  • the hyperspherical space is more in line with the feature space of the face. Therefore, the feature extraction of the face in the hyperspherical space can make the extracted facial features more accurate and further improve the face. Accuracy of recognition.
  • the sample face image and the sample feature vector corresponding to the sample face image are obtained, and the feature extraction sub-model is called to extract the predicted feature image and predicted feature vector of the sample face image.
  • Train feature extraction sub-models Obtain the central feature vector of the face identifier to which the sample face image belongs, call the prediction sub-model to obtain the predicted feature value corresponding to the predicted feature image, obtain the third loss value according to the predicted feature vector, central feature vector and predicted feature value, and according to the third loss Value training predictor sub-model.
  • the face recognition model including the feature extraction sub-model and the predictive sub-model can be used for face recognition.
  • the predictive sub-model Since the predictive sub-model is introduced, the similarity between the target face image and the template face image is obtained. , It also considers the impact of the feature value output by the predictive sub-model on the similarity, that is, the impact of the uncertainty of the feature image on the similarity, instead of only considering the feature vector corresponding to the feature image, so it can effectively reduce the The presence of interference factors in the face image causes the feature vector to fail to accurately represent the characteristics of the face, which can improve the accuracy of face recognition and reduce the false positive rate of face recognition.
  • the feature extraction sub-model is trained, and the feature extraction sub-model after training remains unchanged, according to the sample feature vector and the person to which the sample face image belongs
  • the central feature vector of the face identifier is used to train the predictor sub-model. Therefore, in some embodiments, the training process of the face recognition model is divided into the training phase of the feature extraction sub-model and the training phase of the prediction sub-model.
  • the feature extraction is trained by obtaining For the sample face image of the sub-model, the prediction sub-model is trained without retraining a new feature extraction sub-model, and there is no need to re-collect the sample face image.
  • Fig. 1 is a schematic diagram of a face recognition model provided by an embodiment of the present application.
  • Fig. 2 is a schematic diagram of another face recognition model provided by an embodiment of the present application.
  • FIG. 3 is a flowchart of a face recognition method provided by an embodiment of the present application.
  • FIG. 4 is a flowchart of another face recognition method provided by an embodiment of the present application.
  • FIG. 5 is a flowchart of another face recognition method provided by an embodiment of the present application.
  • FIG. 6 is a result of face recognition provided by an embodiment of the present application and a result of face recognition provided by related technologies;
  • FIG. 7 is a flowchart of a method for training a face recognition model provided by an embodiment of the present application.
  • FIG. 8 is a flowchart of a training model and a deployment model provided by an embodiment of the present application.
  • FIG. 9 is a flowchart of a training feature extraction sub-model provided by an embodiment of the present application.
  • FIG. 10 is a flowchart of training a prediction sub-model provided by an embodiment of the present application.
  • FIG. 11 is a schematic structural diagram of a face recognition device provided by an embodiment of the present application.
  • FIG. 12 is a schematic structural diagram of another face recognition device provided by an embodiment of the present application.
  • FIG. 13 is a schematic structural diagram of a terminal provided by an embodiment of the present application.
  • FIG. 14 is a schematic structural diagram of a server provided by an embodiment of the present application.
  • the terms “first”, “second”, etc. used in this application can be used herein to describe various concepts, but unless otherwise specified, these concepts are not limited by these terms. These terms are only used to distinguish one concept from another.
  • the first feature image is referred to as the second feature image, and similarly, the second feature image is referred to as the first feature image.
  • multiple refers to two or more than two.
  • the multiple face images are any integer face images greater than or equal to two, such as two face images, three face images, and the like.
  • Each refers to each of at least one.
  • each face identifier refers to each of multiple face identifiers. If multiple face identifiers are three face identifiers, then each face identifier refers to 3 Each face ID in the personal face ID.
  • AI Artificial Intelligence
  • digital computers or machines controlled by digital computers to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge, and use knowledge to obtain the best results.
  • artificial intelligence is a comprehensive technology of computer science, which attempts to understand the essence of intelligence and produce a new kind of intelligent machine that can react in a similar way to human intelligence.
  • Artificial intelligence is to study the design principles and implementation methods of various intelligent machines, so that the machines have the functions of perception, reasoning and decision-making.
  • Artificial intelligence technology is a comprehensive discipline, covering a wide range of fields, including both hardware-level technology and software-level technology.
  • Basic artificial intelligence technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, and mechatronics.
  • Artificial intelligence software technology includes natural language processing technology and machine learning.
  • Machine Learning is a multi-field interdisciplinary subject, involving probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and other subjects. Specializing in the study of how computers simulate or realize human learning behaviors in order to acquire new knowledge or skills, and reorganize the existing knowledge structure to continuously improve its own performance.
  • Machine learning is the core of artificial intelligence, the fundamental way to make computers intelligent, and its applications cover all fields of artificial intelligence.
  • Machine learning and deep learning usually include artificial neural networks, belief networks, reinforcement learning, transfer learning, inductive learning, teaching learning and other technologies.
  • Computer Vision is a science that studies how to make machines "see”. To put it further, it refers to the use of cameras and computers instead of human eyes to identify, track, and measure machine vision for targets, and further Do graphic processing to make computer processing an image more suitable for human eyes to observe or transmit to the instrument for inspection.
  • Computer vision technology usually includes image processing, image recognition, image semantic understanding, image retrieval, video processing, video semantic understanding, video content/behavior recognition, three-dimensional object reconstruction, virtual reality, augmented reality, synchronous positioning and map construction, etc. Including common face recognition, fingerprint recognition and other biometric recognition technologies.
  • the face recognition method provided in the embodiments of the present application involves artificial intelligence technology and computer vision technology, and is described by the face recognition method provided in the following embodiments.
  • the embodiment of the present application provides a face recognition method, and the face recognition method is executed by a computer device.
  • the computer equipment calls the face recognition model to realize the recognition of the face in the face image.
  • the computer device is a terminal, and the terminal is a smart phone, a tablet computer, a notebook computer, a desktop computer, a smart speaker, a smart watch, etc.
  • the computer device is a server, the server is an independent physical server, or the server is a server cluster or a distributed system composed of multiple physical servers, or the server provides cloud storage and network services Cloud servers for basic cloud computing services such as cloud communications, security services, CDN (Content Delivery Network), and big data and artificial intelligence platforms.
  • the method provided in the embodiment of the present application can be applied in any scene of face recognition.
  • the terminal pre-stores the user's template face image.
  • the terminal detects that online payment is to be made, it needs to verify the identity of the current user, and the terminal collects the current input target Face image, call the face recognition model provided in the embodiment of this application, and process the collected target face image and pre-stored template face image through the following steps 401-408 to obtain the target face image and The similarity between the template face images, in the case that the similarity is greater than the preset threshold, it is determined that the target face image matches the template face image, that is, the user corresponding to the target face image is the template face For the user corresponding to the image, the current user's identity verification is passed, and the current user has the authority to complete online payment.
  • the face recognition method provided in the embodiments of the present application is applied to an access control system, an application that completes login through face recognition, or other systems that require face recognition to authenticate a user's identity, etc. To verify the user's identity through face recognition.
  • the face recognition model 11 provided in this embodiment of the present application includes: a feature extraction sub-model 101 and a prediction sub-model 102. Among them, the feature extraction sub-model 101 and the prediction sub-model 102 are connected. The feature extraction sub-model 101 is used to extract feature images and feature vectors corresponding to the face image, and the prediction sub-model 102 is used to obtain corresponding feature values according to the feature images.
  • the feature extraction sub-model 101 includes: a feature extraction layer 111 and a feature mapping layer 121.
  • the feature extraction layer 111 is connected to the feature mapping layer 121.
  • the feature extraction layer 111 is used to extract a corresponding feature image based on the face image
  • the feature mapping layer 121 is used to obtain a corresponding feature vector based on the feature image.
  • the face recognition model 22 provided in this embodiment of the present application includes: a feature extraction sub-model 201, a prediction sub-model 202, and a loss acquisition sub-model 203.
  • the feature extraction sub-model 201 includes: a feature extraction layer 211 and a feature mapping layer 221. Among them, the feature extraction layer 211 is connected to the feature mapping layer 221. The feature extraction layer 211 is used to extract a corresponding feature image based on the face image, and the feature mapping layer 221 is used to obtain a corresponding feature vector based on the feature image.
  • Fig. 3 is a flowchart of a face recognition method provided by an embodiment of the present application.
  • the embodiments of the present application are executed by a computer device. Referring to FIG. 3, the method includes the following steps.
  • the computer device When the computer device obtains the target face image, it performs feature extraction on the target face image to obtain the first feature image corresponding to the target face image and the first feature vector corresponding to the first feature image.
  • the first feature image refers to an image that represents the facial features of the target facial image.
  • the facial features of the facial image include the depth features, texture features, color features, etc. of the facial image.
  • the first feature vector refers to a vector representing the feature of the target face image, for example, the first feature vector is a multi-dimensional vector.
  • the computer device When the computer device obtains the first feature image corresponding to the target face image, it processes the first feature image to obtain the first feature value corresponding to the first feature image.
  • the first characteristic value is used to indicate the uncertainty corresponding to the first characteristic image.
  • Uncertainty refers to the degree of unreliability of the processing result due to errors in the processing process, which can represent the degree to which the first feature image can accurately describe the facial features, that is, the first feature image corresponds to the unreliable degree.
  • the degree of certainty refers to the degree of difference between the facial features included in the first feature image and the facial features in the target facial image. The smaller the first feature value, the greater the accuracy of the first feature image in describing the face features in the target face image.
  • the difference between the face features included in the first feature image and the face features in the target face image is The smaller the difference between the two; the larger the first feature value, the smaller the accuracy of the first feature image in describing the facial features in the target face image, the facial features included in the first feature image and the target person The greater the degree of difference between the facial features in the face image.
  • the face recognition process in the embodiments of this application refers to recognizing a target face image and a template face image to determine whether the target face image matches the template face image, where the template face image refers to pre-stored Face image, the target face image refers to the currently acquired image that needs face recognition.
  • the computer device obtains the second feature vector corresponding to the second feature image of the template face image and the second feature value corresponding to the second feature image, according to the first feature vector , The first feature value, the second feature vector, and the second feature value to obtain the similarity between the target face image and the template face image.
  • the second feature image refers to an image representing the features of the template face image
  • the second feature vector refers to a vector representing the features of the template face image, that is, the second feature vector refers to the first feature of the template face image.
  • the feature vector corresponding to the feature image refers to the feature image.
  • the second feature value refers to the feature value corresponding to the second feature image
  • the second feature value is used to indicate the uncertainty corresponding to the second feature image
  • the uncertainty corresponding to the second feature image represents that the second feature image can accurately describe the person
  • the degree of facial features that is, the degree of uncertainty corresponding to the second feature image refers to the degree of difference between the facial features included in the second feature image and the facial features in the template facial image.
  • Small means that the probability that the target face image matches the template face image is smaller.
  • the computer device When the computer device obtains the similarity between the target face image and the template face image, it compares the similarity with a preset threshold. If the similarity is greater than the preset threshold, the target face image and the template face are determined If the image matches, the face recognition passes. If the similarity is not greater than the preset threshold, it is determined that the target face image does not match the template face image. At this time, the target face image is matched with the next template face image until it is determined that the target face image matches a certain face image. If a template face image matches, then face recognition passes, or until it is determined that the target face image does not match each of the stored template face images, then face recognition fails.
  • the preset threshold is set by the computer device by default, or the preset threshold is set by the developer through the computer device.
  • the method provided by the embodiment of the present application obtains the first feature image corresponding to the target face image, and the first feature vector and the first feature value corresponding to the first feature image, according to the first feature vector, the first feature value, and the template
  • the second feature vector and the second feature value corresponding to the second feature image of the face image are used to obtain the similarity between the target face image and the template face image. If the similarity is greater than the preset threshold, determine the target person The face image matches the template face image.
  • the first feature value represents the uncertainty corresponding to the first feature image
  • the second feature value represents the uncertainty corresponding to the second feature image
  • the uncertainty can represent the degree of difference between the feature image and the face image, so
  • the influence of the uncertainty of the feature image on the similarity is also considered, instead of only considering the feature vector corresponding to the feature image, it can effectively reduce the
  • the presence of interfering factors in the face image causes the feature vector to fail to accurately represent the characteristics of the face, which can improve the accuracy of face recognition and reduce the rate of misjudgment of face recognition.
  • Fig. 4 is a flowchart of another face recognition method provided by an embodiment of the present application.
  • the embodiments of the present application are executed by a computer device. Referring to FIG. 4, the method includes the following steps.
  • the computer device calls the feature extraction layer in the face recognition model to perform feature extraction on the target face image to obtain a first feature image corresponding to the target face image.
  • the face recognition model is a model pre-trained by the computer device, or a model trained by other devices and uploaded to the computer device.
  • the structure of the face recognition model and the functions of each part are shown in Figure 1, which will not be repeated here.
  • the computer device When the computer device obtains the target face image to be recognized, it calls the feature extraction layer in the face recognition model to perform feature extraction on the target face image to obtain the first feature image corresponding to the target face image.
  • the feature extraction layer in the embodiment of the present application can map the feature of the target face image to the hyperspherical space to obtain the first feature image corresponding to the target face image, so that the first feature image represented in the first feature image
  • the features conform to the distribution of hyperspherical space.
  • the hyperspherical space refers to a spherical space higher than two dimensions.
  • the radius of the hyperspherical space is set by default by the computer device. Compared with the two-dimensional European space, the hyperspherical space is more in line with the feature space of the face. Therefore, the feature extraction of the face image in the hyperspherical space can make the extracted facial features more accurate.
  • the feature extraction layer is a convolutional neural network (CNN, Convolutional Neural Network), which can perform convolution (Convolution) calculations, nonlinear activation function (Relu) calculations, and pooling (Pooling) calculations and other operations.
  • CNN convolutional neural network
  • Relu nonlinear activation function
  • Pooling pooling
  • the first feature image refers to an image representing the features of the target face image.
  • the features of the face image include the depth feature, texture feature, color feature, etc. of the face image.
  • the computer device collects a face image in the current scene through a configured camera, and uses the face image as a target face image, or cuts the face image to obtain the target face image.
  • the operation of face recognition is triggered, and the computer device detects the trigger operation of face recognition, and shoots through the configured camera to obtain the target person including the face Face image.
  • the computer device obtains the target face image uploaded by other devices, or the computer device downloads the target face image from other devices, or obtains the target face image in other ways.
  • the application embodiment does not limit this.
  • the computer device calls the feature mapping layer in the face recognition model to perform feature mapping on the first feature image to obtain a first feature vector corresponding to the first feature image.
  • the feature mapping layer is a fully connected mapping network, or the feature mapping layer is another form of network, which is not limited in the embodiment of the present application.
  • the computer device When the computer device obtains the first feature image corresponding to the target face image, it calls the feature mapping layer in the face recognition model to perform feature mapping on the first feature image to obtain the first feature vector corresponding to the first feature image .
  • the first feature vector is obtained by mapping the first feature image, and the first feature vector refers to a vector used to represent features of the target face image.
  • the first feature vector is a multi-dimensional vector, For example, the first feature vector is a 1 ⁇ n-dimensional vector, and the first feature vector includes n-dimensional feature values.
  • the feature extraction sub-model in the face recognition model includes a feature extraction layer and a feature mapping layer. Therefore, in the above steps 401-402, the target face image is processed by the feature extraction layer, and The feature mapping layer processes the first feature image as an example to illustrate the process of obtaining the first feature image corresponding to the target face image and the first feature vector corresponding to the first feature image.
  • the feature extraction submodel is another form of submodel, so that the first feature image and the first feature vector can be obtained by calling the feature extraction submodel to perform feature extraction on the target face image.
  • the computer device calls the prediction sub-model in the face recognition model to process the first feature image to obtain the first feature value corresponding to the first feature image.
  • the prediction sub-model is connected to the feature extraction layer in the feature extraction sub-model.
  • the predictive sub-model is a convolutional neural network (CNN, Convolutional Neural Network), and the convolutional neural network is a network connected by multiple fully connected layers, or the convolutional neural network is a ResNet (Residual Neural Network). Net, residual network), etc., which are not limited in the embodiment of the present application.
  • CNN Convolutional Neural Network
  • ResNet Residual Neural Network
  • Net residual network
  • the prediction sub-model is called to process the first feature image to obtain the first feature value corresponding to the first feature image.
  • the first feature value is used to indicate the uncertainty of the face feature in the target face image described by the first feature image. For related content of the uncertainty, refer to step 302 above.
  • the first feature image is a feature image of a face image mapped in a hyperspherical space, and the features represented in the first feature image conform to the distribution of the hyperspherical space, and the first feature value is also conformed to the hyperspherical space.
  • the first feature value of the distribution in the spherical space is used to represent the uncertainty of the first feature image in the hyperspheric space describing the face features in the target face image.
  • step 402 is executed first and then step 403 is executed as an example for description, that is, the first feature vector corresponding to the first feature image is acquired first, and then the first feature image corresponding to the first feature image is acquired.
  • the first characteristic value In another embodiment, step 403 is performed first and then step 402 is performed, that is, the first feature value corresponding to the first feature image is acquired first, and then the first feature vector corresponding to the first feature image is acquired.
  • the computer device calls the feature extraction layer to perform feature extraction on the template face image to obtain a second feature image corresponding to the template face image.
  • the face recognition process in the embodiments of the present application refers to recognizing the target face image and the template face image to determine whether the target face image matches the template face image.
  • the template face image refers to the face image pre-stored by the computer device
  • the target face image refers to the image currently acquired by the computer device that requires face recognition.
  • the matching of the target face image with the template face image means that the face in the target face image and the face in the template face image belong to the same person. For example, take a computer device running an application that uses face recognition to log in as an example.
  • the computer device uses the face image as a template face image and stores it Then, when you log in to the account through the application later, you can verify the user's identity based on the template face image.
  • the computer device uses the face image as a template face image and stores it
  • the user’s identity can be verified according to the template face image
  • the computer device corresponding to the access control system uses the face image as a template face and stores it.
  • subsequent face recognition verification is performed, the user's identity can be verified according to the template face image.
  • the computer device obtains the pre-stored template face image, calls the feature extraction layer in the face recognition model, performs feature extraction on the template face image, and obtains the second feature image corresponding to the template face image, where,
  • the second feature image refers to an image that represents the feature of the template face image.
  • the feature extraction layer in the embodiment of the present application can map the features of the template face image to the hyperspherical space to obtain the second feature image corresponding to the template face image.
  • the hyperspherical space in this step 404 is the same as the hyperspherical space in the aforementioned step 401.
  • the computer device calls the feature mapping layer to perform feature mapping on the second feature image to obtain a second feature vector corresponding to the second feature image.
  • the computer device When the computer device obtains the second feature image corresponding to the template face image, it calls the feature mapping layer in the face recognition model to perform feature mapping on the second feature image to obtain the second feature vector corresponding to the second feature image .
  • the second feature vector refers to a vector used to represent the features of the template face image.
  • the second feature vector is a multi-dimensional vector, for example, the second feature vector is a 1 ⁇ n-dimensional vector. Then the second feature vector includes feature values of n dimensions.
  • the second feature vector is obtained by mapping the second feature image.
  • the feature extraction sub-model in the face recognition model in the embodiment of the application includes a feature extraction layer and a feature mapping layer. Therefore, in steps 404-405, the template face image is processed by the feature extraction layer and the feature mapping layer The second feature image is processed as an example to illustrate the process of obtaining the second feature image corresponding to the template face image and the second feature vector corresponding to the second feature image.
  • the feature extraction sub-model is In other forms of sub-models, the feature extraction sub-model is used to perform feature extraction on the template face image to obtain the second feature image and the second feature vector.
  • the computer device calls the prediction sub-model to process the second feature image to obtain the second feature value corresponding to the second feature image.
  • the second feature value is used to indicate the uncertainty of the face feature in the face image of the second feature image description template.
  • the implementation process of this step 406 and related content are similar to the above-mentioned step 403, and will not be repeated here.
  • step 405 is performed first and then step 406 is performed as an example for description, that is, the second feature vector corresponding to the second feature image is acquired first, and then the second feature image corresponding to the second feature image is acquired.
  • the second characteristic value In another embodiment, step 406 is performed first and then step 405 is performed, that is, the second feature value corresponding to the second feature image is acquired first, and then the second feature vector corresponding to the second feature image is acquired.
  • steps 401-403 are executed first, and then steps 404-406 are executed as an example for description.
  • steps 404-406 are executed first, and then steps 401-403 are executed.
  • the computer device pre-processes the template face image before this face recognition to obtain the second feature vector and second feature value corresponding to the template face image, and stores the second feature vector and second feature value , The computer does not need to perform the above steps 404-406, and directly obtains the stored second feature vector and the second feature value.
  • each sub-model of the face recognition model can perform parallel processing on the target face image and the template face image.
  • the feature extraction model in the face recognition model processes the target face image
  • the person The predictive sub-model in the face recognition model can process the template face image, thereby achieving the effect of parallel processing of the target face image and the template face image, and improving the processing efficiency of the face recognition model.
  • the computer device obtains the similarity between the target face image and the template face image according to the first feature vector, the first feature value, the second feature vector, and the second feature value.
  • the computer device obtains the first feature vector, the first feature value, the second feature vector, and the second feature value
  • the target is obtained according to the first feature vector, the first feature value, the second feature vector, and the second feature value.
  • the similarity between the face image and the template face image is the similarity between the face image and the template face image.
  • the greater the similarity between the target face image and the template face image the greater the probability that the face in the target face image and the face in the template face image belong to the same person, that is, the target
  • the greater the probability that the face image matches the template face image the smaller the similarity between the target face image and the template face image, which means that the face in the target face image and the face in the template face image belong to The smaller the probability of the same person, that is, the smaller the probability that the target face image matches the template face image.
  • the computer device uses a similarity algorithm to calculate the first feature vector, the first feature value, the second feature vector, and the second feature value to obtain the difference between the target face image and the template face image. Similarity.
  • the similarity algorithm is as follows:
  • Sim represents the similarity between the target face image and the template face image
  • k i represents the first feature value corresponding to the target face image
  • k j represents the second feature value corresponding to the template face image
  • i and j It is a positive integer, used to indicate the target face image or template face image
  • d represents the dimension of the feature vector output by the feature mapping layer
  • r represents the radius in the hyperspherical space to which the feature of the face image is mapped.
  • ⁇ i represents the first feature vector corresponding to the target face image
  • ⁇ j represents the second feature vector corresponding to the template face image.
  • m and ⁇ are the preset parameters of the Bessel function
  • m! Represents the factorial of m
  • T( ⁇ ) represents the gamma function.
  • the computer device determines that the target face image matches the template face image when the similarity is greater than the preset threshold.
  • the computer device When the computer device obtains the similarity between the target face image and the template face image, it compares the similarity with a preset threshold, and if the similarity is greater than the preset threshold, the face in the target face image is determined It belongs to the same person as the face in the template face image, that is, the target face image matches the template face image. If the similarity is not greater than the preset threshold, it is determined that the face in the target face image and the face in the template face image do not belong to the same person, that is, the target face image does not match the template face image.
  • the preset threshold is determined according to the false alarm rate of face recognition required in actual application scenarios, or the preset threshold is set by default by the computer device, or the preset threshold is set by the developer through the computer device.
  • the computer device uses a judgment algorithm to judge the similarity to determine whether the target face image matches the template face image.
  • the judgment algorithm is as follows:
  • L out is the judgment result of the computer equipment, and th is the preset threshold.
  • the embodiment of the present application only uses the computer device to recognize the target face image and a template face image as an example to illustrate the process of face recognition.
  • the computer device stores multiple template face images. After obtaining the target face image to be recognized, the computer device traverses the multiple template face images, and traverses each template face image that is traversed. The image executes the steps of the embodiment shown in FIG. 4 until it is determined that the target face image matches a certain template face image among the multiple template face images, or until it is determined that the target face image matches the multiple template face images None of the face images in the templates match.
  • FIG. 5 is a flowchart of another face recognition method provided by an embodiment of the application.
  • the target face image 5101 is input into the feature extraction layer 5201 to obtain the first feature image 5103
  • the template face image 5102 Input the feature extraction layer 5201 to obtain the second feature image 5104.
  • the first feature image 5103 is input into the prediction sub-model 5202 to obtain a first feature value 5105
  • the second feature image 5104 is input into the prediction sub-model 5202 to obtain a second feature value 5106.
  • the first feature image 5103 is input into the feature mapping layer 5203 to obtain a first feature vector 5107
  • the second feature image 5104 is input into the feature mapping layer 5203 to obtain a second feature vector 5108.
  • the similarity 5109 between the target face image 5101 and the template face image 5102 is obtained, and according to the similarity 5109
  • the recognition result 5110 is obtained, that is, when the similarity 5109 is greater than the preset threshold, the recognition result 5110 is that the target face image 5101 matches the template face image 5102; when the similarity 5109 is not greater than the preset threshold, the recognition result 5110 is the target face image 5101 does not match the template face image 5102.
  • the template face image corresponds to a face identifier, and the face identifier is used to indicate the identity of the user.
  • the recognition result 5110 is that the target facial image 5101 matches the template facial image 5102
  • the recognition result 5110 also The face identification corresponding to the template face image 5102 is included to indicate that the target face image 5101 is the face image of the user corresponding to the face identification.
  • the first feature vector corresponding to the collected target face image and the second feature vector corresponding to the template face image are extracted respectively, according to the first feature vector and
  • the second feature vector obtains the similarity between the target face image and the template face image, and determines whether the target face image matches the template face image according to the similarity, so as to determine whether the face recognition passes.
  • the extracted feature vector is not accurate enough, which in turn leads to a lower accuracy of face recognition.
  • Fig. 6 shows the result of face recognition according to the method provided by the embodiments of the present application and the method provided by related technologies.
  • the face image 601 matches the face image 602
  • the face image 603 matches the face image 604
  • the face image 605 matches the face image 606,
  • the face image 607 matches the face image 608.
  • the preset threshold in the related art is 0.179
  • the preset threshold in this application is -1373.377.
  • the embodiment of the present application only uses the computer device to call the feature extraction sub-model and the prediction sub-model in the face recognition model to process the image as an example for illustration.
  • the computer device uses other methods to perform feature extraction on the target face image to obtain the first feature image corresponding to the target face image, and the first feature vector corresponding to the first feature image, The characteristic image is processed to obtain the first characteristic value corresponding to the first characteristic image.
  • the method provided by the embodiment of the present application calls the feature extraction sub-model and the prediction sub-model in the face recognition model to obtain the first feature vector and the first feature value corresponding to the first feature image. According to the first feature vector and the first feature Value, and the second feature vector and second feature value corresponding to the template face image, to obtain the similarity between the target face image and the template face image, and determine the target face when the similarity is greater than the preset threshold The image matches the template face image.
  • the feature of the target face image is mapped to the hyperspherical space to obtain the first feature image corresponding to the target face image.
  • the hyperspherical space is more in line with the feature space of the face. Therefore, the feature extraction of the face in the hyperspherical space can make the extracted facial features more accurate and further improve the face. Accuracy of recognition.
  • a face recognition model Before performing face recognition through the face recognition model, a face recognition model needs to be trained first, and the training process is detailed in the following embodiments.
  • Fig. 7 is a flowchart of a method for training a face recognition model provided by an embodiment of the present application.
  • the embodiment of the present application is executed by a computer device. Referring to FIG. 7, the method includes the following steps.
  • the computer device obtains a sample face image and a sample feature vector corresponding to the sample face image.
  • the computer device obtains a sample face image used for training a face recognition model, and a sample feature vector corresponding to the sample face image.
  • the sample face image is an image including a face
  • the sample feature vector corresponding to the sample face image is a vector used to represent the features of the sample face image.
  • the sample feature vector is used to represent the face identifier to which the sample face image belongs. Taking user 1 and user 2 as examples, the sample feature vector corresponding to any sample face image including the face of user 1 is the sample feature vector a, the sample feature vector corresponding to any sample face image including the face of user 2 is the sample feature vector b.
  • the sample face image is a sample face image pre-stored in a computer device, or a sample face image downloaded from another device by a computer device, or a sample face image uploaded to the computer device by a developer or other device image.
  • the sample feature vector corresponding to the sample face image is the sample feature vector marked by the developer for the sample face image, or the sample feature vector obtained by other methods, which is not limited in the embodiment of the application.
  • the computer device calls the feature extraction layer in the face recognition model to perform feature extraction on the sample face image to obtain a predicted feature image corresponding to the sample face image.
  • the face recognition model is a model used for face recognition.
  • the structure of the face recognition model and the functions of each part are shown in Fig. 1, which will not be repeated here.
  • the feature extraction layer is a convolutional neural network, and the convolutional neural network can perform operations such as convolution calculation, nonlinear activation function calculation, and pooling calculation.
  • the feature extraction layer is another form of network, which is not limited in the embodiment of the present application.
  • the computer device When the computer device obtains the sample face image, it calls the feature extraction layer in the face recognition model to perform feature extraction on the sample face image to obtain the predicted feature image corresponding to the sample face image.
  • the predicted feature image refers to an image that represents the feature of the sample face image.
  • the feature extraction layer in the embodiment of the present application maps the features of the sample face image to the hyperspherical space to obtain the predicted feature image corresponding to the sample face image.
  • Hyperspherical space refers to a spherical space higher than two dimensions. Compared with the two-dimensional European space, the hyperspherical space is more in line with the feature space of the human face. The feature extraction of the human face in the hyperspherical space can make the extracted facial features more accurate.
  • the computer device calls the feature mapping layer in the face recognition model to perform feature mapping on the predicted feature image to obtain the predicted feature vector corresponding to the predicted feature image.
  • the feature mapping layer is a fully connected mapping network, or the feature mapping layer is another form of network, which is not limited in the embodiment of the present application.
  • the computer device When the computer device obtains the predicted feature image corresponding to the sample face image, it calls the feature mapping layer in the face recognition model to perform feature mapping on the predicted feature image to obtain the predicted feature vector corresponding to the predicted feature image.
  • the predicted feature vector refers to a vector used to represent the feature of the sample face image, and the predicted feature vector is obtained by mapping the predicted feature image.
  • the feature extraction sub-model in the face recognition model includes a feature extraction layer and a feature mapping layer. Therefore, in the above steps 702-703, the sample face image is processed by the feature extraction layer, and The feature mapping layer processes the predicted feature image as an example to illustrate the process of obtaining the predicted feature image corresponding to the sample face image and the predicted feature vector corresponding to the predicted feature image.
  • the feature extraction sub-model is another form of sub-model, so that the feature extraction of the sample face image is performed through the feature extraction sub-model to obtain the predicted feature image and the predicted feature vector.
  • the computer device trains the feature extraction sub-model according to the difference between the predicted feature vector and the sample feature vector.
  • the predicted feature vector is a vector that is predicted by the face recognition model and represents the feature of the sample face image
  • the sample feature vector is a real vector that represents the feature of the sample face image. Therefore, when the computer equipment obtains the predicted feature vector and the sample feature vector, it trains the feature extraction sub-model in the face recognition model according to the difference between the predicted feature vector and the sample feature vector, that is, the training feature extraction layer and The feature mapping layer, so that the difference between the predicted feature vector obtained through the feature extraction layer and the feature mapping layer and the sample feature vector becomes smaller and smaller.
  • the computer device obtains the first loss value between the predicted feature vector and the sample feature vector, and trains the feature extraction sub-model according to the first loss value.
  • the first loss value represents the difference between the predicted feature vector and the sample feature vector.
  • the computer device obtains the first loss function, and calculates the prediction feature vector and the sample feature vector according to the first loss function to obtain the first loss value.
  • the first loss function refers to a function used to obtain the loss between the prediction feature vector and the sample feature vector.
  • the face recognition model further includes a loss acquisition sub-model, and the loss acquisition sub-model is connected with the feature extraction sub-model.
  • the loss acquisition sub-model includes a weight vector corresponding to each face identifier.
  • the computer equipment calls the loss acquisition sub-model, and weights the predicted feature vector according to the weight vector corresponding to the face identifier of the sample face image to obtain the weighted feature vector corresponding to the predicted feature vector, and obtains the weighted feature vector and the sample feature vector.
  • the loss acquisition sub-model is used to acquire the corresponding loss value according to the feature vector.
  • the loss acquisition sub-model is connected to the feature extraction sub-model.
  • the feature extraction sub-model includes a feature extraction layer and a feature mapping layer, then the loss acquisition The sub-model is connected with the feature mapping layer in the feature extraction sub-model.
  • the loss acquisition sub-model is a classification network.
  • the classification network is a softmax (logistic regression) network or various types of softmax networks with margins added, or the loss acquisition sub-model is in other forms. The embodiment does not limit this.
  • the weight vector corresponding to each face identifier is used to represent the weight of the feature vector corresponding to the face image corresponding to the face identifier.
  • the predicted feature vector corresponding to the sample face image is 1 ⁇ n-dimensional Vector
  • the predicted feature vector includes feature values of n dimensions.
  • the weight vector corresponding to the face identifier is also a 1 ⁇ n-dimensional vector
  • the weight vector includes n-dimension weight values
  • the n-dimension weight values respectively represent the weight of the feature value of each dimension in the corresponding prediction feature vector .
  • the weight vector corresponding to the face identifier to which the sample face image belongs is determined, and the loss obtaining sub-model is called, According to the weight vector corresponding to the face identifier to which the sample face image belongs, weighting is performed on the predicted feature vector to obtain the weighted feature vector corresponding to the predicted feature vector. That is, the feature value of each dimension in the prediction feature vector is respectively multiplied with the corresponding weight value in the weight vector to obtain a weighted feature vector.
  • the loss acquisition sub-model further includes a second loss function.
  • the computer device obtains the second loss function, and calculates the weighted feature vector and the sample feature vector according to the second loss function to obtain the second loss value.
  • the second loss function refers to a function used to obtain the loss between the weighted feature vector and the sample feature vector.
  • the computer device uses a gradient descent method to optimize the feature extraction sub-model and the loss acquisition sub-model to achieve the purpose of training the feature extraction sub-model and the loss acquisition sub-model.
  • the gradient descent method is a stochastic gradient descent method, a stochastic gradient descent method with a momentum term, an Adagrad method (Adaptive Gradient, adaptive gradient descent method), etc., which are not limited in the embodiment of the present application.
  • steps 701-704 only describe obtaining the predicted feature vector based on the sample face image.
  • the feature extraction sub-model and the loss-acquisition sub-model are trained.
  • the face image and the sample feature vector corresponding to the sample face image are trained for feature extraction sub-models.
  • the computer device adopts other methods to train the feature extraction sub-model based on the sample face image and the sample feature vector corresponding to the sample face image.
  • the computer device trains the feature extraction sub-model and the loss acquisition sub-model based on multiple sample face images and sample feature vectors corresponding to the multiple sample face images.
  • any two sample face images belong to the same face identifier, or any two sample face images belong to different face identifiers, which is not limited in this application.
  • the computer device obtains multiple sample face images and sample feature vectors corresponding to the multiple sample face images, and inputs the multiple sample face images into the feature extraction of the face recognition model at the same time
  • the face recognition model separately processes multiple sample face images, and trains the feature extraction sub-model and loss acquisition sub-model according to the obtained prediction feature vector and the corresponding sample feature vector.
  • the feature extraction sub-model and the loss acquisition sub-model in the face recognition model can perform parallel processing on multiple sample face images.
  • multiple sample face images include the first sample face image and the second sample face image.
  • the loss acquisition sub-model in the face recognition model processes the first sample face image
  • the face recognition The feature extraction sub-model in the model can process the second sample face image, thereby achieving the effect of parallel processing of multiple sample face images and improving the processing efficiency of the face recognition model.
  • the training process of the face recognition model corresponds to a condition for terminating the training model.
  • the condition for terminating the training model is that the number of iterations of the model reaches a preset number, or the loss of the model is less than the first
  • the preset value is not limited in this embodiment of the application. For example, when the number of iterations of training the feature extraction sub-model and the loss acquisition sub-model reaches the preset number of times, the training of the feature extraction sub-model and the loss acquisition sub-model is completed.
  • the first loss value or the second loss value acquired by the computer device is less than the first preset value, it indicates that the loss values of the feature extraction sub-model and the loss acquisition sub-model have converged, and the feature extraction sub-model and the loss acquisition sub-model are completed. Model training.
  • the computer device pre-stores a trained feature extraction sub-model and the sample face image used to train the feature extraction sub-model, so the computer device does not need to perform the above step 701- 704.
  • the computer device By acquiring the sample face image used to train the feature extraction sub-model, perform the following steps 705-707 to complete the training of the prediction sub-model.
  • the computer device obtains a central feature vector corresponding to the sample face image.
  • each sample face image corresponds to a face identification
  • each face identification corresponds to a central feature vector, which represents the face feature corresponding to the face identification, that is, the central feature vector can be used to represent The face features in the sample face image.
  • the computer device obtains feature vectors corresponding to multiple face images of the face identifier to which the sample face image belongs, and determines the central feature vector based on the multiple obtained feature vectors.
  • the computer device obtains feature vectors corresponding to multiple face images, and the computer device determines multiple face images of the face ID to which the sample feature image belongs, and obtains the multiple faces
  • the multiple feature vectors corresponding to the image are averaged to obtain the central feature vector corresponding to the face identifier to which the sample face image belongs.
  • the computer device obtains the weight vector corresponding to the face identifier to which the sample face image belongs, and determines the weight vector corresponding to the sample face image as the central feature vector.
  • the loss acquisition sub-model includes the weight vector corresponding to each face identifier.
  • each weight vector in the loss acquisition sub-model is continuously adjusted.
  • the loss acquisition sub-model includes each weight vector after training. Then the computer device can determine the face identifier to which the sample face image belongs, obtain the weight vector corresponding to the face identifier from the multiple weight vectors in the loss acquisition sub-model, and determine the weight vector as the sample face image The central feature vector corresponding to the face ID.
  • the computer device calls the prediction sub-model to process the prediction feature image to obtain the prediction feature value corresponding to the prediction feature image.
  • the predicted feature value refers to the degree of difference between the face features included in the predicted feature image and the face features in the sample face image, that is, the predicted feature value is used to indicate that the predicted feature image describes the sample face image Uncertainty of facial features.
  • the implementation process of this step 706 and related content are similar to the above-mentioned step 403, and will not be repeated here.
  • step 705 only takes step 705 and then step 706 as an example for description.
  • step 706 is performed first, and then step 705 is performed.
  • the computer device obtains a third loss value according to the predicted feature vector, the central feature vector, and the predicted feature value, and trains the prediction sub-model according to the third loss value.
  • the computer device obtains the predicted feature vector, center feature vector, and predicted feature value corresponding to the sample face image, obtains a third loss value based on the predicted feature vector, center feature vector, and predicted feature value, and trains the face based on the third loss value Identify the predictive sub-model in the model so that the predictive feature value corresponding to the predictive feature image output by the predictive sub-model is more accurate.
  • the third loss value represents the loss of the predicted feature value corresponding to the predicted feature image.
  • the computer device obtains the third loss function, and calculates the predicted feature vector, the central feature vector, and the predicted feature value according to the third loss function to obtain the third loss value.
  • the third loss function refers to a function used to obtain the loss of the predicted feature value.
  • the formula of the third loss function is as follows:
  • L s represents the third loss value
  • k represents the predicted feature value
  • r represents the radius in the hypersphere space to which the feature of the face image is mapped
  • represents the predicted feature vector
  • ⁇ T represents the transposition of the predicted feature vector
  • w x ⁇ c sample represents the center of the feature vector corresponding to the face image
  • x represents the current face image of a sample
  • c denotes at least one face image belongs to the sample face image corresponding to the face identification
  • d represents the feature vector output from feature mapping layer Dimensionality
  • Is a Bessel function.
  • the computer device obtains the target feature value according to the distance between the predicted feature vector and the center feature vector, and obtains the third loss value according to the difference between the target feature value and the predicted feature value.
  • the computer device can obtain the difference between the face images according to the feature vector and feature value corresponding to the feature image. Similarity. Therefore, the predicted feature value essentially represents the uncertainty of the match between the predicted feature vector corresponding to the sample face image and the central feature vector corresponding to the sample face image. The smaller the distance between the predicted feature vector and the central feature vector, the predicted feature The more similar the vector and the central feature vector, that is, the more matching the predicted feature vector and the central feature vector.
  • the computer device can obtain the target feature value according to the distance between the predicted feature vector and the center feature vector, and the target feature value can represent the uncertainty of the match between the predicted feature vector and the center feature vector.
  • the target feature value can represent the uncertainty of the match between the predicted feature vector and the center feature vector. The greater the distance between the predicted feature vector and the central feature vector, the greater the uncertainty of the match between the predicted feature vector and the central feature vector, that is, the greater the target feature value; the distance between the predicted feature vector and the central feature vector The smaller the value, the smaller the uncertainty of the match between the predicted feature vector and the central feature vector, that is, the smaller the target feature value.
  • the face identifier to which the face image to be recognized belongs cannot be known, and therefore the central feature vector corresponding to the face image cannot be known. Therefore, the computer device obtains the feature value according to the feature image. Therefore, in the training process of the predictive sub-model, it is necessary to ensure that the predictive feature value obtained by the predictive sub-model can represent the uncertainty of the match between the predicted feature vector corresponding to the sample face image and the central feature vector corresponding to the sample face image. That is, it is necessary to ensure that the difference between the predicted feature value and the target feature value is small.
  • the computer device obtains the third loss value according to the difference between the target feature value and the predicted feature value, and trains the prediction sub-model according to the third loss value, so that the difference between the target feature value and the predicted feature value is The difference is getting smaller and smaller, making the predicted feature value output by the prediction sub-model more and more accurate.
  • the computer device uses a gradient descent method to optimize the predictor sub-model to achieve the purpose of training the predictor sub-model.
  • the gradient descent method is a stochastic gradient descent method, a stochastic gradient descent method with a momentum term, an Adagrad method (Adaptive Gradient, adaptive gradient descent method), etc., which are not limited in the embodiment of the present application.
  • the optimized gradient of eigenvalues see the following formula:
  • r represents the radius in the hypersphere space to which the feature of the face image is mapped
  • represents the predicted feature vector corresponding to the sample face image
  • ⁇ T represents the transposition of the predicted feature vector
  • W x ⁇ c represents the corresponding sample face image
  • the central feature vector, x represents the current sample face image, c represents at least one face image corresponding to the face identifier to which the sample face image belongs, and d represents the dimension of the feature vector output by the feature mapping layer, with Is a Bessel function.
  • the training prediction is achieved based on the sample feature vector and the central feature vector of the face identifier to which the sample face image belongs while keeping the trained feature extraction sub-model unchanged.
  • the computer device uses other methods to train the predictor sub-model based on the sample feature vector and the center feature vector.
  • the embodiments of the present application adopt other spatial distributions with closed-form solutions to model the feature distribution of the hyperspherical space to reduce the training process of the face recognition model.
  • the training of the face recognition model is divided into the training phase of the feature extraction sub-model and the training phase of the prediction sub-model.
  • the function of obtaining similarity is encapsulated as a similarity obtaining module
  • the function of comparing similarity with a preset threshold is encapsulated as a threshold comparison module.
  • the computer device will be trained
  • the feature extraction sub-model, the prediction sub-model, the similarity acquisition module and the threshold comparison module are deployed to obtain a face recognition model.
  • Figure 8 is a flow chart of a training model and a deployment model provided by an embodiment of the application. See Figure 8.
  • the steps are as follows: 801, training feature extraction layer and feature mapping layer; 802, training prediction sub-model; 803, feature extraction
  • the layer, the feature mapping layer, the predictive sub-model, the similarity acquisition module, and the threshold comparison module are combined to form a face recognition model.
  • Fig. 9 is a flowchart of training a feature extraction sub-model provided by an embodiment of the present application, in which Fig. 9 divides the steps of training the feature extraction sub-model into multiple modules for description.
  • the sample data preparation module 901 uses To obtain the sample face image and the sample feature vector corresponding to the sample face image; the feature extraction module 902 is used to call the feature extraction layer to process the sample feature image to obtain the predicted feature image; the feature mapping module 903 is used to call the feature mapping layer The predicted feature image is processed to obtain a predicted feature vector; the loss obtaining module 904 is used to process the predicted feature vector and the sample feature vector to obtain a loss value. When the loss value is obtained, it is judged whether the condition for terminating the training model is currently met.
  • the training of the feature extraction sub-model is completed.
  • the parameters and the parameters of the feature mapping layer in the feature mapping module 903 are optimized.
  • the condition for terminating the training model is that the number of iterations reaches the preset number, or the loss value is less than the preset value.
  • Fig. 10 is a flow chart of training a prediction sub-model provided by an embodiment of the present application, wherein Fig. 10 divides the steps of training the prediction sub-model into multiple modules for explanation.
  • the central feature vector obtaining module 1001 is used to obtain The central feature vector corresponding to the face identifier to which the sample face image belongs
  • the sample data preparation module 1002 is used to obtain the sample face image
  • the feature extraction module 1003 is used to call the feature extraction layer to process the sample face image to obtain the predicted feature image
  • the feature mapping module 1004 is used to call the feature mapping layer to process the predicted feature image to obtain the predicted feature vector
  • the prediction module 1005 is used to call the prediction sub-model to process the predicted feature image to obtain the predicted feature value
  • the loss value acquisition module 1006 is used to obtain the loss value corresponding to the predicted feature value according to the central feature vector, the predicted feature vector, and the predicted feature value.
  • the loss value When the loss value is obtained, it is judged whether the current conditions for terminating the training model are met. If it is met, the training of the predictor sub-model is completed. If not, the parameters of the predictor sub-model in the prediction module 1005 are optimized through the optimization module 1007 . Wherein, the condition for terminating the training model is that the number of iterations reaches the preset number, or the loss value is less than the preset value.
  • the method provided by the embodiment of the application obtains the sample face image and the sample feature vector corresponding to the sample face image, and calls the feature extraction sub-model to extract the predicted feature image and the predicted feature vector of the sample face image, according to the predicted feature vector and sample feature
  • the difference between the vectors is used to train the feature extraction sub-model.
  • Obtain the central feature vector of the face identifier to which the sample face image belongs call the prediction sub-model to obtain the predicted feature value corresponding to the predicted feature image, obtain the third loss value according to the predicted feature vector, central feature vector and predicted feature value, and according to the third loss Value training predictor sub-model.
  • face recognition can be performed through the face recognition model including the feature extraction sub-model and the predictive sub-model.
  • the predictive sub-model Since the predictive sub-model is introduced, the similarity between the target face image and the template face image is also obtained.
  • the influence of the feature value output by the predictor sub-model on the similarity is considered, that is, the influence of the uncertainty of the feature image on the similarity is considered, instead of only the feature vector corresponding to the feature image, it can effectively reduce the
  • the presence of interference factors in the image causes the feature vector to fail to accurately represent the characteristics of the face, which can improve the accuracy of face recognition and reduce the rate of misjudgment of face recognition.
  • the feature extraction sub-model is trained, and the feature extraction sub-model after training remains unchanged, according to the sample feature vector and the person to which the sample face image belongs
  • the central feature vector of the face identifier is used to train the predictor sub-model. Therefore, in some embodiments, the training process of the face recognition model is divided into the training phase of the feature extraction sub-model and the training phase of the prediction sub-model.
  • the feature extraction sub-model is well trained, the feature is trained by acquiring Extract the sample face images of the sub-model, and train the prediction sub-model without retraining a new feature extraction sub-model, and there is no need to re-collect sample face images.
  • the feature of the sample face image is mapped to the hyperspherical space to obtain the predicted feature image corresponding to the sample face image.
  • the hyperspherical space is more in line with the feature space of the human face. Therefore, the feature extraction of the human face in the hyperspherical space can make the extracted facial features more accurate and improve the training results. The accuracy of the face recognition model for face recognition.
  • FIG. 11 is a schematic structural diagram of a face recognition device provided by an embodiment of the present application. Referring to Figure 11, the device includes:
  • the feature extraction module 1101 is used to perform feature extraction on the target face image to obtain a first feature image corresponding to the target face image and a first feature vector corresponding to the first feature image.
  • the first feature image is used to represent the target face image Facial features;
  • the feature value acquisition module 1102 is used to process the first feature image to obtain the first feature value corresponding to the first feature image.
  • the first feature value is used to represent the uncertainty corresponding to the first feature image, and the uncertainty refers to The degree of difference between the facial features included in the first feature image and the facial features in the target facial image;
  • the similarity acquisition module 1103 is used to acquire the similarity between the target face image and the template face image according to the first feature vector, the first feature value, the second feature vector, and the second feature value.
  • the second feature vector is The feature vector corresponding to the second feature image of the template face image.
  • the second feature value is the feature value corresponding to the second feature image.
  • the second feature value is used to indicate the uncertainty corresponding to the second feature image.
  • the second feature image corresponds to The uncertainty of refers to the degree of difference between the facial features included in the second feature image and the facial features in the template face image;
  • the determining module 1104 is configured to determine that the target face image matches the template face image when the similarity is greater than the preset threshold.
  • the device provided by the embodiment of the present application obtains the first feature image corresponding to the target face image and the first feature vector and the first feature value corresponding to the first feature image, according to the first feature vector, the first feature value and the template face
  • the second feature vector and the second feature value corresponding to the second feature image of the image are used to obtain the similarity between the target face image and the template face image.
  • the similarity is greater than the preset threshold, it is determined that the target face image is Template face image matching.
  • the first feature value represents the uncertainty corresponding to the first feature image
  • the second feature value represents the uncertainty corresponding to the second feature image
  • the uncertainty can represent the degree of difference between the feature image and the face image, so
  • the influence of the uncertainty of the feature image on the similarity is also considered, instead of only considering the feature vector corresponding to the feature image, it can effectively reduce the
  • the presence of interfering factors in the face image causes the feature vector to fail to accurately represent the characteristics of the face, which can improve the accuracy of face recognition and reduce the rate of misjudgment of face recognition.
  • the feature extraction module 1101 includes: a first feature extraction unit 1111, configured to call the feature extraction sub-model in the face recognition model, and perform feature extraction on the target face image to obtain the target person The first feature image corresponding to the face image and the first feature vector corresponding to the first feature image.
  • the feature value obtaining module 1102 includes: a feature value obtaining unit 1112, configured to call the predictor sub-model in the face recognition model to process the first feature image to obtain the first feature image The corresponding first characteristic value.
  • the feature extraction sub-model includes a feature extraction layer and a feature mapping layer.
  • the first feature extraction unit 1111 is configured to: call the feature extraction layer to perform feature extraction on the target face image to obtain the target person The first feature image corresponding to the face image; the feature mapping layer is called to perform feature mapping on the first feature image to obtain the first feature vector corresponding to the first feature image.
  • the device further includes: a first training module 1105, configured to train the feature extraction sub-model according to the sample face image and the sample feature vector corresponding to the sample face image; the second training module 1106, It is used to train the prediction sub-model based on the sample feature vector and the central feature vector corresponding to the sample face image while keeping the trained feature extraction sub-model unchanged.
  • the central feature vector indicates the corresponding face identification of the sample face image Facial features.
  • the first training module 1105 includes: a first acquiring unit 1115, configured to acquire a sample face image and a sample feature vector corresponding to the sample face image; a second feature extraction unit 1125, using The feature extraction sub-model is called to perform feature extraction on the sample face image to obtain the predicted feature image corresponding to the sample face image and the predicted feature vector corresponding to the predicted feature image; the first training unit 1135 is used to perform feature extraction based on the predicted feature vector and the sample The difference between feature vectors, training feature extraction sub-models.
  • the feature extraction sub-model includes a feature extraction layer and a feature mapping layer.
  • the second feature extraction unit 1125 is also used to: call the feature extraction layer to perform feature extraction on the sample face image to obtain the sample The predicted feature image corresponding to the face image; the feature mapping layer is called to perform feature mapping on the predicted feature image to obtain the predicted feature vector corresponding to the predicted feature image.
  • the first training unit 1135 is further used to: obtain a first loss value between the predicted feature vector and the sample feature vector, and the first loss value represents the difference between the predicted feature vector and the sample feature vector. The difference; according to the first loss value, train the feature extraction sub-model.
  • the face recognition model further includes a loss acquisition sub-model, and the loss acquisition sub-model includes a weight vector corresponding to each face identifier.
  • the first training unit 1135 is also used to: call the loss acquisition sub-model , Weighting the predicted feature vector according to the weight vector corresponding to the face identifier of the sample face image to obtain the weighted feature vector corresponding to the predicted feature vector; obtaining the second loss value between the weighted feature vector and the sample feature vector, the second The loss value represents the difference between the weighted feature vector and the sample feature vector; according to the second loss value, the feature extraction sub-model and the loss acquisition sub-model are trained.
  • the second training module 1106 includes: a second obtaining unit 1116, configured to obtain a central feature vector corresponding to a sample face image, the central feature vector representing a face feature corresponding to a face identifier;
  • the feature value acquisition unit 1126 is used to call the prediction sub-model to process the predicted feature image to obtain the predicted feature value corresponding to the predicted feature image.
  • the predicted feature value is used to indicate the uncertainty of the predicted feature image, and the predicted feature image corresponds to the predicted feature image.
  • Uncertainty refers to the degree of difference between the facial features included in the predicted feature image and the facial features in the sample face image; the loss value acquisition unit 1136 is used to predict feature vectors, central feature vectors, and predicted feature values , Obtain a third loss value, the third loss value represents the loss of the predicted feature value corresponding to the predicted feature image; the second training unit 1146 is used to train the prediction sub-model according to the third loss value.
  • the loss value obtaining unit 1136 is further configured to: obtain the target feature value according to the distance between the predicted feature vector and the center feature vector; according to the difference between the target feature value and the predicted feature value To obtain the third loss value.
  • the second obtaining unit 1116 is further configured to: obtain feature vectors corresponding to multiple face images, where the multiple face images are face images corresponding to the face identifiers to which the sample face images belong; Determine the central feature vector from the acquired multiple feature vectors.
  • the second obtaining unit 1116 is further configured to: obtain a weight vector corresponding to the face identifier to which the sample face image belongs; and determine the weight vector corresponding to the sample face image as the central feature vector.
  • the feature extraction module 1101 is further configured to perform feature extraction on the template face image to obtain a second feature image corresponding to the template face image and a second feature vector corresponding to the second feature image;
  • the feature value acquisition module 1102 is also used to process the second feature image to obtain the second feature value corresponding to the second feature image.
  • the first feature extraction unit 1111 is also used to call the feature extraction sub-model in the face recognition model to perform feature extraction on the template face image to obtain the second feature extraction corresponding to the template face image.
  • the feature image and the second feature vector corresponding to the second feature image are also used to call the feature extraction sub-model in the face recognition model to perform feature extraction on the template face image to obtain the second feature extraction corresponding to the template face image.
  • the feature value obtaining unit 1112 is further configured to call the predictive sub-model in the face recognition model to process the second feature image to obtain the second feature value corresponding to the second feature image.
  • the feature extraction sub-model includes a feature extraction layer and a feature mapping layer.
  • the first feature extraction unit 1111 is also used to: call the feature extraction layer to perform feature extraction on the template face image to obtain the template person The second feature image corresponding to the face image; the feature mapping layer is called to perform feature mapping on the second feature image to obtain the second feature vector corresponding to the second feature image.
  • the face recognition device provided in the above embodiment performs face recognition
  • only the division of the above functional modules is used as an example for illustration. In some embodiments in practical applications, the above functions are allocated according to needs. It is completed by different functional modules, that is, the internal structure of the computer equipment is divided into different functional modules to complete all or part of the functions described above.
  • the face recognition device provided in the foregoing embodiment and the face recognition method embodiment belong to the same concept, and the specific implementation process is detailed in the method embodiment, and will not be repeated here.
  • FIG. 13 shows a schematic structural diagram of a terminal 1300 provided by an exemplary embodiment of the present application.
  • the terminal 1300 can be used to execute the steps executed by the computer device in the aforementioned face recognition method.
  • the terminal 1300 includes a processor 1301 and a memory 1302.
  • the processor 1301 includes one or more processing cores, such as a 4-core processor, an 8-core processor, and so on.
  • the processor 1301 adopts at least one hardware form of DSP (Digital Signal Processing), FPGA (Field-Programmable Gate Array), and PLA (Programmable Logic Array, Programmable Logic Array).
  • the processor 1301 includes a main processor and a coprocessor.
  • the main processor is a processor used to process data in the awake state, also called a CPU (Central Processing Unit, central processing unit); the coprocessor is used for A low-power processor that processes data in the standby state.
  • a GPU Graphics Processing Unit, image processing interactor
  • the GPU is used to render and draw content that needs to be displayed on the display screen.
  • the processor 1301 further includes an AI (Artificial Intelligence) processor, and the AI processor is used to process computing operations related to machine learning.
  • AI Artificial Intelligence
  • the memory 1302 includes one or more computer-readable storage media, which are non-transitory.
  • the memory 1302 also includes high-speed random access memory and non-volatile memory, such as one or more magnetic disk storage devices and flash memory storage devices.
  • the non-transitory computer-readable storage medium in the memory 1302 is used to store at least one instruction, and the at least one instruction is used by the processor 1301 to implement what is executed by the computer device in the above-mentioned face recognition method. Method steps.
  • the terminal 1300 may optionally further include: a peripheral device interface 1303 and at least one peripheral device.
  • the processor 1301, the memory 1302, and the peripheral device interface 1303 are connected by a bus or a signal line.
  • Each peripheral device is connected to the peripheral device interface 1303 through a bus, a signal line or a circuit board.
  • the peripheral device includes: a camera assembly 1304.
  • the camera assembly 1304 is used to capture images or videos.
  • the camera assembly 1304 includes a front camera and a rear camera.
  • the front camera is set on the front panel of the terminal 1300
  • the rear camera is set on the back of the terminal 1300.
  • there are at least two rear cameras each of which is a main camera, a depth-of-field camera, a wide-angle camera, and a telephoto camera, so as to realize the fusion of the main camera and the depth-of-field camera to realize the background blur function, the main camera Integrate with the wide-angle camera to realize panoramic shooting and VR (Virtual Reality) shooting functions or other fusion shooting functions.
  • the camera assembly 1304 also includes a flash.
  • the flash is a single-color temperature flash or a dual-color temperature flash. Dual color temperature flash refers to a combination of warm light flash and cold light flash, which can be used for light compensation under different color temperatures.
  • the structure shown in FIG. 13 does not constitute a limitation on the terminal 1300.
  • the terminal 1300 includes more or fewer components than shown in the figure, or some components are combined. Or use different component arrangements.
  • FIG. 14 is a schematic structural diagram of a server provided by an embodiment of the present application.
  • the server 1400 has relatively large differences due to differences in configuration or performance, and includes one or more processors (Central Processing Units, CPUs). ) 1401 and one or more memories 1402, where at least one instruction is stored in the memory 1402, and the at least one instruction is loaded and executed by the processor 1401 to implement the methods provided in the foregoing method embodiments.
  • the server also has components such as a wired or wireless network interface, a keyboard, an input and output interface for input and output, and the server also includes other components for implementing device functions, which will not be repeated here.
  • the server 1400 can be used to execute the steps performed by the computer device in the aforementioned face recognition method.
  • the embodiment of the present application also provides a computer device for face recognition.
  • the computer device includes a processor and a memory.
  • the memory stores at least one instruction, and the at least one instruction is loaded and executed by the processor to realize the following Method steps of the face recognition method:
  • the first feature image is processed to obtain the first feature value corresponding to the first feature image.
  • the first feature value is used to indicate the uncertainty corresponding to the first feature image.
  • the uncertainty refers to the person included in the first feature image.
  • the similarity between the target face image and the template face image is obtained, and the second feature vector is the second feature of the template face image
  • the second feature vector is the second feature of the template face image
  • the second feature image is used to represent the face features of the template face image
  • the second feature value is the feature value corresponding to the second feature image
  • the second feature value is used to represent the feature value corresponding to the second feature image.
  • the uncertainty corresponding to the second feature image refers to the degree of difference between the facial features included in the second feature image and the facial features in the template facial image;
  • the similarity is greater than the preset threshold, it is determined that the target face image matches the template face image.
  • the feature extraction is performed on the target face image to obtain the first feature image corresponding to the target face image and the first feature vector corresponding to the first feature image, including:
  • the feature extraction sub-model in the face recognition model is called to perform feature extraction on the target face image to obtain the first feature image corresponding to the target face image and the first feature vector corresponding to the first feature image.
  • processing the first characteristic image to obtain the first characteristic value corresponding to the first characteristic image includes:
  • the prediction sub-model in the face recognition model is called to process the first feature image to obtain the first feature value corresponding to the first feature image.
  • the feature extraction sub-model includes a feature extraction layer and a feature mapping layer.
  • the feature extraction sub-model in the face recognition model is called to perform feature extraction on the target face image to obtain the corresponding target face image.
  • the first feature image and the first feature vector corresponding to the first feature image include:
  • the method before calling the predictive sub-model in the face recognition model to process the first feature image to obtain the first feature value corresponding to the first feature image, the method further includes:
  • training the feature extraction sub-model according to the sample face image and the sample feature vector corresponding to the sample face image includes:
  • the feature extraction sub-model perform feature extraction on the sample face image, and obtain the predicted feature image corresponding to the sample face image and the predicted feature vector corresponding to the predicted feature image;
  • the feature extraction sub-model is trained.
  • the feature extraction sub-model includes a feature extraction layer and a feature mapping layer.
  • the feature extraction sub-model is called to perform feature extraction on the sample face image to obtain the predicted feature image and predicted feature corresponding to the sample face image.
  • the prediction feature vector corresponding to the image includes:
  • training the feature extraction sub-model according to the difference between the predicted feature vector and the sample feature vector includes:
  • the first loss value represents the difference between the predicted feature vector and the sample feature vector
  • the feature extraction sub-model is trained.
  • the face recognition model also includes a loss acquisition sub-model, and the loss acquisition sub-model includes a weight vector corresponding to each face identifier.
  • the training feature extraction Sub-models including:
  • the second loss value represents the difference between the weighted feature vector and the sample feature vector
  • the sub-model is extracted by training and the sub-model is acquired by the loss.
  • training the prediction sub-model according to the sample feature vector and the center feature vector corresponding to the sample face image includes:
  • the prediction sub-model to process the predicted feature image to obtain the predicted feature value corresponding to the predicted feature image.
  • the predicted feature value is used to express the uncertainty of the predicted feature image.
  • the uncertainty of the predicted feature image refers to the predicted feature image. The degree of difference between the included face features and the face features in the sample face image;
  • a third loss value is obtained, and the third loss value represents the loss of the predicted feature value corresponding to the predicted feature image
  • the prediction sub-model is trained.
  • obtaining the third loss value according to the predicted feature vector, the center feature vector, and the predicted feature value includes:
  • the third loss value is obtained.
  • obtaining the central feature vector corresponding to the sample face image includes:
  • the central feature vector is determined.
  • obtaining the central feature vector corresponding to the sample face image includes:
  • the weight vector corresponding to the sample face image is determined as the central feature vector.
  • the method before obtaining the similarity between the target face image and the template face image according to the first feature vector, the first feature value, the second feature vector, and the second feature value, the method also include:
  • the second characteristic image is processed to obtain the second characteristic value corresponding to the second characteristic image.
  • the feature extraction is performed on the template face image to obtain the second feature image corresponding to the template face image and the second feature vector corresponding to the second feature image, including:
  • the feature extraction sub-model in the face recognition model is called to perform feature extraction on the template face image to obtain a second feature image corresponding to the template face image and a second feature vector corresponding to the second feature image.
  • processing the second feature image to obtain the second feature value corresponding to the second feature image includes:
  • the prediction sub-model in the face recognition model is called to process the second feature image to obtain the second feature value corresponding to the second feature image.
  • the feature extraction sub-model includes a feature extraction layer and a feature mapping layer.
  • the feature extraction sub-model in the face recognition model is called to perform feature extraction on the template face image to obtain the corresponding template face image.
  • the second feature image and the second feature vector corresponding to the second feature image include:
  • the embodiment of the present application also provides a computer-readable storage medium, the computer-readable storage medium stores at least one instruction, and the at least one instruction is loaded and executed by a processor to implement the method steps of the following face recognition method :
  • the first feature image is processed to obtain the first feature value corresponding to the first feature image.
  • the first feature value is used to indicate the uncertainty corresponding to the first feature image.
  • the uncertainty refers to the person included in the first feature image.
  • the similarity between the target face image and the template face image is obtained, and the second feature vector is the second feature of the template face image
  • the second feature vector is the second feature of the template face image
  • the second feature image is used to represent the face features of the template face image
  • the second feature value is the feature value corresponding to the second feature image
  • the second feature value is used to represent the feature value corresponding to the second feature image.
  • the uncertainty corresponding to the second feature image refers to the degree of difference between the facial features included in the second feature image and the facial features in the template facial image;
  • the similarity is greater than the preset threshold, it is determined that the target face image matches the template face image.
  • the feature extraction is performed on the target face image to obtain the first feature image corresponding to the target face image and the first feature vector corresponding to the first feature image, including:
  • the feature extraction sub-model in the face recognition model is called to perform feature extraction on the target face image to obtain the first feature image corresponding to the target face image and the first feature vector corresponding to the first feature image.
  • processing the first characteristic image to obtain the first characteristic value corresponding to the first characteristic image includes:
  • the prediction sub-model in the face recognition model is called to process the first feature image to obtain the first feature value corresponding to the first feature image.
  • the feature extraction sub-model includes a feature extraction layer and a feature mapping layer.
  • the feature extraction sub-model in the face recognition model is called to perform feature extraction on the target face image to obtain the corresponding target face image.
  • the first feature image and the first feature vector corresponding to the first feature image include:
  • the method before calling the predictive sub-model in the face recognition model to process the first feature image to obtain the first feature value corresponding to the first feature image, the method further includes:
  • training the feature extraction sub-model according to the sample face image and the sample feature vector corresponding to the sample face image includes:
  • the feature extraction sub-model perform feature extraction on the sample face image, and obtain the predicted feature image corresponding to the sample face image and the predicted feature vector corresponding to the predicted feature image;
  • the feature extraction sub-model is trained.
  • the feature extraction sub-model includes a feature extraction layer and a feature mapping layer.
  • the feature extraction sub-model is called to perform feature extraction on the sample face image to obtain the predicted feature image and predicted feature corresponding to the sample face image.
  • the prediction feature vector corresponding to the image includes:
  • training the feature extraction sub-model according to the difference between the predicted feature vector and the sample feature vector includes:
  • the first loss value represents the difference between the predicted feature vector and the sample feature vector
  • the feature extraction sub-model is trained.
  • the face recognition model also includes a loss acquisition sub-model, and the loss acquisition sub-model includes a weight vector corresponding to each face identifier.
  • the training feature extraction Sub-models including:
  • the loss acquisition sub-model is called, and the predicted feature vector is weighted according to the weight vector corresponding to the face identifier to which the sample face image belongs, to obtain the weighted feature vector corresponding to the predicted feature vector;
  • the second loss value represents the difference between the weighted feature vector and the sample feature vector
  • training the prediction sub-model according to the sample feature vector and the center feature vector corresponding to the sample face image includes:
  • the prediction sub-model to process the predicted feature image to obtain the predicted feature value corresponding to the predicted feature image.
  • the predicted feature value is used to express the uncertainty of the predicted feature image.
  • the uncertainty of the predicted feature image refers to the predicted feature image. The degree of difference between the included face features and the face features in the sample face image;
  • a third loss value is obtained, and the third loss value represents the loss of the predicted feature value corresponding to the predicted feature image
  • the prediction sub-model is trained.
  • obtaining the third loss value according to the predicted feature vector, the center feature vector, and the predicted feature value includes:
  • the third loss value is obtained.
  • obtaining the central feature vector corresponding to the sample face image includes:
  • the central feature vector is determined.
  • obtaining the central feature vector corresponding to the sample face image includes:
  • the weight vector corresponding to the sample face image is determined as the central feature vector.
  • the method before obtaining the similarity between the target face image and the template face image according to the first feature vector, the first feature value, the second feature vector, and the second feature value, the method also include:
  • the second characteristic image is processed to obtain the second characteristic value corresponding to the second characteristic image.
  • the feature extraction is performed on the template face image to obtain the second feature image corresponding to the template face image and the second feature vector corresponding to the second feature image, including:
  • the feature extraction sub-model in the face recognition model is called to perform feature extraction on the template face image to obtain a second feature image corresponding to the template face image and a second feature vector corresponding to the second feature image.
  • processing the second feature image to obtain the second feature value corresponding to the second feature image includes:
  • the prediction sub-model in the face recognition model is called to process the second feature image to obtain the second feature value corresponding to the second feature image.
  • the feature extraction sub-model includes a feature extraction layer and a feature mapping layer.
  • the feature extraction sub-model in the face recognition model is called to perform feature extraction on the template face image to obtain the corresponding template face image.
  • the second feature image and the second feature vector corresponding to the second feature image include:
  • An embodiment of the present application also provides a computer program, which includes at least one instruction, and the at least one instruction is loaded and executed by a processor to implement the method steps of the aforementioned face recognition method.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Data Mining & Analysis (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Human Computer Interaction (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Mathematical Physics (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Image Analysis (AREA)
  • Collating Specific Patterns (AREA)

Abstract

一种人脸识别方法、装置、计算机设备及存储介质,属于计算机技术领域。该方法包括:获取目标人脸图像对应的第一特征图像以及第一特征图像对应的第一特征向量和第一特征数值,根据第一特征向量、第一特征数值以及模板人脸图像的第二特征图像对应的第二特征向量和第二特征数值,获取目标人脸图像和模板人脸图像之间的相似度,在相似度大于预设阈值的情况下确定目标人脸图像与模板人脸图像匹配。由于在获取相似度时考虑了特征图像的不确定度对相似度的影响,而不是仅考虑特征向量,能够减少人脸图像中存在干扰因素导致特征向量无法准确表示人脸特征的情况,能够提高人脸识别的准确率,能够保证人脸识别进行身份验证的安全性。

Description

人脸识别方法、装置、计算机设备及存储介质
本申请要求于2020年5月22日提交的申请号为2020104388312、发明名称为“人脸识别方法、装置、计算机设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请实施例涉及计算机技术领域,特别涉及一种人脸识别方法、装置、计算机设备及存储介质。
背景技术
人脸识别是基于人的脸部特征进行身份识别的一种生物特征识别技术,随着人工智能技术的飞速发展,基于人工智能的人脸识别在日常生活中的应用越来越广泛,实现了在人脸识别支付、人脸识别登录应用等场景下对用户身份进行监控。在人脸识别支付、人脸识别登录应用等场景下,人脸识别的不准确,将导致身份验证的安全性无法得到保障,因此,如何提高人脸识别的准确性成为一个亟待解决的问题。
发明内容
本申请实施例提供了一种人脸识别方法、装置、计算机设备及存储介质,能够提高人脸识别的准确率。所述技术方案包括如下内容。
一方面,提供了一种人脸识别方法,所述方法包括:对目标人脸图像进行特征提取,得到所述目标人脸图像对应的第一特征图像及所述第一特征图像对应的第一特征向量,所述第一特征图像用于表示所述目标人脸图像的人脸特征;对所述第一特征图像进行处理,得到所述第一特征图像对应的第一特征数值,所述第一特征数值用于表示所述第一特征图像对应的不确定度,所述不确定度是指所述第一特征图像所包括的人脸特征与所述目标人脸图像中的人脸特征之间的差异程度;根据所述第一特征向量、所述第一特征数值、第二特征向量以及第二特征数值,获取所述目标人脸图像和所述模板人脸图像之间的相似度,所述第二特征向量为模板人脸图像的第二特征图像对应的特征向量,所述第二特征数值为所述第二特征图像对应的特征数值,所述第二特征数值用于表示所述第二特征图像对应的不确定度,所述第二特征图像对应的不确定度是指所述第二特征图像所包括的人脸特征与所述模板人脸图像中的人脸特征之间的差异程度;在所述相似度大于预设阈值的情况下,确定所述目标人脸图像与所述模板人脸图像匹配。
另一方面,提供了一种人脸识别装置,所述装置包括:特征提取模块,用于对目标人脸图像进行特征提取,得到所述目标人脸图像对应的第一特征图像及所述第一特征图像对应的第一特征向量,所述第一特征图像用于表示所述目标人脸图像的人脸特征;特征数值获取模块,用于对所述第一特征图像进行处理,得到所述第一特征图像对应的第一特征数值,所述第一特征数值用于表示所述第一特征图像对应的不确定度,所述不确定度是指所述第一特征图像所包括的人脸特征与所述目标人脸图像中的人脸特征之间的差异程度;相似度获取模块,用于根据所述第一特征向量、所述第一特征数值、第二特征向量以及第二特征数值,获取所述目标人脸图像和所述模板人脸图像之间的相似度,所述第二特征向量为模板人脸图像的第二特征图像对应的特征向量,所述第二特征数值为所述第二特征图像对应的特征数值,所述 第二特征数值用于表示所述第二特征图像对应的不确定度,所述第二特征图像对应的不确定度是指所述第二特征图像所包括的人脸特征与所述模板人脸图像中的人脸特征之间的差异程度;确定模块,用于在所述相似度大于预设阈值的情况下,确定所述目标人脸图像与所述模板人脸图像匹配。
另一方面,提供了一种计算机设备,所述计算机设备包括处理器和存储器,所述存储器中存储有至少一条指令,所述至少一条指令由所述处理器加载并执行以实现如所述人脸识别方法。
再一方面,提供了一种计算机可读存储介质,所述计算机可读存储介质中存储有至少一条指令,所述至少一条指令由处理器加载并执行,以实现如所述人脸识别方法。
本申请实施例提供的方法、装置、计算机设备及存储介质,获取目标人脸图像对应的第一特征图像以及第一特征图像对应的第一特征向量和第一特征数值,根据第一特征向量、第一特征数值以及模板人脸图像的第二特征图像对应的第二特征向量和第二特征数值,获取目标人脸图像和模板人脸图像之间的相似度,在相似度大于预设阈值的情况下确定目标人脸图像与模板人脸图像匹配。由于第一特征数值表示第一特征图像对应的不确定度,第二特征数值表示第二特征图像对应的不确定度,而不确定度能够表示特征图像与人脸图像之间的差异程度,因此在获取目标人脸图像和模板人脸图像的相似度时,还考虑了特征图像的不确定度对相似度的影响,而不是仅考虑特征图像对应的特征向量,因此能够有效减少由于人脸图像中存在干扰因素导致特征向量无法准确表示人脸的特征的情况,能够提高人脸识别的准确率,降低人脸识别的误判率。
并且,本申请实施例中,将目标人脸图像的特征映射到超球面空间中,得到该目标人脸图像对应的第一特征图像。由于相比于二维的欧式空间,超球面空间更加符合人脸的特征空间,因此在超球面空间上对人脸进行特征提取,能够使提取到的人脸特征更加准确,能够进一步提高人脸识别的准确率。
并且,获取样本人脸图像和样本人脸图像对应的样本特征向量,调用特征提取子模型提取样本人脸图像的预测特征图像和预测特征向量,根据预测特征向量和样本特征向量之间的差异,训练特征提取子模型。获取样本人脸图像所属人脸标识的中心特征向量,调用预测子模型获取预测特征图像对应的预测特征数值,根据预测特征向量、中心特征向量和预测特征数值获取第三损失值,根据第三损失值训练预测子模型。后续就能够通过包括该特征提取子模型和预测子模型的人脸识别模型,来进行人脸识别,由于引入预测子模型,因此在获取目标人脸图像和模板人脸图像之间的相似度时,还考虑了预测子模型输出的特征数值对相似度的影响,也即是考虑了特征图像的不确定度对相似度的影响,而不是仅考虑特征图像对应的特征向量,因此能够有效减少由于人脸图像中存在干扰因素导致特征向量无法准确表示人脸的特征的情况,能够提高人脸识别的准确率,降低人脸识别的误判率。
并且,根据样本人脸图像和样本人脸图像对应的样本特征向量,训练特征提取子模型,在保持训练后的特征提取子模型不变的情况下,根据样本特征向量和样本人脸图像所属人脸标识的中心特征向量,训练预测子模型。因此在一些实施例中,对人脸识别模型的训练过程分为特征提取子模型的训练阶段和预测子模型的训练阶段,则在特征提取子模型训练好的情况下,通过获取训练该特征提取子模型的样本人脸图像,对预测子模型进行训练,无需重新训练新的特征提取子模型,也无需重新收集样本人脸图像。
附图说明
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请实施例的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还能够根据这些附图获得其他的附图。
图1是本申请实施例提供的一种人脸识别模型的示意图;
图2是本申请实施例提供的另一种人脸识别模型的示意图;
图3是本申请实施例提供的一种人脸识别方法的流程图;
图4是本申请实施例提供的另一种人脸识别方法的流程图;
图5是本申请实施例提供的另一种人脸识别方法的流程图;
图6是本申请实施例提供的一种人脸识别的结果以及相关技术提供的一种人脸识别的结果;
图7是本申请实施例提供的一种人脸识别模型训练方法的流程图;
图8是本申请实施例提供的一种训练模型和部署模型的流程图;
图9是本申请实施例提供的一种训练特征提取子模型的流程图;
图10是本申请实施例提供的一种训练预测子模型的流程图;
图11是本申请实施例提供的一种人脸识别装置的结构示意图;
图12是本申请实施例提供的另一种人脸识别装置的结构示意图;
图13是本申请实施例提供的一种终端的结构示意图;
图14是本申请实施例提供的一种服务器的结构示意图。
具体实施方式
为使本申请实施例的目的、技术方案和优点更加清楚,下面将结合附图对本申请实施方式作进一步地详细描述。
本申请所使用的术语“第一”、“第二”等可在本文中用于描述各种概念,但除非特别说明,这些概念不受这些术语限制。这些术语仅用于将一个概念与另一个概念区分。举例来说,在不脱离本申请的范围的情况下,在一些实施例中,将第一特征图像称为第二特征图像,且类似地,将第二特征图像称为第一特征图像。其中,多个是指两个或者两个以上,例如,多个人脸图像是两个人脸图像、三个人脸图像等任一大于等于二的整数个人脸图像。每个是指至少一个中的每一个,例如,每个人脸标识是指多个人脸标识中的每一个人脸标识,若多个人脸标识为3个人脸标识,则每个人脸标识是指3个人脸标识中的每一个人脸标识。
人工智能(Artificial Intelligence,AI)是利用数字计算机或者数字计算机控制的机器模拟、延伸和扩展人的智能,感知环境、获取知识并使用知识获得最佳结果的理论、方法、技术及应用系统。换句话说,人工智能是计算机科学的一个综合技术,它企图了解智能的实质,并生产出一种新的能以人类智能相似的方式做出反应的智能机器。人工智能也就是研究各种智能机器的设计原理与实现方法,使机器具有感知、推理与决策的功能。
人工智能技术是一门综合学科,涉及领域广泛,既有硬件层面的技术也有软件层面的技术。人工智能基础技术一般包括如传感器、专用人工智能芯片、云计算、分布式存储、大数据处理技术、操作/交互系统、机电一体化等技术。人工智能软件技术包括自然语言处理技术和机器学习。
机器学习(Machine Learning,ML)是一门多领域交叉学科,涉及概率论、统计学、逼近论、凸分析、算法复杂度理论等多门学科。专门研究计算机怎样模拟或实现人类的学习行为,以获取新的知识或技能,重新组织已有的知识结构使之不断改善自身的性能。机器学习是人工智能的核心,是使计算机具有智能的根本途径,其应用遍及人工智能的各个领域。机器学习和深度学习通常包括人工神经网络、置信网络、强化学习、迁移学习、归纳学习、示教学习等技术。
计算机视觉技术(Computer Vision,CV)是一门研究如何使机器“看”的科学,更进一步的说,就是指用摄影机和电脑代替人眼对目标进行识别、跟踪和测量等机器视觉,并进一步做图形处理,使电脑处理成为更适合人眼观察或传送给仪器检测的图像。作为一个科学学科,计算机视觉研究相关的理论和技术,试图建立能够从图像或者多维数据中获取信息的人工智能系统。计算机视觉技术通常包括图像处理、图像识别、图像语义理解、图像检索、视频处理、视频语义理解、视频内容/行为识别、三维物体重建、虚拟现实、增强现实、同步定位与地图构建等技术,还包括常见的人脸识别、指纹识别等生物特征识别技术。
本申请实施例提供的人脸识别方法涉及人工智能技术和计算机视觉技术,通过下述实施例提供的人脸识别方法进行说明。
本申请实施例提供了一种人脸识别方法,该人脸识别方法由计算机设备执行。该计算机设备调用人脸识别模型,实现了对人脸图像中人脸的识别。在一种可能实现方式中,该计算机设备为终端,终端是智能手机、平板电脑、笔记本电脑、台式计算机、智能音箱、智能手表等。在另一种可能实现方式中,该计算机设备为服务器,服务器是独立的物理服务器,或者,服务器是多个物理服务器构成的服务器集群或者分布式系统,又或者,服务器是提供云存储、网络服务、云通信、安全服务、CDN(Content Delivery Network,内容分发网络)、以及大数据和人工智能平台等基础云计算服务的云服务器。
本申请实施例提供的方法,能够应用于人脸识别的任一场景下。
例如,在通过人脸识别进行网上支付的场景下,终端预先存储用户的模板人脸图像,当终端检测到要进行网上支付时,需要对当前用户的身份进行验证,则终端采集当前输入的目标人脸图像,调用本申请实施例提供的人脸识别模型,通过下述步骤401-408,分别对采集到的目标人脸图像和预先存储的模板人脸图像进行处理,获取目标人脸图像和模板人脸图像之间的相似度,在相似度大于预设阈值的情况下,确定目标人脸图像与模板人脸图像匹配,也即是,该目标人脸图像对应的用户是该模板人脸图像对应的用户,则当前用户的身份验证通过,当前用户具备完成网上支付的权限。在相似度不大于预设阈值的情况下,确定目标人脸图像与模板人脸图像不匹配,也即是该目标人脸图像对应的用户不是该模板人脸图像对应的用户,则当前用户的身份验证不通过,网上支付失败。除此之外,在另一些实施例中,本申请实施例提供的人脸识别方法,应用于门禁系统、通过人脸识别完成登录的应用或者需要人脸识别来认证用户身份的其他系统中等,来通过人脸识别对用户的身份进行验证。
在一种可能实现方式中,如图1所示,本申请实施例提供的人脸识别模型11包括:特征提取子模型101和预测子模型102。其中,特征提取子模型101和预测子模型102连接,特征提取子模型101用于提取人脸图像对应的特征图像和特征向量,预测子模型102用于根据特征图像获取对应的特征数值。
在一些实施例中,该特征提取子模型101包括:特征提取层111和特征映射层121。其 中,特征提取层111与特征映射层121连接,特征提取层111用于根据人脸图像提取对应的特征图像,特征映射层121用于根据特征图像获取对应的特征向量。
在另一种可能实现方式中,如图2所示,本申请实施例提供的人脸识别模型22包括:特征提取子模型201、预测子模型202和损失获取子模型203。
其中,特征提取子模型201与预测子模型202连接,特征提取子模型201还与损失获取子模型203连接,特征提取子模型201用于提取人脸图像对应的特征图像和特征向量,预测子模型202用于根据特征图像获取对应的特征数值,损失获取子模型203用于根据特征向量获取对应的损失值。
在一些实施例中,该特征提取子模型201包括:特征提取层211和特征映射层221。其中,特征提取层211与特征映射层221连接,特征提取层211用于根据人脸图像提取对应的特征图像,特征映射层221用于根据特征图像获取对应的特征向量。
图3是本申请实施例提供的一种人脸识别方法的流程图。本申请实施例由计算机设备执行,参见图3,该方法包括以下步骤。
301、对目标人脸图像进行特征提取,得到目标人脸图像对应的第一特征图像及第一特征图像对应的第一特征向量。
计算机设备获取到目标人脸图像时,对该目标人脸图像进行特征提取,得到该目标人脸图像对应的第一特征图像,以及该第一特征图像对应的第一特征向量。
其中,第一特征图像是指表示目标人脸图像的人脸特征的图像,例如,人脸图像的人脸特征包括人脸图像的深度特征、纹理特征、色彩特征等。第一特征向量是指表示目标人脸图像的特征的向量,例如,该第一特征向量为多维向量。
302、对第一特征图像进行处理,得到第一特征图像对应的第一特征数值。
当计算机设备获取到目标人脸图像对应的第一特征图像,对该第一特征图像进行处理,得到该第一特征图像对应的第一特征数值。其中,第一特征数值用于表示第一特征图像对应的不确定度。不确定度是指由于处理过程中存在误差,对处理结果的不可信赖的程度,能一定程度上代表第一特征图像能准确描述人脸特征的程度,也即是,第一特征图像对应的不确定度是指第一特征图像所包括的人脸特征与目标人脸图像中的人脸特征之间的差异程度。该第一特征数值越小,表示该第一特征图像描述目标人脸图像中人脸特征的准确程度越大,第一特征图像所包括的人脸特征与目标人脸图像中的人脸特征之间的差异程度越小;该第一特征数值越大,表示该第一特征图像描述该目标人脸图像中人脸特征的准确程度越小,第一特征图像所包括的人脸特征与目标人脸图像中的人脸特征之间的差异程度越大。
303、根据第一特征向量、第一特征数值、第二特征向量以及第二特征数值,获取目标人脸图像和模板人脸图像之间的相似度。
本申请实施例的人脸识别过程,是指对目标人脸图像和模板人脸图像进行识别,以确定该目标人脸图像是否与模板人脸图像匹配,其中模板人脸图像是指预先存储的人脸图像,目标人脸图像是指当前获取的、需要进行人脸识别的图像。为了将目标人脸图像与模板人脸图像进行匹配,计算机设备获取模板人脸图像的第二特征图像对应的第二特征向量以及该第二特征图像对应的第二特征数值,根据第一特征向量、第一特征数值、第二特征向量和第二特征数值,获取目标人脸图像和模板人脸图像之间的相似度。
其中,第二特征图像是指表示模板人脸图像的特征的图像,第二特征向量是指表示模板 人脸图像的特征的向量,也即是,第二特征向量是指模板人脸图像的第二特征图像对应的特征向量。第二特征数值是指第二特征图像对应的特征数值,第二特征数值用于表示第二特征图像对应的不确定度,第二特征图像对应的不确定度代表第二特征图像能准确描述人脸特征的程度,也即是,第二特征图像对应的不确定度是指第二特征图像所包括的人脸特征与模板人脸图像中的人脸特征之间的差异程度。
其中,目标人脸图像和模板人脸图像之间的相似度越大,表示目标人脸图像与模板人脸图像匹配的概率越大,目标人脸图像和模板人脸图像之间的相似度越小,表示目标人脸图像与模板人脸图像匹配的概率越小。
304、在相似度大于预设阈值的情况下,确定目标人脸图像与模板人脸图像匹配。
当计算机设备获取到目标人脸图像和模板人脸图像之间的相似度,将该相似度与预设阈值进行比较,若相似度大于该预设阈值,则确定目标人脸图像与模板人脸图像匹配,则人脸识别通过。若相似度不大于该预设阈值,则确定目标人脸图像与模板人脸图像不匹配,此时继续将目标人脸图像与下一个模板人脸图像进行匹配,直至确定目标人脸图像与某一模板人脸图像匹配,则人脸识别通过,或者,直至确定目标人脸图像与存储的每个模板人脸图像均不匹配,则人脸识别失败。其中,该预设阈值由计算机设备默认设置,或者,该预设阈值由开发人员通过计算机设备自行设置。
本申请实施例提供的方法,获取目标人脸图像对应的第一特征图像,以及第一特征图像对应的第一特征向量和第一特征数值,根据第一特征向量、第一特征数值,以及模板人脸图像的第二特征图像对应的第二特征向量和第二特征数值,获取目标人脸图像和模板人脸图像之间的相似度,在相似度大于预设阈值的情况下,确定目标人脸图像与模板人脸图像匹配。由于第一特征数值表示第一特征图像对应的不确定度,第二特征数值表示第二特征图像对应的不确定度,而不确定度能够表示特征图像与人脸图像之间的差异程度,因此在获取目标人脸图像和模板人脸图像之间的相似度时,还考虑了特征图像的不确定度对相似度的影响,而不是仅考虑特征图像对应的特征向量,因此能够有效减少由于人脸图像中存在干扰因素导致特征向量无法准确表示人脸的特征的情况,能够提高人脸识别的准确率,降低人脸识别的误判率。
图4是本申请实施例提供的另一种人脸识别方法的流程图。本申请实施例由计算机设备执行,参见图4,该方法包括以下步骤。
401、计算机设备调用人脸识别模型中的特征提取层,对目标人脸图像进行特征提取,得到目标人脸图像对应的第一特征图像。
其中,该人脸识别模型为该计算机设备预先训练的模型,或者由其他设备训练好之后上传至该计算机设备中的模型。人脸识别模型的结构以及各部分的功能参见图1,此处不再赘述。
当计算机设备获取到待识别的目标人脸图像时,调用该人脸识别模型中的特征提取层,对该目标人脸图像进行特征提取,得到该目标人脸图像对应的第一特征图像。其中,本申请实施例中的特征提取层,能够将目标人脸图像的特征映射到超球面空间中,得到该目标人脸图像对应的第一特征图像,使得该第一特征图像中所表示的特征符合超球面空间的分布。超球面空间是指高于二维的球面空间,在一些实施例中,该超球面空间的半径由计算机设备默认设置。由于相比于二维的欧式空间,超球面空间更加符合人脸的特征空间,因此在超球面 空间上对人脸图像进行特征提取,能够使提取到的人脸特征更加准确。
在一种可能实现方式中,该特征提取层为卷积神经网络(CNN,Convolutional Neural Network),该卷积神经网络能够执行卷积(Convolution)计算、非线性激活函数(Relu)计算、池化(Pooling)计算等操作。或者,该特征提取层为其他形式的网络,本申请实施例对此不做限定。
其中,第一特征图像是指表示目标人脸图像的特征的图像,例如,人脸图像的特征包括人脸图像的深度特征、纹理特征、色彩特征等。
在一种可能实现方式中,计算机设备通过配置的摄像头采集当前场景下的人脸图像,将该人脸图像作为目标人脸图像,或者,对人脸图像进行裁剪处理得到目标人脸图像。在一些实施例中,当用户需要进行人脸识别时,触发对人脸识别的操作,计算机设备检测到对人脸识别的触发操作,通过配置的摄像头进行拍摄,从而获取包括人脸的目标人脸图像。
在另一种可能实现方式中,计算机设备获取其他设备上传的目标人脸图像,或者,该计算机设备从其他设备中下载该目标人脸图像,或者,采用其他方式获取该目标人脸图像,本申请实施例对此不做限定。
402、计算机设备调用人脸识别模型中的特征映射层,对第一特征图像进行特征映射,得到第一特征图像对应的第一特征向量。
在一些实施例中,该特征映射层为全连接映射网络,或者,该特征映射层为其他形式的网络,本申请实施例对此不做限定。
当计算机设备获取到目标人脸图像对应的第一特征图像,则调用人脸识别模型中的特征映射层,对该第一特征图像进行特征映射,得到该第一特征图像对应的第一特征向量。其中,该第一特征向量由该第一特征图像映射得到,该第一特征向量是指用于表示目标人脸图像的特征的向量,在一些实施例中,该第一特征向量为多维向量,例如该第一特征向量为1×n维的向量,则第一特征向量中包括n个维度的特征值。
需要说明的是,本申请实施例中,人脸识别模型中的特征提取子模型包括特征提取层和特征映射层,因此上述步骤401-402,以特征提取层对目标人脸图像进行处理,以及特征映射层对第一特征图像进行处理为例,说明得到目标人脸图像对应的第一特征图像,以及第一特征图像对应的第一特征向量的过程。而在另一实施例中,特征提取子模型为其他形式的子模型,使得通过调用该特征提取子模型对目标人脸图像进行特征提取,能够得到第一特征图像以及第一特征向量即可。
403、计算机设备调用人脸识别模型中的预测子模型,对第一特征图像进行处理,得到第一特征图像对应的第一特征数值。
本申请实施例中,该预测子模型与特征提取子模型中的特征提取层连接。在一些实施例中,该预测子模型为卷积神经网络(CNN,Convolutional Neural Network),该卷积神经网络为多个全连接层相连接的网络,或者,该卷积神经网络为ResNet(Residual Net,残差网络)形式的网络等,本申请实施例对此不做限定。
当计算机设备获取到目标人脸图像对应的第一特征图像,则调用该预测子模型对第一特征图像进行处理,得到该第一特征图像对应的第一特征数值。其中,第一特征数值用于表示第一特征图像描述目标人脸图像中人脸特征的不确定度,关于不确定度的相关内容参见上述步骤302。在一些实施例中,该第一特征图像为人脸图像映射在超球面空间的特征图像,该第 一特征图像中所表示的特征符合超球面空间的分布,则该第一特征数值也为符合超球面空间的分布的第一特征数值,用于表示超球面空间的第一特征图像描述目标人脸图像中人脸特征的不确定度。
需要说明的是,本申请实施例中,仅以先执行步骤402再执行步骤403为例进行说明,也即是,先获取第一特征图像对应的第一特征向量,再获取第一特征图像对应的第一特征数值。而在另一实施例中,先执行步骤403再执行步骤402,也即是,先获取第一特征图像对应的第一特征数值,再获取第一特征图像对应的第一特征向量。
404、计算机设备调用特征提取层,对模板人脸图像进行特征提取,得到模板人脸图像对应的第二特征图像。
本申请实施例的人脸识别过程,是指对目标人脸图像和模板人脸图像进行识别,以确定该目标人脸图像是否与模板人脸图像匹配。其中,模板人脸图像是指计算机设备预先存储的人脸图像,目标人脸图像是指计算机设备当前获取的、需要进行人脸识别的图像。目标人脸图像与模板人脸图像匹配,是指目标人脸图像中的人脸与模板人脸图像中的人脸属于同一个人。例如,以计算机设备中运行通过人脸识别进行登录的应用为例,当用户在该应用中注册帐号时,该用户输入人脸图像,则计算机设备将该人脸图像作为模板人脸图像并存储下来,后续通过该应用登录该帐号时,就能够根据该模板人脸图像对用户身份进行验证。或者,以通过人脸识别进行网上支付为例,当用户将网上支付的验证方式设置为人脸识别时,该用户输入人脸图像,则计算机设备将该人脸图像作为模板人脸图像并存储下来,后续进行网上支付时,就能够根据该模板人脸图像对用户身份进行验证,或者,以门禁系统通过人脸识别验证用户身份为例,当用户将门禁系统的验证方式设置为人脸识别验证时,该用户输入人脸图像,则门禁系统对应的计算机设备将该人脸图像作为模板人脸并存储下来,后续进行人脸识别验证时,就能够根据该模板人脸图像对用户身份进行验证。
因此,计算机设备获取预先存储的模板人脸图像,调用该人脸识别模型中的特征提取层,对该模板人脸图像进行特征提取,得到该模板人脸图像对应的第二特征图像,其中,第二特征图像是指表示模板人脸图像的特征的图像。其中,本申请实施例中的特征提取层,能够将模板人脸图像的特征映射到超球面空间中,得到该模板人脸图像对应的第二特征图像。该步骤404中的超球面空间与上述步骤401中的超球面空间相同。
405、计算机设备调用特征映射层,对第二特征图像进行特征映射,得到第二特征图像对应的第二特征向量。
当计算机设备获取到模板人脸图像对应的第二特征图像,则调用人脸识别模型中的特征映射层,对该第二特征图像进行特征映射,得到该第二特征图像对应的第二特征向量。其中,该第二特征向量是指用于表示模板人脸图像的特征的向量,在一些实施例中,该第二特征向量为多维向量,例如该第二特征向量为1×n维的向量,则第二特征向量中包括n个维度的特征值。该第二特征向量由该第二特征图像映射得到。
需要说明的是,本申请实施例中人脸识别模型中的特征提取子模型包括特征提取层和特征映射层,因此步骤404-405以特征提取层对模板人脸图像进行处理,以及特征映射层对第二特征图像进行处理为例,说明得到模板人脸图像对应的第二特征图像,以及第二特征图像对应的第二特征向量的过程,而在另一实施例中,特征提取子模型为其他形式的子模型,使得通过该特征提取子模型对模板人脸图像进行特征提取,能够得到第二特征图像以及第二特 征向量即可。
406、计算机设备调用预测子模型,对第二特征图像进行处理,得到第二特征图像对应的第二特征数值。
第二特征数值用于表示第二特征图像描述模板人脸图像中人脸特征的不确定度。该步骤406的实现过程与相关内容和上述步骤403类似,在此不再一一赘述。
需要说明的是,本申请实施例中,仅以先执行步骤405再执行步骤406为例进行说明,也即是,先获取第二特征图像对应的第二特征向量,再获取第二特征图像对应的第二特征数值。而在另一实施例中,先执行步骤406再执行步骤405,也即是,先获取第二特征图像对应的第二特征数值,再获取第二特征图像对应的第二特征向量。
需要说明的是,本申请实施例中,仅以先执行步骤401-403,再执行步骤404-406为例进行说明。而在另一实施例中,先执行步骤404-406,再执行步骤401-403。或者计算机设备在本次人脸识别之前,预先对模板人脸图像进行处理,得到模板人脸图像对应的第二特征向量和第二特征数值,将该第二特征向量和第二特征数值存储下来,则计算机无需再执行上述步骤404-406,直接获取存储的第二特征向量和第二特征数值即可。或者,计算机设备获取到待识别的目标人脸图像后,获取预先存储的模板人脸图像,将该目标人脸图像和模板人脸图像以图像对的形式,同时输入至人脸识别模型中,由该人脸识别模型分别对目标人脸图像和模板人脸图像进行处理,得到第一特征向量、第一特征数值、第二特征向量和第二特征数值。其中,该人脸识别模型的各个子模型能够对目标人脸图像和模板人脸图像进行并行处理,例如,在人脸识别模型中的特征提取模型对目标人脸图像进行处理的同时,该人脸识别模型中的预测子模型能够对模板人脸图像进行处理,由此达到对目标人脸图像和模板人脸图像并行处理的效果,提高人脸识别模型的处理效率。
407、计算机设备根据第一特征向量、第一特征数值、第二特征向量以及第二特征数值,获取目标人脸图像和模板人脸图像之间的相似度。
当计算机设备获取到第一特征向量、第一特征数值、第二特征向量和第二特征数值,则根据该第一特征向量、第一特征数值、第二特征向量和第二特征数值,获取目标人脸图像和模板人脸图像之间的相似度。其中,目标人脸图像和模板人脸图像之间的相似度越大,表示目标人脸图像中的人脸和模板人脸图像中的人脸属于同一个人的概率越大,也即是,目标人脸图像与模板人脸图像匹配的概率越大;目标人脸图像和模板人脸图像之间的相似度越小,表示目标人脸图像中的人脸和模板人脸图像中的人脸属于同一个人的概率越小,也即是,目标人脸图像与模板人脸图像匹配的概率越小。
在一种可能实现方式中,计算机设备采用相似度算法对第一特征向量、第一特征数值、第二特征向量和第二特征数值进行计算,得到目标人脸图像和模板人脸图像之间的相似度。在一些实施例中,该相似度算法如下:
Figure PCTCN2021085978-appb-000001
Sim表示目标人脸图像和模板人脸图像之间的相似度,k i表示目标人脸图像对应的第一特征数值,k j表示模板人脸图像对应的第二特征数值,其中,i和j为正整数,用于指示目标人脸图像或者模板人脸图像,d表示特征映射层输出的特征向量的维数,r表示人脸图像的特征映射到的超球面空间中的半径。
其中,
Figure PCTCN2021085978-appb-000002
其中,
Figure PCTCN2021085978-appb-000003
p=k i·μ i+k j·μ j,μ i表示目标人脸 图像对应的第一特征向量,μ j表示模板人脸图像对应的第二特征向量。其中,
Figure PCTCN2021085978-appb-000004
为贝塞尔函数,该贝塞尔函数为
Figure PCTCN2021085978-appb-000005
x表示目标人脸图像,m和α均为该贝塞尔函数的预设参数,m!表示m的阶乘,T(·)表示伽玛函数。
408、计算机设备在相似度大于预设阈值的情况下,确定目标人脸图像与模板人脸图像匹配。
当计算机设备获取到目标人脸图像和模板人脸图像之间的相似度,将该相似度与预设阈值进行比较,若相似度大于该预设阈值,则确定目标人脸图像中的人脸和模板人脸图像中的人脸属于同一个人,也即是目标人脸图像与模板人脸图像匹配。若相似度不大于该预设阈值,则确定目标人脸图像中的人脸和模板人脸图像中的人脸不属于同一个人,也即是目标人脸图像与模板人脸图像不匹配。其中,该预设阈值根据实际应用场景中要求的人脸识别的误报率来确定,或者该预设阈值由计算机设备默认设置,或者该预设阈值由开发人员通过计算机设备自行设置。
在一种可能实现方式中,计算机设备采用判断算法对相似度进行判断,来确定目标人脸图像是否与模板人脸图像匹配。在一些实施例中,该判断算法如下:
Figure PCTCN2021085978-appb-000006
L out为计算机设备的判断结果,th为预设阈值。
需要说明的是,本申请实施例仅以计算机设备对目标人脸图像与一个模板人脸图片进行识别为例,说明人脸识别的过程。在另一实施例中,计算机设备存储有多个模板人脸图像,则计算机设备获取到待识别的目标人脸图像后,对多个模板人脸图像进行遍历,对遍历的每一个模板人脸图像执行图4所示的实施例的步骤,直至确定目标人脸图像与多个模板人脸图像中的某一模板人脸图像匹配,或者,直至确定目标人脸图像与多个模板人脸图像中的任一模板人脸图像均不匹配。
图5为本申请实施例提供的另一种人脸识别方法的流程图,参见图5,将目标人脸图像5101输入特征提取层5201中,得到第一特征图像5103,将模板人脸图像5102输入特征提取层5201中,得到第二特征图像5104。
将第一特征图像5103输入至预测子模型5202中,得到第一特征数值5105,将第二特征图像5104输入至预测子模型5202中,得到第二特征数值5106。
将第一特征图像5103输入至特征映射层5203中,得到第一特征向量5107,将第二特征图像5104输入至特征映射层5203中,得到第二特征向量5108。
根据第一特征数值5105、第二特征数值5106、第一特征向量5107和第二特征向量5108,获取目标人脸图像5101和模板人脸图像5102之间的相似度5109,根据该相似度5109得出识别结果5110,即相似度5109大于预设阈值时,识别结果5110为目标人脸图像5101和模板人脸图像5102匹配;相似度5109不大于预设阈值时,识别结果5110为目标人脸图像5101和模板人脸图像5102不匹配。在一些实施例中,模板人脸图像对应人脸标识,人脸标识用于表示用户的身份,则当识别结果5110为目标人脸图像5101和模板人脸图像5102匹配时,识别结果5110中还包括该模板人脸图像5102对应的人脸标识,来表示该目标人脸图像5101为该人脸标识对应的用户的人脸图像。
相关技术中,在进行人脸识别时,通过调用人脸识别模型,分别提取采集的目标人脸图像对应的第一特征向量和模板人脸图像对应的第二特征向量,根据第一特征向量和第二特征向量获取目标人脸图像和模板人脸图像之间的相似度,根据该相似度确定目标人脸图像与模板人脸图像是否匹配,从而确定人脸识别是否通过。但是,由于人脸图像中存在干扰因素,例如人脸图像中存在遮挡物或者人脸图像本身比较模糊等,导致提取的特征向量不够准确,进而导致人脸识别的准确率较低。
图6为根据本申请实施例提供的方法以及根据相关技术提供的方法进行人脸识别的结果。参见图6,人脸图像601与人脸图像602匹配,人脸图像603与人脸图像604匹配,人脸图像605与人脸图像606匹配,人脸图像607与人脸图像608匹配。相关技术中的预设阈值为0.179,本申请中的预设阈值为-1373.377。
相关技术的方法得到人脸图像601与人脸图像602之间的相似度为cosθ 1=0.127<0.179,确定人脸图像601与人脸图像602不匹配,因此识别错误;本申请实施例的方法得到人脸图像601的特征数值为k 1=970.013,人脸图像602的特征数值为k 2=412.385,人脸图像601与人脸图像602之间的相似度为s(x 1,x 2)=-1364.021>-1373.377,确定人脸图像601与人脸图像602匹配,因此识别正确。相关技术的方法得到人脸图像603与人脸图像604之间的相似度为cosθ 2=0.102<0.179,确定人脸图像603与人脸图像604不匹配,因此识别错误;本申请实施例的方法得到人脸图像603的特征数值为k 3=401.687,人脸图像604的特征数值为k 4=877.605,人脸图像603与人脸图像604之间的相似度为s(x 3,x 4)=-1368.452>-1373.377,确定人脸图像603与人脸图像604匹配,因此识别正确。相关技术的方法得到人脸图像605与人脸图像606之间的相似度为cosθ 3=0.154<0.179,确定人脸图像605与人脸图像606不匹配,因此识别错误;本申请实施例的方法得到人脸图像605的特征数值为k 5=1018.599,人脸图像606的特征数值为k 6=565.877,人脸图像605与人脸图像606之间的相似度为s(x 5,x 6)=-1365.027>-1373.377,确定人脸图像605与人脸图像606匹配,因此识别正确。相关技术的方法得到人脸图像607与人脸图像608之间的相似度为cosθ 4=0.072<0.179,确定人脸图像607与人脸图像608不匹配,因此识别错误;本申请实施例的方法得到人脸图像607的特征数值为k 7=523.347,人脸图像608的特征数值为k 8=412.226,人脸图像607与人脸图像608之间的相似度为s(x 7,x 8)=-1367.089>-1373.377,确定人脸图像607与人脸图像608匹配,因此识别正确。根据上述识别结果能够知道,与相关技术相比,采用本申请实施例提供的方法进行人脸识别,能够提高人脸识别的准确率。
需要说明的是,本申请实施例仅以计算机设备调用人脸识别模型中的特征提取子模型和预测子模型,对图像进行处理为例来说明。在另一实施例中,计算机设备采用其他方式,实现对目标人脸图像进行特征提取,得到目标人脸图像对应的第一特征图像,以及第一特征图像对应的第一特征向量,对第一特征图像进行处理,得到第一特征图像对应的第一特征数值。
本申请实施例提供的方法,调用人脸识别模型中的特征提取子模型和预测子模型,获取第一特征图像对应的第一特征向量和第一特征数值,根据第一特征向量、第一特征数值,以及模板人脸图像对应的第二特征向量和第二特征数值,获取目标人脸图像和模板人脸图像之间的相似度,在相似度大于预设阈值的情况下,确定目标人脸图像与模板人脸图像匹配。由于第一特征数值表示第一特征图像对应的不确定度,第二特征数值表示第二特征图像对应的 不确定度,而不确定度能够表示特征图像与人脸图像之间的差异程度,因此在获取目标人脸图像和模板人脸图像之间的相似度时,还考虑了特征图像的不确定度对相似度的影响,而不是仅考虑特征图像对应的特征向量,因此能够有效减少由于人脸图像中存在干扰因素导致特征向量无法准确表示人脸的特征的情况,能够提高人脸识别的准确率,降低人脸识别的误判率。
并且,本申请实施例中,将目标人脸图像的特征映射到超球面空间中,得到该目标人脸图像对应的第一特征图像。由于相比于二维的欧式空间,超球面空间更加符合人脸的特征空间,因此在超球面空间上对人脸进行特征提取,能够使提取到的人脸特征更加准确,能够进一步提高人脸识别的准确率。
在通过人脸识别模型进行人脸识别之前,需要先训练出人脸识别模型,训练过程详见下述实施例。
图7是本申请实施例提供的一种人脸识别模型训练方法的流程图。本申请实施例由计算机设备执行,参见图7,该方法包括以下步骤。
701、计算机设备获取样本人脸图像和样本人脸图像对应的样本特征向量。
计算机设备获取用于训练人脸识别模型的样本人脸图像,以及样本人脸图像对应的样本特征向量。其中,样本人脸图像为包括人脸的图像,样本人脸图像对应的样本特征向量为用于表示样本人脸图像的特征的向量。例如,该样本特征向量用于表示样本人脸图像所属的人脸标识,以用户1和用户2为例,包括用户1的人脸的任一样本人脸图像对应的样本特征向量均为样本特征向量a,包括用户2的人脸的任一样本人脸图像对应的样本特征向量均为样本特征向量b。
其中,样本人脸图像为计算机设备中预先存储的样本人脸图像,或者由计算机设备从其他设备中下载的样本人脸图像,或者为开发人员或者其他设备上传至该计算机设备中的样本人脸图像。其中,样本人脸图像对应的样本特征向量,为开发人员为样本人脸图像所标注的样本特征向量,或者由其他方式得到的样本特征向量,本申请实施例对此不做限定。
702、计算机设备调用人脸识别模型中的特征提取层,对样本人脸图像进行特征提取,得到样本人脸图像对应的预测特征图像。
人脸识别模型是用于进行人脸识别的模型,该人脸识别模型的结构以及各部分的功能参见图1,此处不再赘述。在一种可能实现方式中,该特征提取层为卷积神经网络,该卷积神经网络能够执行卷积计算、非线性激活函数计算、池化计算等操作。或者,该特征提取层为其他形式的网络,本申请实施例对此不做限定。
当计算机设备获取到样本人脸图像,则调用人脸识别模型中的特征提取层,对该样本人脸图像进行特征提取,得到该样本人脸图像对应的预测特征图像。该预测特征图像是指表示样本人脸图像的特征的图像。
其中,本申请实施例中的特征提取层,将样本人脸图像的特征映射到超球面空间中,得到该样本人脸图像对应的预测特征图像。超球面空间是指高于二维的球面空间。相比于二维的欧式空间,超球面空间更加符合人脸的特征空间,在超球面空间上对人脸进行特征提取,能够使提取到的人脸特征更加准确。
703、计算机设备调用人脸识别模型中的特征映射层,对预测特征图像进行特征映射,得到预测特征图像对应的预测特征向量。
在一些实施例中,该特征映射层为全连接映射网络,或者,该特征映射层为其他形式的网络,本申请实施例对此不做限定。
当计算机设备获取到样本人脸图像对应的预测特征图像,则调用人脸识别模型中的特征映射层,对该预测特征图像进行特征映射,得到该预测特征图像对应的预测特征向量。其中,该预测特征向量是指用于表示样本人脸图像的特征的向量,该预测特征向量由该预测特征图像映射得到。
需要说明的是,本申请实施例中,人脸识别模型中的特征提取子模型包括特征提取层和特征映射层,因此上述步骤702-703,以特征提取层对样本人脸图像进行处理,以及特征映射层对预测特征图像进行处理为例,说明得到样本人脸图像对应的预测特征图像,以及预测特征图像对应的预测特征向量的过程。而在另一实施例中,特征提取子模型为其他形式的子模型,使得通过该特征提取子模型对样本人脸图像进行特征提取,能够得到预测特征图像以及预测特征向量即可。
704、计算机设备根据预测特征向量和样本特征向量之间的差异,训练特征提取子模型。
其中,预测特征向量为通过人脸识别模型预测的、表示样本人脸图像的特征的向量,样本特征向量为真实的、表示样本人脸图像的特征的向量。因此,当计算机设备获取到预测特征向量和样本特征向量,则根据预测特征向量和样本特征向量之间的差异,来训练人脸识别模型中的特征提取子模型,也即是训练特征提取层和特征映射层,以使通过特征提取层和特征映射层得到的预测特征向量,与样本特征向量的差异越来越小。
在一种可能实现方式中,计算机设备获取预测特征向量和样本特征向量之间的第一损失值,根据该第一损失值,训练特征提取子模型。其中,第一损失值表示预测特征向量和样本特征向量之间的差异。
在一些实施例中,计算机设备获取第一损失函数,根据第一损失函数对预测特征向量和样本特征向量进行计算,得到第一损失值。其中,第一损失函数是指用于获取预测特征向量和样本特征向量之间的损失的函数。
在另一种可能实现方式中,人脸识别模型还包括损失获取子模型,损失获取子模型与特征提取子模型连接。该损失获取子模型包括每个人脸标识对应的权重向量。计算机设备调用损失获取子模型,按照样本人脸图像所属人脸标识对应的权重向量,对预测特征向量进行加权处理,得到预测特征向量对应的加权特征向量,获取加权特征向量和样本特征向量之间的第二损失值,根据第二损失值,训练特征提取子模型和损失获取子模型。其中,第二损失值表示加权特征向量和样本特征向量之间的差异。
损失获取子模型用于根据特征向量获取对应的损失值,该损失获取子模型与特征提取子模型连接,本申请实施例中,特征提取子模型包括特征提取层和特征映射层,则该损失获取子模型与该特征提取子模型中的特征映射层连接。在一些实施例中,该损失获取子模型为分类网络,如该分类网络为softmax(逻辑回归)网络或者各类添加margin(差额)类型的softmax网络,或者损失获取子模型为其他形式,本申请实施例对此不做限定。
其中,每个人脸标识对应的权重向量用于表示该人脸标识对应的人脸图像对应的特征向量的权重,在一些实施例中,样本人脸图像对应的预测特征向量为1×n维的向量,则预测特征向量中包括n个维度的特征值。则人脸标识对应的权重向量也为1×n维的向量,权重向量中包括n个维度的权重值,n个维度的权重值分别表示对应的预测特征向量中每个维度的特 征值的权重。
则计算机设备获取到样本人脸图像对应的预测特征图像后,在损失获取子模型包括的多个权重向量中,确定该样本人脸图像所属人脸标识对应的权重向量,调用损失获取子模型,按照该样本人脸图像所属人脸标识对应的权重向量,对预测特征向量进行加权处理,得到预测特征向量对应的加权特征向量。也即是,将预测特征向量中每个维度的特征值,分别与权重向量中对应的权重值进行相乘,得到加权特征向量。在一些实施例中,损失获取子模型还包括第二损失函数。计算机设备获取第二损失函数,根据第二损失函数对加权特征向量和样本特征向量进行计算,得到第二损失值。其中,第二损失函数是指用于获取加权特征向量和样本特征向量之间的损失的函数。
在另一种可能实现方式中,计算机设备采用梯度下降法,对特征提取子模型和损失获取子模型进行优化,以达到训练特征提取子模型和损失获取子模型的目的。其中,该梯度下降法为随机梯度下降法、带动量项的随机梯度下降法、Adagrad法(Adaptive Gradient,自适应梯度下降法)等,本申请实施例对此不做限定。
需要说明的是,步骤701-704仅说明根据样本人脸图像获取预测特征向量,根据预测特征向量和样本特征向量之间的差异,训练特征提取子模型和损失获取子模型,由此实现根据样本人脸图像和样本人脸图像对应的样本特征向量,训练特征提取子模型。在另一实施例中计算机设备采用其他方式,实现根据样本人脸图像和样本人脸图像对应的样本特征向量,训练特征提取子模型。
需要说明的是,本申请实施例中仅以根据一个样本特征图像和该样本特征图像对应的样本特征向量,训练特征提取子模型和损失获取子模型为例进行说明。在实际训练过程中,计算机设备会根据多个样本人脸图像,以及该多个样本人脸图像对应的样本特征向量,训练特征提取子模型和损失获取子模型。其中,多个样本人脸图像之间的任两个样本人脸图像所属的人脸标识相同,或者,任两个样本人脸图像所属的人脸标识不同,本申请对此不做限定。
在一种可能实现方式中,计算机设备获取多个样本人脸图像,以及该多个样本人脸图像对应的样本特征向量,将多个样本人脸图像同时输入至人脸识别模型中的特征提取层,由该人脸识别模型分别对多个样本人脸图像进行处理,根据得到的预测特征向量和对应的样本特征向量,训练特征提取子模型和损失获取子模型。其中,该人脸识别模型中的特征提取子模型和损失获取子模型能够对多个样本人脸图像进行并行处理。例如多个样本人脸图像中包括第一样本人脸图像和第二样本人脸图像,在人脸识别模型中的损失获取子模型对第一样本人脸图像进行处理的同时,该人脸识别模型中的特征提取子模型能够对第二样本人脸图像进行处理,由此达到对多个样本人脸图像并行处理的效果,提高人脸识别模型的处理效率。
在另一种可能实现方式中,该人脸识别模型的训练过程对应有终止训练模型的条件,该终止训练模型的条件为模型的迭代次数达到预设次数,或者,模型的损失至小于第一预设数值,本申请实施例对此不做限定。例如,当训练特征提取子模型和损失获取子模型的迭代次数达到预设次数时,完成对特征提取子模型和损失获取子模型的训练。或者,当计算机设备获取的第一损失值或者第二损失值小于第一预设数值时,表示特征提取子模型和损失获取子模型的损失值收敛,则完成对特征提取子模型和损失获取子模型的训练。
需要说明的是,在另一实施例中,计算机设备中预先存储有训练好的特征提取子模型,以及训练该特征提取子模型所用到的样本人脸图像,则计算机设备无需执行上述步骤701-704, 通过获取训练该特征提取子模型所用到的样本人脸图像,执行下述步骤705-707,完成对预测子模型的训练即可。
705、计算机设备获取样本人脸图像对应的中心特征向量。
其中,每个样本人脸图像对应一个人脸标识,每个人脸标识对应一个中心特征向量,该中心特征向量表示人脸标识对应的人脸特征,也即是,该中心特征向量能够用于表示该样本人脸图像中的人脸特征。当计算机设备完成对人脸识别模型中的特征提取子模型和损失获取子模型的训练之后,计算机设备获取该样本人脸图像所属人脸标识的中心特征向量。
在一种可能实现方式中,计算机设备获取样本人脸图像所属人脸标识的多个人脸图像对应的特征向量,根据获取到的多个特征向量,确定中心特征向量。在训练特征提取子模型和损失获取子模型的过程中,计算机设备得到多个人脸图像对应的特征向量,则计算机设备确定该样本特征图像所属人脸标识的多个人脸图像,获取该多个人脸图像对应的多个特征向量,对获取到的该多个特征向量进行取均值操作,得到该样本人脸图像所属人脸标识对应的中心特征向量。
在另一种可能实现方式中,计算机设备获取样本人脸图像所属人脸标识对应的权重向量,将该样本人脸图像对应的权重向量确定为中心特征向量。
损失获取子模型包括每个人脸标识对应的权重向量。在训练特征提取子模型和损失获取子模型的过程中,不断调整损失获取子模型中的每个权重向量,当训练完成时,损失获取子模型中包括训练后的每个权重向量。则计算机设备能够确定该样本人脸图像所属的人脸标识,从损失获取子模型中的多个权重向量中,获取该人脸标识对应的权重向量,将该权重向量确定为该样本人脸图像所属人脸标识对应的中心特征向量。
706、计算机设备调用预测子模型,对预测特征图像进行处理,得到预测特征图像对应的预测特征数值。
预测特征数值是指预测特征图像所包括的人脸特征与样本人脸图像中的人脸特征之间的差异程度,也即是,该预测特征数值用于表示预测特征图像描述样本人脸图像中人脸特征的不确定度。该步骤706的实现过程与相关内容和上述步骤403类似,在此不再一一赘述。
需要说明的是,本申请实施例仅以先执行步骤705,再执行步骤706为例进行说明,在另一实施例中,先步骤执行706,再执行步骤705。
707、计算机设备根据预测特征向量、中心特征向量和预测特征数值,获取第三损失值,根据第三损失值,训练预测子模型。
当计算机设备获取到样本人脸图像对应的预测特征向量、中心特征向量和预测特征数值,根据预测特征向量、中心特征向量和预测特征数值获取第三损失值,根据该第三损失值训练人脸识别模型中的预测子模型,以使该预测子模型输出的预测特征图像对应的预测特征数值更加准确。其中,第三损失值表示预测特征图像对应的预测特征数值的损失。
在一种可能实现方式中,计算机设备获取第三损失函数,根据第三损失函数对预测特征向量、中心特征向量和预测特征数值进行计算,得到第三损失值。其中,第三损失函数是指用于获取预测特征数值的损失的函数。在一些实施例中,该第三损失函数的公式如下:
Figure PCTCN2021085978-appb-000007
其中,L s表示第三损失值,k表示预测特征数值,r表示人脸图像的特征映射到的超球空间中的半径,μ表示预测特征向量,μ T表示预测特征向量的转置,w x∈c表示样本人脸图像对应 的中心特征向量,x表示当前的样本人脸图像,c表示样本人脸图像所属人脸标识对应的至少一个人脸图像,d表示特征映射层输出的特征向量的维数,
Figure PCTCN2021085978-appb-000008
为贝塞尔函数。
在另一种可能实现方式中,计算机设备根据预测特征向量和中心特征向量之间的距离,获取目标特征数值,根据目标特征数值和预测特征数值之间的差异,获取第三损失值。
由于预测特征数值用于表示样本特征图像描述人脸图像中人脸特征的不确定度,而实际应用场景中,计算机设备能够根据特征图像对应的特征向量和特征数值,获取人脸图像之间的相似度。因此,预测特征数值实质是表示样本人脸图像对应的预测特征向量与样本人脸图像对应的中心特征向量相匹配的不确定度,预测特征向量与中心特征向量之间的距离越小,预测特征向量与中心特征向量越相似,也即是预测特征向量与中心特征向量越匹配。
其中,计算机设备能够根据预测特征向量和中心特征向量之间的距离,获取目标特征数值,则该目标特征数值能够表示预测特征向量和中心特征向量相匹配的不确定度。预测特征向量和中心特征向量之间的距离越大,预测特征向量和中心特征向量相匹配的不确定度越大,也即是目标特征数值越大;预测特征向量和中心特征向量之间的距离越小,预测特征向量和中心特征向量相匹配的不确定度越小,也即是目标特征数值越小。
而在实际应用场景中无法得知待识别的人脸图像所属的人脸标识,因此也无法得知人脸图像对应的中心特征向量,因此计算机设备根据特征图像来获取特征数值。因此在预测子模型的训练过程中,需要保证预测子模型得到的预测特征数值,能够表示样本人脸图像对应的预测特征向量与样本人脸图像对应的中心特征向量相匹配的不确定度,也即是,需要保证预测特征数值与目标特征数值之间的差异较小。因此在一些实施例中,计算机设备根据目标特征数值和预测特征数值之间的差异获取第三损失值,根据第三损失值训练预测子模型,以使该目标特征数值和预测特征数值之间的差异越来越小,使该预测子模型输出的预测特征数值越来越准确。
在另一种可能实现方式中,计算机设备采用梯度下降法,对预测子模型进行优化,以达到训练预测子模型的目的。其中,该梯度下降法为随机梯度下降法、带动量项的随机梯度下降法、Adagrad法(Adaptive Gradient,自适应梯度下降法)等,本申请实施例对此不做限定。其中,特征数值的优化梯度,参见如下公式:
Figure PCTCN2021085978-appb-000009
r表示人脸图像的特征映射到的超球空间中的半径,μ表示样本人脸图像对应的预测特征向量,μ T表示预测特征向量的转置,W x∈c表示样本人脸图像对应的中心特征向量,x表示当前的样本人脸图像,c表示样本人脸图像所属人脸标识对应的至少一个人脸图像,d表示特征映射层输出的特征向量的维数,
Figure PCTCN2021085978-appb-000010
Figure PCTCN2021085978-appb-000011
为贝塞尔函数。
需要说明的是,通过执行上述步骤705-707,实现了在保持训练后的特征提取子模型不变的情况下,根据样本特征向量和样本人脸图像所属人脸标识的中心特征向量,训练预测子模型。在另一实施例中,计算机设备采用其他方式,根据样本特征向量和中心特征向量,训练预测子模型。
需要说明的是,在另一些实施例中,本申请实施例采用其他具有闭式解的空间分布,对超球面空间的特征分布进行建模,来减少人脸识别模型的训练过程。
本申请实施例提供的方法中,将训练人脸识别模型分为特征提取子模型的训练阶段和预 测子模型的训练阶段。在一种可能实现方式中,将获取相似度的功能封装为相似度获取模块,将比较相似度与预设阈值的功能封装为阈值比较模块,则在一些实施例中,计算机设备将训练好的特征提取子模型、预测子模型以及相似度获取模块和阈值比较模块进行部署,得到人脸识别模型。图8是本申请实施例提供的一种训练模型和部署模型的流程图,参见图8,步骤如下:801、训练特征提取层与特征映射层;802、训练预测子模型;803、将特征提取层、特征映射层、预测子模型、相似度获取模块、阈值比较模块进行组合,构成人脸识别模型。
图9是本申请实施例提供的一种训练特征提取子模型的流程图,其中图9将训练特征提取子模型的步骤分为多个模块来进行说明,参见图9,样本数据准备模块901用于获取样本人脸图像和样本人脸图像对应的样本特征向量;特征提取模块902用于调用特征提取层对样本特征图像进行处理,以得到预测特征图像;特征映射模块903用于调用特征映射层对预测特征图像进行处理,以得到预测特征向量;损失获取模块904用于对预测特征向量和样本特征向量进行处理,以得到损失值。获取到损失值时,判断当前是否满足终止训练模型的条件,若满足,则完成对特征提取子模型的训练,若不满足,则通过优化模块905,对特征提取模块902中的特征提取层的参数,以及特征映射模块903中的特征映射层的参数进行优化。其中,终止训练模型的条件为迭代次数达到预设次数,或者损失值小于预设数值。
图10本申请实施例提供的一种训练预测子模型的流程图,其中图10将训练预测子模型的步骤分为多个模块来进行说明,参见图10,中心特征向量获取模块1001用于获取样本人脸图像所属人脸标识对应的中心特征向量,样本数据准备模块1002用于获取样本人脸图像,特征提取模块1003用于调用特征提取层对样本人脸图像进行处理,以得到预测特征图像,特征映射模块1004用于调用特征映射层对预测特征图像进行处理,以得到预测特征向量,预测模块1005用于调用预测子模型对预测特征图像进行处理,以得到预测特征数值,损失值获取模块1006用于根据中心特征向量、预测特征向量和预测特征数值,获取预测特征数值对应的损失值。获取到损失值时判断当前是否满足终止训练模型的条件,若满足,则完成对预测子模型的训练,若不满足,则通过优化模块1007,对预测模块1005中的预测子模型的参数进行优化。其中,终止训练模型的条件为迭代次数达到预设次数,或者损失值小于预设数值。
本申请实施例提供的方法,获取样本人脸图像和样本人脸图像对应的样本特征向量,调用特征提取子模型提取样本人脸图像的预测特征图像和预测特征向量,根据预测特征向量和样本特征向量之间的差异,训练特征提取子模型。获取样本人脸图像所属人脸标识的中心特征向量,调用预测子模型获取预测特征图像对应的预测特征数值,根据预测特征向量、中心特征向量和预测特征数值获取第三损失值,根据第三损失值训练预测子模型。后续即能够通过包括该特征提取子模型和预测子模型的人脸识别模型进行人脸识别,由于引入预测子模型,因此在获取目标人脸图像和模板人脸图像之间的相似度时,还考虑了预测子模型输出的特征数值对相似度的影响,也即是考虑了特征图像的不确定度对相似度的影响,而不是仅考虑特征图像对应的特征向量,因此能够有效减少由于人脸图像中存在干扰因素导致特征向量无法准确表示人脸的特征的情况,能够提高人脸识别的准确率,降低人脸识别的误判率。
并且,根据样本人脸图像和样本人脸图像对应的样本特征向量,训练特征提取子模型,在保持训练后的特征提取子模型不变的情况下,根据样本特征向量和样本人脸图像所属人脸标识的中心特征向量,训练预测子模型。因此在一些实施例中,对人脸识别模型的训练过程,分为特征提取子模型的训练阶段和预测子模型的训练阶段,则在特征提取子模型训练好的情 况下,通过获取训练该特征提取子模型的样本人脸图像,对预测子模型进行训练,无需重新训练新的特征提取子模型,也无需重新收集样本人脸图像。
并且,本申请实施例中,将样本人脸图像的特征映射到超球面空间中,得到该样本人脸图像对应的预测特征图像。由于相比于二维的欧式空间,超球面空间更加符合人脸的特征空间,因此在超球面空间上对人脸进行特征提取,能够使提取到的人脸特征更加准确,能够提高训练得到的人脸识别模型进行人脸识别时的准确率。
图11是本申请实施例提供的一种人脸识别装置的结构示意图。参见图11,该装置包括:
特征提取模块1101,用于对目标人脸图像进行特征提取,得到目标人脸图像对应的第一特征图像及第一特征图像对应的第一特征向量,第一特征图像用于表示目标人脸图像的人脸特征;
特征数值获取模块1102,用于对第一特征图像进行处理,得到第一特征图像对应的第一特征数值,第一特征数值用于表示第一特征图像对应的不确定度,不确定度是指第一特征图像所包括的人脸特征与目标人脸图像中的人脸特征之间的差异程度;
相似度获取模块1103,用于根据第一特征向量、第一特征数值、第二特征向量以及第二特征数值,获取目标人脸图像和模板人脸图像之间的相似度,第二特征向量为模板人脸图像的第二特征图像对应的特征向量,第二特征数值为第二特征图像对应的特征数值,第二特征数值用于表示第二特征图像对应的不确定度,第二特征图像对应的不确定度是指第二特征图像所包括的人脸特征与模板人脸图像中的人脸特征之间的差异程度;
确定模块1104,用于在相似度大于预设阈值的情况下,确定目标人脸图像与模板人脸图像匹配。
本申请实施例提供的装置,获取目标人脸图像对应的第一特征图像以及第一特征图像对应的第一特征向量和第一特征数值,根据第一特征向量、第一特征数值以及模板人脸图像的第二特征图像对应的第二特征向量和第二特征数值,获取目标人脸图像和模板人脸图像之间的相似度,在相似度大于预设阈值的情况下确定目标人脸图像与模板人脸图像匹配。由于第一特征数值表示第一特征图像对应的不确定度,第二特征数值表示第二特征图像对应的不确定度,而不确定度能够表示特征图像与人脸图像之间的差异程度,因此在获取目标人脸图像和模板人脸图像之间的相似度时,还考虑了特征图像的不确定度对相似度的影响,而不是仅考虑特征图像对应的特征向量,因此能够有效减少由于人脸图像中存在干扰因素导致特征向量无法准确表示人脸的特征的情况,能够提高人脸识别的准确率,降低人脸识别的误判率。
在一些实施例中,参见图12,特征提取模块1101,包括:第一特征提取单元1111,用于调用人脸识别模型中的特征提取子模型,对目标人脸图像进行特征提取,得到目标人脸图像对应的第一特征图像及第一特征图像对应的第一特征向量。
在一些实施例中,参见图12,特征数值获取模块1102,包括:特征数值获取单元1112,用于调用人脸识别模型中的预测子模型,对第一特征图像进行处理,得到第一特征图像对应的第一特征数值。
在一些实施例中,参见图12,特征提取子模型包括特征提取层和特征映射层,第一特征提取单元1111,用于:调用特征提取层,对目标人脸图像进行特征提取,得到目标人脸图像对应的第一特征图像;调用特征映射层,对第一特征图像进行特征映射,得到第一特征图像对应的第一特征向量。
在一些实施例中,参见图12,装置还包括:第一训练模块1105,用于根据样本人脸图像和样本人脸图像对应的样本特征向量,训练特征提取子模型;第二训练模块1106,用于在保持训练后的特征提取子模型不变的情况下,根据样本特征向量和样本人脸图像对应的中心特征向量,训练预测子模型,中心特征向量表示样本人脸图像所属人脸标识对应的人脸特征。
在一些实施例中,参见图12,第一训练模块1105,包括:第一获取单元1115,用于获取样本人脸图像和样本人脸图像对应的样本特征向量;第二特征提取单元1125,用于调用特征提取子模型,对样本人脸图像进行特征提取,得到样本人脸图像对应的预测特征图像及预测特征图像对应的预测特征向量;第一训练单元1135,用于根据预测特征向量和样本特征向量之间的差异,训练特征提取子模型。
在一些实施例中,参见图12,特征提取子模型包括特征提取层和特征映射层,第二特征提取单元1125,还用于:调用特征提取层,对样本人脸图像进行特征提取,得到样本人脸图像对应的预测特征图像;调用特征映射层,对预测特征图像进行特征映射,得到预测特征图像对应的预测特征向量。
在一些实施例中,参见图12,第一训练单元1135,还用于:获取预测特征向量和样本特征向量之间的第一损失值,第一损失值表示预测特征向量和样本特征向量之间的差异;根据第一损失值,训练特征提取子模型。
在一些实施例中,参见图12,人脸识别模型还包括损失获取子模型,损失获取子模型包括每个人脸标识对应的权重向量,第一训练单元1135,还用于:调用损失获取子模型,按照样本人脸图像所属人脸标识对应的权重向量对预测特征向量进行加权处理,得到预测特征向量对应的加权特征向量;获取加权特征向量和样本特征向量之间的第二损失值,第二损失值表示加权特征向量和样本特征向量之间的差异;根据第二损失值,训练特征提取子模型和损失获取子模型。
在一些实施例中,参见图12,第二训练模块1106,包括:第二获取单元1116,用于获取样本人脸图像对应的中心特征向量,中心特征向量表示人脸标识对应的人脸特征;特征数值获取单元1126,用于调用预测子模型,对预测特征图像进行处理,得到预测特征图像对应的预测特征数值,预测特征数值用于表示预测特征图像对应的不确定度,预测特征图像对应的不确定度是指预测特征图像所包括的人脸特征与样本人脸图像中的人脸特征之间的差异程度;损失值获取单元1136,用于根据预测特征向量、中心特征向量和预测特征数值,获取第三损失值,第三损失值表示预测特征图像对应的预测特征数值的损失;第二训练单元1146,用于根据第三损失值,训练预测子模型。
在一些实施例中,参见图12,损失值获取单元1136,还用于:根据预测特征向量和中心特征向量之间的距离,获取目标特征数值;根据目标特征数值和预测特征数值之间的差异,获取第三损失值。
在一些实施例中,参见图12,第二获取单元1116,还用于:获取多个人脸图像对应的特征向量,多个人脸图像为样本人脸图像所属人脸标识对应的人脸图像;根据获取到的多个特征向量,确定中心特征向量。
在一些实施例中,参见图12,第二获取单元1116还用于:获取样本人脸图像所属人脸标识对应的权重向量;将样本人脸图像对应的权重向量确定为中心特征向量。
在一些实施例中,参见图12,特征提取模块1101,还用于对模板人脸图像进行特征提取, 得到模板人脸图像对应的第二特征图像及第二特征图像对应的第二特征向量;特征数值获取模块1102,还用于对第二特征图像进行处理,得到第二特征图像对应的第二特征数值。
在一些实施例中,参见图12,第一特征提取单元1111,还用于调用人脸识别模型中的特征提取子模型,对模板人脸图像进行特征提取,得到模板人脸图像对应的第二特征图像及第二特征图像对应的第二特征向量。
在一些实施例中,参见图12,特征数值获取单元1112,还用于调用人脸识别模型中的预测子模型,对第二特征图像进行处理,得到第二特征图像对应的第二特征数值。
在一些实施例中,参见图12,特征提取子模型包括特征提取层和特征映射层,第一特征提取单元1111还用于:调用特征提取层,对模板人脸图像进行特征提取,得到模板人脸图像对应的第二特征图像;调用特征映射层,对第二特征图像进行特征映射,得到第二特征图像对应的第二特征向量。
需要说明的是:上述实施例提供的人脸识别装置在进行人脸识别时,仅以上述各功能模块的划分进行举例说明,在实际应用时的一些实施例中,根据需要而将上述功能分配由不同的功能模块完成,即将计算机设备的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。另外,上述实施例提供的人脸识别装置与人脸识别方法实施例属于同一构思,其具体实现过程详见方法实施例,这里不再赘述。
图13示出了本申请一个示例性实施例提供的终端1300的结构示意图。终端1300能够用于执行上述人脸识别方法中计算机设备所执行的步骤。
通常,终端1300包括有:处理器1301和存储器1302。
处理器1301包括一个或多个处理核心,比如4核心处理器、8核心处理器等。处理器1301采用DSP(Digital Signal Processing,数字信号处理)、FPGA(Field-Programmable Gate Array,现场可编程门阵列)、PLA(Programmable Logic Array,可编程逻辑阵列)中的至少一种硬件形式来实现。处理器1301包括主处理器和协处理器,主处理器是用于对在唤醒状态下的数据进行处理的处理器,也称CPU(Central Processing Unit,中央处理器);协处理器是用于对在待机状态下的数据进行处理的低功耗处理器。在一些实施例中,处理器1301中集成有GPU(Graphics Processing Unit,图像处理的交互器),GPU用于负责显示屏所需要显示的内容的渲染和绘制。一些实施例中,处理器1301还包括AI(Artificial Intelligence,人工智能)处理器,该AI处理器用于处理有关机器学习的计算操作。
存储器1302包括一个或多个计算机可读存储介质,该计算机可读存储介质是非暂态的。存储器1302还包括高速随机存取存储器,以及非易失性存储器,比如一个或多个磁盘存储设备、闪存存储设备。在一些实施例中存储器1302中的非暂态的计算机可读存储介质用于存储至少一条指令,该至少一条指令用于被处理器1301所具有以实现上述人脸识别方法中计算机设备所执行的方法步骤。
在一些实施例中,终端1300还可选包括有:外围设备接口1303和至少一个外围设备。处理器1301、存储器1302和外围设备接口1303之间通过总线或信号线相连。各个外围设备通过总线、信号线或电路板与外围设备接口1303相连。具体地,外围设备包括:摄像头组件1304。
摄像头组件1304用于采集图像或视频。在一些实施例中,摄像头组件1304包括前置摄像头和后置摄像头。通常,前置摄像头设置在终端1300的前面板,后置摄像头设置在终端 1300的背面。在一些实施例中,后置摄像头为至少两个,分别为主摄像头、景深摄像头、广角摄像头、长焦摄像头中的任意一种,以实现主摄像头和景深摄像头融合实现背景虚化功能、主摄像头和广角摄像头融合实现全景拍摄以及VR(Virtual Reality,虚拟现实)拍摄功能或者其它融合拍摄功能。在一些实施例中,摄像头组件1304还包括闪光灯。闪光灯是单色温闪光灯或双色温闪光灯。双色温闪光灯是指暖光闪光灯和冷光闪光灯的组合,能够用于不同色温下的光线补偿。
本领域技术人员能够理解,图13中示出的结构并不构成对终端1300的限定,在一些实施例中,该终端1300包括比图示更多或更少的组件,或者组合某些组件,或者采用不同的组件布置。
图14是本申请实施例提供的一种服务器的结构示意图,在一些实施例中,该服务器1400因配置或性能不同而产生比较大的差异,包括一个或一个以上处理器(Central Processing Units,CPU)1401和一个或一个以上的存储器1402,其中,存储器1402中存储有至少一条指令,至少一条指令由处理器1401加载并执行以实现上述各个方法实施例提供的方法。当然,该服务器还具有有线或无线网络接口、键盘以及输入输出接口等部件,以便进行输入输出,该服务器还包括其他用于实现设备功能的部件,在此不做赘述。
服务器1400能够用于执行上述人脸识别方法中计算机设备所执行的步骤。
本申请实施例还提供了一种用于人脸识别的计算机设备,该计算机设备包括处理器和存储器,存储器中存储有至少一条指令,该至少一条指令由处理器加载并执行,以实现下述人脸识别方法的方法步骤:
对目标人脸图像进行特征提取,得到目标人脸图像对应的第一特征图像及第一特征图像对应的第一特征向量,第一特征图像用于表示目标人脸图像的人脸特征;
对第一特征图像进行处理,得到第一特征图像对应的第一特征数值,第一特征数值用于表示第一特征图像对应的不确定度,不确定度是指第一特征图像所包括的人脸特征与目标人脸图像中的人脸特征之间的差异程度;
根据第一特征向量、第一特征数值、第二特征向量以及第二特征数值,获取目标人脸图像和模板人脸图像之间的相似度,第二特征向量为模板人脸图像的第二特征图像对应的特征向量,第二特征图像用于表示模板人脸图像的人脸特征,第二特征数值为第二特征图像对应的特征数值,第二特征数值用于表示第二特征图像对应的不确定度,第二特征图像对应的不确定度是指第二特征图像所包括的人脸特征与模板人脸图像中的人脸特征之间的差异程度;
在相似度大于预设阈值的情况下,确定目标人脸图像与模板人脸图像匹配。
在一种可能的实现方式中,对目标人脸图像进行特征提取,得到目标人脸图像对应的第一特征图像及第一特征图像对应的第一特征向量,包括:
调用人脸识别模型中的特征提取子模型,对目标人脸图像进行特征提取,得到目标人脸图像对应的第一特征图像及第一特征图像对应的第一特征向量。
在一种可能的实现方式中,对第一特征图像进行处理,得到第一特征图像对应的第一特征数值,包括:
调用人脸识别模型中的预测子模型,对第一特征图像进行处理,得到第一特征图像对应的第一特征数值。
在一种可能的实现方式中,特征提取子模型包括特征提取层和特征映射层,调用人脸识 别模型中的特征提取子模型,对目标人脸图像进行特征提取,得到目标人脸图像对应的第一特征图像及第一特征图像对应的第一特征向量,包括:
调用特征提取层,对目标人脸图像进行特征提取,得到目标人脸图像对应的第一特征图像;
调用特征映射层,对第一特征图像进行特征映射,得到第一特征图像对应的第一特征向量。
在一种可能的实现方式中,调用人脸识别模型中的预测子模型,对第一特征图像进行处理,得到第一特征图像对应的第一特征数值之前,该方法还包括:
根据样本人脸图像和样本人脸图像对应的样本特征向量,训练特征提取子模型;
在保持训练后的特征提取子模型不变的情况下,根据样本特征向量和样本人脸图像对应的中心特征向量,训练预测子模型,中心特征向量表示样本人脸图像所属人脸标识对应的人脸特征。
在一种可能的实现方式中,根据样本人脸图像和样本人脸图像对应的样本特征向量,训练特征提取子模型,包括:
获取样本人脸图像和样本人脸图像对应的样本特征向量;
调用特征提取子模型,对样本人脸图像进行特征提取,得到样本人脸图像对应的预测特征图像及预测特征图像对应的预测特征向量;
根据预测特征向量和样本特征向量之间的差异,训练特征提取子模型。
在一种可能的实现方式中,特征提取子模型包括特征提取层和特征映射层,调用特征提取子模型,对样本人脸图像进行特征提取,得到样本人脸图像对应的预测特征图像及预测特征图像对应的预测特征向量,包括:
调用特征提取层,对样本人脸图像进行特征提取,得到样本人脸图像对应的预测特征图像;
调用特征映射层,对预测特征图像进行特征映射,得到预测特征图像对应的预测特征向量。
在一种可能的实现方式中,根据预测特征向量和样本特征向量之间的差异,训练特征提取子模型,包括:
获取预测特征向量和样本特征向量之间的第一损失值,第一损失值表示预测特征向量和样本特征向量之间的差异;
根据第一损失值,训练特征提取子模型。
在一种可能的实现方式中,人脸识别模型还包括损失获取子模型,损失获取子模型包括每个人脸标识对应的权重向量,根据预测特征向量和样本特征向量之间的差异,训练特征提取子模型,包括:
调用损失获取子模型,按照样本人脸图像所属人脸标识对应的权重向量对预测特征向量进行加权处理,得到预测特征向量对应的加权特征向量;
获取加权特征向量和样本特征向量之间的第二损失值,第二损失值表示加权特征向量和样本特征向量之间的差异;
根据第二损失值,训练所征提取子模型和损失获取子模型。
在一种可能的实现方式中,在保持训练后的特征提取子模型不变的情况下,根据样本特 征向量和样本人脸图像对应的中心特征向量,训练预测子模型,包括:
获取样本人脸图像对应的中心特征向量;
调用预测子模型,对预测特征图像进行处理,得到预测特征图像对应的预测特征数值,预测特征数值用于表示预测特征图像对应的不确定度,预测特征图像对应的不确定度是指预测特征图像所包括的人脸特征与样本人脸图像中的人脸特征之间的差异程度;
根据预测特征向量、中心特征向量和预测特征数值,获取第三损失值,第三损失值表示预测特征图像对应的预测特征数值的损失;
根据第三损失值,训练预测子模型。
在一种可能的实现方式中,根据预测特征向量、中心特征向量和预测特征数值,获取第三损失值,包括:
根据预测特征向量和中心特征向量之间的距离,获取目标特征数值;
根据目标特征数值和预测特征数值之间的差异,获取第三损失值。
在一种可能的实现方式中,获取样本人脸图像对应的中心特征向量,包括:
获取多个人脸图像对应的特征向量,多个人脸图像为样本人脸图像所属人脸标识对应的人脸图像;
根据获取到的多个特征向量,确定中心特征向量。
在一种可能的实现方式中,获取样本人脸图像对应的中心特征向量,包括:
获取样本人脸图像所属人脸标识对应的权重向量;
将样本人脸图像对应的权重向量确定为中心特征向量。
在一种可能的实现方式中,根据第一特征向量、第一特征数值、第二特征向量以及第二特征数值,获取目标人脸图像和模板人脸图像之间的相似度之前,该方法还包括:
对模板人脸图像进行特征提取,得到模板人脸图像对应的第二特征图像及第二特征图像对应的第二特征向量;
对第二特征图像进行处理,得到第二特征图像对应的第二特征数值。
在一种可能的实现方式中,对模板人脸图像进行特征提取,得到模板人脸图像对应的第二特征图像及第二特征图像对应的第二特征向量,包括:
调用人脸识别模型中的特征提取子模型,对模板人脸图像进行特征提取,得到模板人脸图像对应的第二特征图像及第二特征图像对应的第二特征向量。
在一种可能的实现方式中,对第二特征图像进行处理,得到第二特征图像对应的第二特征数值,包括:
调用人脸识别模型中的预测子模型,对第二特征图像进行处理,得到第二特征图像对应的第二特征数值。
在一种可能的实现方式中,特征提取子模型包括特征提取层和特征映射层,调用人脸识别模型中的特征提取子模型,对模板人脸图像进行特征提取,得到模板人脸图像对应的第二特征图像及第二特征图像对应的第二特征向量,包括:
调用特征提取层,对模板人脸图像进行特征提取,得到模板人脸图像对应的第二特征图像;
调用特征映射层,对第二特征图像进行特征映射,得到第二特征图像对应的第二特征向量。
本申请实施例还提供了一种计算机可读存储介质,该计算机可读存储介质中存储有至少一条指令,该至少一条指令由处理器加载并执行,以实现下述人脸识别方法的方法步骤:
对目标人脸图像进行特征提取,得到目标人脸图像对应的第一特征图像及第一特征图像对应的第一特征向量,第一特征图像用于表示目标人脸图像的人脸特征;
对第一特征图像进行处理,得到第一特征图像对应的第一特征数值,第一特征数值用于表示第一特征图像对应的不确定度,不确定度是指第一特征图像所包括的人脸特征与目标人脸图像中的人脸特征之间的差异程度;
根据第一特征向量、第一特征数值、第二特征向量以及第二特征数值,获取目标人脸图像和模板人脸图像之间的相似度,第二特征向量为模板人脸图像的第二特征图像对应的特征向量,第二特征图像用于表示模板人脸图像的人脸特征,第二特征数值为第二特征图像对应的特征数值,第二特征数值用于表示第二特征图像对应的不确定度,第二特征图像对应的不确定度是指第二特征图像所包括的人脸特征与模板人脸图像中的人脸特征之间的差异程度;
在相似度大于预设阈值的情况下,确定目标人脸图像与模板人脸图像匹配。
在一种可能的实现方式中,对目标人脸图像进行特征提取,得到目标人脸图像对应的第一特征图像及第一特征图像对应的第一特征向量,包括:
调用人脸识别模型中的特征提取子模型,对目标人脸图像进行特征提取,得到目标人脸图像对应的第一特征图像及第一特征图像对应的第一特征向量。
在一种可能的实现方式中,对第一特征图像进行处理,得到第一特征图像对应的第一特征数值,包括:
调用人脸识别模型中的预测子模型,对第一特征图像进行处理,得到第一特征图像对应的第一特征数值。
在一种可能的实现方式中,特征提取子模型包括特征提取层和特征映射层,调用人脸识别模型中的特征提取子模型,对目标人脸图像进行特征提取,得到目标人脸图像对应的第一特征图像及第一特征图像对应的第一特征向量,包括:
调用特征提取层,对目标人脸图像进行特征提取,得到目标人脸图像对应的第一特征图像;
调用特征映射层,对第一特征图像进行特征映射,得到第一特征图像对应的第一特征向量。
在一种可能的实现方式中,调用人脸识别模型中的预测子模型,对第一特征图像进行处理,得到第一特征图像对应的第一特征数值之前,该方法还包括:
根据样本人脸图像和样本人脸图像对应的样本特征向量,训练特征提取子模型;
在保持训练后的特征提取子模型不变的情况下,根据样本特征向量和样本人脸图像对应的中心特征向量,训练预测子模型,中心特征向量表示样本人脸图像所属人脸标识对应的人脸特征。
在一种可能的实现方式中,根据样本人脸图像和样本人脸图像对应的样本特征向量,训练特征提取子模型,包括:
获取样本人脸图像和样本人脸图像对应的样本特征向量;
调用特征提取子模型,对样本人脸图像进行特征提取,得到样本人脸图像对应的预测特征图像及预测特征图像对应的预测特征向量;
根据预测特征向量和样本特征向量之间的差异,训练特征提取子模型。
在一种可能的实现方式中,特征提取子模型包括特征提取层和特征映射层,调用特征提取子模型,对样本人脸图像进行特征提取,得到样本人脸图像对应的预测特征图像及预测特征图像对应的预测特征向量,包括:
调用特征提取层,对样本人脸图像进行特征提取,得到样本人脸图像对应的预测特征图像;
调用特征映射层,对预测特征图像进行特征映射,得到预测特征图像对应的预测特征向量。
在一种可能的实现方式中,根据预测特征向量和样本特征向量之间的差异,训练特征提取子模型,包括:
获取预测特征向量和样本特征向量之间的第一损失值,第一损失值表示预测特征向量和样本特征向量之间的差异;
根据第一损失值,训练特征提取子模型。
在一种可能的实现方式中,人脸识别模型还包括损失获取子模型,损失获取子模型包括每个人脸标识对应的权重向量,根据预测特征向量和样本特征向量之间的差异,训练特征提取子模型,包括:
调用损失获取子模型,按照样本人脸图像所属人脸标识对应的权重向量对预测特征向量进行加权处理,得到预测特征向量对应的加权特征向量;
获取加权特征向量和样本特征向量之间的第二损失值,第二损失值表示加权特征向量和样本特征向量之间的差异;
根据第二损失值,训练所征提取子模型和损失获取子模型。
在一种可能的实现方式中,在保持训练后的特征提取子模型不变的情况下,根据样本特征向量和样本人脸图像对应的中心特征向量,训练预测子模型,包括:
获取样本人脸图像对应的中心特征向量;
调用预测子模型,对预测特征图像进行处理,得到预测特征图像对应的预测特征数值,预测特征数值用于表示预测特征图像对应的不确定度,预测特征图像对应的不确定度是指预测特征图像所包括的人脸特征与样本人脸图像中的人脸特征之间的差异程度;
根据预测特征向量、中心特征向量和预测特征数值,获取第三损失值,第三损失值表示预测特征图像对应的预测特征数值的损失;
根据第三损失值,训练预测子模型。
在一种可能的实现方式中,根据预测特征向量、中心特征向量和预测特征数值,获取第三损失值,包括:
根据预测特征向量和中心特征向量之间的距离,获取目标特征数值;
根据目标特征数值和预测特征数值之间的差异,获取第三损失值。
在一种可能的实现方式中,获取样本人脸图像对应的中心特征向量,包括:
获取多个人脸图像对应的特征向量,多个人脸图像为样本人脸图像所属人脸标识对应的人脸图像;
根据获取到的多个特征向量,确定中心特征向量。
在一种可能的实现方式中,获取样本人脸图像对应的中心特征向量,包括:
获取样本人脸图像所属人脸标识对应的权重向量;
将样本人脸图像对应的权重向量确定为中心特征向量。
在一种可能的实现方式中,根据第一特征向量、第一特征数值、第二特征向量以及第二特征数值,获取目标人脸图像和模板人脸图像之间的相似度之前,该方法还包括:
对模板人脸图像进行特征提取,得到模板人脸图像对应的第二特征图像及第二特征图像对应的第二特征向量;
对第二特征图像进行处理,得到第二特征图像对应的第二特征数值。
在一种可能的实现方式中,对模板人脸图像进行特征提取,得到模板人脸图像对应的第二特征图像及第二特征图像对应的第二特征向量,包括:
调用人脸识别模型中的特征提取子模型,对模板人脸图像进行特征提取,得到模板人脸图像对应的第二特征图像及第二特征图像对应的第二特征向量。
在一种可能的实现方式中,对第二特征图像进行处理,得到第二特征图像对应的第二特征数值,包括:
调用人脸识别模型中的预测子模型,对第二特征图像进行处理,得到第二特征图像对应的第二特征数值。
在一种可能的实现方式中,特征提取子模型包括特征提取层和特征映射层,调用人脸识别模型中的特征提取子模型,对模板人脸图像进行特征提取,得到模板人脸图像对应的第二特征图像及第二特征图像对应的第二特征向量,包括:
调用特征提取层,对模板人脸图像进行特征提取,得到模板人脸图像对应的第二特征图像;
调用特征映射层,对第二特征图像进行特征映射,得到第二特征图像对应的第二特征向量。
本申请实施例还提供了一种计算机程序,该计算机程序包括至少一条指令,该至少一条指令由处理器加载并执行,以实现上述人脸识别方法的方法步骤。
本领域普通技术人员能够理解实现上述实施例的全部或部分步骤通过硬件来完成,或者,通过程序来指令相关的硬件完成,该程序存储于一种计算机可读存储介质中,上述提到的存储介质是只读存储器、磁盘或光盘等。
以上所述仅为本申请实施例的可选实施例,并不用以限制本申请实施例,凡在本申请实施例的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本申请的保护范围之内。

Claims (20)

  1. 一种人脸识别方法,其中,由计算机设备执行,所述方法包括:
    对目标人脸图像进行特征提取,得到所述目标人脸图像对应的第一特征图像及所述第一特征图像对应的第一特征向量,所述第一特征图像用于表示所述目标人脸图像的人脸特征;
    对所述第一特征图像进行处理,得到所述第一特征图像对应的第一特征数值,所述第一特征数值用于表示所述第一特征图像对应的不确定度,所述第一特征图像对应的不确定度是指所述第一特征图像所包括的人脸特征与所述目标人脸图像中的人脸特征之间的差异程度;
    根据所述第一特征向量、所述第一特征数值、第二特征向量以及第二特征数值,获取所述目标人脸图像和所述模板人脸图像之间的相似度,所述第二特征向量为模板人脸图像的第二特征图像对应的特征向量,所述第二特征图像用于表示所述模板人脸图像的人脸特征,所述第二特征数值为所述第二特征图像对应的特征数值,所述第二特征数值用于表示所述第二特征图像对应的不确定度,所述第二特征图像对应的不确定度是指所述第二特征图像所包括的人脸特征与所述模板人脸图像中的人脸特征之间的差异程度;
    在所述相似度大于预设阈值的情况下,确定所述目标人脸图像与所述模板人脸图像匹配。
  2. 根据权利要求1所述的方法,其中,所述对目标人脸图像进行特征提取,得到所述目标人脸图像对应的第一特征图像及所述第一特征图像对应的第一特征向量,包括:
    调用人脸识别模型中的特征提取子模型,对目标人脸图像进行特征提取,得到所述目标人脸图像对应的第一特征图像及所述第一特征图像对应的第一特征向量。
  3. 根据权利要求2所述的方法,其中,所述对所述第一特征图像进行处理,得到所述第一特征图像对应的第一特征数值,包括:
    调用所述人脸识别模型中的预测子模型,对所述第一特征图像进行处理,得到所述第一特征图像对应的第一特征数值。
  4. 根据权利要求2所述的方法,其中,所述特征提取子模型包括特征提取层和特征映射层,所述调用人脸识别模型中的特征提取子模型,对目标人脸图像进行特征提取,得到所述目标人脸图像对应的第一特征图像及所述第一特征图像对应的第一特征向量,包括:
    调用所述特征提取层,对所述目标人脸图像进行特征提取,得到所述目标人脸图像对应的第一特征图像;
    调用所述特征映射层,对所述第一特征图像进行特征映射,得到所述第一特征图像对应的第一特征向量。
  5. 根据权利要求3所述的方法,其中,所述调用所述人脸识别模型中的预测子模型,对所述第一特征图像进行处理,得到所述第一特征图像对应的第一特征数值之前,所述方法还包括:
    根据样本人脸图像和所述样本人脸图像对应的样本特征向量,训练所述特征提取子模型;
    在保持训练后的特征提取子模型不变的情况下,根据所述样本特征向量和所述样本人脸图像对应的中心特征向量,训练所述预测子模型,所述中心特征向量表示所述样本人脸图像所属人脸标识对应的人脸特征。
  6. 根据权利要求5所述的方法,其中,所述根据样本人脸图像和所述样本人脸图像对应的样本特征向量,训练所述特征提取子模型,包括:
    获取所述样本人脸图像和所述样本人脸图像对应的样本特征向量;
    调用所述特征提取子模型,对所述样本人脸图像进行特征提取,得到所述样本人脸图像对应的预测特征图像及所述预测特征图像对应的预测特征向量;
    根据所述预测特征向量和所述样本特征向量之间的差异,训练所述特征提取子模型。
  7. 根据权利要求6所述的方法,其中,所述特征提取子模型包括特征提取层和特征映射层,所述调用所述特征提取子模型,对所述样本人脸图像进行特征提取,得到所述样本人脸图像对应的预测特征图像及所述预测特征图像对应的预测特征向量,包括:
    调用所述特征提取层,对所述样本人脸图像进行特征提取,得到所述样本人脸图像对应的预测特征图像;
    调用所述特征映射层,对所述预测特征图像进行特征映射,得到所述预测特征图像对应的预测特征向量。
  8. 根据权利要求6所述的方法,其中,所述根据所述预测特征向量和所述样本特征向量之间的差异,训练所述特征提取子模型,包括:
    获取所述预测特征向量和所述样本特征向量之间的第一损失值,所述第一损失值表示所述预测特征向量和所述样本特征向量之间的差异;
    根据所述第一损失值,训练所述特征提取子模型。
  9. 根据权利要求6所述的方法,其中,所述人脸识别模型还包括损失获取子模型,所述损失获取子模型包括每个人脸标识对应的权重向量,所述根据所述预测特征向量和所述样本特征向量之间的差异,训练所述特征提取子模型,包括:
    调用所述损失获取子模型,按照所述样本人脸图像所属人脸标识对应的权重向量对所述预测特征向量进行加权处理,得到所述预测特征向量对应的加权特征向量;
    获取所述加权特征向量和所述样本特征向量之间的第二损失值,所述第二损失值表示所述加权特征向量和所述样本特征向量之间的差异;
    根据所述第二损失值,训练所述特征提取子模型和所述损失获取子模型。
  10. 根据权利要求9所述的方法,其中,所述在保持训练后的特征提取子模型不变的情况下,根据所述样本特征向量和所述样本人脸图像对应的中心特征向量,训练所述预测子模型,包括:
    获取所述样本人脸图像对应的中心特征向量;
    调用所述预测子模型,对所述预测特征图像进行处理,得到所述预测特征图像对应的预测特征数值,所述预测特征数值用于表示所述预测特征图像对应的不确定度,所述预测特征图像对应的不确定度是指所述预测特征图像所包括的人脸特征与所述样本人脸图像中的人脸特征之间的差异程度;
    根据所述预测特征向量、所述中心特征向量和所述预测特征数值,获取第三损失值,所述第三损失值表示所述预测特征图像对应的预测特征数值的损失;
    根据所述第三损失值,训练所述预测子模型。
  11. 根据权利要求10所述的方法,其中,所述根据所述预测特征向量、所述中心特征向量和所述预测特征数值,获取第三损失值,包括:
    根据所述预测特征向量和所述中心特征向量之间的距离,获取目标特征数值;
    根据所述目标特征数值和所述预测特征数值之间的差异,获取所述第三损失值。
  12. 根据权利要求10所述的方法,其中,所述获取所述样本人脸图像对应的中心特征向量,包括:
    获取多个人脸图像对应的特征向量,所述多个人脸图像为所述样本人脸图像所属人脸标识对应的人脸图像;
    根据获取到的多个特征向量,确定所述中心特征向量。
  13. 根据权利要求10所述的方法,其中,所述获取所述样本人脸图像对应的中心特征向量,包括:
    获取所述样本人脸图像所属人脸标识对应的权重向量;
    将所述样本人脸图像对应的权重向量确定为所述中心特征向量。
  14. 根据权利要求1所述的方法,其中,所述根据所述第一特征向量、所述第一特征数值、第二特征向量以及第二特征数值,获取所述目标人脸图像和所述模板人脸图像之间的相似度之前,所述方法还包括:
    对所述模板人脸图像进行特征提取,得到所述模板人脸图像对应的第二特征图像及所述第二特征图像对应的第二特征向量;
    对所述第二特征图像进行处理,得到所述第二特征图像对应的第二特征数值。
  15. 根据权利要求14所述的方法,其中,所述对所述模板人脸图像进行特征提取,得到所述模板人脸图像对应的第二特征图像及所述第二特征图像对应的第二特征向量,包括:
    调用人脸识别模型中的特征提取子模型,对所述模板人脸图像进行特征提取,得到所述模板人脸图像对应的第二特征图像及所述第二特征图像对应的第二特征向量。
  16. 根据权利要求15所述的方法,其中,所述对所述第二特征图像进行处理,得到所述第二特征图像对应的第二特征数值,包括:
    调用所述人脸识别模型中的预测子模型,对所述第二特征图像进行处理,得到所述第二特征图像对应的第二特征数值。
  17. 根据权利要求15所述的方法,其中,所述特征提取子模型包括特征提取层和特征映射层,所述调用人脸识别模型中的特征提取子模型,对所述模板人脸图像进行特征提取,得到所述模板人脸图像对应的第二特征图像及所述第二特征图像对应的第二特征向量,包括:
    调用所述特征提取层,对所述模板人脸图像进行特征提取,得到所述模板人脸图像对应的第二特征图像;
    调用所述特征映射层,对所述第二特征图像进行特征映射,得到所述第二特征图像对应的第二特征向量。
  18. 一种人脸识别装置,其中,所述装置包括:
    特征提取模块,用于对目标人脸图像进行特征提取,得到所述目标人脸图像对应的第一特征图像及所述第一特征图像对应的第一特征向量,所述第一特征图像用于表示所述目标人脸图像的人脸特征;
    特征数值获取模块,用于对所述第一特征图像进行处理,得到所述第一特征图像对应的第一特征数值,所述第一特征数值用于表示所述第一特征图像对应的不确定度,所述不确定度是指所述第一特征图像所包括的人脸特征与所述目标人脸图像中的人脸特征之间的差异程 度;
    相似度获取模块,用于根据所述第一特征向量、所述第一特征数值、第二特征向量以及第二特征数值,获取所述目标人脸图像和所述模板人脸图像之间的相似度,所述第二特征向量为模板人脸图像的第二特征图像对应的特征向量,所述第二特征数值为所述第二特征图像对应的特征数值,所述第二特征数值用于表示所述第二特征图像对应的不确定度,所述第二特征图像对应的不确定度是指所述第二特征图像所包括的人脸特征与所述模板人脸图像中的人脸特征之间的差异程度;
    确定模块,用于在所述相似度大于预设阈值的情况下,确定所述目标人脸图像与所述模板人脸图像匹配。
  19. 一种计算机设备,其中,所述计算机设备包括处理器和存储器,所述存储器中存储有至少一条指令,所述至少一条指令由所述处理器加载并执行,以实现如权利要求1至17任一所述的人脸识别方法。
  20. 一种计算机可读存储介质,其中,所述计算机可读存储介质中存储有至少一条指令,所述至少一条指令由处理器加载并执行,以实现如权利要求1至17任一所述的人脸识别方法。
PCT/CN2021/085978 2020-05-22 2021-04-08 人脸识别方法、装置、计算机设备及存储介质 WO2021232985A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US17/744,260 US11816880B2 (en) 2020-05-22 2022-05-13 Face recognition method and apparatus, computer device, and storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010438831.2A CN111340013B (zh) 2020-05-22 2020-05-22 人脸识别方法、装置、计算机设备及存储介质
CN202010438831.2 2020-05-22

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/744,260 Continuation US11816880B2 (en) 2020-05-22 2022-05-13 Face recognition method and apparatus, computer device, and storage medium

Publications (1)

Publication Number Publication Date
WO2021232985A1 true WO2021232985A1 (zh) 2021-11-25

Family

ID=71184961

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/085978 WO2021232985A1 (zh) 2020-05-22 2021-04-08 人脸识别方法、装置、计算机设备及存储介质

Country Status (3)

Country Link
US (1) US11816880B2 (zh)
CN (1) CN111340013B (zh)
WO (1) WO2021232985A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114580948A (zh) * 2022-03-15 2022-06-03 河北雄安睿天科技有限公司 一种水务年度预算分析系统

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111340013B (zh) * 2020-05-22 2020-09-01 腾讯科技(深圳)有限公司 人脸识别方法、装置、计算机设备及存储介质
CN111985310B (zh) * 2020-07-08 2023-06-30 华南理工大学 一种用于人脸识别的深度卷积神经网络的训练方法
CN111915480B (zh) * 2020-07-16 2023-05-23 抖音视界有限公司 生成特征提取网络的方法、装置、设备和计算机可读介质
CN111767900B (zh) * 2020-07-28 2024-01-26 腾讯科技(深圳)有限公司 人脸活体检测方法、装置、计算机设备及存储介质
CN111931153B (zh) * 2020-10-16 2021-02-19 腾讯科技(深圳)有限公司 基于人工智能的身份验证方法、装置和计算机设备
CN112241764B (zh) * 2020-10-23 2023-08-08 北京百度网讯科技有限公司 图像识别方法、装置、电子设备及存储介质
CN115424330B (zh) * 2022-09-16 2023-08-11 郑州轻工业大学 一种基于dfmn和dsd的单模态人脸活体检测方法
CN117522951B (zh) * 2023-12-29 2024-04-09 深圳市朗诚科技股份有限公司 鱼类监测方法、装置、设备及存储介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1886748A (zh) * 2003-10-01 2006-12-27 奥森泰克公司 用于手指生物测量处理的方法以及相关的手指生物测量传感器
CN101281598A (zh) * 2008-05-23 2008-10-08 清华大学 基于多部件多特征融合的人脸识别方法
CN104899579A (zh) * 2015-06-29 2015-09-09 小米科技有限责任公司 人脸识别方法和装置
CN109522872A (zh) * 2018-12-04 2019-03-26 西安电子科技大学 一种人脸识别方法、装置、计算机设备及存储介质
US20200110926A1 (en) * 2018-10-03 2020-04-09 Idemia Identity & Security France Parameter training method for a convolutional neural network and method for detecting items of interest visible in an image and for associating items of interest visible in an image
CN111340013A (zh) * 2020-05-22 2020-06-26 腾讯科技(深圳)有限公司 人脸识别方法、装置、计算机设备及存储介质

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2008059197A (ja) * 2006-08-30 2008-03-13 Canon Inc 画像照合装置、画像照合方法、コンピュータプログラム及び記憶媒体
CN102254192B (zh) * 2011-07-13 2013-07-31 北京交通大学 基于模糊k近邻的三维模型半自动标注方法及系统
US8873813B2 (en) * 2012-09-17 2014-10-28 Z Advanced Computing, Inc. Application of Z-webs and Z-factors to analytics, search engine, learning, recognition, natural language, and other utilities
JP5787845B2 (ja) * 2012-08-24 2015-09-30 株式会社東芝 画像認識装置、方法、及びプログラム
US9576221B2 (en) * 2014-07-09 2017-02-21 Ditto Labs, Inc. Systems, methods, and devices for image matching and object recognition in images using template image classifiers
US10043058B2 (en) * 2016-03-09 2018-08-07 International Business Machines Corporation Face detection, representation, and recognition
CN107633207B (zh) * 2017-08-17 2018-10-12 平安科技(深圳)有限公司 Au特征识别方法、装置及存储介质
CN108805185B (zh) * 2018-05-29 2023-06-30 腾讯科技(深圳)有限公司 人脸识别方法、装置、存储介质及计算机设备
CN110232678B (zh) * 2019-05-27 2023-04-07 腾讯科技(深圳)有限公司 一种图像不确定度预测方法、装置、设备及存储介质
CN110956079A (zh) * 2019-10-12 2020-04-03 深圳壹账通智能科技有限公司 人脸识别模型构建方法、装置、计算机设备和存储介质

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1886748A (zh) * 2003-10-01 2006-12-27 奥森泰克公司 用于手指生物测量处理的方法以及相关的手指生物测量传感器
CN101281598A (zh) * 2008-05-23 2008-10-08 清华大学 基于多部件多特征融合的人脸识别方法
CN104899579A (zh) * 2015-06-29 2015-09-09 小米科技有限责任公司 人脸识别方法和装置
US20200110926A1 (en) * 2018-10-03 2020-04-09 Idemia Identity & Security France Parameter training method for a convolutional neural network and method for detecting items of interest visible in an image and for associating items of interest visible in an image
CN109522872A (zh) * 2018-12-04 2019-03-26 西安电子科技大学 一种人脸识别方法、装置、计算机设备及存储介质
CN111340013A (zh) * 2020-05-22 2020-06-26 腾讯科技(深圳)有限公司 人脸识别方法、装置、计算机设备及存储介质

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114580948A (zh) * 2022-03-15 2022-06-03 河北雄安睿天科技有限公司 一种水务年度预算分析系统
CN114580948B (zh) * 2022-03-15 2022-11-04 河北雄安睿天科技有限公司 一种水务年度预算分析系统

Also Published As

Publication number Publication date
US11816880B2 (en) 2023-11-14
CN111340013B (zh) 2020-09-01
US20220270348A1 (en) 2022-08-25
CN111340013A (zh) 2020-06-26

Similar Documents

Publication Publication Date Title
WO2021232985A1 (zh) 人脸识别方法、装置、计算机设备及存储介质
CN109902546B (zh) 人脸识别方法、装置及计算机可读介质
US11645506B2 (en) Neural network for skeletons from input images
CN108205655B (zh) 一种关键点预测方法、装置、电子设备及存储介质
CN107704838B (zh) 目标对象的属性识别方法及装置
KR20230021043A (ko) 객체 인식 방법 및 장치, 및 인식기 학습 방법 및 장치
CN109492627B (zh) 一种基于全卷积网络的深度模型的场景文本擦除方法
WO2020199611A1 (zh) 活体检测方法和装置、电子设备及存储介质
CN111444826B (zh) 视频检测方法、装置、存储介质及计算机设备
CN114333078B (zh) 活体检测方法、装置、电子设备及存储介质
Yang et al. PipeNet: Selective modal pipeline of fusion network for multi-modal face anti-spoofing
CN111767900A (zh) 人脸活体检测方法、装置、计算机设备及存储介质
CN111368672A (zh) 一种用于遗传病面部识别模型的构建方法及装置
CN111898561B (zh) 一种人脸认证方法、装置、设备及介质
CN111598168B (zh) 图像分类方法、装置、计算机设备及介质
CN110728319B (zh) 一种图像生成方法、装置以及计算机存储介质
CN115050064A (zh) 人脸活体检测方法、装置、设备及介质
CN113298158A (zh) 数据检测方法、装置、设备及存储介质
CN112818995A (zh) 图像分类方法、装置、电子设备及存储介质
CN117237547B (zh) 图像重建方法、重建模型的处理方法和装置
CN116433955A (zh) 对抗攻击的检测方法及系统
CN117011909A (zh) 人脸识别模型的训练方法、人脸识别的方法和装置
CN117011449A (zh) 三维面部模型的重构方法和装置、存储介质及电子设备
CN115311723A (zh) 活体检测方法、装置及计算机可读存储介质
CN114004265A (zh) 一种模型训练方法及节点设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21808186

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 25.04.2023)

122 Ep: pct application non-entry in european phase

Ref document number: 21808186

Country of ref document: EP

Kind code of ref document: A1