CN110866469A

CN110866469A - Human face facial features recognition method, device, equipment and medium

Info

Publication number: CN110866469A
Application number: CN201911047655.3A
Authority: CN
Inventors: 贺珂珂; 葛彦昊; 汪铖杰; 李季檩
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2019-10-30
Filing date: 2019-10-30
Publication date: 2020-03-06

Abstract

The invention discloses a face facial feature recognition method, a face facial feature recognition device, face facial feature recognition equipment and a face facial feature recognition medium. The method comprises the following steps: acquiring a facial five sense organs image to be recognized; extracting facial features from facial features to be recognized through a feature extraction model; determining standard features of the five sense organs matched with the features of the five sense organs based on the similarity between the features of the five sense organs and the features of a plurality of standard five sense organs, wherein each standard feature of the five sense organs is marked with a corresponding attribute type of the five sense organs; acquiring the attribute type of the five sense organs corresponding to the matched standard five sense organ characteristics; taking the attribute type of the facial features corresponding to the matched standard facial features as the attribute type of the facial features to be recognized; the feature extraction model corresponds to a network structure used for extracting feature information of facial features in the facial features classification model, the facial features classification model is obtained by performing machine learning training on a plurality of facial features sample images, and the facial features sample images carry classification labeling information of the belonging users.

Description

Human face facial features recognition method, device, equipment and medium

Technical Field

The invention relates to the technical field of internet communication, in particular to a face facial feature recognition method, device, equipment and medium.

Background

With the development of internet communication technology, various biometric identification technologies have been developed, and face identification is one of them. Face recognition belongs to the field of computer vision, and is a biological recognition technology for identity recognition based on face feature information of people. The human face is used as an important biological feature of a human body, has the characteristics of uniqueness and difficult copying, and is relatively convenient to acquire face images, so that the face recognition plays an important role in many fields.

The five sense organs (eyebrows, eyes, ears, nose and mouth) in the human face often have corresponding attribute types of the five sense organs. For example, the eyebrows include willow eyebrows, sword eyebrows, etc., and the eyes include peach-blossom eyes, red-rooted salvia, etc. The human face five sense organs recognition with finer granularity is also significant. However, in the prior art, the obtained human face facial features identification scheme needs to label a large number of facial features sample images with the feature attribute types, which causes a problem of large labeling workload, such as labeling each eyebrow sample image with the corresponding feature attribute types (e.g., willow-leaf eyebrow, sword eyebrow, etc.). In addition, five-sense organ attribute types can be labeled based on different classification dimensions (for example, an eyebrow sample image can be labeled with an eyebrow classification dimension or with an eyebrow thickening classification dimension), and differences between different classification dimensions may cause a problem of difficulty in identification. The efficiency of establishing a relevant model for identifying facial features is influenced by the problems in the prior art. Accordingly, there is a need to provide a more effective recognition scheme for the five sense organs of the face.

Disclosure of Invention

In order to solve the problems of low efficiency, high cost and the like of establishing a relevant model when the prior art is applied to identifying facial features, the invention provides a facial feature identification method, a device, equipment and a medium, wherein the method comprises the following steps:

in one aspect, the invention provides a face facial feature recognition method, which comprises the following steps:

acquiring a facial five sense organs image to be recognized;

extracting facial features from the facial features image to be recognized through a feature extraction model;

determining standard features of the five sense organs matched with the features of the five sense organs based on the similarity between the features of the five sense organs and a plurality of standard features of the five sense organs, wherein each standard feature of the five sense organs is marked with a corresponding attribute type of the five sense organs;

acquiring the attribute type of the facial features corresponding to the matched standard facial features;

taking the attribute type of the facial features corresponding to the matched standard facial features as the attribute type of the facial features to be recognized;

the feature extraction model corresponds to a network structure used for extracting feature information of facial features in the facial features classification model, the facial features classification model is obtained by performing machine learning training on a plurality of facial features sample images, and the facial features sample images carry classification labeling information of the users.

Another aspect provides a facial feature recognition apparatus, the apparatus comprising:

an image acquisition module: the facial feature recognition system is used for acquiring facial feature images to be recognized;

a feature extraction module: extracting facial features from the facial features image to be recognized through a feature extraction model;

a feature matching module: the facial features are used for determining standard facial features matched with the facial features based on the similarity between the facial features and a plurality of standard facial features, and each standard facial feature is labeled with a corresponding facial attribute type;

the five sense organs attribute type acquisition module: the method is used for acquiring the attribute type of the facial features corresponding to the matched standard facial features;

and a five sense organs attribute type marking module: the facial feature recognition system is used for taking the facial feature attribute type corresponding to the matched standard facial feature as the facial feature attribute type of the facial feature image to be recognized;

Another aspect provides an electronic device comprising a processor and a memory, wherein the memory stores at least one instruction, at least one program, a set of codes, or a set of instructions, which are loaded and executed by the processor to implement the facial feature recognition method as described above.

Another aspect provides a computer readable storage medium having stored therein at least one instruction, at least one program, set of codes, or set of instructions, which is loaded and executed by a processor to implement a facial feature recognition method as described above.

The face facial features recognition method, device, equipment and medium provided by the invention have the following technical effects:

the facial features classification model is obtained through training, and the feature extraction model is constructed based on the network structure used for extracting the facial features image feature information in the facial features classification model. And obtaining the facial features of the facial features to be recognized through the feature extraction model. And marking the attribute type of the facial features to be recognized, which corresponds to the standard facial features matched with the facial features to be recognized, based on the similarity between the facial features and the standard facial features. In the training of the relevant model, specific facial feature attribute types do not need to be labeled for the facial feature sample image, and the difficulty and the workload of manual labeling are reduced. The attribute type of the facial features of the user is identified in a feature comparison mode, and the fine-grained facial feature information can be accurately obtained.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions and advantages of the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.

FIG. 1 is a schematic diagram of an application environment provided by an embodiment of the invention;

fig. 2 is a schematic flow chart of a face facial feature recognition method according to an embodiment of the present invention;

FIG. 3 is a schematic flow chart of acquiring an image of facial features to be recognized according to an embodiment of the present invention;

FIG. 4 is a schematic flow chart of a method for training a facial feature classification model according to an embodiment of the present invention;

FIG. 5 is a diagram illustrating an application scenario of a feature extraction model according to an embodiment of the present invention;

fig. 6 is a schematic diagram of an application scenario of a feature extraction model according to an embodiment of the present invention;

fig. 7 is a schematic diagram of an application scenario of a feature extraction model according to an embodiment of the present invention;

fig. 8 is a block diagram of a facial feature recognition apparatus according to an embodiment of the present invention;

fig. 9 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention.

It should be noted that the terms "comprises" and "comprising," and any variations thereof, in the description and claims of the present invention and the above-described drawings, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or server that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

Referring to fig. 1, fig. 1 is a schematic diagram of an application environment according to an embodiment of the present invention, which may include a client 01 and a server 02, where the client and the server are connected through a network. The facial features image to be recognized can be sent to the server by the client. The server carries out image processing on the received facial feature image to be recognized so as to label the corresponding feature attribute type for the facial feature image. It should be noted that fig. 1 is only an example.

Specifically, the client 01 may include a physical device such as a smart phone, a desktop computer, a tablet computer, a notebook computer, a digital assistant, a smart wearable device, etc., or may include software running in the physical device, such as a web page provided by some service providers to the user, or may provide applications provided by the service providers to the user.

Specifically, the server 02 may include a server operating independently, or a distributed server, or a server cluster composed of a plurality of servers. The server 02 may comprise a network communication unit, a processor and a memory, etc. Specifically, the server 02 may provide a background service for the client.

In practical applications, the solution provided by the embodiment of the present invention may relate to the related art of artificial intelligence, which will be described in the following specific embodiments. Artificial Intelligence (AI) is a new technical science to study and develop theories, methods, techniques and application systems for simulating, extending and expanding human Intelligence. The artificial intelligence technology is a comprehensive subject and relates to the field of extensive technology, namely the technology of a hardware level and the technology of a software level. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like. With the research and progress of artificial intelligence technology, the artificial intelligence technology is developed and applied in a plurality of fields, such as common smart homes, smart wearable devices, virtual assistants, smart speakers, smart marketing, unmanned driving, automatic driving, unmanned aerial vehicles, robots, smart medical care, smart customer service, and the like.

Machine Learning (ML) is a multi-domain cross subject, and relates to multiple subjects such as probability theory, statistics, approximation theory, convex analysis and algorithm complexity theory. The special research on how a computer simulates or realizes the learning behavior of human beings so as to acquire new knowledge or skills and reorganize the existing knowledge structure to continuously improve the performance of the computer. Machine learning is the core of artificial intelligence, is the fundamental approach for computers to have intelligence, and is applied to all fields of artificial intelligence. Machine learning and deep learning generally include techniques such as artificial neural networks, belief networks, reinforcement learning, transfer learning, inductive learning, and formal education learning.

The following describes a specific embodiment of a facial feature recognition method according to the present invention, and fig. 2 is a schematic flow chart of the facial feature recognition method according to the embodiment of the present invention, and the present specification provides the method operation steps as described in the embodiment or the flow chart, but may include more or less operation steps based on conventional or non-creative labor. The order of steps recited in the embodiments is merely one manner of performing the steps in a multitude of orders and does not represent the only order of execution. In practice, the system or server product may be implemented in a sequential or parallel manner (e.g., parallel processor or multi-threaded environment) according to the embodiments or methods shown in the figures. Specifically, as shown in fig. 2 and 6, the method may include:

s201: acquiring a facial five sense organs image to be recognized;

in the embodiment of the invention, the facial five sense organ image to be recognized can be an image acquired by a device for imaging and recording images by using an optical principle. The device for imaging and recording images by using the optical principle can be a digital camera and a camera of a terminal device (such as a camera of intelligent glasses and a camera of a mobile phone). The size of the acquired facial features image to be recognized can be different, and correspondingly, the size range of the facial features image to be recognized can be relatively wide. The facial features image to be recognized can be an image directly imaged by equipment for imaging and recording images by utilizing an optical principle, and can also be an image obtained by preprocessing the directly imaged image. The preprocessing mode can comprise denoising processing, gray processing and the like of the image.

In a specific embodiment, as shown in fig. 3, the step of acquiring the image of the facial features to be recognized includes:

s301: collecting an image to be identified;

the image to be recognized may be an image shot by a digital camera or a camera, or the image to be recognized may also be an image captured from a video shot by the digital camera or the camera, or the image to be recognized may also be an image uploaded to a server by a user through a network. The image to be recognized may be a static image or a dynamic image. Of course, the obtaining manner of the image to be recognized may also include other possible manners, which is not limited in this embodiment of the present invention.

S302: carrying out face detection on the image to be recognized to obtain a face image;

face detection (face detection) can be performed on any given image by searching the image by adopting an intelligent strategy to determine whether the image contains a face. By using the face detection technology, images to be recognized which do not contain faces can be filtered. For the image to be recognized containing the human face, the human face area in the image can be cut out to obtain the human image.

S303: carrying out face key point detection on the face image to obtain at least one facial feature image to be recognized;

face keypoint detection may be to locate on the face the five sense organ keypoints, which may include at least one selected from the group consisting of eyebrows, eyes, ears, nose, and mouth. By utilizing the face key point detection technology, the regions which do not contain the face key points can be cut out from the face image, correspondingly, the regions which contain the face key points in the face image can be respectively cut out, and at least one facial feature image to be recognized is obtained. The facial feature image to be recognized comprises an eyebrow object, an eye object, an ear object, a nose object or an oral object. Therefore, the interference of redundant features on the identification of the facial features and the facial features can be reduced, and the calculation amount in the identification process is reduced.

In practical applications, after the acquired video data is acquired, frames with facial features are extracted from the video data. And then, carrying out face detection on the frame to obtain a face coordinate rectangular frame. And then, positioning key points of the five sense organs according to the face coordinate rectangular frame to obtain key points of the five sense organs. And determining the coordinate position of the facial features according to the key points of the facial features. And determining the specific eyebrow position, eye position, ear position, nose position or mouth position according to the coordinate positions of the facial features, and correspondingly obtaining an eyebrow image, an eye image, an ear image, a nose image or a mouth image.

S202: extracting facial features from the facial features image to be recognized through a feature extraction model;

in the embodiment of the invention, the feature extraction model corresponds to a network structure used for extracting feature information of facial features images in a facial features classification model, the facial features classification model is obtained by performing machine learning training on a plurality of facial features sample images, and the facial features sample images carry classification labeling information of users.

In a specific embodiment, the extracting, by a feature extraction model, features of the facial features from the image of the facial features to be recognized includes: determining a target feature extraction model corresponding to a facial feature object according to the feature object in the facial feature image to be recognized; and extracting the facial features from the facial features image to be recognized through the target feature extraction model.

The five-sense object may be an eyebrow object, an eye object, an ear object, a nose object, or an oral object. The target feature extraction model corresponds to a network structure used for extracting facial feature image feature information in a facial feature classification model aiming at the same facial feature object. For example, the facial features in the facial features to be recognized image are eyebrow objects, the target feature extraction model is used for extracting eyebrow features, and the facial features classification model pointed by the network structure used for constructing the target feature extraction model is also used for realizing eyebrow classification.

In practical application, for a received face image, at least one facial feature image to be recognized can be obtained by cutting based on a facial feature object, and then each facial feature image to be recognized is respectively input into a corresponding target feature extraction model. For example, the eyebrow image is input to a feature extraction model for extracting features of the eyebrows, and the eye image is input to a feature extraction model for extracting features of the eyes.

In another specific embodiment, as shown in fig. 4 and 5, the training process of the facial feature classification model includes the following steps:

s401: acquiring a facial five sense organ sample image;

the facial five sense organs sample image comprises an eyebrow object, an eye object, an ear object, a nose object or an oral object. The facial feature sample image may be obtained by cutting the facial sample image based on the detection of the facial key point, and reference may be made to the related process of cutting the facial image based on the detection of the facial key point in step S303 to obtain the facial feature image to be recognized, which is not described in detail herein. The face sample image can correspond to the training of the face recognition model, so that the classification labeling information of the belonging user carried by the face five sense organs sample image can be multiplexed with the labeling information (identity label) of the face sample image.

In particular, a first quantity threshold characterizing the number of candidate users and a second quantity threshold characterizing the number of images corresponding to the same candidate user may be determined. Then, the facial five sense organ sample image is acquired according to the first quantity threshold and the second quantity threshold. When the training sample image is acquired, the dimension of the number of candidate users and the dimension of the number of images of the same candidate user are set. Further, the face sample images (corresponding to the training of the face recognition model) may be obtained according to the first quantity threshold and the second quantity threshold, for example, obtaining images of N persons, each person including 100 and 300 variable image numbers. And then, based on the detection of the key points of the human face, the human face sample image is cut to obtain a facial five-sense organ sample image, and the human face sample image and the carried labeling information are multiplexed.

S402: inputting the facial five sense organ sample image into a neural network model for image classification training;

the size threshold value can be determined, then the size value of the human face facial features sample image is adjusted according to the size threshold value, and the human face facial features sample image with the adjusted size value is input into the neural network model for image classification training. Setting the size of the facial five sense organ sample input into the neural network model to be M × N pixels (for example, 112 × 56 pixels), and performing corresponding adjustment on the initial size of the facial five sense organ sample image. Of course, the face image may also be cut according to the size threshold in the foregoing step S303 to obtain the facial feature image to be recognized.

The network structure of the neural network model comprises a residual error network ResNet18 (which can be composed of convolutional layers, pooling layers and full-link layers) for extracting facial feature information of facial features and a classifier (or classification layer) for classifying facial features based on the facial feature information of facial features. Of course, the network structure of the neural network model may be adjusted to increase or decrease the convolutional layer, and is not limited thereto.

In practical application, the eyebrow sample image meeting the size threshold requirement is input into the neural network model. Then, the feature of the eyebrow sample image, which may be 256-dimensional, is obtained by ResNet18, and the feature may be specifically output by the conv5-3 layer (the third convolutional layer inside the fifth convolutional layer) in ResNet 18. And then classified by a classifier (or classification layer) based on the features.

S403: in the training process, adjusting the model parameters of the neural network model until the classification result output by the neural network model is matched with the input classification label information of the facial five sense organ sample image;

the adjustment process of the model parameters comprises the following steps: obtaining a loss value according to an intermediate value and an annotation value, wherein the intermediate value corresponds to a classification result output by the neural network model, and the annotation value corresponds to the input classification annotation information of the facial feature sample image; obtaining the gradient of the model parameter according to the loss value; updating the model parameters based on the gradient of the model parameters. The loss value corresponds to the forward computation phase of the model training, in which the model samples a certain number of sample images for training of the current iteration. The forward calculation of each sample image through the model generates a loss value, and the size of the loss value represents the learning quality of the sample image. And then in a back propagation stage of model training, calculating the gradient of the parameters through the loss value model, and further adjusting the parameters of the model.

S404: taking the neural network model corresponding to the adjusted model parameters as the facial five sense organs classification model;

the trained facial feature classification model can output the classification result of the user to which the image to be predicted belongs, such as which user the eyebrow represented by the image to be predicted belongs to. The face facial features classification model is obtained by performing machine learning training on a plurality of labeled face facial features sample images, and the obtained face facial features classification model has high generalization capability, can improve the adaptability of image classification, and can improve the reliability and effectiveness of image classification.

In practical application, the feature extraction model corresponds to a network structure used for extracting feature information of facial features in the facial features classification model. Extracting the feature of the five sense organs from the image of the five sense organs of the face to be recognized through the feature extraction model can be realized according to the following steps: inputting the facial feature image to be recognized into the facial feature classification model, and taking out an output result of an intermediate layer of the facial feature classification model, wherein the intermediate layer corresponds to a residual error network ResNet18 used for extracting feature information of the facial feature image in the facial feature classification model, and the output result can be specifically an output result of a conv5-3 layer. The output here is the five sense organ feature.

S203: determining standard features of the five sense organs matched with the features of the five sense organs based on the similarity between the features of the five sense organs and a plurality of standard features of the five sense organs, wherein each standard feature of the five sense organs is marked with a corresponding attribute type of the five sense organs;

in the embodiment of the present invention, the same facial features object may correspond to a plurality of standard facial features, and each of the standard facial features is labeled with a corresponding facial feature attribute type. The standard five sense organ features point to certain difference between the five sense organ objects, and meanwhile, the five sense organ objects pointed by each standard five sense organ feature point to are clear in the related images. For example, the standard facial features of the eyebrow object may include a standard facial feature pointing to the willow-leaf eyebrow, a standard facial feature pointing to the sword eyebrow, a standard facial feature pointing to the arch eyebrow, and a standard facial feature pointing to the high eyebrow, and the corresponding "willow-leaf eyebrow", "sword eyebrow", "arch eyebrow", and "high eyebrow" may be respectively used as the labeled facial attribute types of the corresponding standard facial features. Furthermore, standard facial features can be defined, a standard facial feature image carrying a standard facial feature object is obtained, and the standard facial feature image is input into the feature extraction model to obtain standard facial feature features. The standard facial features image carries a type label describing the corresponding attribute type of the five sense organs. Of course, defining standard facial features may be based on different dimensions, such as race, gender, shape, color, and so forth.

The similarity may be determined using at least one selected from the group consisting of euclidean distance (eulerian metric: referring to a real distance between two points in an m-dimensional space or a natural length of a vector), cosine similarity (cosine similarity: their similarity is evaluated by calculating an angle cosine value of two vectors; a vector is drawn into a vector space, such as the most common two-dimensional space, according to a coordinate value), and relative entropy (Kullback-Leibler divergence, which is an asymmetry metric of a difference between two probability distributions).

In a specific embodiment, similarity calculation may be performed on the features of the five sense organs and the features of the plurality of standard five sense organs, respectively, to obtain a similarity set. For example, the features of the five sense organs correspond to the eyebrow object, and accordingly, the features of the standard five sense organs are the features of the standard eyebrow: a standard brow feature 1, a standard brow feature 2 and a standard brow feature 3. And respectively carrying out similarity calculation on the five sense organ characteristics and the plurality of standard eyebrow characteristics to respectively obtain the similarity of the five sense organ characteristics corresponding to each standard eyebrow characteristic: similarity 1 (corresponding to standard brow feature 1), similarity 2 (corresponding to standard brow feature 2), and similarity 3 (corresponding to standard brow feature 3). The similarity 1, the similarity 2 and the similarity 3 constitute a similarity set. Then, the maximum similarity is determined in the similarity set. For example, the degree of similarity 1 is 70%, the degree of similarity 2 is 65%, and the degree of similarity 3 is 90%. Then the maximum similarity is similarity 3. And determining the standard facial features corresponding to the maximum similarity as the matched standard facial features. The matched standard facial features are then the standard eyebrow feature 3 (corresponding to the maximum similarity: similarity 3).

S204: acquiring the attribute type of the facial features corresponding to the matched standard facial features;

in the embodiment of the invention, the standard facial features can be defined, the standard facial features image carrying the standard facial features object is obtained, and the standard facial features image is input into the feature extraction model to obtain the standard facial features. The standard facial features image carries a type label describing the corresponding attribute type of the five sense organs. The standard features of the five sense organs and the corresponding attribute types of the five sense organs can be stored in advance, and after the matched standard features of the five sense organs are determined, the attribute types of the five sense organs corresponding to the matched standard features of the five sense organs are obtained based on the corresponding relation in the stored data.

S205: taking the attribute type of the facial features corresponding to the matched standard facial features as the attribute type of the facial features to be recognized;

in the embodiment of the invention, the facial features to be recognized are labeled by utilizing the facial feature attribute type corresponding to the matched standard facial feature.

In practical application, for a received face image, at least one facial feature image to be recognized can be obtained by cutting based on a feature object, and then each facial feature image to be recognized is respectively input into a corresponding feature extraction model. Referring to steps S202 to S205, each image of facial features to be recognized is sequentially subjected to feature extraction, similarity calculation, and attribute type acquisition and labeling of facial features, so as to realize more accurate facial feature labeling of the facial image, as shown in fig. 7, and thus, the image can be displayed to the user.

According to the technical scheme provided by the embodiment of the present specification, the facial feature classification model is obtained through training in the embodiment of the present specification, and the feature extraction model is constructed based on the network structure for extracting the image feature information of the facial feature in the facial feature classification model. And obtaining the facial features of the facial features to be recognized through the feature extraction model. And marking the attribute type of the facial features to be recognized, which corresponds to the standard facial features matched with the facial features to be recognized, based on the similarity between the facial features and the standard facial features. In the training of the relevant model, specific facial feature attribute types do not need to be labeled for the facial feature sample image, and the difficulty and the workload of manual labeling are reduced. The attribute type of the facial features of the user is identified in a feature comparison mode, and the fine-grained facial feature information can be accurately obtained.

An embodiment of the present invention further provides a facial feature recognition device, as shown in fig. 8, the device includes:

the image acquisition module 810: the facial feature recognition system is used for acquiring facial feature images to be recognized;

the feature extraction module 820: extracting facial features from the facial features image to be recognized through a feature extraction model;

the feature matching module 830: the facial features are used for determining standard facial features matched with the facial features based on the similarity between the facial features and a plurality of standard facial features, and each standard facial feature is labeled with a corresponding facial attribute type;

facial feature attribute type acquisition module 840: the method is used for acquiring the attribute type of the facial features corresponding to the matched standard facial features;

facial attribute type labeling module 850: the facial feature recognition system is used for taking the facial feature attribute type corresponding to the matched standard facial feature as the facial feature attribute type of the facial feature image to be recognized;

It should be noted that the device and method embodiments in the device embodiment are based on the same inventive concept.

An embodiment of the present invention provides an electronic device, which includes a processor and a memory, where the memory stores at least one instruction, at least one program, a code set, or an instruction set, and the at least one instruction, the at least one program, the code set, or the instruction set is loaded and executed by the processor to implement the facial feature recognition method provided in the foregoing method embodiment.

Further, fig. 9 shows a schematic hardware structure of an electronic device for implementing the facial features recognition method provided by the embodiment of the present invention, where the electronic device may participate in forming or including the facial features recognition apparatus provided by the embodiment of the present invention. As shown in fig. 9, the electronic device 90 may include one or more (shown here as 902a, 902b, … …, 902 n) processors 902 (the processors 902 may include, but are not limited to, a processing device such as a microprocessor MCU or a programmable logic device FPGA, etc.), a memory 904 for storing data, and a transmission device 906 for communication functions. Besides, the method can also comprise the following steps: a display, an input/output interface (I/O interface), a Universal Serial Bus (USB) port (which may be included as one of the ports of the I/O interface), a network interface, a power source, and/or a camera. It will be understood by those skilled in the art that the structure shown in fig. 9 is only an illustration and is not intended to limit the structure of the electronic device. For example, the electronic device 90 may also include more or fewer components than shown in FIG. 9, or have a different configuration than shown in FIG. 9.

It should be noted that the one or more processors 902 and/or other data processing circuitry described above may be referred to generally herein as "data processing circuitry". The data processing circuitry may be embodied in whole or in part in software, hardware, firmware, or any combination thereof. Further, the data processing circuit may be a single stand-alone processing module, or incorporated in whole or in part into any of the other elements in the electronic device 90 (or mobile device). As referred to in the embodiments of the application, the data processing circuit acts as a processor control (e.g. selection of a variable resistance termination path connected to the interface).

The memory 904 can be used for storing software programs and modules of application software, such as program instructions/data storage devices corresponding to the method described in the embodiment of the present invention, and the processor 902 executes various functional applications and data processing by running the software programs and modules stored in the memory 94, so as to implement the above-mentioned facial feature recognition method. The memory 904 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 904 may further include memory located remotely from the processor 902, which may be connected to the electronic device 90 via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The transmitting means 906 is used for receiving or sending data via a network. Specific examples of the network described above may include a wireless network provided by a communication provider of the electronic device 90. In one example, the transmission device 906 includes a network adapter (NIC) that can be connected to other network devices through a base station so as to communicate with the internet. In one embodiment, the transmitting device 906 can be a Radio Frequency (RF) module, which is used for communicating with the internet in a wireless manner.

The display may be, for example, a touch screen type Liquid Crystal Display (LCD) that may enable a user to interact with a user interface of the electronic device 90 (or mobile device).

Embodiments of the present invention further provide a storage medium, which may be disposed in an electronic device to store at least one instruction, at least one program, a code set, or a set of instructions related to implementing a method for identifying facial features in the method embodiments, where the at least one instruction, the at least one program, the code set, or the set of instructions are loaded and executed by the processor to implement the method for identifying facial features provided in the method embodiments.

Alternatively, in this embodiment, the storage medium may be located in at least one network server of a plurality of network servers of a computer network. Optionally, in this embodiment, the storage medium may include, but is not limited to: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic disk, or an optical disk, which can store program codes.

It should be noted that: the precedence order of the above embodiments of the present invention is only for description, and does not represent the merits of the embodiments. And specific embodiments thereof have been described above. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.

The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the device and electronic apparatus embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, and reference may be made to some descriptions of the method embodiments for relevant points.

It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.

The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims

1. A facial feature recognition method, the method comprising:

acquiring a facial five sense organs image to be recognized;

2. The method according to claim 1, wherein the step of obtaining the facial features image to be recognized comprises:

collecting an image to be identified;

carrying out face detection on the image to be recognized to obtain a face image;

carrying out face key point detection on the face image to obtain at least one facial feature image to be recognized;

the facial feature image to be recognized comprises an eyebrow object, an eye object, an ear object, a nose object or an oral object.

3. The method according to claim 1, wherein the extracting facial features from the facial features image to be recognized through a feature extraction model comprises:

determining a target feature extraction model corresponding to a facial feature object according to the feature object in the facial feature image to be recognized;

and extracting the facial features from the facial features image to be recognized through the target feature extraction model.

4. The method of claim 1, wherein determining a standard facial feature that matches the facial feature based on the similarity between the facial feature and a plurality of standard facial features comprises:

respectively carrying out similarity calculation on the features of the five sense organs and the standard features of the five sense organs to obtain a similarity set;

determining the maximum similarity in the similarity set;

and determining the standard facial features corresponding to the maximum similarity as the matched standard facial features.

5. The method according to claim 1, wherein the training process of the facial feature classification model comprises the following steps:

acquiring a facial five sense organ sample image;

inputting the facial five sense organ sample image into a neural network model for image classification training;

in the training process, adjusting the model parameters of the neural network model until the classification result output by the neural network model is matched with the input classification label information of the facial five sense organ sample image;

taking the neural network model corresponding to the adjusted model parameters as the facial five sense organs classification model;

wherein the facial five sense organs sample image comprises an eyebrow object, an eye object, an ear object, a nose object or an oral object.

6. The method of claim 5, wherein the adjusting process of the model parameters comprises the steps of:

obtaining a loss value according to an intermediate value and an annotation value, wherein the intermediate value corresponds to a classification result output by the neural network model, and the annotation value corresponds to the input classification annotation information of the facial feature sample image;

obtaining the gradient of the model parameter according to the loss value;

updating the model parameters based on the gradient of the model parameters.

7. The method of claim 5, wherein said obtaining the facial five sense organ sample image comprises:

determining a first quantity threshold value and a second quantity threshold value, wherein the first quantity threshold value represents the quantity of candidate users, and the second quantity threshold value represents the quantity of images corresponding to the same candidate user;

acquiring the facial five sense organ sample image according to the first quantity threshold and the second quantity threshold.

8. A facial feature recognition apparatus, the apparatus comprising:

9. An electronic device, comprising a processor and a memory, wherein at least one instruction, at least one program, set of codes, or set of instructions is stored in the memory, and wherein the at least one instruction, the at least one program, the set of codes, or the set of instructions is loaded and executed by the processor to implement the facial feature recognition method according to any one of claims 1-7.

10. A computer readable storage medium having stored therein at least one instruction, at least one program, a set of codes, or a set of instructions, which is loaded and executed by a processor to implement the method of facial feature recognition according to any one of claims 1-7.