CN109214343B - Method and device for generating face key point detection model - Google Patents

Method and device for generating face key point detection model Download PDF

Info

Publication number
CN109214343B
CN109214343B CN201811075000.2A CN201811075000A CN109214343B CN 109214343 B CN109214343 B CN 109214343B CN 201811075000 A CN201811075000 A CN 201811075000A CN 109214343 B CN109214343 B CN 109214343B
Authority
CN
China
Prior art keywords
face
neural network
detection model
key point
convolutional neural
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811075000.2A
Other languages
Chinese (zh)
Other versions
CN109214343A (en
Inventor
邓启力
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Douyin Vision Co Ltd
Douyin Vision Beijing Co Ltd
Original Assignee
Beijing ByteDance Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing ByteDance Network Technology Co Ltd filed Critical Beijing ByteDance Network Technology Co Ltd
Priority to CN201811075000.2A priority Critical patent/CN109214343B/en
Publication of CN109214343A publication Critical patent/CN109214343A/en
Application granted granted Critical
Publication of CN109214343B publication Critical patent/CN109214343B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • G06V40/165Detection; Localisation; Normalisation using facial parts and geometric relationships
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Abstract

The embodiment of the application discloses a method and a device for generating a face key point detection model. One embodiment of the method comprises: obtaining a sample set; inputting the face images in the sample set into a pre-trained first face key point detection model to obtain a face key point detection result of the input face images; and training to obtain a second face key point detection model by using a machine learning method and taking the face images in the sample set as input and the face key point detection results of the input face images as output. The embodiment can obtain a model for detecting the key points of the human face, and the method enriches the generation modes of the model.

Description

Method and device for generating face key point detection model
Technical Field
The embodiment of the application relates to the technical field of computers, in particular to a method and a device for generating a face key point detection model.
Background
The human face key points are labels for positioning facial features and contours in the image and are mainly used for positioning key positions of the human face, such as facial contours, eyebrows, eyes and lips. The detection of key points of the human face is an important step in the process of human face recognition, and meanwhile, the technology has an extremely wide application space in reality: such as expression recognition, face animation synthesis, etc. Especially, the explosion of internet application products such as live broadcast, beauty, special-effect cameras and the like which are well known in the recent years also benefits from the improvement of the related technical level of the key points of the human face.
The related method, usually directly using the samples in the sample set and the manual labeling, trains the model for detecting the face key points.
Disclosure of Invention
The embodiment of the application provides a method and a device for generating a face key point detection model.
In a first aspect, an embodiment of the present application provides a method for generating a face keypoint detection model, where the method includes: acquiring a sample set, wherein samples in the sample set comprise face images; inputting the face images in the sample set into a pre-trained first face key point detection model to obtain a face key point detection result of the input face images; and training to obtain a second face key point detection model by using a machine learning method and taking the face images in the sample set as input and the face key point detection results of the input face images as output.
In some embodiments, the samples in the sample set further include annotation information of face key points in the face image; and after obtaining the face key point detection result of the input face image, the method further comprises the following steps: for a sample in a sample set, carrying out similarity calculation on a face key point detection result of a face image in the sample and the labeling information in the sample; and deleting the samples of which the similarity calculation results are smaller than a preset value in the sample set so as to update the sample set.
In some embodiments, training a second face key point detection model by using a machine learning method and taking a face image in a sample set as an input and a face key point detection result of the input face image as an output includes: extracting samples from the updated sample set, and executing the following training steps: inputting the face image in the extracted sample into a second convolutional neural network to obtain information output by the second convolutional neural network; inputting information output by the second convolutional neural network and a face key point detection result of the input face image into a pre-established loss function to obtain a loss value of the extracted sample; determining whether the second convolutional neural network is trained based on the comparison of the loss value and the target value; and in response to determining that the training of the second convolutional neural network is completed, determining the trained second convolutional neural network as a second face key point detection model.
In some embodiments, the method for performing face feature extraction using a machine learning method includes using a face image in a sample set as an input, using a face key detection result of the input face image as an output, and training to obtain a second face key detection model, and further includes: and in response to determining that the second convolutional neural network is not trained, updating parameters in the second convolutional neural network based on the loss values, re-extracting samples from the sample set, and continuing to perform the training step by using the second convolutional neural network after updating the parameters as the second convolutional neural network.
In some embodiments, the first face keypoint detection model is trained by: the method comprises the steps of taking a face image in a target sample set as input of a first convolution neural network, taking labeling information of face key points in the input face image as output of the first convolution neural network, training the first convolution neural network by using a machine learning method, and determining the trained convolution neural network as a first face key point detection model.
In some embodiments, the complexity of the model structure of the second face keypoint detection model is less than the complexity of the model structure of the first face keypoint detection model.
In a second aspect, an embodiment of the present application provides an apparatus for generating a face keypoint detection model, where the apparatus includes: an acquisition unit configured to acquire a sample set, wherein samples in the sample set include face images; the input unit is configured to input the face images in the sample set into a first face key point detection model trained in advance, and a face key point detection result of the input face images is obtained; and the training unit is configured to train to obtain a second face key point detection model by using a machine learning method and taking the face images in the sample set as input and taking the face key point detection results of the input face images as output.
In some embodiments, the samples in the sample set further include annotation information of face key points in the face image; and the apparatus further comprises: the calculating unit is configured to calculate the similarity between the face key point detection result of the face image in the sample and the labeling information in the sample for the sample in the sample set; and the deleting unit is configured to delete the samples of which the similarity calculation results are smaller than a preset value in the sample set so as to update the sample set.
In some embodiments, the training unit is further configured to: extracting samples from the updated sample set, and executing the following training steps: inputting the face image in the extracted sample into a second convolutional neural network to obtain information output by the second convolutional neural network; inputting information output by the second convolutional neural network and a face key point detection result of the input face image into a pre-established loss function to obtain a loss value of the extracted sample; determining whether the second convolutional neural network is trained based on the comparison of the loss value and the target value; and in response to determining that the training of the second convolutional neural network is completed, determining the trained second convolutional neural network as a second face key point detection model.
In some embodiments, the training unit is further configured to: and in response to determining that the second convolutional neural network is not trained, updating parameters in the second convolutional neural network based on the loss values, re-extracting samples from the sample set, and continuing to perform the training step by using the second convolutional neural network after updating the parameters as the second convolutional neural network.
In some embodiments, the first face keypoint detection model is trained by: the method comprises the steps of taking a face image in a target sample set as input of a first convolution neural network, taking labeling information of face key points in the input face image as output of the first convolution neural network, training the first convolution neural network by using a machine learning method, and determining the trained convolution neural network as a first face key point detection model.
In some embodiments, the complexity of the model structure of the second face keypoint detection model is less than the complexity of the model structure of the first face keypoint detection model.
In a third aspect, the embodiment of the application provides a method for acquiring a face image to be detected; the face image to be detected is input into the face key point detection model generated by the method described in any embodiment of the first aspect, and a face key point detection result of the face image to be detected is generated.
In a fourth aspect, the present application provides a computer-readable medium, on which a computer program is stored, where the computer program is implemented in a face keypoint detection model generated by the method described in any one of the embodiments of the first aspect.
In a fifth aspect, an embodiment of the present application provides an electronic device, including: one or more processors; storage means having one or more programs stored thereon which, when executed by one or more processors, cause the one or more processors to implement a method as in any one of the embodiments of the first and third aspects above.
In a sixth aspect, the present application provides a computer-readable medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the method according to any one of the first and third aspects.
According to the method and the device for generating the face key point detection model, the sample set is obtained, and samples can be extracted from the sample set to train the second convolutional neural network. Wherein the samples in the sample set comprise face images. Therefore, the face images in the sample set are input to the first face key point detection model trained in advance, and the face key point detection result of the input face images can be obtained. Then, by using a machine learning method, the face images in the sample set are used as input, and the face key point detection results of the input face images are used as output, so that a second face key point detection model can be obtained through training. Therefore, a model for detecting the key points of the human face can be obtained, and the method enriches the generation modes of the model.
Drawings
Other features, objects and advantages of the present application will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, made with reference to the accompanying drawings in which:
FIG. 1 is an exemplary system architecture diagram in which one embodiment of the present application may be applied;
FIG. 2 is a flow diagram of one embodiment of a method for generating a face keypoint detection model according to the application;
FIG. 3 is a schematic diagram of an application scenario of a method for generating a face keypoint detection model according to the application;
FIG. 4 is a flow diagram of yet another embodiment of a method for generating a face keypoint detection model according to the present application;
FIG. 5 is a schematic structural diagram of an embodiment of an apparatus for generating a face keypoint detection model according to the present application;
FIG. 6 is a flow diagram of one embodiment of a method for detecting face keypoints according to the present application;
FIG. 7 is a schematic structural diagram of an embodiment of an apparatus for detecting face keypoints according to the present application;
FIG. 8 is a block diagram of a computer system suitable for use in implementing the electronic device of an embodiment of the present application.
Detailed Description
The present application will be described in further detail with reference to the following drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant invention and not restrictive of the invention. It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings.
It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present application will be described in detail below with reference to the embodiments with reference to the attached drawings.
Fig. 1 illustrates an exemplary system architecture 100 to which the method for generating a face keypoint detection model or the apparatus for generating a face keypoint detection model of the present application may be applied.
As shown in fig. 1, the system architecture 100 may include terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 serves as a medium for providing communication links between the terminal devices 101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.
The user may use the terminal devices 101, 102, 103 to interact with the server 105 via the network 104 to receive or send messages or the like. The terminal devices 101, 102, 103 may be installed with various communication client applications, such as an image processing application, an information browsing application, a video recording application, a video playing application, a voice interaction application, a search application, an instant messaging tool, a mailbox client, social platform software, and the like.
The terminal apparatuses 101, 102, and 103 may be hardware or software. When the terminal devices 101, 102, 103 are hardware, they may be various electronic devices with display screens, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like. When the terminal apparatuses 101, 102, 103 are software, they can be installed in the electronic apparatuses listed above. It may be implemented as multiple pieces of software or software modules (e.g., to provide distributed services) or as a single piece of software or software module. And is not particularly limited herein.
When the terminal devices 101, 102, 103 are hardware, an image capturing device may be mounted thereon. The image acquisition device can be various devices capable of realizing the function of acquiring images, such as a camera, a sensor and the like. The user can use the image capturing devices on the terminal devices 101, 102, 103 to capture facial images.
The server 105 may be a server that provides various services, such as a database server. The database server may store the sample set or obtain the sample set from other devices. A sample set may contain a plurality of samples. Wherein the sample may comprise a face image. In addition, the database server may further store a first pre-trained face keypoint detection model. The model can be obtained by training a complex network. The model has large parameters and size, and requires high computing resources (such as memory and a GPU (Graphics Processing Unit)).
The server 105 may train a second convolutional neural network with a simpler structure by using a machine learning method based on the sample set and the first face keypoint detection model, and send a training result (e.g., the generated lightweight second face keypoint detection model) to the terminal devices 101, 102, and 103. In this way, the terminal devices 101, 102, 103 may apply the second face keypoint detection model for face keypoint detection.
The server 105 may be hardware or software. When the server is hardware, it may be implemented as a distributed server cluster formed by multiple servers, or may be implemented as a single server. When the server is software, it may be implemented as multiple pieces of software or software modules (e.g., to provide distributed services), or as a single piece of software or software module. And is not particularly limited herein.
It should be noted that the method for generating the face keypoint detection model provided by the embodiment of the present application is generally executed by the server 105, and accordingly, the apparatus for generating the face keypoint detection model is generally disposed in the server 105.
It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
With continued reference to FIG. 2, a flow 200 of one embodiment of a method for generating a face keypoint detection model according to the present application is shown. The method for generating the face key point detection model comprises the following steps:
step 201, a sample set is obtained.
In this embodiment, the execution subject (e.g., the server 105 shown in fig. 1) of the method for generating the face keypoint detection model may acquire the sample set in various ways. For example, the executing entity may obtain the existing sample set stored therein from another server (e.g., a database server) for storing the samples through a wired connection or a wireless connection. As another example, a user may collect a sample via a terminal device (e.g., terminal devices 101, 102, 103 shown in FIG. 1). In this way, the execution entity may receive samples collected by the terminal and store the samples locally, thereby generating a sample set. It should be noted that the wireless connection means may include, but is not limited to, a 3G/4G connection, a WiFi connection, a bluetooth connection, a WiMAX connection, a Zigbee connection, a uwb (ultra wideband) connection, and other wireless connection means now known or developed in the future.
Here, the sample set may include a plurality of samples. Wherein the sample may comprise a face image. The face image may include an image intercepted after face detection of various images. For example, after face detection is performed on an image in the internet, position information (i.e., the position of the face detection frame) indicating the area where the face object is located is obtained. And (4) carrying out screenshot on the area where the face object is located, and obtaining a face image. In addition, the face image may also include an image in which the user directly photographs the face of the human body.
Step 202, inputting the face images in the sample set to a pre-trained first face key point detection model to obtain a face key point detection result of the input face images.
In this embodiment, the executing entity may extract samples from the sample set obtained in step 201, and input the face images in the extracted samples to a first face key point detection model trained in advance, so as to obtain a face key point detection result of the input face images. Wherein, the first face key point detection model can be used to detect the positions (which can be expressed by coordinates) of the face key points in the face image. Here, the face key point detection result may be position information of the face key point (e.g., coordinates of the face key point). In practice, the face key points may be key points in the face (e.g., points with semantic information, or points that affect the face contour or the shape of the five sense organs, etc.). For example, face keypoints may include, but are not limited to, corners of the eyes, corners of the mouth, points in the contour, and the like.
In this embodiment, the first face keypoint detection model may be obtained by performing supervised training on an existing model by using a machine learning method. Here, various existing models capable of extracting image features may be used for training. For example, models such as convolutional neural networks, deep neural networks, and the like may be used. In practice, a Convolutional Neural Network (CNN) is a feed-forward Neural Network, and its artificial neurons can respond to a part of surrounding cells within a coverage range, and have excellent performance on image processing, so that the Convolutional Neural Network can be used for extracting features of a sample image. Convolutional neural networks may include convolutional layers, pooling layers, fully-connected layers, and the like. Among other things, convolutional layers may be used to extract image features. The pooling layer may be used to down-sample (down sample) the incoming information.
In some optional implementations of this embodiment, the first face keypoint detection model may be obtained by training through the following steps: the method comprises the steps of taking a face image in a target sample set as input of a first convolutional neural network, taking label information of face key points in the input face image as output of the first convolutional neural network, training the first convolutional neural network by using a machine learning method, and determining the trained first convolutional neural network as a first face key point detection model.
Here, the target sample set used for training the first face keypoint detection model may be the sample set obtained in step 201, or may be another sample set, which is not limited herein. The target sample set may include a large number of samples. The sample can comprise a face image and a face key point position label of the face image.
Here, the first convolutional neural network for training the first face keypoint detection model may be established based on various existing structures (e.g., DenseBox, VGGNet, ResNet, SegNet, etc.). Also, the first convolutional neural network may use a more complex network structure. For example, a plurality of build-up layers (e.g., 6 or 10 layers), a plurality of pooling layers, a plurality of full-link layers, and the like may be provided. Wherein each convolution layer may be provided with a plurality of convolution kernels (filters). It should be noted that the convolutional neural network used for training the first face keypoint detection model may further include other layers as needed, and is not limited herein.
In some optional implementations of the embodiment, the first face keypoint detection model may be obtained by training in a manner of confrontational training. The execution main body can input the face images in the target sample set into a pre-trained initial face key point detection model, input a face key point detection result generated by the initial face key point detection model into a pre-established discrimination model, and perform countermeasure training on the discrimination model and the initial face key point detection model to obtain a first face key point detection model.
Specifically, in the process of performing countermeasure training on the pair discrimination model and the initial face key point detection model, the discrimination model and the initial face key point detection model may be separately and alternately iteratively trained. For example, the parameters of the initial face key point detection model may be fixed first, and a discrimination model may be trained for the first time; then, fixing the parameters of the discrimination model after the first training, and carrying out the first training on the initial face key point detection model; and then, fixing parameters of the initial face key point detection model after the first training, carrying out second training on the discrimination model after the first training, and so on, and taking the initial face key point detection model obtained after the final training as a final first face key point detection model.
Here, the initial face key point detection model may be used to perform preliminary face key point detection on the face image. Various existing face keypoint detection models trained by using a face image as a sample can be used as the initial face keypoint detection model. For example, the initial face keypoint detection model may be obtained by performing supervised training on an existing convolutional neural network structure (e.g., DenseBox, VGGNet, ResNet, SegNet, etc.) by using a machine learning method and a face image.
Here, the above-described discrimination model may be used to determine whether the face key point detection result input to the discrimination model is taken from a face image. In practice, if the discrimination model determines that the face key point detection result input thereto is taken from the face image, a certain preset numerical value (for example, 1) can be output; if it is determined that the feature information input thereto is not taken from the face image, another preset value (e.g., 0) may be output. It should be noted that the above discriminant Model may be various existing models (e.g., Naive Bayes Model (NBM), Support Vector Machine (SVM), or neural network including fully connected layers (FCs)) that can implement the classification function.
And 203, using a machine learning method to input the face images in the sample set, using the face key point detection results of the input face images as output, and training to obtain a second face key point detection model.
In this embodiment, the executing agent may use a machine learning method to input the face images in the sample set, output the face keypoint detection result of the input face images, and train to obtain the second face keypoint detection model. Here, various existing models (which may be referred to as initial models) may be used for training. For example, models such as convolutional neural networks, deep neural networks, and the like may be used.
Here, the execution subject may extract samples in the sample set one by one for training. After each training, the initial model is updated. And then training the updated initial model by using the next sample until the initial model achieves the expected effect (for example, the predicted result is the same as or similar to the result output by the first face key point detection model). At this time, the final initial model may be determined as the second face keypoint detection model.
In addition, the execution subject may simultaneously extract a plurality of samples in the sample set and perform training using the extracted plurality of samples. After each training, the initial model is updated. And extracting a plurality of samples next time to train the updated initial model until the model achieves the expected effect.
In some optional implementations of this embodiment, the complexity of the model structure of the second face keypoint detection model may be less than the complexity of the model structure of the first face keypoint detection model. Here, the complexity may be characterized by the number of layers of the model, the number of parameters, and the like. For example, the fewer the number of layers, the less the complexity; the fewer the number of parameters, the less complex.
In some optional implementations of the present embodiment, the samples in the sample set further include annotation information of key points of a face in the face image. The labeling information may include coordinates of each face key point. After obtaining the face key point detection result of each face image output by the first face key point detection model, the executing body may further perform the following operations:
first, for a sample in the sample set, the executing entity may perform similarity calculation between a face key point detection result of a face image in the sample and annotation information in the sample. Here, the similarity calculation may be performed using various similarity calculation methods. Such as euclidean distance, cosine theorem, jackard similarity measure methods, etc.
Specifically, for each face key point in each sample, a distance between the coordinates of the face key point in the face key point detection result and the coordinates of the face key point in the annotation information may be calculated first. Then, the sum or the average of the distances of the face key points can be determined as the similarity calculation result of the sample.
In the second step, the execution subject may delete the sample whose similarity calculation result is smaller than a preset value from the sample set, so as to update the sample set. Here, the preset may be a numerical value preset by a technician through a large amount of data statistics and experiments.
In some optional implementation manners of this embodiment, the executing entity may obtain, on the basis of the implementation manner, the second face keypoint detection model by using the updated training set and training by using a machine learning method.
With continued reference to fig. 3, fig. 3 is a schematic diagram of an application scenario of the method for generating a face keypoint detection model according to the present embodiment. In the application scenario of fig. 3, a terminal device 301 used by a user (e.g., a technician) may have a model training class application installed thereon in the application scenario of fig. 3. After the user opens the application and uploads the sample set or the storage path of the sample set, the server 302 providing background support for the application may run a method for generating a face keypoint detection model, including:
first, a sample set may be obtained. Wherein the samples in the sample set may comprise face images 303. Thereafter, the face images 303 in the sample set may be input to the first face keypoint detection model 304 trained in advance, and the face keypoint detection results 305 of the input face images may be obtained. Then, a machine learning method may be used to train the second face keypoint detection model 306 by using the face images in the sample set as input and using the face keypoint detection result 305 of the input face images as output.
The method provided by the above embodiment of the present application may extract a sample from the sample set by obtaining the sample set to perform training of the second convolutional neural network. Wherein, the samples in the sample set comprise face images. Therefore, the face images in the sample set are input to the first face key point detection model trained in advance, and the face key point detection result of the input face images can be obtained. Then, by using a machine learning method, the face images in the sample set are used as input, and the face key point detection results of the input face images are used as output, so that a second face key point detection model can be obtained through training. Therefore, a model for detecting the key points of the human face can be obtained, and the method enriches the generation modes of the model.
In addition, the face key point detection result output by the trained first face key point detection model is used as a label pair to train the second face key point detection model, so that the second face key point detection model can learn from the trained and well-represented first face key point detection model in the training process. Therefore, compared with a model obtained by only carrying out supervised learning through a sample set, the second face key point detection model obtained through training improves the accuracy rate of face key point detection. In addition, the trained second face key point detection model can have a lightweight structure and can be suitable for a mobile terminal.
With further reference to fig. 4, a flow 400 of yet another embodiment of a method for generating a face keypoint detection model is shown. The process 400 of the method for generating a face keypoint detection model comprises the following steps:
step 401, a sample set is obtained.
In this embodiment, the execution subject of the method for generating a face keypoint detection model (e.g., the server 105 shown in fig. 1) may obtain a sample set. Here, the sample set may include a plurality of samples. The samples can include face images and labeling information of face key points in the face images. Here, the annotation information may be used to indicate the location of the face keypoints in the face image. The annotation information may include coordinates of each face keypoint.
Step 402, inputting the face images in the sample set into a pre-trained first face key point detection model to obtain a face key point detection result of the input face images.
In this embodiment, the executing entity may extract samples from the sample set obtained in step 401, and input the face images in the extracted samples to a first face key point detection model trained in advance, so as to obtain a face key point detection result of the input face images.
In this embodiment, the first face keypoint detection model may be obtained by training through the following steps: the method comprises the steps of taking a face image in a target sample set as input of a first convolutional neural network, taking label information of face key points in the input face image as output of the first convolutional neural network, training the first convolutional neural network by using a machine learning method, and determining the trained first convolutional neural network as a first face key point detection model.
Here, the target sample set used for training the first face keypoint detection model may be the sample set obtained in step 401, or may be another sample set, which is not limited herein. The target sample set may include a large number of samples. The sample can comprise a face image and a face key point position label of the face image.
Here, the first convolutional neural network for training the first face keypoint detection model may be established based on various existing structures (e.g., DenseBox, VGGNet, ResNet, SegNet, etc.). Also, the first convolutional neural network may have a more complex network structure. For example, multiple convolutional layers (e.g., 6, or 10 layers, etc.), multiple pooling layers, multiple fully-connected layers, etc. may be included. Wherein each convolutional layer may be provided with a plurality of convolutional kernels. It should be noted that the convolutional neural network used for training the first face keypoint detection model may further include other layers as needed, and is not limited herein.
And 403, for the samples in the sample set, performing similarity calculation on the face key point detection result of the face image in the sample and the labeling information in the sample.
In this embodiment, for a sample in the sample set, the executing entity may perform similarity calculation between a face key point detection result of a face image in the sample and annotation information in the sample. Here, the similarity calculation may be performed using various similarity calculation methods. Such as euclidean distance, cosine theorem, jackard similarity measure methods, etc. Specifically, for each face key point in each sample, a distance between the coordinates of the face key point in the face key point detection result and the coordinates of the face key point in the annotation information may be calculated first. Then, the sum or the average of the distances of the face key points can be determined as the similarity calculation result of the sample.
And step 404, deleting the samples in the sample set, of which the similarity calculation result is smaller than a preset value, so as to update the sample set.
In this embodiment, the execution subject may delete the sample whose similarity calculation result is smaller than a preset value from the sample set, so as to update the sample set. Here, the preset may be a numerical value preset by a technician through a large amount of data statistics and experiments.
At step 405, samples are extracted from the updated set of samples.
In this embodiment, the execution subject may extract a sample from the updated sample set. Here, the manner of extracting the sample and the number of samples to be extracted are not limited in the present application. For example, at least one sample may be randomly extracted, or a sample with better definition (i.e., higher pixels of the face image) from which the face image is extracted. Then, the training steps of steps 406 to 409 may be performed as follows.
Step 406, inputting the face image in the extracted sample to a second convolutional neural network, so as to obtain information output by the second convolutional neural network.
In this embodiment, the executing entity may input the face image in the extracted sample to the second convolutional neural network, and obtain information output by the second convolutional neural network (for example, coordinates of key points of the face predicted by the second convolutional neural network).
Here, the second convolutional neural network may be a convolutional neural network using various existing structures (e.g., DenseBox, VGGNet, ResNet, SegNet, etc.). Also, the network structure of the second convolutional neural network may be simpler. For example, a small number of convolutional layers (e.g., one or two layers), a small number of pooling layers (e.g., one or two layers), and fully connected layers may be included. It should be noted that the convolutional neural network used for training the first face keypoint detection model may further include other layers as needed, and is not limited herein.
It should be noted that, in general, a more complex model structure may have a better performance (e.g., a higher accuracy of face key point detection) than a simpler model structure after training by using the same data and method. Meanwhile, the model trained by the more complex model structure occupies more computing resources due to more parameters and complex computing process, and has slower computing speed. Therefore, the model trained by the more complex model structure is not suitable for being deployed at the mobile terminal. On the contrary, the model trained by using the simpler model structure has fewer parameters and simple calculation process, only occupies fewer calculation resources, has higher processing speed and can be deployed at a mobile terminal for use. However, if the sample set is used directly for training, the trained model usually performs poorly (e.g., the accuracy of face keypoint detection is low).
It can be understood that the first convolutional neural network used for training the first face keypoint detection model, where the number of convolutional layers, the number of pooling layers, the number of fully-connected layers, the number of convolutional cores, the number of parameters, and the like may be set according to actual requirements, and are not limited herein. The set configuration may allow the trained first face keypoint detection model to achieve a desired performance (e.g., a recognition accuracy reaching a set value). Meanwhile, the number of convolutional layers, the number of pooled layers, the number of fully-connected layers, the number of convolutional cores, and the number of parameters in the second convolutional neural network may be set according to factors such as computational resources available from the mobile terminal, and the like, which is not limited herein.
Step 407, inputting the information output by the second convolutional neural network and the face key point detection result of the input face image into a pre-established loss function to obtain the loss value of the extracted sample.
In this embodiment, the executing entity may input the information output by the second convolutional neural network and the face keypoint detection result of the input face image (i.e., the face keypoint detection result of the face image output by the first face keypoint detection model) to a loss function (loss function) established in advance, so as to obtain the loss value of the extracted sample. Here, the value of the loss function (i.e., the loss value) may be used to characterize the degree of difference between the information (e.g., coordinates of the key points of the face) output by the second convolutional neural network and the labeling information. The loss function is a non-negative real-valued function. In general, the smaller the value of the loss function (loss value), the better the robustness of the model. The loss function may be set according to actual requirements.
Step 408, determining whether the second convolutional neural network is trained based on the comparison of the loss value and the target value.
In this embodiment, the executing entity may determine whether the training of the second convolutional neural network is completed based on the comparison between the loss value and the target value. Here, if there are a plurality of (at least two) samples extracted in the first step in the present implementation, the execution subject may compare the loss value of each sample with the target value, respectively. It can thus be determined whether the loss value for each sample is less than or equal to the target value. As an example, if there are multiple samples taken in the first step in this implementation, the execution subject may determine that the second convolutional neural network training is complete if the loss value of each sample is less than or equal to the target value. As another example, the executive may count the proportion of samples with a loss value less than or equal to the target value to the samples taken. And when the ratio reaches a preset sample ratio (e.g., 95%), it can be determined that the second convolutional neural network training is complete. It should be noted that the target value may be generally used as an ideal case of representing the degree of inconsistency between the predicted value and the true value. That is, when the loss value is less than or equal to the target value, the predicted value may be considered to be close to or approximate the true value. The target value may be set according to actual demand.
It is noted that, in response to determining that the second convolutional neural network has been trained, the following step 409 may be performed. In response to determining that the second convolutional neural network is not trained, parameters in the second convolutional neural network may be updated based on the determined loss values, samples are re-extracted from the sample set, and the training step is continued using the updated parameter second convolutional neural network as the second convolutional neural network. It should be noted that the extraction method is not limited in this application. For example, in the case where there are a large number of samples in the sample set, the execution subject may extract a sample from which it has not been extracted.
Here, the gradient of the loss value with respect to the model parameters may be found using a back propagation algorithm, and then the model parameters may be updated based on the gradient using a gradient descent algorithm. It should be noted that the back propagation algorithm, the gradient descent algorithm, and the machine learning method are well-known technologies that are currently widely researched and applied, and are not described herein again.
And 409, in response to the fact that the training of the second convolutional neural network is completed, determining the trained second convolutional neural network as a second face key point detection model.
In this embodiment, in response to determining that the training of the second convolutional neural network is completed, the executing entity may determine the trained second convolutional neural network as the second face keypoint detection model.
In this embodiment, since the structure of the second convolutional neural network used for training the second face keypoint detection model is simpler than the structure of the first convolutional neural network used for training the first face keypoint detection model, the complexity of the model structure of the second face keypoint detection model is less than the complexity of the model structure of the first face keypoint detection model.
As can be seen from fig. 4, compared with the embodiment corresponding to fig. 2, the process 400 of the method for generating a face keypoint detection model in this embodiment involves a step of deleting samples in a training set, and a step of training a second convolutional neural network to obtain a second face keypoint detection model. Because the structure of the second convolutional neural network used for training the second face key point detection model is simpler than the structure of the first convolutional neural network used for training the first face key point detection model, the second face key point detection model is used for detecting the face key points, and compared with the first face key point detection model with a complex structure, the detection efficiency can be improved, the occupancy rate of computing resources is reduced, and the method can be suitable for deployment of a mobile terminal.
With further reference to fig. 5, as an implementation of the method shown in the above figures, the present application provides an embodiment of an apparatus for generating a face keypoint detection model, where the apparatus embodiment corresponds to the method embodiment shown in fig. 2, and the apparatus may be applied to various electronic devices in particular.
As shown in fig. 5, the apparatus 500 for generating a face keypoint detection model according to the present embodiment includes: an obtaining unit 501 configured to obtain a sample set, where samples in the sample set include a face image; an input unit 502 configured to input the face images in the sample set to a first face key point detection model trained in advance, and obtain a face key point detection result of the input face images; the training unit 503 is configured to train the face images in the sample set as input and the face key point detection results of the input face images as output by using a machine learning method to obtain a second face key point detection model.
In some optional implementation manners of this embodiment, the samples in the sample set may further include annotation information of the face key points in the face image. And the apparatus may further include a calculation unit and a deletion unit (not shown in the figure). The calculating unit may be configured to perform similarity calculation on a face key point detection result of a face image in a sample of the sample set and annotation information in the sample, for the sample. The deleting unit may be configured to delete the samples of which the similarity calculation result is smaller than a preset value from the sample set, so as to update the sample set.
In some optional implementations of this embodiment, the training unit 503 may be further configured to: extracting samples from the updated sample set, and executing the following training steps: inputting the face image in the extracted sample into a second convolutional neural network to obtain information output by the second convolutional neural network; inputting information output by the second convolutional neural network and a face key point detection result of the input face image into a pre-established loss function to obtain a loss value of the extracted sample; determining whether the second convolutional neural network is trained based on the comparison between the loss value and the target value; and in response to determining that the training of the second convolutional neural network is completed, determining the trained second convolutional neural network as a second face key point detection model.
In some optional implementations of this embodiment, the training unit 503 may be further configured to: and in response to determining that the second convolutional neural network is not trained, updating parameters in the second convolutional neural network based on the loss values, re-extracting samples from the sample set, and continuing to execute the training step by using the second convolutional neural network with the updated parameters as the second convolutional neural network.
In some optional implementations of this embodiment, the first face keypoint detection model may be obtained by training through the following steps: the method comprises the steps of taking a face image in a target sample set as input of a first convolutional neural network, taking label information of face key points in the input face image as output of the first convolutional neural network, training the first convolutional neural network by using a machine learning method, and determining the trained convolutional neural network as a first face key point detection model.
In some optional implementations of this embodiment, the complexity of the model structure of the second face keypoint detection model is less than the complexity of the model structure of the first face keypoint detection model.
The apparatus provided by the above embodiment of the present application obtains the sample set through the obtaining unit 501, and may extract a sample therefrom to perform training of the second convolutional neural network. Wherein, the samples in the sample set comprise face images. In this way, the input unit 502 inputs the face images in the sample set to the first face keypoint detection model trained in advance, and can obtain the face keypoint detection result of the input face images. Next, the training unit 503 takes the face images in the sample set as input and takes the face key point detection result of the input face images as output by using a machine learning method, so as to obtain a second face key point detection model through training. Therefore, a model for detecting the key points of the human face can be obtained, and the generation mode of the model is enriched.
Referring to fig. 6, a flowchart 600 of an embodiment of a method for detecting face keypoints provided by the present application is shown. The method for detecting the key points of the human face can comprise the following steps:
step 601, obtaining a face image to be detected.
In the present embodiment, an executing subject (for example, the terminal devices 101, 102, 103 shown in fig. 1) of the method for detecting a face key point may acquire a face image to be detected. The image to be detected may be acquired by an image acquisition device such as a camera mounted on the execution main body, or may be acquired by the execution main body from the internet or other electronic devices. Here, the acquisition position of the image to be detected is not limited.
Step 602, inputting the face image to be detected into the second face key point detection model, and generating a face key point detection result of the face image to be detected.
In this embodiment, the execution subject may store a second face keypoint detection model. The executing body may input the image to be detected obtained in step 601 into the second face key point detection model, so as to obtain a face key point detection result of the face image to be detected (i.e. position information of the face key point in the face image to be detected).
In this embodiment, the second face keypoint detection model may be generated by the method described in the embodiment of fig. 2. For a specific generation process, reference may be made to the related description of the embodiment in fig. 2, which is not described herein again.
It should be noted that the method for detecting face keypoints according to the present embodiment may be used to test the second face keypoint detection model generated in the foregoing embodiments. And then the second face key point detection model can be continuously optimized according to the test result. The method may also be a practical application method of the second face keypoint detection model generated by the above embodiments. The second face key point detection model generated by the embodiments is adopted to detect the face key points, which is beneficial to improving the performance of face key point detection. The second face key point detection model is a lightweight model, so that the second face key point detection model can be deployed at a mobile terminal for use.
With continuing reference to fig. 7, as an implementation of the method illustrated in fig. 6 described above, the present application provides an embodiment of an apparatus for detecting key points of a human face. The embodiment of the device corresponds to the embodiment of the method shown in fig. 6, and the device can be applied to various electronic devices.
As shown in fig. 7, the apparatus 700 for detecting key points of a human face according to this embodiment includes: an acquisition unit 701 configured to acquire a face image to be detected; the generating unit 702 is configured to input the face image to be detected into the second face keypoint detection model generated by using the method described in the embodiment of fig. 2, and generate a face keypoint detection result of the face image to be detected.
It will be understood that the elements described in the apparatus 700 correspond to various steps in the method described with reference to fig. 6. Thus, the operations, features and resulting advantages described above with respect to the method are also applicable to the apparatus 700 and the units included therein, and will not be described herein again.
Referring now to FIG. 8, shown is a block diagram of a computer system 800 suitable for use in implementing the electronic device of an embodiment of the present application. The electronic device shown in fig. 8 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present application.
As shown in fig. 8, the computer system 800 includes a Central Processing Unit (CPU)801 that can perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)802 or a program loaded from a storage section 808 into a Random Access Memory (RAM) 803. In the RAM 803, various programs and data necessary for the operation of the system 800 are also stored. The CPU 801, ROM 802, and RAM 803 are connected to each other via a bus 804. An input/output (I/O) interface 805 is also connected to bus 804.
The following components are connected to the I/O interface 805: an input portion 806 including a touch screen, a touch pad, and the like; an output section 807 including a display such as a Liquid Crystal Display (LCD) and a speaker; a storage portion 808 including a hard disk and the like; and a communication section 809 including a network interface card such as a LAN card, a modem, or the like. The communication section 809 performs communication processing via a network such as the internet. A drive 810 is also connected to the I/O interface 805 as necessary. A removable medium 811 such as a semiconductor memory or the like is mounted on the drive 810 as necessary, so that a computer program read out therefrom is mounted into the storage portion 808 as necessary.
In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program can be downloaded and installed from a network through the communication section 809 and/or installed from the removable medium 811. The computer program performs the above-described functions defined in the method of the present application when executed by the Central Processing Unit (CPU) 801. It should be noted that the computer readable medium described herein can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present application, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In this application, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units described in the embodiments of the present application may be implemented by software or hardware. The described units may also be provided in a processor, and may be described as: a processor includes an acquisition unit, an input unit, and a training unit. Where the names of these units do not in some cases constitute a limitation of the unit itself, for example, the acquisition unit may also be described as a "unit acquiring a sample set".
As another aspect, the present application also provides a computer-readable medium, which may be contained in the apparatus described in the above embodiments; or may be present separately and not assembled into the device. The computer readable medium carries one or more programs which, when executed by the apparatus, cause the apparatus to: obtaining a sample set; inputting the face images in the sample set into a pre-trained first face key point detection model to obtain a face key point detection result of the input face images; and training to obtain a second face key point detection model by using a machine learning method and taking the face images in the sample set as input and the face key point detection results of the input face images as output.
The above description is only a preferred embodiment of the application and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the invention herein disclosed is not limited to the particular combination of features described above, but also encompasses other arrangements formed by any combination of the above features or their equivalents without departing from the spirit of the invention. For example, the above features may be replaced with (but not limited to) features having similar functions disclosed in the present application.

Claims (14)

1. A method for generating a face keypoint detection model, comprising:
acquiring a sample set, wherein samples in the sample set comprise a face image and annotation information of face key points in the face image;
inputting the face images in the sample set into a first face key point detection model trained in advance to obtain a face key point detection result of the input face images, and after the face key point detection result of the input face images is obtained, for the samples in the sample set, performing similarity calculation on the face key point detection result of the face images in the samples and the labeled information in the samples, and deleting the samples with similarity calculation results smaller than a preset value in the sample set to update the sample set;
and training to obtain a second face key point detection model by using a machine learning method and taking the face images in the sample set as input and the face key point detection results of the input face images as output.
2. The method for generating a face keypoint detection model according to claim 1, wherein said training, by using a machine learning method, a second face keypoint detection model obtained by taking a face image in a sample set as an input and taking a face keypoint detection result of the input face image as an output, comprises:
extracting samples from the updated sample set, and executing the following training steps: inputting the face image in the extracted sample into a second convolutional neural network to obtain information output by the second convolutional neural network; inputting information output by the second convolutional neural network and a face key point detection result of the input face image into a pre-established loss function to obtain a loss value of the extracted sample; determining whether training of a second convolutional neural network is complete based on the comparison of the loss value to a target value; and in response to determining that the training of the second convolutional neural network is completed, determining the trained second convolutional neural network as a second face key point detection model.
3. The method for generating a face keypoint detection model according to claim 2, wherein said training, by using a machine learning method, takes the face images in the sample set as input and the face keypoint detection result of the input face images as output to obtain a second face keypoint detection model, further comprises:
in response to determining that the second convolutional neural network is not trained, updating parameters in the second convolutional neural network based on the loss values, re-extracting samples from the sample set, and continuing to perform the training step using the updated parameters of the second convolutional neural network as the second convolutional neural network.
4. A method for generating a face keypoint detection model according to claim 1, wherein said first face keypoint detection model is trained by:
the method comprises the steps of taking a face image in a target sample set as input of a first convolutional neural network, taking labeling information of face key points in the input face image as output of the first convolutional neural network, training the first convolutional neural network by using a machine learning method, and determining the trained convolutional neural network as a first face key point detection model.
5. Method for generating a face keypoint detection model according to one of claims 1 to 4, wherein the complexity of the model structure of said second face keypoint detection model is lower than the complexity of the model structure of said first face keypoint detection model.
6. An apparatus for generating a face keypoint detection model, comprising:
the system comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is configured to acquire a sample set, wherein samples in the sample set comprise a human face image and annotation information of human face key points in the human face image;
an input unit, configured to input the face images in the sample set to a first face key point detection model trained in advance, obtain a face key point detection result of the input face images, and perform similarity calculation between the face key point detection result of the face image in the sample and the label information in the sample for the samples in the sample set after obtaining the face key point detection result of the input face image, delete the samples in the sample set whose similarity calculation result is smaller than a preset value, and update the sample set;
and the training unit is configured to train to obtain a second face key point detection model by using a machine learning method and taking the face images in the sample set as input and taking the face key point detection results of the input face images as output.
7. The apparatus for generating a face keypoint detection model of claim 6, wherein the training unit is further configured to:
extracting samples from the updated sample set, and executing the following training steps: inputting the face image in the extracted sample into a second convolutional neural network to obtain information output by the second convolutional neural network; inputting information output by the second convolutional neural network and a face key point detection result of the input face image into a pre-established loss function to obtain a loss value of the extracted sample; determining whether training of a second convolutional neural network is complete based on the comparison of the loss value to a target value; and in response to determining that the training of the second convolutional neural network is completed, determining the trained second convolutional neural network as a second face key point detection model.
8. The apparatus for generating a face keypoint detection model of claim 7, wherein the training unit is further configured to:
in response to determining that the second convolutional neural network is not trained, updating parameters in the second convolutional neural network based on the loss values, re-extracting samples from the sample set, and continuing to perform the training step using the updated parameters of the second convolutional neural network as the second convolutional neural network.
9. An apparatus for generating a face keypoint detection model according to claim 6, wherein the first face keypoint detection model is trained by:
the method comprises the steps of taking a face image in a target sample set as input of a first convolutional neural network, taking labeling information of face key points in the input face image as output of the first convolutional neural network, training the first convolutional neural network by using a machine learning method, and determining the trained convolutional neural network as a first face key point detection model.
10. Apparatus for generating a face keypoint detection model according to one of claims 6 to 9, wherein the complexity of the model structure of said second face keypoint detection model is lower than the complexity of the model structure of said first face keypoint detection model.
11. A method for detecting face keypoints, comprising:
acquiring a human face image to be detected;
inputting the face image to be detected into the second face key point detection model generated by the method according to any one of claims 1 to 5, and generating a face key point detection result of the face image to be detected.
12. An apparatus for detecting face keypoints, comprising:
an acquisition unit configured to acquire a face image to be detected;
a generating unit configured to input the face image to be detected into a second face key point detection model generated by the method according to any one of claims 1 to 5, and generate a face key point detection result of the face image to be detected.
13. An electronic device, comprising:
one or more processors;
a storage device having one or more programs stored thereon,
when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-5, 11.
14. A computer-readable medium, on which a computer program is stored, which program, when being executed by a processor, carries out the method according to any one of claims 1-5, 11.
CN201811075000.2A 2018-09-14 2018-09-14 Method and device for generating face key point detection model Active CN109214343B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811075000.2A CN109214343B (en) 2018-09-14 2018-09-14 Method and device for generating face key point detection model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811075000.2A CN109214343B (en) 2018-09-14 2018-09-14 Method and device for generating face key point detection model

Publications (2)

Publication Number Publication Date
CN109214343A CN109214343A (en) 2019-01-15
CN109214343B true CN109214343B (en) 2021-03-09

Family

ID=64983493

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811075000.2A Active CN109214343B (en) 2018-09-14 2018-09-14 Method and device for generating face key point detection model

Country Status (1)

Country Link
CN (1) CN109214343B (en)

Families Citing this family (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109740567A (en) * 2019-01-18 2019-05-10 北京旷视科技有限公司 Key point location model training method, localization method, device and equipment
CN109829431B (en) * 2019-01-31 2021-02-12 北京字节跳动网络技术有限公司 Method and apparatus for generating information
CN109858445B (en) * 2019-01-31 2021-06-25 北京字节跳动网络技术有限公司 Method and apparatus for generating a model
CN109902641B (en) * 2019-03-06 2021-03-02 中国科学院自动化研究所 Semantic alignment-based face key point detection method, system and device
CN110210526A (en) * 2019-05-14 2019-09-06 广州虎牙信息科技有限公司 Predict method, apparatus, equipment and the storage medium of the key point of measurand
CN110334587B (en) * 2019-05-23 2021-01-22 北京市威富安防科技有限公司 Training method and device of face key point positioning model and key point positioning method
CN110287955B (en) * 2019-06-05 2021-06-22 北京字节跳动网络技术有限公司 Target area determination model training method, device and computer readable storage medium
CN110288646A (en) * 2019-06-21 2019-09-27 北京邮电大学 A kind of human dimension calculation method and device based on image
CN110390291B (en) * 2019-07-18 2021-10-08 北京字节跳动网络技术有限公司 Data processing method and device and electronic equipment
CN110399927B (en) * 2019-07-26 2022-02-01 玖壹叁陆零医学科技南京有限公司 Recognition model training method, target recognition method and device
CN110415291A (en) * 2019-08-07 2019-11-05 清华大学 Image processing method and relevant device
CN110443222B (en) * 2019-08-14 2022-09-09 北京百度网讯科技有限公司 Method and device for training face key point detection model
CN110555426A (en) * 2019-09-11 2019-12-10 北京儒博科技有限公司 Sight line detection method, device, equipment and storage medium
CN110826421B (en) * 2019-10-18 2023-09-05 易视腾科技股份有限公司 Method and device for filtering faces with difficult gestures
CN110796089A (en) * 2019-10-30 2020-02-14 上海掌门科技有限公司 Method and apparatus for training face-changing model
CN111178172A (en) * 2019-12-13 2020-05-19 北京工业大学 Laboratory mouse sniffing action recognition method, module and system
CN111353581A (en) * 2020-02-12 2020-06-30 北京百度网讯科技有限公司 Lightweight model acquisition method and device, electronic equipment and storage medium
CN111368685B (en) * 2020-02-27 2023-09-29 北京字节跳动网络技术有限公司 Method and device for identifying key points, readable medium and electronic equipment
CN111860199B (en) * 2020-06-28 2022-09-27 上海芯翌智能科技有限公司 Method and equipment for detecting key points in image
CN111753793B (en) * 2020-06-30 2022-11-22 重庆紫光华山智安科技有限公司 Model training method and device, face screening method and electronic equipment
CN114187177A (en) * 2021-11-30 2022-03-15 北京字节跳动网络技术有限公司 Method, device and equipment for generating special effect video and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107609519A (en) * 2017-09-15 2018-01-19 维沃移动通信有限公司 The localization method and device of a kind of human face characteristic point
CN108197652A (en) * 2018-01-02 2018-06-22 百度在线网络技术(北京)有限公司 For generating the method and apparatus of information
CN108229646A (en) * 2017-08-08 2018-06-29 北京市商汤科技开发有限公司 neural network model compression method, device, storage medium and electronic equipment
CN108229651A (en) * 2017-11-28 2018-06-29 北京市商汤科技开发有限公司 Neural network model moving method and system, electronic equipment, program and medium
CN108446653A (en) * 2018-03-27 2018-08-24 百度在线网络技术(北京)有限公司 Method and apparatus for handling face-image
CN108460365A (en) * 2018-03-27 2018-08-28 百度在线网络技术(北京)有限公司 Identity identifying method and device

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105404896B (en) * 2015-11-03 2019-04-19 北京旷视科技有限公司 Labeled data processing method and labeled data processing system
CN105426826A (en) * 2015-11-09 2016-03-23 张静 Tag noise correction based crowd-sourced tagging data quality improvement method
CN106295533B (en) * 2016-08-01 2019-07-02 厦门美图之家科技有限公司 A kind of optimization method, device and the camera terminal of self-timer image
CN106709917B (en) * 2017-01-03 2020-09-11 青岛海信医疗设备股份有限公司 Neural network model training method, device and system
CN108073914B (en) * 2018-01-10 2022-02-18 成都品果科技有限公司 Animal face key point marking method

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108229646A (en) * 2017-08-08 2018-06-29 北京市商汤科技开发有限公司 neural network model compression method, device, storage medium and electronic equipment
CN107609519A (en) * 2017-09-15 2018-01-19 维沃移动通信有限公司 The localization method and device of a kind of human face characteristic point
CN108229651A (en) * 2017-11-28 2018-06-29 北京市商汤科技开发有限公司 Neural network model moving method and system, electronic equipment, program and medium
CN108197652A (en) * 2018-01-02 2018-06-22 百度在线网络技术(北京)有限公司 For generating the method and apparatus of information
CN108446653A (en) * 2018-03-27 2018-08-24 百度在线网络技术(北京)有限公司 Method and apparatus for handling face-image
CN108460365A (en) * 2018-03-27 2018-08-28 百度在线网络技术(北京)有限公司 Identity identifying method and device

Also Published As

Publication number Publication date
CN109214343A (en) 2019-01-15

Similar Documents

Publication Publication Date Title
CN109214343B (en) Method and device for generating face key point detection model
US10936919B2 (en) Method and apparatus for detecting human face
US11487995B2 (en) Method and apparatus for determining image quality
CN109816589B (en) Method and apparatus for generating cartoon style conversion model
US10853623B2 (en) Method and apparatus for generating information
US9875445B2 (en) Dynamic hybrid models for multimodal analysis
CN111797893B (en) Neural network training method, image classification system and related equipment
CN108898186B (en) Method and device for extracting image
US10691928B2 (en) Method and apparatus for facial recognition
CN109492128B (en) Method and apparatus for generating a model
WO2022089360A1 (en) Face detection neural network and training method, face detection method, and storage medium
CN109447156B (en) Method and apparatus for generating a model
US11436863B2 (en) Method and apparatus for outputting data
CN109145828B (en) Method and apparatus for generating video category detection model
CN108509994B (en) Method and device for clustering character images
CN109034069B (en) Method and apparatus for generating information
CN109800730B (en) Method and device for generating head portrait generation model
CN108229375B (en) Method and device for detecting face image
CN108388889B (en) Method and device for analyzing face image
CN109272543B (en) Method and apparatus for generating a model
WO2022227765A1 (en) Method for generating image inpainting model, and device, medium and program product
CN112149615A (en) Face living body detection method, device, medium and electronic equipment
WO2021034864A1 (en) Detection of moment of perception
CN110110666A (en) Object detection method and device
CN112529149A (en) Data processing method and related device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP01 Change in the name or title of a patent holder

Address after: 100041 B-0035, 2 floor, 3 building, 30 Shixing street, Shijingshan District, Beijing.

Patentee after: Tiktok vision (Beijing) Co.,Ltd.

Address before: 100041 B-0035, 2 floor, 3 building, 30 Shixing street, Shijingshan District, Beijing.

Patentee before: BEIJING BYTEDANCE NETWORK TECHNOLOGY Co.,Ltd.

Address after: 100041 B-0035, 2 floor, 3 building, 30 Shixing street, Shijingshan District, Beijing.

Patentee after: Douyin Vision Co.,Ltd.

Address before: 100041 B-0035, 2 floor, 3 building, 30 Shixing street, Shijingshan District, Beijing.

Patentee before: Tiktok vision (Beijing) Co.,Ltd.

CP01 Change in the name or title of a patent holder