CN113569594A - Method and device for labeling key points of human face - Google Patents

Method and device for labeling key points of human face Download PDF

Info

Publication number
CN113569594A
CN113569594A CN202010350817.7A CN202010350817A CN113569594A CN 113569594 A CN113569594 A CN 113569594A CN 202010350817 A CN202010350817 A CN 202010350817A CN 113569594 A CN113569594 A CN 113569594A
Authority
CN
China
Prior art keywords
face
image
position information
target
information corresponding
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010350817.7A
Other languages
Chinese (zh)
Inventor
顾阳
王晋玮
李源
杨德尧
左钟融
张册
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Momenta Suzhou Technology Co Ltd
Original Assignee
Momenta Suzhou Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Momenta Suzhou Technology Co Ltd filed Critical Momenta Suzhou Technology Co Ltd
Priority to CN202010350817.7A priority Critical patent/CN113569594A/en
Publication of CN113569594A publication Critical patent/CN113569594A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The embodiment of the invention discloses a method and a device for labeling key points of a human face, wherein the method comprises the following steps: acquiring face images acquired by a plurality of image acquisition devices in the same acquisition period; detecting image position information corresponding to each face key point in a target face from each face image; determining the labeling position information of each face key point of the target face in each face image based on the image position information corresponding to the face key point in the target face in each face image and a preset key point labeling rule, wherein the preset key point labeling rule comprises the following steps: the method comprises the following steps of labeling rules based on image position information corresponding to face key points in a target face image and/or labeling rules based on image position information corresponding to the face key points which are obtained by traversing in each face image and meet specified position conditions, wherein the target face image is as follows: the face images meeting the specified screening rules in the face images are used for improving the accuracy of the position identification result of the face key points.

Description

Method and device for labeling key points of human face
Technical Field
The invention relates to the technical field of image recognition, in particular to a method and a device for labeling key points of a human face.
Background
The face recognition technology is widely applied to the technical fields of security protection, personnel identity detection, personnel tracking and the like. In the related face recognition technology, it is generally required to first detect image position information of face key points of each part of a face from an acquired image including the face, and then perform subsequent task steps based on the image position information of the face key points. For example, the method comprises the steps of carrying out identity verification on a person, or carrying out fatigue driving behavior identification on the person, and the like.
As can be seen from the above process, it is important to accurately identify the image position information of the face key points of each part of the face included in the image. In the related face recognition technology, in the process of recognizing the key points of the face of the human eye part, the image position information of the eye corner point of the human eye part can be determined based on the eyelid line. In some scenarios, the above process is prone to position recognition errors, such as: when a person closes the eyes with strength, wrinkles are prone to appear around the eyes, and at the moment, the eyebrow tails are prone to be identified as canthus points, so that position identification errors of key points of the human faces are caused.
Disclosure of Invention
The invention provides a method, a device, a method and a device for labeling key points of a human face, and aims to improve the accuracy of a position identification result of the key points of the human face. The specific technical scheme is as follows:
in a first aspect, an embodiment of the present invention provides a method for labeling key points of a human face, where the method includes:
the method comprises the steps of obtaining face images collected by a plurality of image collecting devices in the same collecting period, wherein the plurality of image collecting devices shoot a target face from different angles;
detecting image position information corresponding to each face key point in the target face from each face image;
determining labeling position information of each face key point of the target face in each face image based on image position information corresponding to the face key points in the target face in each face image and a preset key point labeling rule, wherein the preset key point labeling rule comprises: the method comprises the following steps of labeling rules based on image position information corresponding to face key points in a target face image, and/or labeling rules based on image position information corresponding to the face key points which are obtained by traversing in each face image and meet specified position conditions, wherein the target face image is as follows: and the face images meet the specified screening rules.
Optionally, the step of detecting image position information corresponding to each face key point in the target face from each face image includes:
determining a corresponding thermodynamic diagram of each face image based on the face images;
aiming at each face image, determining image position information corresponding to each face key point in the target face and semantic information corresponding to the image position information from the thermodynamic diagram corresponding to the face image by using a preset clustering algorithm;
the step of determining labeling position information of each face key point of the target face in each face image based on image position information corresponding to the face key point in the target face in each face image and a preset key point labeling rule comprises the following steps:
for each face image, determining image position information corresponding to face key points corresponding to target semantic information from the face key points in the face image based on semantic information corresponding to each face key point in the target face in the face image, wherein the target semantic information is semantic information corresponding to image position information corresponding to at least two face key points;
aiming at each target semantic information corresponding to each face image, sequentially taking the image position information corresponding to the face key point corresponding to the target semantic information as the target image position information corresponding to the face key point corresponding to the target semantic information; determining a reprojection error corresponding to the face image based on target image position information corresponding to a face key point corresponding to each target semantic information in the face image, image position information corresponding to face key points corresponding to other semantic information, and a preset three-dimensional face model, wherein the other semantic information is as follows: semantic information other than the target semantic information;
and determining the labeling position information of each face characteristic point in each face image based on each reprojection error corresponding to the face image and the image position information corresponding to a group of face key points corresponding to each reprojection error corresponding to the face image.
Optionally, the preset key point annotation rule includes: a rule for labeling based on image position information corresponding to the face key points which meet the specified position conditions and are obtained by traversing in each face image;
the step of determining, for each face image, based on the reprojection errors corresponding to the face image and the image location information corresponding to a group of face key points corresponding to the reprojection errors corresponding to the face image, the annotation location information of each face feature point in the face image includes:
and determining image position information corresponding to a group of human face key points corresponding to the reprojection error with the minimum numerical value in the reprojection errors corresponding to each human face image as the labeling position information of each human face characteristic point in the human face image.
Optionally, the preset key point annotation rule includes: the method comprises the steps of carrying out annotation rules based on image position information corresponding to face key points in a target face image and carrying out annotation rules based on image position information corresponding to the face key points which are obtained by traversing in each face image and meet specified position conditions;
the step of determining, for each face image, based on the reprojection errors corresponding to the face image and the image location information corresponding to a group of face key points corresponding to the reprojection errors corresponding to the face image, the annotation location information of each face feature point in the face image includes:
aiming at each face image, determining image position information corresponding to a group of face key points corresponding to the re-projection error with the minimum numerical value in the re-projection errors corresponding to the face image as the middle position information of each face characteristic point in the face image;
displaying each face image containing the middle position information of each face key point so that a user can select at least two frames of target face images with accurate face key point detection results from the displayed face images containing the middle position information of each face key point;
determining at least two frames of target face images selected by a user based on the operation of the user;
determining three-dimensional position information corresponding to each face key point based on the middle position information of each face key point in each target face image and the equipment information of the image acquisition equipment corresponding to each target face image;
and determining the labeling position information of each face key point in each face image based on the three-dimensional position information corresponding to each face key point, the equipment information of the image acquisition equipment corresponding to each face image and a preset projection formula.
Optionally, the target semantic information includes a left-eye corner point and/or a right-eye corner point.
Optionally, the preset key point annotation rule includes: a rule for labeling based on image position information corresponding to a face key point in a target face image;
the step of determining labeling position information of each face key point of the target face in each face image based on image position information corresponding to the face key point in the target face in each face image and a preset key point labeling rule comprises the following steps:
displaying each face image containing image position information corresponding to each face key point in the target face, so that a user can select at least two frames of target face images with accurate face key point detection results from the displayed face images containing the image position information corresponding to each face key point in the target face;
determining at least two frames of target face images selected by a user based on the operation of the user;
determining three-dimensional position information corresponding to each face key point based on image position information corresponding to each face key point in each target face image and equipment information of image acquisition equipment corresponding to each target face image;
and determining the labeling position information of each face key point in each face image based on the three-dimensional position information corresponding to each face key point, the equipment information of the image acquisition equipment corresponding to each face image and a preset projection formula.
In a second aspect, an embodiment of the present invention provides a device for labeling key points of a human face, where the device includes:
the system comprises an obtaining module, a processing module and a processing module, wherein the obtaining module is configured to obtain face images which are acquired by a plurality of image acquisition devices in the same acquisition period, and the plurality of image acquisition devices shoot target faces from different angles;
the detection module is configured to detect image position information corresponding to each face key point in the target face from each face image;
a determining module, configured to determine labeling position information of each face key point of the target face in each face image based on image position information corresponding to the face key point in the target face in each face image and a preset key point labeling rule, where the preset key point labeling rule includes: the method comprises the following steps of labeling rules based on image position information corresponding to face key points in a target face image, and/or labeling rules based on image position information corresponding to the face key points which are obtained by traversing in each face image and meet specified position conditions, wherein the target face image is as follows: and the face images meet the specified screening rules.
Optionally, the detection module includes:
the first determining unit is configured to determine a corresponding thermodynamic diagram based on each face image;
the second determining unit is configured to determine image position information corresponding to each face key point in the target face and semantic information corresponding to the image position information from the thermodynamic diagram corresponding to each face image by using a preset clustering algorithm for each face image;
the determining module includes:
a third determining unit, configured to determine, for each face image, image location information corresponding to face key points corresponding to target semantic information from the face key points in the face image based on semantic information corresponding to each face key point in the target face in the face image, where the target semantic information is semantic information corresponding to image location information corresponding to at least two face key points;
a fourth determining unit, configured to, for each target semantic information corresponding to each face image, sequentially use image position information corresponding to a face key point corresponding to the target semantic information as target image position information corresponding to the face key point corresponding to the target semantic information; determining a reprojection error corresponding to the face image based on target image position information corresponding to a face key point corresponding to each target semantic information in the face image, image position information corresponding to face key points corresponding to other semantic information, and a preset three-dimensional face model, wherein the other semantic information is as follows: semantic information other than the target semantic information;
and the fifth determining unit is configured to determine, for each face image, the annotation position information of each face feature point in the face image based on each reprojection error corresponding to the face image and the image position information corresponding to a group of face key points corresponding to each reprojection error corresponding to the face image.
Optionally, the preset key point annotation rule includes: a rule for labeling based on image position information corresponding to the face key points which meet the specified position conditions and are obtained by traversing in each face image;
the fifth determining unit is specifically configured to determine, for each face image, image location information corresponding to a group of face key points corresponding to a smallest-valued reprojection error among the reprojection errors corresponding to the face image, as labeled location information of each face feature point in the face image.
Optionally, the preset key point annotation rule includes: the method comprises the steps of carrying out annotation rules based on image position information corresponding to face key points in a target face image and carrying out annotation rules based on image position information corresponding to the face key points which are obtained by traversing in each face image and meet specified position conditions;
the fifth determining unit is specifically configured to determine, for each face image, image position information corresponding to a group of face key points corresponding to a re-projection error with a smallest numerical value among re-projection errors corresponding to the face image, as middle position information of each face feature point in the face image;
displaying each face image containing the middle position information of each face key point so that a user can select at least two frames of target face images with accurate face key point detection results from the displayed face images containing the middle position information of each face key point;
determining at least two frames of target face images selected by a user based on the operation of the user;
determining three-dimensional position information corresponding to each face key point based on the middle position information of each face key point in each target face image and the equipment information of the image acquisition equipment corresponding to each target face image;
and determining the labeling position information of each face key point in each face image based on the three-dimensional position information corresponding to each face key point, the equipment information of the image acquisition equipment corresponding to each face image and a preset projection formula.
Optionally, the target semantic information includes a left-eye corner point and/or a right-eye corner point.
Optionally, the preset key point annotation rule includes: a rule for labeling based on image position information corresponding to a face key point in a target face image;
the determining module is specifically configured to display each face image containing image position information corresponding to each face key point in the target face, so that a user can select at least two frames of target face images with accurate face key point detection results from the displayed face images containing the image position information corresponding to each face key point in the target face;
determining at least two frames of target face images selected by a user based on the operation of the user;
determining three-dimensional position information corresponding to each face key point based on image position information corresponding to each face key point in each target face image and equipment information of image acquisition equipment corresponding to each target face image;
and determining the labeling position information of each face key point in each face image based on the three-dimensional position information corresponding to each face key point, the equipment information of the image acquisition equipment corresponding to each face image and a preset projection formula.
As can be seen from the above, the method and device for labeling key points of a human face provided in the embodiments of the present invention obtain human face images acquired by a plurality of image acquisition devices in the same acquisition period, wherein the plurality of image acquisition devices shoot a target human face from different angles; detecting image position information corresponding to each face key point in a target face from each face image; determining the labeling position information of each face key point of the target face in each face image based on the image position information corresponding to the face key point in the target face in each face image and a preset key point labeling rule, wherein the preset key point labeling rule comprises the following steps: the method comprises the following steps of labeling rules based on image position information corresponding to face key points in a target face image and/or labeling rules based on image position information corresponding to the face key points which are obtained by traversing in each face image and meet specified position conditions, wherein the target face image is as follows: and the face images meet the specified screening rules in the face images.
By applying the embodiment of the invention, the target face image which meets the specified screening rule, namely has accurate position detection result, is determined from the face image based on the rule of labeling the image position information corresponding to the face key points in the target face image, further, the position information of the key points of the face in each face image is marked by using the image position information corresponding to the key points of the face in the determined target face image, and/or based on the rule of labeling the image position information corresponding to the face key points which are obtained by traversing in each face image and meet the specified position condition, labeling the image position information corresponding to the face key points with accurate detection results in each face image, the marked position information of the human face characteristic points with accurate positions is obtained, and the accuracy of the position identification result of the human face key points is improved. Of course, not all of the advantages described above need to be achieved at the same time in the practice of any one product or method of the invention.
The innovation points of the embodiment of the invention comprise:
1. the method comprises the steps of determining a target face image which meets a specified screening rule, namely accurate position detection results, from the face image based on an image position information corresponding to face key points in the target face image, labeling the position information of the face key points in each face image based on the image position information corresponding to the face key points in the determined target face image, and/or labeling the position information of the face key points in each face image based on an image position information corresponding to the face key points meeting specified position conditions, which is obtained by traversing in each face image, and labeling the position information of the face feature points with accurate position by using the image position information corresponding to the face key points with accurate detection results in each face image, so as to obtain the labeled position information of the face feature points with accurate position, and improve the accuracy of the position identification results of the face key points.
2. Firstly, determining a thermodynamic diagram corresponding to each face image to determine image position information and semantic information thereof corresponding to each face key point contained in the face image from the thermodynamic diagram, then enumerating the image position information corresponding to target semantic information corresponding to at least two pieces of image position information, sequentially using the image position information as target image position information corresponding to the face key point corresponding to the target semantic information, and further combining a preset three-dimensional face model to determine a reprojection error corresponding to the face image under the condition that each piece of image position information corresponding to the target semantic information is used as the target image position information. Subsequently, on the one hand, considering that the relationship between the three-dimensional position information corresponding to each human face feature point in the preset three-dimensional human face model conforms to the actual human face feature, the smaller the reprojection error corresponding to the obtained human face image is, the higher the accuracy of the image position information representing the human face feature point corresponding to each semantic information detected from the image is, and the corresponding implementation is to determine the image position information corresponding to a group of human face key points corresponding to the reprojection error with the smallest numerical value in the reprojection error corresponding to the human face image as the labeled position information of each human face feature point in the human face image. On the other hand, considering that the relationship between the three-dimensional position information corresponding to each human face feature point in the preset three-dimensional human face model conforms to the actual human face feature, the smaller the reprojection error corresponding to the obtained human face image is, the higher the accuracy of the image position information representing the human face feature point corresponding to each semantic information detected from the image is, and meanwhile, based on the manual detection of the personnel on the position information of the human face key points detected in the multi-frame image, the human face key points with higher accuracy of the corresponding image position information are determined, the subsequent position annotation of the human face key points is performed, and the accuracy of the annotation position information of the human face key points in each human face image is better improved.
3. Considering the situation that the position detection of the face characteristic points in the multi-frame face images is more accurate and the position detection of the face characteristic points is less accurate, the target face image with the more accurate position detection of the face characteristic points in the frame face images is identified manually, and the labeled position information of each face key point in each face image is determined based on the image position information corresponding to the face characteristic points in the target face image, so that the accuracy of the position identification result of the face key points is improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below. It is to be understood that the drawings in the following description are merely exemplary of some embodiments of the invention. For a person skilled in the art, without inventive effort, further figures can be obtained from these figures.
Fig. 1 is a schematic flow chart of a method for labeling key points of a human face according to an embodiment of the present invention;
fig. 2 is another schematic flow chart of a method for labeling key points of a human face according to an embodiment of the present invention;
FIG. 3 is an exemplary diagram of a thermodynamic diagram corresponding to a face image;
fig. 4 is another schematic flow chart of a method for labeling key points of a human face according to an embodiment of the present invention;
fig. 5 is a schematic structural diagram of a device for labeling key points of a human face according to an embodiment of the present invention.
Detailed Description
The technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. It is to be understood that the described embodiments are merely a few embodiments of the invention, and not all embodiments. All other embodiments, which can be obtained by a person skilled in the art without inventive effort based on the embodiments of the present invention, are within the scope of the present invention.
It is to be noted that the terms "comprises" and "comprising" and any variations thereof in the embodiments and drawings of the present invention are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements listed, but may alternatively include other steps or elements not listed, or inherent to such process, method, article, or apparatus.
The invention provides a method and a device for labeling key points of a human face, which are used for improving the accuracy of a position identification result of the key points of the human face. The following provides a detailed description of embodiments of the invention.
Fig. 1 is a schematic flow chart of a method for labeling key points of a human face according to an embodiment of the present invention. The method may comprise the steps of:
s101: and acquiring the face images acquired by a plurality of image acquisition devices in the same acquisition period.
The plurality of image acquisition devices shoot the target face from different angles.
The method for labeling the key points of the human face provided by the embodiment of the invention can be applied to any type of electronic equipment with computing capability, and the electronic equipment can be a server or a terminal. The electronic device can be connected with a plurality of image acquisition devices to obtain images acquired by the plurality of image acquisition devices. In one implementation, the plurality of image capture devices may be disposed within a compartment of a vehicle and may be captured from different angles with respect to a human face of a person within the vehicle. Alternatively, the plurality of image capturing devices may be located in any indoor or outdoor scene. The plurality of image acquisition devices can be used for shooting the human faces of people in an all-around mode. In one case, there may be an overlapping region of the image capturing regions of every two image capturing apparatuses positioned adjacently in the plurality of image capturing apparatuses.
In one case, the target face may be in a state of being strongly opened and closed, and accordingly, the electronic device may obtain images of the face in the state of being strongly opened and closed by the target face acquired by the plurality of image acquisition devices.
S102: and detecting image position information corresponding to each face key point in the target face from each face image.
In this step, the electronic device may detect each face image by using a preset face key point detection algorithm, and detect image position information corresponding to each face key point in a target face in the face image.
In one implementation, the preset face keypoint detection algorithm may include, but is not limited to: a deep learning-based key point detection model, an asm (active Shape model) algorithm, and a CPR (Cascaded position Regression) algorithm. The key point detection model based on deep learning may be a neural network model obtained by training based on sample images marked with key points of each face in the face, wherein the training process of the key point detection model based on deep learning may refer to the training process of the neural network model in the related art, and is not described herein again. The embodiment of the present invention does not limit the specific type of the preset face key point detection algorithm, and any algorithm that can detect image position information corresponding to each face key point of a target face in a face image can be applied to the embodiment of the present invention.
The electronic equipment can obtain the image position information corresponding to each face key point in the target face in the face image and can also obtain the semantic information corresponding to each face key point.
In another implementation manner, the electronic device may first process each face image by using a HeatMap algorithm to obtain a thermodynamic diagram corresponding to each face image, and then determine image position information corresponding to each face key point in the target face and semantic information corresponding to the image position information from the thermodynamic diagram corresponding to each face image by using a preset clustering algorithm.
Wherein the content of the first and second substances,
semantic information corresponding to the face key points is as follows: and information describing attribute information of the corresponding face key points in the target face image. For example: semantic information includes, but is not limited to: the left face outer eye corner point, the left face inner eye corner point, the right face outer eye corner point, the right face inner eye corner point and the like.
In the thermodynamic diagram corresponding to each face image, the pixel value of each pixel point represents the brightness of the pixel point. The larger the pixel value of the pixel point is, the larger the brightness of the pixel point is; and the pixel value of the pixel point in the thermodynamic diagram corresponding to the face image can also represent the possibility that the pixel point is a target point, namely a key point, and the greater the pixel value of the pixel point is, the greater the possibility that the corresponding pixel point is the target point is.
In one case, each face image may correspond to a plurality of thermodynamic diagrams, where each thermodynamic diagram corresponding to a face image corresponds to semantic information for describing attribute information of the corresponding face key point in the target face image. For example: the thermodynamic diagrams corresponding to the face image may include a thermodynamic diagram corresponding to the outer corner points of the left face, a thermodynamic diagram corresponding to the inner corner points of the left face, a thermodynamic diagram corresponding to the outer corner points of the right face, a thermodynamic diagram corresponding to the inner corner points of the right face, and the like. Correspondingly, in the thermodynamic diagram corresponding to the outer corner point of the left eye, if the pixel value corresponding to the pixel point (x0, y0) is the largest, the outer corner point of the left eye is most likely to be located at (x0, y 0).
S103: and determining the labeling position information of each face key point of the target face in each face image based on the image position information corresponding to the face key point in the target face in each face image and a preset key point labeling rule.
Wherein, presetting the key point marking rule comprises: the method comprises the following steps of labeling rules based on image position information corresponding to face key points in a target face image and/or labeling rules based on image position information corresponding to the face key points which are obtained by traversing in each face image and meet specified position conditions, wherein the target face image is as follows: and the face images meet the specified screening rules in the face images.
In this step, a preset key point labeling rule is pre-stored in the electronic device locally or in a connected storage device, after the electronic device obtains image position information corresponding to each face key point in a target face in each face image, a face image satisfying a specified screening rule can be determined from the face image based on the preset key point labeling rule and the image position information corresponding to each face key point in the target face in each face image, that is, a face image with an accurate position detection result is determined as the target face image, and the labeling position information of each face key point in each face image is determined based on the image position information corresponding to each face key point in the target face image, and/or the image position information corresponding to the face key point satisfying the specified position condition obtained by traversal in each face image is determined to determine the image position information corresponding to the face key point with an accurate position detection result, and determining the labeling position information of each face key point in each face image. And obtaining the marking position information of the key points of the human face with accurate positions in the human face images.
Subsequently, in an implementation manner, after determining the annotation position information of each face key point in each face image, the electronic device may perform a subsequent process based on the annotation position information of each face key point in each face image, for example: and performing a face recognition process, or performing a fatigue driving behavior detection process, or performing a personnel identity verification process, or sending the labeled position information of each face key point in each face image to other electronic equipment, so that the other electronic equipment executes the corresponding preset process.
By applying the embodiment of the invention, the target face image which meets the specified screening rule, namely has accurate position detection result, is determined from the face image based on the rule of labeling the image position information corresponding to the face key points in the target face image, further, the position information of the key points of the face in each face image is marked by using the image position information corresponding to the key points of the face in the determined target face image, and/or based on the rule of labeling the image position information corresponding to the face key points which are obtained by traversing in each face image and meet the specified position condition, labeling the image position information corresponding to the face key points with accurate detection results in each face image, the marked position information of the human face characteristic points with accurate positions is obtained, and the accuracy of the position identification result of the human face key points is improved.
In another embodiment of the present invention, as shown in fig. 2, the method may include the steps of:
s201: and acquiring the face images acquired by a plurality of image acquisition devices in the same acquisition period.
The plurality of image acquisition devices shoot the target face from different angles.
S202: based on each face image, its corresponding thermodynamic diagram is determined.
S203: and aiming at each face image, determining image position information corresponding to each face key point in the target face and semantic information corresponding to the image position information from the thermodynamic diagram corresponding to the face image by using a preset clustering algorithm.
S204: and aiming at each face image, determining image position information corresponding to each face key point corresponding to the target semantic information from the face key points in the face image based on the semantic information corresponding to each face key point in the target face in the face image.
The target semantic information is semantic information corresponding to image position information corresponding to at least two face key points;
s205: aiming at each target semantic information corresponding to each face image, sequentially taking the image position information corresponding to the face key point corresponding to the target semantic information as the target image position information corresponding to the face key point corresponding to the target semantic information; and determining a reprojection error corresponding to the face image based on the target image position information corresponding to the face key point corresponding to each target semantic information in the face image, the image position information corresponding to the face key points corresponding to other semantic information and a preset three-dimensional face model.
Wherein, the other semantic information is: semantic information other than the target semantic information;
s206: and determining the labeling position information of each face characteristic point in each face image based on each reprojection error corresponding to the face image and the image position information corresponding to a group of face key points corresponding to each reprojection error corresponding to the face image.
In the embodiment of the present invention, it is considered that, in some cases, the preset face keypoint detection algorithm is directly used to detect the face keypoints in the face image, and a situation that the position information of the detected face feature points is inaccurate may occur, for example, when the target face is in a state of opening and closing eyes with force, errors are likely to occur in the detection of the image position information of the eye corner points of the human eyes. In order to ensure the accuracy of the position information of the detected face feature points, the electronic device may first process each face image by using a HeatMap algorithm to obtain a thermodynamic diagram corresponding to each face image, and then determine image position information corresponding to each face key point in the target face and semantic information corresponding to the image position information from the thermodynamic diagram corresponding to each face image by using a preset clustering algorithm. And then, determining the labeling position information of each face key point in each face image based on the image position information corresponding to each face key point in the target face determined from the thermodynamic diagram and the semantic information corresponding to the image position information.
In one aspect, the process of determining the labeled position information of each face key point in each face image based on the image position information corresponding to each face key point in the target face determined from the thermodynamic diagram and the semantic information corresponding to the image position information may be: for each face image, determining semantic information corresponding to at least two image position information from the semantic information as target semantic information based on semantic information corresponding to each face key point in the target face in the face image; and then, determining image position information corresponding to the face key points corresponding to the target semantic information from the face key points in the face image.
Aiming at each target semantic information corresponding to each face image, sequentially taking the image position information corresponding to the face key point corresponding to the target semantic information as the target image position information corresponding to the face key point corresponding to the target semantic information; determining a reprojection error corresponding to the face image based on target image position information corresponding to a face key point corresponding to each target semantic information in the face image, image position information corresponding to face key points corresponding to other semantic information, and a preset three-dimensional face model, wherein the preset three-dimensional face model includes three-dimensional position information of a space point corresponding to the face key point corresponding to each target semantic information, and three-dimensional position information of a space point corresponding to the face key point corresponding to other semantic information, and the preset three-dimensional face model can be determined based on a 3d fuzzy model (3d deformable model).
Specifically, the process of determining the reprojection error corresponding to the face image may be: determining three-dimensional position information of a space point corresponding to each target semantic information and three-dimensional position information of a space point corresponding to other semantic information from a preset three-dimensional face model as model three-dimensional position information, and further converting the space point corresponding to the model three-dimensional position information from a rectangular space coordinate system where the preset three-dimensional face model is located to an equipment coordinate system of image acquisition equipment corresponding to the face image based on a preset position conversion relation; further, projecting a spatial point corresponding to each target semantic information and spatial points corresponding to other semantic information in an equipment coordinate system of the image acquisition equipment corresponding to the face image into the face image by using a preset projection formula of the image acquisition equipment corresponding to the face image, and determining projection position information of a projection point of the spatial point corresponding to a face key corresponding to each target semantic information in the face image and projection position information of a projection point of the spatial point corresponding to a face key corresponding to each other semantic information in the face image; and calculating projection position information corresponding to each target semantic information, target image position information corresponding to the face key point corresponding to each target semantic information, and the distance between the projection position information corresponding to each other semantic information and the image position information corresponding to each other semantic information, and taking the sum of the calculated distances as a re-projection error corresponding to the face image.
The image acquisition equipment corresponding to the face image is as follows: and the image acquisition equipment acquires the face image, and the image acquisition equipment determines the face image according to a preset projection formula. The preset position conversion relation is as follows: and presetting a conversion relation between a rectangular coordinate system of a space where the three-dimensional face model is located and an equipment coordinate system of image acquisition equipment corresponding to the face image.
In another case, the target semantic information may be set in advance according to actual conditions, in an implementation manner, the target human face is in a state of being opened and closed with force, accordingly, considering that many wrinkles are likely to occur around the human eye under the condition that the target human face is in the state of being opened and closed with force, due to the occurrence of wrinkles, a detection position of an eye corner point of the human eye is wrong, for example: the eyebrow tail points of eyebrows above the human eyes are detected as the outer eye corner points of the human eyes, and in order to ensure the accuracy of the position information of the outer eye corner points in the determined key points of the human face, the target semantic information can comprise a left outer eye corner point and/or a right outer eye corner point. Another example is: the eyebrow points of the eyebrows above the human eyes are detected as the inner eye angular points of the human eyes, and in order to ensure the accuracy of the position information of the inner eye angular points in the determined key points of the human face, the target semantic information can comprise left inner eye angular points and/or right inner eye angular points.
Correspondingly, the process of determining the labeling position information of each face key point in each face image based on the image position information corresponding to each face key point in the target face determined from the thermodynamic diagram and the semantic information corresponding to the image position information may be: and determining image position information corresponding to the face key points corresponding to the target semantic information from the face key points in the face image directly based on semantic information corresponding to each face key point in the target face in the face image, and if the face key points corresponding to the semantic information representing the face key points as canthus points and the image position information corresponding to the face key points are determined from the face image.
Further, aiming at each target semantic information corresponding to each face image, sequentially taking the image position information corresponding to the face key point corresponding to the target semantic information as the target image position information corresponding to the face key point corresponding to the target semantic information; and determining a reprojection error corresponding to the face image based on the target image position information corresponding to the face key point corresponding to each target semantic information in the face image, the image position information corresponding to the face key points corresponding to other semantic information and a preset three-dimensional face model.
For example, when the target face is in a state of opening and closing the eyes with force, two pieces of image position information corresponding to the target semantic information that is the corner of the left eye and/or two pieces of image position information corresponding to the target semantic information that is the corner of the right eye may appear in the thermodynamic diagram corresponding to the face image, as shown in fig. 3. Correspondingly, the electronic device can determine the image position information corresponding to the left eye external canthus point with accurate position from the 2 image position information corresponding to the left eye external canthus point with the target semantic information, and can sequentially use the 2 image position information corresponding to the left eye external canthus point as the target image position information corresponding to the left eye external canthus point; and/or determining image position information corresponding to the right-eye external canthus point with accurate position from 2 image position information corresponding to the right-eye external canthus point with target semantic information, wherein the 2 image position information corresponding to the right-eye external canthus point can be used as target image position information corresponding to the right-eye external canthus point in sequence; and determining a reprojection error corresponding to the face image based on the target image position information corresponding to the left eye external canthus point, the target image position information corresponding to the right eye external canthus point, the image position information corresponding to the face key point corresponding to other semantic information and a preset three-dimensional face model.
It can be understood that the number of the reprojection errors corresponding to the face image is related to the number of the target semantic information in the face image and the number of the image position information corresponding to each target semantic information. For example, if the number of target semantic information in the face image is 2 and the number of image position information corresponding to each target semantic information in the face image is 3, the number of reprojection errors corresponding to the face image is 2 × 3 — 6.
Subsequently, after the electronic device determines the reprojection error corresponding to each face image, for each face image, based on each reprojection error corresponding to the face image and the image position information corresponding to a group of face key points corresponding to each reprojection error corresponding to the face image, the annotation position information of each face feature point in the face image is determined.
In another embodiment of the present invention, considering that the relationship between the three-dimensional position information corresponding to each human face feature point in the preset three-dimensional human face model conforms to the actual human face feature, the smaller the reprojection error corresponding to the human face image is, the higher the accuracy of the image position information representing the human face feature point corresponding to each semantic information detected from the human face image is. Correspondingly, the preset key point labeling rule may include: a rule for labeling based on image position information corresponding to the face key points which meet the specified position conditions and are obtained by traversing in each face image;
the S206 may include the following steps 011:
011: and determining image position information corresponding to a group of human face key points corresponding to the reprojection error with the minimum numerical value in the reprojection errors corresponding to the human face image as the labeling position information of the human face characteristic points in the human face image.
In the embodiment of the present invention, the face key points satisfying the specified position condition may refer to: and among all face key points in the face image, calculating a reprojection error corresponding to the face image by using the image position information corresponding to the face key points, wherein the reprojection error is the face key point with the minimum value in the plurality of reprojection errors corresponding to the face image.
Considering that the relationship between the three-dimensional position information corresponding to each human face characteristic point in the preset three-dimensional human face model conforms to the actual human face characteristics, the smaller the reprojection error corresponding to the determined human face image is, the higher the accuracy of the image position information representing the human face characteristic points corresponding to each semantic information detected from the image is, and meanwhile, the different accuracy of the detection of the image position information of the human face characteristic points in different human face images is considered. In order to determine the labeling position information of the face key points with higher accuracy in each face image. In another embodiment of the present invention, the preset keyword annotation rule includes: the method comprises the steps of carrying out annotation rules based on image position information corresponding to face key points in a target face image and carrying out annotation rules based on image position information corresponding to the face key points which are obtained by traversing in each face image and meet specified position conditions;
the step S206 may include the following steps 021-:
021: and aiming at each face image, determining image position information corresponding to a group of face key points corresponding to the reprojection error with the minimum numerical value in the reprojection errors corresponding to the face image as the middle position information of each face characteristic point in the face image.
022: and displaying each face image containing the middle position information of each face key point so that a user can select at least two frames of target face images with accurate face key point detection results from the displayed face images containing the middle position information of each face key point.
023: and determining at least two frames of target face images selected by the user based on the operation of the user.
024: and determining three-dimensional position information corresponding to each face key point based on the middle position information of each face key point in each target face image and the equipment information of the image acquisition equipment corresponding to each target face image.
025: and determining the labeling position information of each face key point in each face image based on the three-dimensional position information corresponding to each face key point, the equipment information of the image acquisition equipment corresponding to each face image and a preset projection formula.
In the embodiment of the invention, aiming at each face image, the electronic equipment determines the reprojection error with the minimum value from the reprojection errors corresponding to the face image, and determines the image position information corresponding to a group of face key points corresponding to the reprojection error with the minimum value as the middle position information of each face characteristic point in the face image; furthermore, each face image containing the middle position information of each face key point is displayed, so that a user can select at least two frames of target face images with accurate face key point detection results from the displayed face images containing the middle position information of each face key point; and determining at least two frames of target face images selected by the user based on the selection operation of the user.
The electronic equipment determines three-dimensional position information corresponding to the face key points corresponding to the semantic information based on the middle position information of the face key points corresponding to the semantic information in each target face image and the equipment information of the image acquisition equipment corresponding to each target face image aiming at each semantic information; after the three-dimensional position information corresponding to the face key point corresponding to each semantic information is determined, the equipment information and a preset projection formula of the image acquisition equipment corresponding to each face image are utilized, the space point represented by the three-dimensional position information corresponding to the face key point corresponding to each semantic information is projected into each face image, and the labeling position information of the face key point corresponding to each semantic information in each face image, namely the labeling position information of each face key point in each face image, is determined.
The device information of the image acquisition device corresponding to the face image may include: and the image acquisition equipment acquires the face image, and acquires the equipment pose information and the equipment internal reference information when the face image is acquired. The device internal reference information of the image acquisition device includes but is not limited to: the length of each pixel point in the direction of a transverse axis of an imaging surface of the image acquisition equipment, the length of each pixel point in the direction of a longitudinal axis, a focal length, position information of an image principal point, a zoom factor and the like, wherein the image principal point is an intersection point of an optical axis of the image acquisition equipment and an image plane. The device pose information of the image capturing device may include: and the position and the posture of the face image are acquired by the image acquisition equipment.
In the embodiment of the invention, the face key points with higher accuracy of the corresponding image position information are determined by screening the image position information corresponding to each face key point in the face image and manually detecting the position information of the face key points detected in the multi-frame image by personnel, and the subsequent position labeling of the face key points is carried out, so that the accuracy of the labeled position information of the face key points in each face image is better improved.
Considering the situation that the position detection of the face characteristic points in the multi-frame face images is more accurate and the position detection of the face characteristic points is less accurate, the target face image with the more accurate position detection of the face characteristic points in the frame face images is identified manually, and the labeled position information of each face key point in each face image is determined based on the image position information corresponding to the face characteristic points in the target face image, so that the accuracy of the position identification result of the face key points is improved. Correspondingly, in another embodiment of the present invention, the presetting of the keyword annotation rule includes: a rule for labeling based on image position information corresponding to a face key point in a target face image; as shown in fig. 4, the method may include the steps of:
s401: and acquiring the face images acquired by a plurality of image acquisition devices in the same acquisition period.
The plurality of image acquisition devices shoot the target face from different angles.
S402: and detecting image position information corresponding to each face key point in the target face from each face image.
S403: and displaying each face image containing the image position information corresponding to each face key point in the target face so that the user can select at least two frames of target face images with accurate face key point detection results from the displayed face images containing the image position information corresponding to each face key point in the target face.
S404: and determining at least two frames of target face images selected by the user based on the operation of the user.
S405: and determining three-dimensional position information corresponding to each face key point based on image position information corresponding to each face key point in each target face image and equipment information of image acquisition equipment corresponding to each target face image.
S406: and determining the labeling position information of each face key point in each face image based on the three-dimensional position information corresponding to each face key point, the equipment information of the image acquisition equipment corresponding to each face image and a preset projection formula.
In the embodiment of the invention, after detecting the image position information corresponding to each face key point in the target face from each face image, the electronic equipment directly displays each face image containing the image position information corresponding to each face key point in the target face, and a user can select at least two frames of target face images with accurate detection results of the face key points from the displayed face images containing the image position information corresponding to each face key point in the target face and execute the selection operation. The electronic equipment determines at least two frames of target face images selected by a user based on the selection operation of the user, wherein the at least two frames of target face images are as follows: and selecting the face image with more accurate image position information corresponding to the face key point contained in the face image by the user.
Subsequently, the electronic equipment determines three-dimensional position information corresponding to the face key point corresponding to the semantic information based on the middle position information of the face key point corresponding to the semantic information in each target face image and the equipment information of the image acquisition equipment corresponding to each target face image aiming at each semantic information; after the three-dimensional position information corresponding to the face key point corresponding to each semantic information is determined, the equipment information and a preset projection formula of the image acquisition equipment corresponding to each face image are utilized, the space point represented by the three-dimensional position information corresponding to the face key point corresponding to each semantic information is projected into each face image, and the labeling position information of the face key point corresponding to each semantic information in each face image, namely the labeling position information of each face key point in each face image, is determined.
Corresponding to the above method embodiment, an embodiment of the present invention provides a device for labeling key points of a human face, and as shown in fig. 5, the device includes:
an obtaining module 510 configured to obtain face images acquired by a plurality of image acquisition devices in the same acquisition period, wherein the plurality of image acquisition devices shoot a target face from different angles;
a detection module 520 configured to detect image position information corresponding to each face key point in the target face from each face image;
a determining module 530, configured to determine labeling position information of each face key point of the target face in each face image based on image position information corresponding to a face key point in the target face in each face image and a preset key point labeling rule, where the preset key point labeling rule includes: the method comprises the following steps of labeling rules based on image position information corresponding to face key points in a target face image, and/or labeling rules based on image position information corresponding to the face key points which are obtained by traversing in each face image and meet specified position conditions, wherein the target face image is as follows: and the face images meet the specified screening rules.
By applying the embodiment of the invention, the target face image which meets the specified screening rule, namely has accurate position detection result, is determined from the face image based on the rule of labeling the image position information corresponding to the face key points in the target face image, further, the position information of the key points of the face in each face image is marked by using the image position information corresponding to the key points of the face in the determined target face image, and/or based on the rule of labeling the image position information corresponding to the face key points which are obtained by traversing in each face image and meet the specified position condition, labeling the image position information corresponding to the face key points with accurate detection results in each face image, the marked position information of the human face characteristic points with accurate positions is obtained, and the accuracy of the position identification result of the human face key points is improved.
In another embodiment of the present invention, the detecting module 520 includes:
a first determining unit (not shown in the figure) configured to determine a corresponding thermodynamic diagram based on each face image;
a second determining unit (not shown in the figures), configured to determine, for each face image, image position information corresponding to each face key point in the target face and semantic information corresponding to the image position information from a thermodynamic diagram corresponding to the face image by using a preset clustering algorithm;
the determining module 530 includes:
a third determining unit (not shown in the figures), configured to determine, for each face image, image location information corresponding to face key points corresponding to target semantic information from the face key points in the face image based on semantic information corresponding to each face key point in the target face in the face image, where the target semantic information is semantic information corresponding to image location information corresponding to at least two face key points;
a fourth determining unit (not shown in the figure), configured to, for each target semantic information corresponding to each face image, sequentially use image position information corresponding to a face key point corresponding to the target semantic information as target image position information corresponding to the face key point corresponding to the target semantic information; determining a reprojection error corresponding to the face image based on target image position information corresponding to a face key point corresponding to each target semantic information in the face image, image position information corresponding to face key points corresponding to other semantic information, and a preset three-dimensional face model, wherein the other semantic information is as follows: semantic information other than the target semantic information;
and a fifth determining unit (not shown in the figure), configured to determine, for each face image, annotation position information of each face feature point in the face image based on each reprojection error corresponding to the face image and image position information corresponding to a group of face key points corresponding to each reprojection error corresponding to the face image.
In another embodiment of the present invention, the preset keyword annotation rule includes: a rule for labeling based on image position information corresponding to the face key points which meet the specified position conditions and are obtained by traversing in each face image;
the fifth determining unit is specifically configured to determine, for each face image, image location information corresponding to a group of face key points corresponding to a smallest-valued reprojection error among the reprojection errors corresponding to the face image, as labeled location information of each face feature point in the face image.
In another embodiment of the present invention, the preset keyword annotation rule includes: the method comprises the steps of carrying out annotation rules based on image position information corresponding to face key points in a target face image and carrying out annotation rules based on image position information corresponding to the face key points which are obtained by traversing in each face image and meet specified position conditions;
the fifth determining unit is specifically configured to determine, for each face image, image position information corresponding to a group of face key points corresponding to a re-projection error with a smallest numerical value among re-projection errors corresponding to the face image, as middle position information of each face feature point in the face image;
displaying each face image containing the middle position information of each face key point so that a user can select at least two frames of target face images with accurate face key point detection results from the displayed face images containing the middle position information of each face key point;
determining at least two frames of target face images selected by a user based on the operation of the user;
determining three-dimensional position information corresponding to each face key point based on the middle position information of each face key point in each target face image and the equipment information of the image acquisition equipment corresponding to each target face image;
and determining the labeling position information of each face key point in each face image based on the three-dimensional position information corresponding to each face key point, the equipment information of the image acquisition equipment corresponding to each face image and a preset projection formula.
In another embodiment of the present invention, the target semantic information includes a left eye corner point and/or a right eye corner point.
In another embodiment of the present invention, the preset keyword annotation rule includes: a rule for labeling based on image position information corresponding to a face key point in a target face image;
the determining module 530 is specifically configured to display each face image including image position information corresponding to each face key point in the target face, so that the user selects at least two frames of target face images with accurate face key point detection results from the displayed face images including image position information corresponding to each face key point in the target face;
determining at least two frames of target face images selected by a user based on the operation of the user;
determining three-dimensional position information corresponding to each face key point based on image position information corresponding to each face key point in each target face image and equipment information of image acquisition equipment corresponding to each target face image;
and determining the labeling position information of each face key point in each face image based on the three-dimensional position information corresponding to each face key point, the equipment information of the image acquisition equipment corresponding to each face image and a preset projection formula.
The device and system embodiments correspond to the method embodiments, and have the same technical effects as the method embodiments, and specific descriptions refer to the method embodiments. The device embodiment is obtained based on the method embodiment, and for specific description, reference may be made to the method embodiment section, which is not described herein again. Those of ordinary skill in the art will understand that: the figures are merely schematic representations of one embodiment, and the blocks or flow diagrams in the figures are not necessarily required to practice the present invention.
Those of ordinary skill in the art will understand that: modules in the devices in the embodiments may be distributed in the devices in the embodiments according to the description of the embodiments, or may be located in one or more devices different from the embodiments with corresponding changes. The modules of the above embodiments may be combined into one module, or further split into multiple sub-modules.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (10)

1. A method for labeling key points of a human face is characterized by comprising the following steps:
the method comprises the steps of obtaining face images collected by a plurality of image collecting devices in the same collecting period, wherein the plurality of image collecting devices shoot a target face from different angles;
detecting image position information corresponding to each face key point in the target face from each face image;
determining labeling position information of each face key point of the target face in each face image based on image position information corresponding to the face key points in the target face in each face image and a preset key point labeling rule, wherein the preset key point labeling rule comprises: the method comprises the following steps of labeling rules based on image position information corresponding to face key points in a target face image, and/or labeling rules based on image position information corresponding to the face key points which are obtained by traversing in each face image and meet specified position conditions, wherein the target face image is as follows: and the face images meet the specified screening rules.
2. The method according to claim 1, wherein the step of detecting image position information corresponding to each face key point in the target face from each face image comprises:
determining a corresponding thermodynamic diagram of each face image based on the face images;
aiming at each face image, determining image position information corresponding to each face key point in the target face and semantic information corresponding to the image position information from the thermodynamic diagram corresponding to the face image by using a preset clustering algorithm;
the step of determining labeling position information of each face key point of the target face in each face image based on image position information corresponding to the face key point in the target face in each face image and a preset key point labeling rule comprises the following steps:
for each face image, determining image position information corresponding to face key points corresponding to target semantic information from the face key points in the face image based on semantic information corresponding to each face key point in the target face in the face image, wherein the target semantic information is semantic information corresponding to image position information corresponding to at least two face key points;
aiming at each target semantic information corresponding to each face image, sequentially taking the image position information corresponding to the face key point corresponding to the target semantic information as the target image position information corresponding to the face key point corresponding to the target semantic information; determining a reprojection error corresponding to the face image based on target image position information corresponding to a face key point corresponding to each target semantic information in the face image, image position information corresponding to face key points corresponding to other semantic information, and a preset three-dimensional face model, wherein the other semantic information is as follows: semantic information other than the target semantic information;
and determining the labeling position information of each face characteristic point in each face image based on each reprojection error corresponding to the face image and the image position information corresponding to a group of face key points corresponding to each reprojection error corresponding to the face image.
3. The method of claim 2, wherein the preset keypoint labeling rule comprises: a rule for labeling based on image position information corresponding to the face key points which meet the specified position conditions and are obtained by traversing in each face image;
the step of determining, for each face image, based on the reprojection errors corresponding to the face image and the image location information corresponding to a group of face key points corresponding to the reprojection errors corresponding to the face image, the annotation location information of each face feature point in the face image includes:
and determining image position information corresponding to a group of human face key points corresponding to the reprojection error with the minimum numerical value in the reprojection errors corresponding to each human face image as the labeling position information of each human face characteristic point in the human face image.
4. The method of claim 2, wherein the preset keypoint labeling rule comprises: the method comprises the steps of carrying out annotation rules based on image position information corresponding to face key points in a target face image and carrying out annotation rules based on image position information corresponding to the face key points which are obtained by traversing in each face image and meet specified position conditions;
the step of determining, for each face image, based on the reprojection errors corresponding to the face image and the image location information corresponding to a group of face key points corresponding to the reprojection errors corresponding to the face image, the annotation location information of each face feature point in the face image includes:
aiming at each face image, determining image position information corresponding to a group of face key points corresponding to the re-projection error with the minimum numerical value in the re-projection errors corresponding to the face image as the middle position information of each face characteristic point in the face image;
displaying each face image containing the middle position information of each face key point so that a user can select at least two frames of target face images with accurate face key point detection results from the displayed face images containing the middle position information of each face key point;
determining at least two frames of target face images selected by a user based on the operation of the user;
determining three-dimensional position information corresponding to each face key point based on the middle position information of each face key point in each target face image and the equipment information of the image acquisition equipment corresponding to each target face image;
and determining the labeling position information of each face key point in each face image based on the three-dimensional position information corresponding to each face key point, the equipment information of the image acquisition equipment corresponding to each face image and a preset projection formula.
5. The method of claims 2-4, wherein the target semantic information includes left eye corner points and/or right eye corner points.
6. The method of claim 1, wherein the preset keypoint labeling rule comprises: a rule for labeling based on image position information corresponding to a face key point in a target face image;
the step of determining labeling position information of each face key point of the target face in each face image based on image position information corresponding to the face key point in the target face in each face image and a preset key point labeling rule comprises the following steps:
displaying each face image containing image position information corresponding to each face key point in the target face, so that a user can select at least two frames of target face images with accurate face key point detection results from the displayed face images containing the image position information corresponding to each face key point in the target face;
determining at least two frames of target face images selected by a user based on the operation of the user;
determining three-dimensional position information corresponding to each face key point based on image position information corresponding to each face key point in each target face image and equipment information of image acquisition equipment corresponding to each target face image;
and determining the labeling position information of each face key point in each face image based on the three-dimensional position information corresponding to each face key point, the equipment information of the image acquisition equipment corresponding to each face image and a preset projection formula.
7. A labeling device for key points of a human face is characterized by comprising:
the system comprises an obtaining module, a processing module and a processing module, wherein the obtaining module is configured to obtain face images which are acquired by a plurality of image acquisition devices in the same acquisition period, and the plurality of image acquisition devices shoot target faces from different angles;
the detection module is configured to detect image position information corresponding to each face key point in the target face from each face image;
a determining module, configured to determine labeling position information of each face key point of the target face in each face image based on image position information corresponding to the face key point in the target face in each face image and a preset key point labeling rule, where the preset key point labeling rule includes: the method comprises the following steps of labeling rules based on image position information corresponding to face key points in a target face image, and/or labeling rules based on image position information corresponding to the face key points which are obtained by traversing in each face image and meet specified position conditions, wherein the target face image is as follows: and the face images meet the specified screening rules.
8. The apparatus of claim 7, wherein the detection module comprises:
the first determining unit is configured to determine a corresponding thermodynamic diagram based on each face image;
the second determining unit is configured to determine image position information corresponding to each face key point in the target face and semantic information corresponding to the image position information from the thermodynamic diagram corresponding to each face image by using a preset clustering algorithm for each face image;
the determining module includes:
a third determining unit, configured to determine, for each face image, image location information corresponding to face key points corresponding to target semantic information from the face key points in the face image based on semantic information corresponding to each face key point in the target face in the face image, where the target semantic information is semantic information corresponding to image location information corresponding to at least two face key points;
a fourth determining unit, configured to, for each target semantic information corresponding to each face image, sequentially use image position information corresponding to a face key point corresponding to the target semantic information as target image position information corresponding to the face key point corresponding to the target semantic information; determining a reprojection error corresponding to the face image based on target image position information corresponding to a face key point corresponding to each target semantic information in the face image, image position information corresponding to face key points corresponding to other semantic information, and a preset three-dimensional face model, wherein the other semantic information is as follows: semantic information other than the target semantic information;
and the fifth determining unit is configured to determine, for each face image, the annotation position information of each face feature point in the face image based on each reprojection error corresponding to the face image and the image position information corresponding to a group of face key points corresponding to each reprojection error corresponding to the face image.
9. The apparatus of claim 8, wherein the preset keypoint labeling rule comprises: a rule for labeling based on image position information corresponding to the face key points which meet the specified position conditions and are obtained by traversing in each face image;
the fifth determining unit is specifically configured to determine, for each face image, image location information corresponding to a group of face key points corresponding to a smallest-valued reprojection error among the reprojection errors corresponding to the face image, as labeled location information of each face feature point in the face image.
10. The apparatus of claim 8, wherein the preset keypoint labeling rule comprises: the method comprises the steps of carrying out annotation rules based on image position information corresponding to face key points in a target face image and carrying out annotation rules based on image position information corresponding to the face key points which are obtained by traversing in each face image and meet specified position conditions;
the fifth determining unit is specifically configured to determine, for each face image, image position information corresponding to a group of face key points corresponding to a re-projection error with a smallest numerical value among re-projection errors corresponding to the face image, as middle position information of each face feature point in the face image;
displaying each face image containing the middle position information of each face key point so that a user can select at least two frames of target face images with accurate face key point detection results from the displayed face images containing the middle position information of each face key point;
determining at least two frames of target face images selected by a user based on the operation of the user;
determining three-dimensional position information corresponding to each face key point based on the middle position information of each face key point in each target face image and the equipment information of the image acquisition equipment corresponding to each target face image;
and determining the labeling position information of each face key point in each face image based on the three-dimensional position information corresponding to each face key point, the equipment information of the image acquisition equipment corresponding to each face image and a preset projection formula.
CN202010350817.7A 2020-04-28 2020-04-28 Method and device for labeling key points of human face Pending CN113569594A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010350817.7A CN113569594A (en) 2020-04-28 2020-04-28 Method and device for labeling key points of human face

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010350817.7A CN113569594A (en) 2020-04-28 2020-04-28 Method and device for labeling key points of human face

Publications (1)

Publication Number Publication Date
CN113569594A true CN113569594A (en) 2021-10-29

Family

ID=78158139

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010350817.7A Pending CN113569594A (en) 2020-04-28 2020-04-28 Method and device for labeling key points of human face

Country Status (1)

Country Link
CN (1) CN113569594A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114677734A (en) * 2022-03-25 2022-06-28 马上消费金融股份有限公司 Key point labeling method and device
CN117173765A (en) * 2023-09-06 2023-12-05 广东工业大学 Large-scale mask face data set labeling method and system

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114677734A (en) * 2022-03-25 2022-06-28 马上消费金融股份有限公司 Key point labeling method and device
CN114677734B (en) * 2022-03-25 2024-02-02 马上消费金融股份有限公司 Key point marking method and device
CN117173765A (en) * 2023-09-06 2023-12-05 广东工业大学 Large-scale mask face data set labeling method and system

Similar Documents

Publication Publication Date Title
CN110163114B (en) Method and system for analyzing face angle and face blurriness and computer equipment
CN108764024B (en) Device and method for generating face recognition model and computer readable storage medium
CN108960211B (en) Multi-target human body posture detection method and system
CN109858371B (en) Face recognition method and device
CN105740780B (en) Method and device for detecting living human face
CN110210276A (en) A kind of motion track acquisition methods and its equipment, storage medium, terminal
CN111753643B (en) Character gesture recognition method, character gesture recognition device, computer device and storage medium
CN107133608A (en) Identity authorization system based on In vivo detection and face verification
CN111325769B (en) Target object detection method and device
CN107346414B (en) Pedestrian attribute identification method and device
CN109299658B (en) Face detection method, face image rendering device and storage medium
CN110059666B (en) Attention detection method and device
CN111310826B (en) Method and device for detecting labeling abnormality of sample set and electronic equipment
CN112884782B (en) Biological object segmentation method, apparatus, computer device, and storage medium
CN109815823B (en) Data processing method and related product
CN113569594A (en) Method and device for labeling key points of human face
CN112200056A (en) Face living body detection method and device, electronic equipment and storage medium
CN112528902A (en) Video monitoring dynamic face recognition method and device based on 3D face model
CN113557546B (en) Method, device, equipment and storage medium for detecting associated objects in image
Dupuis et al. Robust radial face detection for omnidirectional vision
US20240161461A1 (en) Object detection method, object detection apparatus, and object detection system
CN114255493A (en) Image detection method, face detection device, face detection equipment and storage medium
CN112149517A (en) Face attendance checking method and system, computer equipment and storage medium
KR20010035100A (en) An Effective Object Tracking Method and for Apparatus for Interactive HyperLink Video
CN112613457B (en) Image acquisition mode detection method, device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20211126

Address after: 215100 floor 23, Tiancheng Times Business Plaza, No. 58, qinglonggang Road, high speed rail new town, Xiangcheng District, Suzhou, Jiangsu Province

Applicant after: MOMENTA (SUZHOU) TECHNOLOGY Co.,Ltd.

Address before: Room 601-a32, Tiancheng information building, No. 88, South Tiancheng Road, high speed rail new town, Xiangcheng District, Suzhou City, Jiangsu Province

Applicant before: MOMENTA (SUZHOU) TECHNOLOGY Co.,Ltd.