WO2020119458A1 - 脸部关键点检测方法、装置、计算机设备和存储介质 - Google Patents

脸部关键点检测方法、装置、计算机设备和存储介质 Download PDF

Info

Publication number
WO2020119458A1
WO2020119458A1 PCT/CN2019/121237 CN2019121237W WO2020119458A1 WO 2020119458 A1 WO2020119458 A1 WO 2020119458A1 CN 2019121237 W CN2019121237 W CN 2019121237W WO 2020119458 A1 WO2020119458 A1 WO 2020119458A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
detected
face
key
frame
Prior art date
Application number
PCT/CN2019/121237
Other languages
English (en)
French (fr)
Inventor
曹煊
曹玮剑
葛彦昊
汪铖杰
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Priority to KR1020217011194A priority Critical patent/KR102592270B1/ko
Priority to EP19897431.3A priority patent/EP3839807A4/en
Priority to JP2021516563A priority patent/JP2022502751A/ja
Publication of WO2020119458A1 publication Critical patent/WO2020119458A1/zh
Priority to US17/184,368 priority patent/US11915514B2/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • G06V40/165Detection; Localisation; Normalisation using facial parts and geometric relationships
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • G06V40/171Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Definitions

  • This application relates to the field of artificial intelligence, and in particular to a method, device, computer equipment, and storage medium for detecting facial key points.
  • Face key point detection technology plays a vital role in applications such as face recognition, face registration, and facial beauty.
  • the relevant face key point detection method can detect face key points based on the global features of the face.
  • the detection of face key points refers to positioning the position coordinates of key points in the face image.
  • face key point detection as shown in Fig. 1, the entire face picture is used as input, and the position coordinates of all face key points are simultaneously output through a neural network or a mathematical model.
  • the related key point detection method uses the entire face image as the detection object to perform key point detection. Therefore, there is a problem of low detection efficiency.
  • a face key point detection method, device, computer equipment, and storage medium that can improve detection efficiency are provided, which can solve the aforementioned problem of low efficiency in detecting face key points.
  • a face key point detection method includes:
  • the face image to be detected is a face image of a frame to be detected
  • a face key point detection device comprising:
  • the overall image acquisition module is used to acquire the facial image to be detected, and the facial image to be detected is the facial image of the frame to be detected;
  • a partial image determination module configured to determine partial images including key points in the facial image to be detected according to the facial image to be detected
  • the local candidate point determination module is used to determine candidate points of key points corresponding to each partial image based on each partial image;
  • the overall key point determination module is used to jointly constrain candidate points of each key point and determine key points of each face.
  • a computer device includes a memory and a processor.
  • the memory stores a computer program.
  • the processor implements the computer program to implement the following steps:
  • the face image to be detected is a face image of a frame to be detected
  • the face image to be detected is a face image of a frame to be detected
  • the above face key point detection method, device, computer equipment and storage medium acquire the facial image to be detected, the facial image to be detected is the facial image of the frame to be detected; according to the facial image to be detected, Determine that the facial image to be detected includes partial images of each key point; based on each partial image, determine the candidate points of the key points corresponding to each partial image; and jointly constrain the candidate points of each key point to determine each face Key points. Since the partial image of each key point is included in the entire facial image to be detected, the candidate points of the corresponding key points in the partial image are determined respectively. The calculation amount can be reduced, and the determination efficiency of key point candidate points can be improved. Thus, the detection efficiency of key points of each face can be improved.
  • the face key point detection method is applied to makeup applications, due to the improved detection efficiency, the time consumption of key point detection can be reduced, the stuck phenomenon at runtime can be reduced, and a smoother makeup effect can be provided.
  • FIG. 1 is a schematic diagram of a face key point detection method in a related technical manner
  • FIG. 2 is an application environment diagram of a face key point detection method in an embodiment
  • FIG. 3 is a schematic flowchart of a method for detecting key points of a face in an embodiment
  • FIG. 4 is a schematic diagram of a neural network structure of a key point detection method in a specific embodiment
  • FIG. 5 is a schematic diagram of another neural network structure of a face key point detection method in a specific embodiment
  • FIG. 6 is a schematic diagram of a face key point detection method in a specific embodiment
  • FIG. 7 is a schematic flow chart of a method for detecting key points of a face in a specific embodiment
  • FIG. 8 is a diagram showing an example of applying makeup by using related technical methods
  • FIG. 9 is an example diagram of applying makeup accurately by the face key point detection method of an embodiment
  • FIG. 10 is a structural block diagram of a face key point detection device in an embodiment.
  • FIG. 2 is an application environment diagram of a face key point detection method in an embodiment.
  • the face key point detection method can be applied to computer equipment.
  • the computer device may be a terminal or a server.
  • the terminal may be a desktop device or a mobile terminal, for example, a mobile phone, a tablet computer, a desktop computer, and so on.
  • the server may be an independent physical server, a physical server cluster, or a virtual server.
  • the computer device includes a processor, a memory, and a network interface connected through a system bus.
  • the memory includes a non-volatile storage medium and an internal memory.
  • the non-volatile storage medium of the computer device stores an operating system and a computer program.
  • the processor can enable the processor to implement the steps of the face key point detection method.
  • a computer program may also be stored in the internal memory.
  • the processor may cause the processor to execute the steps of the facial key point detection method.
  • FIG. 2 is only a block diagram of a part of the structure related to the solution of the present application, and does not constitute a limitation on the computer equipment to which the solution of the present application is applied.
  • the specific computer equipment may It includes more or fewer components than shown in the figure, or some components are combined, or have a different component arrangement.
  • a face key point detection method is provided. This method can be run on the computer device in FIG. 2.
  • the face key point detection method includes the following steps:
  • S302 Acquire a face image to be detected, and the face image to be detected is a face image of a frame to be detected.
  • the face image to be detected may be an independent face image, or may be a frame image in a continuous multiple frame face image in a dynamic scene.
  • the face image to be detected may be an image including face information.
  • the face may refer to a human face, that is, a human face.
  • the face can also refer to the face of an animal, such as the face of animals such as cats, dogs, lions, tigers, and polar bears.
  • this method can be applied to dynamic scenes, which are scenes that include no less than two frames of images.
  • dynamic scenes which are scenes that include no less than two frames of images.
  • key points of a face it is necessary to detect face images of no less than two frames in the scene.
  • the key points of the face may be detected for each frame of the face image.
  • S304 Determine, according to the facial image to be detected, a partial image that includes each key point in the facial image to be detected.
  • the key point may refer to a point on a facial organ in a face image, for example, a point on the corner of the eye, a midpoint of the eyelid, a tip of the nose, a corner of the mouth, or an outline.
  • each key point of the face can be defined in advance.
  • the face type can be used to indicate whether the face is a human face or an animal face.
  • the face type may include at least one of a human face type, a cat face type, and a dog face type.
  • each key point of the face is defined in advance, and a partial image can be obtained according to each defined key point, that is, one partial image corresponds to one key point.
  • the definition of each key point of the face in advance may include: according to the position of the key point in the face image, the key point is marked, for example, the key point can be marked as the first key point and the second key point ,...Nth key point, where N is the total number of face key points, where the 1st key point can be the key point of the corner of the eye, the 2nd key point can be the key point of the middle point of the eyelid,...
  • the 52 key points can be key points of the corner of the mouth, etc.
  • N may be 86, where the eyes may include 22 key points, the eyebrows may include 16 key points, the nose may include 11 key points, the mouth may include 18 key points, and the face contour may include 19 key point.
  • each key point can also be classified according to organs.
  • the key points can be divided into eye key points, eyebrow key points, nose key points, mouth key points, contour key points, and other types of key points.
  • the first key point to the 22nd key point can be the key points of the eyes; the 23rd key point to the 38th key point can be the key points of the eyebrows.
  • the size of the partial image may be smaller than one-tenth, one-twentieth, etc. of the face image to be detected. In short, the partial image is much smaller than the face image to be detected.
  • multiple key points of the face may be defined in advance according to preset rules, and for multiple key points, a partial image corresponding to the multiple key points may be obtained, that is, a partial The image corresponds to multiple key points.
  • pre-defining multiple key points of the face may include: classifying and defining face key points according to organs, and the types of key points may include eye key points, eyebrow key points, and nose key points , At least one of the key points of the mouth and the key points of the outline.
  • multiple key points of the same type may be the first key point to the 22nd key point of the eye, and may also be the 23rd key point to the 38th key point of the eyebrow.
  • the size of the partial image may be smaller than one-half, one-fifth, etc. of the face image to be detected. In short, the partial image is smaller than the face image to be detected.
  • Partial images can be extracted based on the definition of key points. For example, for each key point, the facial image to be detected is extracted, and the partial image corresponding to the key point is extracted.
  • the partial image corresponding to the key point is a partial image including the key point.
  • the facial image to be detected is extracted, and the partial image including the first key point is extracted.
  • the facial image to be detected is extracted, and a partial image corresponding to the key points of the type is extracted.
  • the partial image corresponding to this type of key point is a partial image including all key points of this type. For example, for the eye key point, the face image to be detected is extracted, and the part including the first key point to the 22nd key point is extracted image.
  • the implementation of a partial image corresponding to one key point can extract smaller partial images compared to the implementation of a partial image corresponding to multiple key points, so it can improve higher detection effectiveness.
  • the number of extracted partial images is smaller, which can reduce the calculation amount of the computer device and thus can be faster Identify key points on the face.
  • the candidate points of the key points corresponding to the partial image can be determined according to the texture characteristics of the partial image. That is, each partial image may be independently subjected to key point detection to determine candidate points of key points corresponding to the partial image.
  • the candidate point refers to a point that may be a key point corresponding to the partial image.
  • a trained neural network model may be used to determine candidate points for key points corresponding to the local image.
  • the corresponding neural network model can be used to map each key point to obtain the candidate point of the key point.
  • the neural network corresponding to each key point may be a pre-trained neural network model based on local images. Taking a partial image corresponding to a key point as an example, the number of neural network models is equal to the number of predefined face key points, and each key point can correspond to a neural network model based on a partial image, so that multiple Partial images are simultaneously input into multiple neural network models for simultaneous processing to speed up processing.
  • the input of the neural network model based on the local image is a local image, and the output may be a heat map of the local image.
  • Heat Map refers to an image characterized by the probability of point distribution, indicating the level of energy.
  • the size of a pixel value in the image indicates the level of probability that the pixel is a key point.
  • the heat map can express the probability of each pixel in the local image as a key point.
  • the pixel point whose probability value meets the preset condition is the candidate point.
  • the preset condition may be that the probability value is greater than the preset probability value, for example, the preset probability value may be any value in the range of 0 to 1 such as 0, 0.1, 0.5, and so on. In this way, the method of determining the candidate points of the key points corresponding to the partial image through the neural network can further improve the accuracy of face key point detection.
  • S308 Jointly constrain candidate points of each key point to determine key points of each face.
  • the candidate points of each key point correspond to each partial image. Therefore, when determining the face key points of the entire face image to be detected, it is necessary to jointly constrain the candidate points of the key points corresponding to the partial images to determine the key points corresponding to the partial images.
  • the set of key points corresponding to each partial image of the image to be detected is the face key points of the entire face image to be detected.
  • performing joint constraints on the candidate points of each key point to determine the key points of each face may include: performing joint constraints on the candidate points in each partial image according to the conditions of the joint constraints to determine each The key points in the partial image are obtained as the face key points of the entire face image to be detected.
  • the conditions of the joint constraint indicate the conditions that the key points corresponding to the partial images should meet when they are combined.
  • the joint constraint condition may be a condition based on facial features that should be met when the key points corresponding to the partial images are combined.
  • satisfying the condition based on facial features may include that the key point of the eye is above the key point of the mouth, the key point of the nose is between the key point of the eye and the key point of the mouth, and so on.
  • performing joint constraints on candidate points of each key point to determine key points of each face may include: using a trained neural network model to perform joint constraints on candidate points of each key point to determine key points of each face point.
  • the input of the neural network model may be a heat map of a partial image, and the output may be key points of each face of the face image to be detected. In this way, determining the key points of the face through the neural network model can further improve the detection efficiency of the key points of the face.
  • each partial image corresponds to multiple key points
  • a rotation operation is required, so the linear model cannot be used for joint constraints.
  • the linear model cannot be used for joint constraints.
  • the joint constraints of the candidate points of each key point are based on the linear model, only a single solution is required. Compared with the nonlinear model, multiple solutions are required, which requires less calculation and faster detection speed.
  • the pixels with the highest probability of being key points in each partial image are not simply used as key points on the face, but joint constraints are required. For example, when there are interference factors such as occlusion and dark light at a certain key point, the probability value of the corresponding pixel point as a key point is low, but in the case of joint constraints, the above interference factors will be investigated and output as key points of the face.
  • the facial image to be detected is acquired; based on the facial image to be detected, it is determined that the facial image to be detected includes partial images of each key point; based on each partial image, each Candidate points of key points corresponding to the partial image; joint constraints are made on the candidate points of each key point to determine the key points of each face. Since the partial images of each key point are respectively included in the overall face image to be detected, the candidate points of the corresponding key points in the partial image are determined respectively. The calculation amount can be reduced, and the determination efficiency of key point candidate points can be improved. Thus, the detection efficiency of key points of each face can be improved.
  • the face key point detection method based on this embodiment can also improve the accuracy of key point detection.
  • the face key point detection method is applied to makeup applications, due to the improved detection efficiency, the time consumption of key point detection can be reduced, the stuck phenomenon at runtime can be reduced, and a smoother makeup effect can be provided.
  • the dynamic scene may be a dynamic scene in applications such as sharing video applications, shooting video applications, and beautifying video applications.
  • makeup for example, drawing eye shadow, lipstick, face-lift, etc.
  • CG Computer Graphics
  • the face key point detection method based on this embodiment has the beneficial effect of high detection accuracy, it is possible to avoid the key point detection caused by the special circumstances such as side faces, occlusion, dark light, etc. when the face is makeup The problem of accurate and messy makeup. Thus, the stability of makeup is improved.
  • the dynamic scene has high real-time requirements, that is, the key points of the face in each video frame need to be detected in real time, that is, the detection efficiency is higher, and the face based on this embodiment
  • the key point detection method can be better applied to dynamic scenes. While ensuring the fluency of dynamic scenes, it is suitable for application environments where the execution terminal is a smart terminal.
  • the candidate points of each key point are jointly constrained to determine the key points of each face, including: joint constraints of the candidate points of each key point based on a linear model to determine the key points of each face.
  • each partial image corresponds to a key point.
  • the candidate points of each key point can be jointly constrained based on the linear model to determine each face key point in the face map to be detected.
  • the traditional nonlinear model often has a solution deviation, and the joint constraint based on the linear model can obtain more accurate face key points. Since the linear model is used to jointly constrain the candidate points of each key point, only a single solution is required. Compared with the non-linear model, multiple solutions are required, which requires less calculation and can guarantee convergence to the global optimal.
  • the linear model has the effect of ensuring convergence to the global optimal and less calculation. Therefore, based on the face key point detection method of this embodiment, the efficiency of determining face key points can be further improved, and the accuracy of face key point detection can be improved.
  • the face key point detection method is applied to makeup applications, due to the improved detection efficiency and accuracy, the time consumption of key point detection can be further reduced, the stuck phenomenon at runtime can be reduced, and a smoother makeup effect can be provided .
  • the linear model can ensure that the solution obtains the global optimal solution, it can avoid the solution deviation in the side face pose and improve the accuracy of makeup in the side face pose.
  • the linear model is a dynamic linear point distribution model.
  • the dynamic linear point distribution model is a dynamic point distribution model based on linear constraints.
  • joint constraints are performed on candidate points of each key point to determine key points of each face, including: obtaining the constraint parameters of each local image based on a dynamic linear point distribution model; according to each constraint parameter, Jointly constrain the candidate points of the key points of each partial image to obtain the face key points of each partial image.
  • Point distribution model refers to the statistical model of the distribution of key points in a certain category of objects, which can reflect the shape characteristics of the category of objects.
  • the point distribution model is a statistical model of key point distribution on the face.
  • the dynamic linear point distribution model can dynamically update the constraint parameters (PDM parameters) in real time, making the detection results more accurate. Therefore, the face key point detection method based on this embodiment can ensure convergence to the global optimal, less calculation, and constraint parameters can be dynamically updated in real time. Therefore, based on the face key point detection method of this embodiment, the efficiency of determining face key points can be further improved, and the accuracy of face key point detection can be further improved.
  • the input of the dynamic linear point distribution model is a heat map of each partial image, and the output is a key point of each face.
  • the optimization objective function of the joint constraint solution process in the dynamic linear point distribution model can be:
  • H k [x, y] is the probability that the k-th key point is located at [x, y] coordinates
  • is the regular constraint strength, and can be set to any value between 0.1 and 5.0 according to experience
  • B is PDM parameter.
  • M PDM is a matrix composed of point distribution model basis vectors
  • M PDM is a composite construction matrix
  • is PCA (Principle Component Analysis, principal component decomposition, which refers to removing redundant components from key point data in training data to obtain main components Vector)
  • a vector composed of the eigenvalues corresponding to the principal component vector, and st represents the constraint condition.
  • PCA can convert multiple indicators of the original data into a smaller number of comprehensive indicators through dimensionality reduction.
  • the smaller number of comprehensive indicators can reflect most of the information of the original data. Therefore, the smaller number of comprehensive indicators
  • the indicator can be regarded as the main component of the original data.
  • the constraints are: Among them, [x, y] represents the three-dimensional space coordinates of key points of each face, s represents a scaling factor, which can be a floating point number; ⁇ represents a matrix composed of PCA basis vectors; B represents PDM parameters; and T represents a translation factor.
  • R represents the rotation factor, which can be expressed as:
  • ⁇ , ⁇ , ⁇ respectively represent the rotation angle around the X, Y, Z axis in the three-dimensional space coordinates. Due to the existence of a non-linear factor R in the traditional constraints, it is required to solve the parameter B in this formula, which requires a complex algorithm, such as the gradient descent method, which is time-consuming and cannot be guaranteed to be the global optimal solution.
  • the linear model may be a linear point distribution model.
  • the efficiency of determining the key points of each face can be further improved, and the accuracy of detecting face key points can be improved.
  • Beneficial effect the efficiency of determining the key points of each face.
  • the candidate points of the key points corresponding to the respective partial images are determined based on the respective partial images, including: when the frame to be detected is a non-preset frame, the preamble frames that acquire the frame to be detected respectively include each Partial images of key points; based on the previous partial frames and the corresponding partial images in the frame to be detected, the candidate points of the key points corresponding to the partial images in the facial image to be detected are determined respectively.
  • the preset frame may be a preset frame, and the preset frame may be a non-key frame or a first frame.
  • the key frame may refer to a video frame including key information, or may be a video frame obtained every predetermined number of frames or time.
  • Non-preset frames are key frames and non-first frames.
  • the first frame may refer to the first frame used to detect key points of the face in a dynamic scene.
  • the non-first frame is the frame after the first frame used to detect key points of the face in a dynamic scene.
  • the preamble frame is any frame before the frame to be detected, and the preamble frame may include one frame before the frame to be detected, or may include multiple frames before the frame to be detected.
  • the preceding frame may be at least one frame continuous with the frame to be detected. For example, the preceding frame may be the previous frame of the frame to be detected.
  • the candidate points of the key points corresponding to each partial image are separately determined, including: when the preset frame is the first frame, when the frame to be detected is a non-preset frame, that is, to be detected The frame is a non-first frame, and the partial images of each key point in the previous frame of the frame to be detected are obtained; based on the corresponding partial images of the previous frame and the frame to be detected, the corresponding partial images in the facial image to be detected are determined respectively Candidate points for the key points.
  • the corresponding partial images in the preceding frame and the frame to be detected are the partial images in the preceding frame and the partial image in the frame to be detected corresponding to the same key point.
  • the candidate points of the key points corresponding to each partial image are separately determined, including: the preset frame is a non-key frame, when the frame to be detected is a non-preset frame and is a non-first frame At the time, obtain the partial images of each key point in the previous frame of the frame to be detected; based on the previous frame and the corresponding partial image in the frame to be detected, determine the key points corresponding to the partial images of the face image to be detected respectively Candidate.
  • the frame to be detected when the frame to be detected is a non-preset frame and is not the first frame, it means that the frame to be detected is a key frame, and the key frame is not the first frame in the dynamic scene.
  • the corresponding partial images in the preceding frame and the frame to be detected are the partial images in the preceding frame and the partial image in the frame to be detected corresponding to the same key point.
  • the candidate points of the key points corresponding to the face image to be detected are determined by combining the partial images corresponding to the frame to be detected and its predecessor frame, that is, the Candidate points of key points corresponding to the partial images of the frame to be detected.
  • the corresponding partial images in the preceding frame and the frame to be detected are the partial images in the preceding frame and the partial image in the frame to be detected corresponding to the same key point. Understandably, the number of partial images of the frame to be detected is greater than one. The number of partial images of the frame to be detected may be the same as the number of face key points in the frame to be detected.
  • the way to determine the partial image of the previous frame can be to predict the position of the key point in a pre-detection manner, and extract the partial image according to the predicted key point position to obtain the partial image of the previous frame; it can also be based on the previous frame
  • the detection results of the key points of the face are extracted by the partial image to obtain the partial image of the previous frame.
  • the detection result of the face key points of the previous frame is the detection result obtained by detecting the previous frame using the face key point detection method provided in the embodiment of the present application.
  • the candidate points of the key point corresponding to the partial image of the frame to be detected are determined in combination with the frame to be detected and its predecessor frame. It can ensure that the key points in the previous frame and the frame to be detected have consistency. Thus, the stability of key points in the dynamic scene can be improved.
  • the face key point detection method is applied to upper makeup applications, the stability of face key point detection on continuous frame video can be improved, and the problem of makeup jitter can be improved.
  • the method further includes: performing local image extraction based on each key point of the face, and determining the preface of the next frame of the face image
  • the frame includes partial images of each key point.
  • partial image extraction is performed, and it can be determined that the previous frame of the facial image of the frame after the frame to be detected (that is, the current frame to be detected) includes partial images of each key point That is, after the face key point detection method provided in the embodiment of the present application is used to obtain the face key points of the current frame to be detected, partial image extraction is performed according to the obtained face key points to obtain each key After the partial image of the point, the next frame to be detected can be processed. Therefore, the obtained partial image including each key point can be used as the partial image of the previous frame of the next frame to be detected.
  • N is a natural number
  • partial images including key points in the previous m frames can be obtained.
  • m is a natural number less than N.
  • the candidate points of the key points corresponding to the respective partial images are determined based on the respective partial images, including: when the frame to be detected is a preset frame, based on the partial images of the frame to be detected, the candidate to be detected is determined respectively Candidate points for key points corresponding to each partial image in the face image.
  • the candidate points of the key points corresponding to the partial images in the face image to be detected can be determined respectively according to the partial images of the frame to be detected.
  • the preset frame may be a non-key frame or the first frame. In this way, when the frame to be detected is a preset frame, such as when there is no preamble frame when the frame to be detected is the first frame, based on the partial images of the frame to be detected, the corresponding partial images in the face image to be detected can be determined respectively Candidate points for the key points.
  • the candidate points of the key points corresponding to each partial image are determined respectively, including: the preset frame is the first frame, and when the frame to be detected is the preset frame, based on each part of the frame to be detected For the image, determine the candidate points of the key points corresponding to each partial image in the face image to be detected. Since the frame to be detected is the first frame, the frame to be detected has no preceding frame, and the computer device can determine candidate points of key points corresponding to the partial images in the facial image to be detected according to the partial images of the frame to be detected.
  • the candidate points of the key points corresponding to the partial images are determined respectively, including: the preset frame is a non-key frame, and when the frame to be detected is the preset frame, based on the For each partial image, determine candidate points for key points corresponding to each partial image in the face image to be detected.
  • the non-key frame may be a video frame that does not include key information, therefore, the processing method of the non-key frame can be simplified, and the candidate points of the key points corresponding to each partial image can be obtained based only on each partial image of the non-key frame itself , Reducing the amount of computing equipment.
  • the trained neural network model is used to determine the candidate points of the key points corresponding to the local image, and each pixel point in the local image is represented as a key by means of a heat map Point probabilities, thereby determining candidate points for key points.
  • each neural network model input is a 16x16 partial image
  • the output is an 8x8 heat map, which includes two convolutional layers and a fully connected layer.
  • the convolution kernel size of the convolutional layer is 5x5, convolution There is no padding and pooling in the layer, and the convolutional layer uses ReLU (The Rectified Linear Unit, modified linear unit) as the activation function.
  • the size of the pixel value in the heat map indicates the probability that the pixel is a key point. A larger pixel value indicates a higher probability that the point is a key point.
  • the parameters of the neural network are initialized with a Gaussian distribution with a variance of 0.01 and a mean of 0.
  • the training method uses the SGD (Stochastic Gradient Descent) algorithm to solve the parameters of the neural network and trains at each iteration
  • the back propagation error is the Euclidean distance between the predicted heat map and the labeled heat map.
  • the determination of the candidate points for each key point uses the same neural network structure, but is trained independently of each other. Therefore, the parameters of the neural network corresponding to the key points of each face are different.
  • the training data used for training can be the public 300W data set and the data set marked by Youtu Lab.
  • the frame to be detected when the frame to be detected is a preset frame, based on the partial images of the frame to be detected, candidate candidates for key points corresponding to the partial images in the face image to be detected are determined respectively point.
  • the partial images including the key points in the previous frame of the frame to be detected are obtained; based on the corresponding partial images in the previous frame and the frame to be detected To determine the candidate points of the key points corresponding to each partial image in the face image to be detected.
  • the upper face image in FIG. 6 is a face image of a preset frame
  • the lower face image is a face image of a non-preset frame.
  • the determination of the candidate point of each key point in the face image to be detected is determined based on a partial image of the frame to be detected.
  • the determination of the candidate points of each key point in the face image to be detected is a combination of a partial image of the frame to be detected and a partial image of the previous frame, a total of two Determined by a partial image.
  • the face image to be detected is a face image
  • the face key points are face key points
  • Facial landmarks refer to a set of points with facial features in the face image, such as a set of points such as the corners of the eye, the midpoint of the eyelid, the tip of the nose, the corner of the mouth, and the outline.
  • a method of separately determining candidate points of key points corresponding to each partial image includes: for each partial image, determining a heat map of the partial image; the heat map includes the corresponding partial image The probability of each pixel being a key point; according to the heat map, the candidate points of the key point corresponding to the partial image are determined.
  • the probability of each pixel point as a key point in a partial image can be expressed in the form of a heat map.
  • the candidate point may be a pixel point whose probability as a key point is greater than a preset probability value.
  • the preset probability value can be any value within the range of 0 to 1 such as 0, 0.1, etc. That is, the candidate points of the key points corresponding to the partial image can be determined according to the heat map. If there are interference factors such as occlusion and dark light at a key point, the heat value in the corresponding heat map is low, but when joint constraints are performed, the above interference factors will be investigated.
  • the heat map contains the probability information of the distribution of key points, if there are interference factors such as occlusion and dark light at a certain key point, the heat value in the corresponding heat map is low. Therefore, the special features such as side face, occlusion and dark light can be improved. Under the circumstances, the detection accuracy of key points.
  • the face key point detection method is applied to makeup applications, since this method can improve the detection accuracy of key points in special situations such as side face, occlusion, and dark light, it can improve the side face, occlusion, and dark.
  • the accuracy of applying makeup under special circumstances such as light improves the problem of makeup misalignment.
  • acquiring the facial image to be detected includes: acquiring an initial facial image; preprocessing the initial facial image to obtain the facial image to be detected.
  • the initial face image may be a face image collected by a camera and not processed.
  • the preprocessing may include operations such as straightening or/and zooming, so that the key points of the pupils of both eyes are on the same horizontal line, or/and, the horizontal distance of the key points of the pupils of both eyes is a preset value.
  • Rotation refers to rotating the face image to the key point of the pupils of both eyes on the same horizontal line.
  • Zooming refers to the process of enlarging or reducing the face image.
  • the pre-processing may also include acquiring a texture feature image of the initial face image, or/and performing face area positioning on the initial face image. In this way, the efficiency of face key point detection can be further improved.
  • the preprocessing includes: correcting the initial facial image so that the key points of the pupils of both eyes are on the horizontal line; or/and, scaling the initial facial image so that the horizontal distance of the key points of the pupils of both eyes is default value.
  • the preset value is a predetermined horizontal distance value of the key point of the two-hole through-hole. For example, it can be 160 pixels. In this way, the efficiency of face key point detection can be further improved.
  • the partial images of the facial image to be detected respectively include key points, including: initializing facial key points of the facial image to be detected; Detect partial images that include each key point in the face image.
  • the method of initializing the face key points of the face image to be detected may include: when the face image to be detected does not have a previous frame face image, for example, it may be the first frame in a dynamic scene, or An independent face image, at this time, the initialization result of the initialization key point can be determined according to the key point coordinates of the average face model.
  • the initialization result can be the key point coordinates of the average face model.
  • the key point coordinates of the average face model may be the average position coordinates of key points of each face obtained after analyzing a large number of face models.
  • the initialization result of the initialization key point can be determined according to the key point coordinates of the face image of the previous frame.
  • the initialization result may be the key point coordinates of the face image of the previous frame.
  • the face key points to be detected may be initialized in other ways, for example, the face key points to be detected may be initialized in a pre-detection manner.
  • the facial image to be detected includes partial images of each key point, which may be the center position of the partial image using the initialization result as the corresponding key point, and image extraction is performed according to a preset size to obtain the image containing the key point. Partial image.
  • a face key point detection method which is applied to a dynamic scene, includes:
  • the initial face image rotate and scale the initial face image so that the key points of the pupils of both eyes are on the same horizontal line and the horizontal distance of the key points of the pupils of both eyes is the preset value, to obtain the face image to be detected;
  • the face to be detected The image is the face image of the frame to be detected;
  • the face image to be detected is the face image;
  • Initialize the face key points of the face image to be detected determine the partial images of the key points included in the face image to be detected based on the initialization result, and the face image to be detected is the face image of the frame to be detected;
  • the frame to be detected is a non-preset frame
  • the partial images of each key point in the previous frame of the frame to be detected are obtained; based on the corresponding partial images of the previous frame and the frame to be detected, the facial images to be detected are determined respectively Heat map corresponding to each partial image in
  • the frame to be detected is a preset frame, based on each partial image of the frame to be detected, determine the heat map corresponding to each partial image in the face image to be detected;
  • the heat maps are jointly constrained to determine the key points of each face, and the key points of the face are the key points of the face;
  • Partial image extraction is performed based on the key points of each face, and it is determined that the partial image of each key point is included in the previous frame of the face image of the next frame.
  • the detection efficiency, accuracy, and stability of face key points can be improved.
  • FIGS. 8 and 9 show that when there is a dark light on one side of the human face, the key points of the eyebrow area and eye area of the side face are detected to be disordered by the prior art method. Therefore, “wear glasses” for the object "During the "make-up operation", the position of wearing glasses is obviously disordered.
  • FIG. 9 shows that the key points of the eyebrow area and eye area of the side face with dark light detected by the embodiment of the present application are accurate. Therefore, when the “make-up operation” of “wearing glasses” is performed for the subject, the glasses are worn Location is accurate.
  • steps in the flowcharts of FIGS. 3 and 7 are displayed in order according to the arrows, the steps are not necessarily executed in the order indicated by the arrows. Unless clearly stated in this article, the execution of these steps is not strictly limited in order, and these steps can be executed in other orders. Moreover, at least some of the steps in FIGS. 3 and 7 may include multiple sub-steps or multiple stages. These sub-steps or stages are not necessarily executed at the same time, but may be executed at different times. These sub-steps or stages The execution order of is not necessarily sequential, but may be executed in turn or alternately with at least a part of other steps or sub-steps or stages of other steps.
  • a face key point detection device including:
  • the overall image acquisition module 802 is used to acquire the facial image to be detected, and the facial image to be detected is the facial image of the frame to be detected;
  • the partial image determination module 804 is configured to determine, according to the face image to be detected, partial images including key points in the face image to be detected;
  • the local candidate point determination module 806 is used to determine candidate points of key points corresponding to each partial image based on each partial image;
  • the overall key point determination module 808 is used to jointly constrain the candidate points of each key point and determine the key points of each face.
  • the face key point detection device acquires a face image to be detected; based on the face image to be detected, it is determined that the face image to be detected includes partial images of each key point; based on each partial image, each partial image is determined separately Candidate points of corresponding key points; Jointly constrain the candidate points of each key point to determine the key points of each face. Since the partial images of each key point are respectively included in the overall face image to be detected, the candidate points of the corresponding key points in the partial image are determined respectively. The calculation amount can be reduced, and the determination efficiency of key point candidate points can be improved. Thus, the detection efficiency of key points of each face can be improved.
  • the overall key point determination module is used to jointly constrain the candidate points of each key point based on the linear model to determine the key points of each face.
  • the linear model is a dynamic linear point distribution model.
  • the local candidate point determination module is used to obtain partial images of each key point in the previous frame of the frame to be detected when the frame to be detected is a non-preset frame; based on the previous sequence The frame and the corresponding partial image in the frame to be detected respectively determine candidate points of key points corresponding to each partial image in the face image to be detected.
  • it also includes a pre-order local image determination module, which is used to jointly constrain the candidate points of each key point in the overall key point determination module, after determining each face key point, based on each face key point Partial image extraction is performed to determine that the partial images of each key point are included in the previous frame of the face image of the next frame.
  • a pre-order local image determination module which is used to jointly constrain the candidate points of each key point in the overall key point determination module, after determining each face key point, based on each face key point Partial image extraction is performed to determine that the partial images of each key point are included in the previous frame of the face image of the next frame.
  • the local candidate point determination module is used to determine, based on the partial images of the frame to be detected, corresponding to the partial images of the facial image to be detected when the frame to be detected is a preset frame Candidates for key points.
  • the face image to be detected is a face image
  • the face key points are face key points
  • the local candidate point determination module is used to determine the heat map of the partial image for each partial image; the heat map includes the probability of each pixel in the corresponding partial image as a key point;
  • candidate points of key points corresponding to the partial image are determined.
  • the overall image acquisition module is used to acquire an initial facial image; and pre-process the initial facial image to obtain the facial image to be detected.
  • the overall image acquisition module is also used to straighten the initial facial image so that the key points of the pupils of the two eyes are on the same horizontal line; or/and, the initial facial image is scaled so that both eyes The horizontal distance of the key point of the pupil is a preset value.
  • a local candidate point determination module is used to initialize face key points of the face image to be detected; based on the initialization result, it is determined that the face image to be detected includes partial images of each key point.
  • a computer device is provided.
  • the computer device may be a server.
  • the computer device includes a processor, a memory, and a network interface connected through a system bus.
  • the processor of the computer device is used to provide computing and control capabilities.
  • the memory of the computer device includes a non-volatile storage medium and an internal memory.
  • the non-volatile storage medium stores an operating system and computer programs.
  • the internal memory provides an environment for the operating system and computer programs in the non-volatile storage medium.
  • the network interface of the computer device is used to communicate with external terminals through a network connection.
  • a computer device is provided, and the computer device may be a terminal.
  • the computer equipment includes a processor, a memory, a network interface, a display screen, and an input device connected through a system bus.
  • the processor of the computer device is used to provide computing and control capabilities.
  • the memory of the computer device includes a non-volatile storage medium and an internal memory.
  • the non-volatile storage medium stores an operating system and computer programs.
  • the internal memory provides an environment for the operating system and computer programs in the non-volatile storage medium.
  • the network interface of the computer device is used to communicate with external terminals through a network connection. When the computer program is executed by the processor, a method for detecting facial key points is realized.
  • the display screen of the computer device may be a liquid crystal display screen or an electronic ink display screen
  • the input device of the computer device may be a touch layer covered on the display screen, or may be a button, a trackball, or a touch pad provided on the computer device housing , Can also be an external keyboard, touchpad or mouse.
  • a computer device is provided, and the computer device may be a server or a terminal.
  • the computer device includes a memory and a processor.
  • the memory stores a computer program.
  • the processor executes the computer program, the steps of the face key point detection method described above are implemented.
  • the computer device includes a memory and a processor.
  • the memory stores a computer program.
  • the processor executes the computer program, the following steps are implemented:
  • the face image to be detected is a face image of a frame to be detected
  • the facial image to be detected determine that the facial image to be detected includes partial images of each key point
  • the candidate points of each key point are jointly constrained to determine the key points of each face, including:
  • the candidate points of each key point are jointly constrained to determine the key points of each face.
  • the linear model is a dynamic linear point distribution model.
  • determining candidate points of key points corresponding to each partial image respectively includes:
  • the frame to be detected is a non-preset frame, obtain partial images including key points in the previous frame of the frame to be detected;
  • candidate points for key points corresponding to the respective partial images in the face image to be detected are respectively determined.
  • the method further includes:
  • Partial image extraction is performed based on each key point of the face, and it is determined that the partial image of each key point is included in the previous frame of the face image of the next frame, respectively.
  • determining candidate points of key points corresponding to each partial image respectively includes:
  • the frame to be detected is a preset frame
  • candidate points of key points corresponding to each partial image in the face image to be detected are respectively determined.
  • the face image to be detected is a face image
  • the face key points are face key points
  • the method of determining candidate points of key points corresponding to each partial image based on each partial image includes:
  • the heat map For each partial image, determine the heat map of the partial image; the heat map includes the probability of each pixel in the corresponding partial image as a key point;
  • candidate points of key points corresponding to the partial image are determined.
  • the acquiring the facial image to be detected includes:
  • the preprocessing includes:
  • the initial facial image is scaled so that the horizontal distance between the key points of the pupils of both eyes is a preset value.
  • the determining that the facial image to be detected respectively includes partial images of key points according to the facial image to be detected includes:
  • the facial images to be detected respectively include partial images of each key point.
  • a computer-readable storage medium is provided on which a computer program is stored, and when the computer program is executed by a processor, the steps of the face key point detection method described above are implemented.
  • a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, the following steps are realized:
  • the face image to be detected is a face image of a frame to be detected
  • the facial image to be detected determine that the facial image to be detected includes partial images of each key point
  • the candidate points of each key point are jointly constrained to determine the key points of each face, including:
  • the candidate points of each key point are jointly constrained to determine the key points of each face.
  • the linear model is a dynamic linear point distribution model.
  • determining candidate points of key points corresponding to each partial image respectively includes:
  • the frame to be detected is a non-preset frame, obtain partial images including key points in the previous frame of the frame to be detected;
  • candidate points for key points corresponding to the respective partial images in the face image to be detected are respectively determined.
  • the method further includes:
  • Partial image extraction is performed based on each key point of the face, and it is determined that the partial image of each key point is included in the previous frame of the face image of the next frame, respectively.
  • determining candidate points of key points corresponding to each partial image respectively includes:
  • the frame to be detected is a preset frame
  • candidate points of key points corresponding to each partial image in the face image to be detected are respectively determined.
  • the face image to be detected is a face image
  • the face key points are face key points
  • the method of determining candidate points of key points corresponding to each partial image based on each partial image includes:
  • the heat map For each partial image, determine the heat map of the partial image; the heat map includes the probability of each pixel in the corresponding partial image as a key point;
  • candidate points of key points corresponding to the partial image are determined.
  • the acquiring the facial image to be detected includes:
  • the preprocessing includes:
  • the initial facial image is scaled so that the horizontal distance between the key points of the pupils of both eyes is a preset value.
  • the determining that the facial image to be detected respectively includes partial images of key points according to the facial image to be detected includes:
  • the facial images to be detected respectively include partial images of each key point.
  • Non-volatile memory may include read-only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory.
  • Volatile memory can include random access memory (RAM) or external cache memory.
  • RAM is available in many forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.
  • SRAM static RAM
  • DRAM dynamic RAM
  • SDRAM synchronous DRAM
  • DDRSDRAM double data rate SDRAM
  • ESDRAM enhanced SDRAM
  • SLDRAM synchronous chain (Synchlink) DRAM
  • SLDRAM synchronous chain (Synchlink) DRAM
  • Rambus direct RAM
  • DRAM direct memory bus dynamic RAM
  • RDRAM memory bus dynamic RAM

Abstract

本申请涉及一种脸部关键点检测方法、装置、计算机设备和存储介质,获取待检测脸部图像,待检测脸部图像为待检测帧的脸部图像;根据所述待检测脸部图像,确定所述待检测脸部图像中分别包括各关键点的局部图像;基于各局部图像,分别确定各局部图像对应的关键点的候选点;对各关键点的候选点进行联合约束,确定各脸部关键点。由于对整体的待检测脸部图像中分别包括各关键点的局部图像,分别确定该局部图像中对应的关键点的候选点。可以降低计算量,提高关键点候选点的确定效率。从而,可以使得各脸部关键点的检测效率得到提升。

Description

脸部关键点检测方法、装置、计算机设备和存储介质
本申请要求于2018年12月10日提交、申请号为201811503905.5、发明名称为“脸部关键点检测方法、装置、计算机设备和存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及人工智能领域,特别是涉及一种脸部关键点检测方法、装置、计算机设备和存储介质。
背景技术
脸部关键点检测技术对脸部识别、脸部配准、脸部美妆等应用都具有至关重要的作用。相关的脸部关键点检测方式可以基于脸部全局特征进行脸部关键点检测。如,人脸关键点检测(Landmark Detection),是指对人脸图像中关键点的位置坐标进行定位。在进行人脸关键点检测时,如图1所示,以整张人脸图片作为输入,经过神经网络或数学模型同时输出所有人脸关键点的位置坐标。
相关的关键点检测方法,以整幅脸部图像作为检测对象,进行关键点的检测,因此,存在检测效率低的问题。
发明内容
基于此,提供一种能够提高检测效率的脸部关键点检测方法、装置、计算机设备和存储介质,可以解决上述检测脸部关键点的效率低的问题。
一种脸部关键点检测方法,所述方法包括:
获取待检测脸部图像,所述待检测脸部图像为待检测帧的脸部图像;
根据所述待检测脸部图像,确定所述待检测脸部图像中分别包括各关键点的局部图像;
基于各局部图像,分别确定各局部图像对应的关键点的候选点;
对各关键点的候选点进行联合约束,确定各脸部关键点。
一种脸部关键点检测装置,所述装置包括:
整体图像获取模块,用于获取待检测脸部图像,所述待检测脸部图像为待检测帧的脸部图像;
局部图像确定模块,用于根据所述待检测脸部图像,确定所述待检测脸部图像中分别包括各关键点的局部图像;
局部候选点确定模块,用于基于各局部图像,分别确定各局部图像对应的关键点的候选点;
整体关键点确定模块,用于对各关键点的候选点进行联合约束,确定各脸部关键点。
一种计算机设备,包括存储器和处理器,所述存储器存储有计算机程序,所述处理器执行所述计算机程序时实现以下步骤:
获取待检测脸部图像,所述待检测脸部图像为待检测帧的脸部图像;
根据所述待检测脸部图像,确定所述待检测脸部图像中分别包括各关键点的局部图像;
基于各局部图像,分别确定各局部图像对应的关键点的候选点;
对各关键点的候选点进行联合约束,确定各脸部关键点。
一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现以下步骤:
获取待检测脸部图像,所述待检测脸部图像为待检测帧的脸部图像;
根据所述待检测脸部图像,确定所述待检测脸部图像中分别包括各关键点的局部图像;
基于各局部图像,分别确定各局部图像对应的关键点的候选点;
对各关键点的候选点进行联合约束,确定各脸部关键点。
上述一种脸部关键点检测方法、装置、计算机设备和存储介质,获取待检测脸部图像,所述待检测脸部图像为待检测帧的脸部图像;根据所述待检测脸部图像,确定所述待检测脸部图像中分别包括各关键点的局部图像;基于各局部图像,分别确定各局部图像对应的关键点的候选点;对各关键点的候选点进行联合约束,确定各脸部关键点。由于对整体的待检测脸部图像中分别包括各 关键点的局部图像,分别确定该局部图像中对应的关键点的候选点。可以降低计算量,提高关键点候选点的确定效率。从而,可以使得各脸部关键点的检测效率得到提升。当该脸部关键点检测方法应用于上妆类应用时,由于检测效率得到提升,可以降低关键点检测的耗时,减少运行时的卡顿现象,提供更流畅的上妆效果。
附图说明
图1为相关技术方式中脸部关键点检测方法的原理图;
图2为一个实施例中脸部关键点检测方法的应用环境图;
图3为一个实施例中脸部关键点检测方法的流程示意图;
图4为一具体实施例中的脸部关键点检测方法的神经网络结构示意图;
图5为一具体实施例中的脸部关键点检测方法的另一神经网络结构示意图;
图6为一具体实施例中的脸部关键点检测方法的原理图;
图7为一具体实施例中的脸部关键点检测方法的流程示意图;
图8为一通过相关技术方式得到上妆错乱示例图;
图9为通过一实施例的脸部关键点检测方法上妆准确的示例图;
图10为一个实施例中脸部关键点检测装置的结构框图。
具体实施方式
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。
图2为一个实施例中脸部关键点检测方法的应用环境图。该脸部关键点检测方法可应用于计算机设备中。该计算机设备可以是终端或者服务器。终端可以是台式设备或者移动终端,例如,手机、平板电脑、台式电脑等。服务器可以是独立的物理服务器、物理服务器集群或者虚拟服务器。其中,该计算机设备包括通过系统总线连接的处理器、存储器和网络接口。其中,存储器包括非易失性存储介质和内存储器。该计算机设备的非易失性存储介质存储有操作系统及计算机程序,该计算机程序被处理器执行时,可使得处理器实现脸部关键点检测方法的步骤。该内存储器中也可储存有计算机程序,该计算机程序被处理器执行时,可使得处理器执行脸部关键点检测方法的步骤。
本领域技术人员可以理解,图2中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的计算机设备的限定,具体的计算机设备可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。
如图3所示,在一个实施例中,提供了一种脸部关键点检测方法。该方法可以运行于图2中的计算机设备。该脸部关键点检测方法,包括以下步骤:
S302,获取待检测脸部图像,待检测脸部图像为待检测帧的脸部图像。
待检测脸部图像可以是一张独立的脸部图像,也可以是在动态场景中连续多帧脸部图像中的一帧图像。待检测脸部图像可以为包括脸部信息的图像。脸部可以是指人类的脸部,即人脸。脸部也可以是指动物的脸部,如可以是猫、狗、狮子、老虎、北极熊等动物的脸部。
可以理解地,本方法可以应用于动态场景,动态场景为包括不少于两帧图像的场景,检测脸部关键点时,需要针对场景中不少于两帧的脸部图像进行检测。如,可以是对每一帧的脸部图像进行人脸关键点的检测。
S304,根据待检测脸部图像,确定待检测脸部图像中分别包括各关键点的局部图像。
其中,关键点可以是指脸部图像中的脸部器官上的点,例如,眼角、眼皮中点、鼻尖、嘴角、轮廓上的点。脸部类型确定时,如当脸部为人脸时,可以预先对脸部的各个关键点进行定义。其中,脸部类型可以用于指示脸部为人的脸部还是动物的脸部。在一种可能实现方式中,脸部类型可以包括人脸类型、猫脸类型、狗脸类型中的至少一项。
在其中一实施例中,预先对脸部的每一个关键点进行定义,可以根据每个定义的关键点获取一张局部图像,即一张局部图像对应一个关键点。其中,预先对脸部的每一个关键点进行定义可以包括:根据该关键点在脸部图像中的位置,对关键点进行标记,如可以将关键点标记为第1关键点、第2关键点、...第N关键点,其中,N为脸部关键点的总数,其中,第1关键点可以为眼角的关键点、第2关键点可以为眼皮中点的关键点、...第52关键点可以为嘴角的关键点等。
可选地,N可以为86,其中,眼睛可以包括22个关键点、眉毛可以包括16个关键点、鼻子可以包括11个关键点、嘴巴可以包括18个关键点、脸部轮 廓可以包括19个关键点。
可选地,还可以按照器官对每一个关键点进行分类,如,可以将关键点分为眼睛关键点、眉毛关键点、鼻子关键点、嘴巴关键点、轮廓关键点等多种类型的关键点,其中,第1关键点至第22关键点可以为眼睛的各关键点;第23关键点至第38关键点可以为眉毛的关键点。该局部图像的大小可以是小于待检测脸部图像的十分之一、二十分之一等,总之,该局部图像远小于待检测脸部图像。
可以理解地,在其它实施例中,还可以根据预设规则,预先对脸部的多个关键点进行定义,针对多个按键点,获取该多个关键点对应的局部图像,即一张局部图像对应多个关键点。在一种可能实现方式中,预先对脸部的多个关键点进行定义可以包括:按照器官对脸部关键点进行分类定义,关键点的类型可以包括眼睛关键点、眉毛关键点、鼻子关键点、嘴巴关键点、轮廓关键点中的至少一项。如,多个同一类型的关键点可以为眼睛的第1关键点至第22关键点,还可以为眉毛的第23关键点至第38关键点。该局部图像的大小可以是小于待检测脸部图像的二分之一、五分之一等,总之,该局部图像小于待检测脸部图像。
可以基于关键点定义的方式,提取局部图像。如,针对每一个关键点,对待检测脸部图像进行提取,提取该关键点对应的局部图像。该关键点对应的局部图像,是包括该关键点在内的局部图像,如针对第1关键点,对待检测脸部图像进行提取,提取包括该第1关键点的局部图像。又如,根据预设规则,针对多个同一类型的关键点,对待检测脸部图像进行提取,提取包括该类型关键点对应的局部图像。该类型关键点对应的局部图像,是包括该类型所有关键点在内的局部图像,如针对眼睛关键点,对待检测脸部图像进行提取,提取包括该第1关键点至第22关键点的局部图像。
需要说明的是,一张局部图像对应一个关键点的实施方式,相较于一张局部图像对应多个关键点的实施方式,能够提取更小的局部图像,因此,其能够提升更高的检测效率。而一张部图像对应多个关键点的实施方式,相较于一张局部图像对应一个关键点的实施方式,提取的局部图像数量更少,能够减轻计算机设备的运算量,从而能够更快地确定脸部关键点。
S306,基于各局部图像,分别确定各局部图像对应的关键点的候选点。
对每一张局部图像,可以根据该局部图像的纹理特征,确定该局部图像对 应的关键点的候选点。也即,可以将每一张局部图像独立地进行关键点检测,确定该局部图像对应的关键点的候选点。候选点是指可能是该局部图像对应的关键点的点。
在一些实施例中,可以采用训练好的神经网络模型,确定局部图像对应的关键点的候选点。如可以采用对应的神经网络模型对各关键点进行映射,得到该关键点的候选点。各关键点对应的神经网络可以是预先训练好的基于局部图像的神经网络模型。以一张局部图像对应一个关键点为例,神经网络模型的个数等于预先定义的脸部关键点的个数,每一个关键点可以对应一个基于局部图像的神经网络模型,这样能够将多张局部图像同时输入至多个神经网络模型中,进行同步处理,加快处理速度。该基于局部图像的神经网络模型的输入为局部图像,输出可以为该局部图像的热量图。热量图(Heat Map),是指由点分布概率表征能量高低的图像,图像中一个像素值的大小表征该像素是关键点的概率的高低。热量图可以表达局部图像中各像素点作为关键点的概率。概率值满足预设条件的像素点即为候选点,预设条件可以为概率值大于预设概率值,如,预设概率值可以为0、0.1、0.5等0至1区间的任一数值。如此,通过神经网络确定该局部图像对应的关键点的候选点的方式,可以进一步提高脸部关键点检测的准确性。
S308,对各关键点的候选点进行联合约束,确定各脸部关键点。
由于各关键点的候选点分别与各局部图像对应。因此,在确定待检测脸部图像整体的脸部关键点时,需要对各局部图像对应的关键点的候选点进行联合约束,以确定该局部图像对应的关键点。待检测图像的各局部图像对应的关键点的集合即为待检测脸部图像整体的脸部关键点。通过获取每个局部图像对应的关键点得到待检测图像整体的脸部关键点,且对每个局部图像并行处理,因此减少了脸部检测过程的耗时。
在一种可能实现方式中,对各关键点的候选点进行联合约束,确定各脸部关键点,可以包括:按照联合约束的条件,对各局部图像中的候选点进行联合约束,确定每个局部图像中的关键点,得到待检测脸部图像整体的脸部关键点。其中,联合约束的条件指示各局部图像对应的关键点联合起来应该满足的条件。如当待检测脸部图像中脸部的脸部类型为人脸类型时,联合约束条件可以为各局部图像对应的关键点联合起来应满足的基于面部特征的条件。其中,满足基于面部特征的条件可以包括眼睛的关键点在嘴巴的关键点的上方,鼻子的关键 点在眼睛的关键点和嘴巴的关键点之间等。
在一些实施例中,对各关键点的候选点进行联合约束,确定各脸部关键点可以包括:采用训练好的神经网络模型,对各关键点的候选点进行联合约束,确定各脸部关键点。该神经网络模型的输入可以为局部图像的热量图,输出可以为待检测脸部图像的各脸部关键点。如此,通过神经网络模型确定脸部关键点,可以进一步提升脸部关键点的检测效率。
需要说明的是,当每一张局部图像对应多个关键点时,在进行联合约束时,需要进行旋转操作,因此不能采用线性模型进行联合约束。当每一张局部图像对应一个关键点时,在进行联合约束时,可以无需进行旋转操作,因此可以采用线性模型进行联合约束。基于线性模型对各关键点的候选点进行联合约束时,只需要进行单次求解,相较于非线性模型需要多次求解需的计算量更少,检测速度更快。
还需说明的是,在本实施例中,并不是简单的将各个局部图像中,作为关键点概率最高的像素点作为脸部关键点,而是需要进行联合约束。例如,当某个关键点存在遮挡、暗光等干扰因素,相应像素点的作为关键点的概率值较低,但在联合约束时,将排查上述干扰因素,将其作为脸部关键点输出。
基于本实施例的脸部关键点检测方法,获取待检测脸部图像;根据待检测脸部图像,确定待检测脸部图像中分别包括各关键点的局部图像;基于各局部图像,分别确定各局部图像对应的关键点的候选点;对各关键点的候选点进行联合约束,确定各脸部关键点。由于对整体的待检测脸部图像中分别包括各关键点的局部图像,分别确定该局部图像中对应的关键点的候选点。可以降低计算量,提高关键点候选点的确定效率。从而,可以使得各脸部关键点的检测效率得到提升。同时由于局部图像更能体现局部的细节特征,因此,基于本实施例的脸部关键点检测方法还可以提高关键点检测的准确性。当该脸部关键点检测方法应用于上妆类应用时,由于检测效率得到提升,可以降低关键点检测的耗时,减少运行时的卡顿现象,提供更流畅的上妆效果。
动态场景如可以是分享视频应用、拍摄视频应用、美化视频应用等应用中的动态场景。在这些动态场景中,可以对人脸进行化妆,例如,画眼影、描口红、瘦脸等。在对人脸进行化妆时,首先都需要对人脸五官进行精确定位,也即人脸关键点检测。当检测到每个器官的关键点以后,可以采用CG(Computer Graphics,计算机动画)渲染技术对人脸图像进行上妆。由于基于本实施例的脸 部关键点检测方法,具有检测准确性高的有益效果,因此,可以避免因人脸上妆时存在侧脸、遮挡、暗光等特殊情况,引起的关键点检测不准确、上妆错乱的问题。从而,提高上妆的稳定性。
还需说明的是,动态场景对实时性要求高,也即需要实时检测出每个视频帧中人脸图像的脸部关键点,也即对检测效率要求更高,基于本实施例的脸部关键点检测方法,能够更好的适用于动态场景。在保证动态场景流畅性的同时,适用于执行终端为智能终端的应用环境。
在其中一实施例中,对各关键点的候选点进行联合约束,确定各脸部关键点,包括:基于线性模型对各关键点的候选点进行联合约束,确定各脸部关键点。
可以理解地,基于本实施例的脸部关键点检测方法中,当每一张局部图像对应一个关键点。在进行联合约束时,可以无需进行旋转操作,因此,可以基于线性模型对各关键点的候选点进行联合约束,从而确定待检测脸部图中的各脸部关键点。尤其是,在侧脸姿态下,传统非线性模型往往存在求解偏差的情况,而基于线性模型的联合约束,可以获得更准确的脸部关键点。由于基于线性模型对各关键点的候选点进行联合约束时,只需要进行单次求解,相较于非线性模型需要多次求解需的计算量更少,且能够保证收敛到全局最优。
由于线性模型相较于非线性模型具有能够保证收敛到全局最优、计算量更少的效果。因此,基于本实施例的脸部关键点检测方法,可以使得确定各脸部关键点的效率得到进一步提升的同时,提高脸部关键点检测的准确性。当该脸部关键点检测方法应用于上妆类应用时,由于检测效率及准确性得到提升,可以进一步降低关键点检测的耗时,减少运行时的卡顿现象,提供更流畅的上妆效果。同时,由于线性模型可以保证求解得到全局最优解,能够避免在侧脸姿态下存在求解偏差的情况,提高在侧脸姿态下上妆的准确性。
另外,线性模型为动态线性点分布模型。动态线性点分布模型为动态的基于线性约束条件的点分布模型。在一种可能实现方式中,对各关键点的候选点进行联合约束,确定各脸部关键点,包括:基于动态线性点分布模型,获取每个局部图像的约束参数;根据每个约束参数,对各个局部图像的关键点的候选点进行联合约束,得到各个局部图像的脸部关键点。
点分布模型(PDM,Point Distribution Model)是指,某一类别对象中关键 点分布的统计模型,能反映出该类别对象的形状特征。在本实施例中,点分布模型为脸部关键点分布的统计模型。动态的线性点分布模型,可以实时动态更新约束参数(PDM参数),使得检测结果更为准确。因此,基于本实施例的脸部关键点检测方法可以保证收敛到全局最优、计算量更少、约束参数可以实时动态更新。从而,基于本实施例的脸部关键点检测方法,可以使得确定各脸部关键点的效率得到进一步提升的同时,进一步提高脸部关键点检测的准确性。
在其中一具体实施例中,动态线性点分布模型的输入为各局部图像的热量图,输出为各脸部关键点。该动态线性点分布模型中联合约束求解过程的优化目标函数可以为:
Figure PCTCN2019121237-appb-000001
其中,H k [x,y]为第k个关键点位于[x,y]坐标的概率,λ为正则约束强度,可以根据经验设定为0.1~5.0之间的任一数值,B为PDM参数。M PDM是由点分布模型基向量构成的矩阵,M PDM为复合构造矩阵,Λ为PCA(Principle Component Analysis,主成分分解,是指去除训练数据中关键点数据中的冗余成分,得到主要成分向量)主成分向量对应的特征值构成的向量,s.t.表示约束条件。
其中,PCA可以通过降维的方式,将原始数据的多个指标转化为更少数量的综合指标,该更少数量的综合指标能够反映原始数据的大部分信息,因此,该更少数量的综合指标可以看做是原始数据的主成分。
另外,在本实施例中,约束条件为:
Figure PCTCN2019121237-appb-000002
其中,[x,y]表示各脸部关键点的三维空间坐标,s表示缩放因子,可以为一个浮点数;Φ表示PCA基向量构成的矩阵;B表示PDM参数;T表示平移因子。上述公式中的乘法为矩阵相乘,由于矩阵乘法具有结合律,因此上述约束条件可以进一步改写为:[X,Y]=M PDM·B。其中,M PDM可以为复合构造矩阵,如,M PDM的第1列为
Figure PCTCN2019121237-appb-000003
M PDM的第2列为向量[1,0,1,0,……,1,0];M PDM的第3列为向量[0,1,0,1,……,0,1];M PDM的第4列到最后一列为PCA基向量。求解参数B时,B=(M PDM) -1·[X,Y],其中(M PDM) -1表示M PDM的伪逆矩阵。
相对于传统的约束条件,
Figure PCTCN2019121237-appb-000004
其中,R表示旋转因子,可以表示为:
Figure PCTCN2019121237-appb-000005
其中,φ,θ,ψ分别表示绕三维空间坐标中X、Y、Z轴旋转的角度。由于传 统约束条件中存在非线性因子R,要求解该公式中的参数B,需要使用复杂的算法求解,如梯度下降方法,其耗时严重,且无法保证为全局最优解。
基于本实施方式的动态线性点分布模型中,在进行联合约束时,无需进行旋转操作,约束条件中不存在非线性因子,因此,脸部关键点检测的效率高。
可以理解地,在其它实施例中,线性模型也可以为线性点分布模型即可,如此,可以达到使得确定各脸部关键点的效率得到进一步提升的同时,提高脸部关键点检测准确性的有益效果。
在其中一实施例中,基于各局部图像,分别确定各局部图像对应的关键点的候选点,包括:当待检测帧为非预设帧时,获取待检测帧的前序帧中分别包括各关键点的局部图像;基于前序帧及待检测帧中对应的局部图像,分别确定待检测脸部图像中各局部图像对应的关键点的候选点。
其中,预设帧可以为预先设置的帧,该预设帧可以为非关键帧或者首帧。其中,关键帧可以是指包括关键信息的视频帧,也可以是每隔预设数量帧或时间取到的一视频帧等。非预设帧即为关键帧、非首帧。首帧可以是指在动态场景中用于检测脸部关键点的第一帧。非首帧即为在动态场景中、用于检测脸部关键点的第一帧之后帧。前序帧为在待检测帧之前的任一帧,并且前序帧可以包括待检测帧之前的一个帧,也可以包括待检测帧之前的多个帧。另外,前序帧可以是与待检测帧连续的至少一帧。如,前序帧可以为待检测帧的前一帧。
在一种可能实现方式中,基于各局部图像,分别确定各局部图像对应的关键点的候选点,包括:当预设帧为首帧时,当待检测帧为非预设帧,也即待检测帧为非首帧,获取待检测帧的前序帧中分别包括各关键点的局部图像;基于前序帧及待检测帧中对应的局部图像,分别确定待检测脸部图像中各局部图像对应的关键点的候选点。其中,前序帧及待检测帧中对应的局部图像,为同一关键点对应的前序帧中的局部图像及待检测帧中的局部图像。
在一种可能实现方式中,基于各局部图像,分别确定各局部图像对应的关键点的候选点,包括:预设帧为非关键帧,当待检测帧为非预设帧且为非首帧时,获取待检测帧的前序帧中分别包括各关键点的局部图像;基于前序帧及待检测帧中对应的局部图像,分别确定待检测脸部图像中各局部图像对应的关键点的候选点。
其中,当待检测帧为非预设帧且为非首帧时,是指该待检测帧为关键帧,且该关键帧不是动态场景中的首帧。前序帧及待检测帧中对应的局部图像,为同一关键点对应的前序帧中的局部图像及待检测帧中的局部图像。
在本实施例中,当待检测帧为非预设帧时,结合待检测帧及其前序帧对应的局部图像,确定该待检测脸部图像对应的关键点的候选点,也即确定该待检测帧的局部图像对应的关键点的候选点。前序帧及待检测帧中对应的局部图像,为同一关键点对应的前序帧中的局部图像及待检测帧中的局部图像。可以理解地,待检测帧的局部图像的数量大于1。该待检测帧的局部图像的数量可以与该待检测帧中脸部关键点的数量相同。
前序帧的局部图像的确定方式,可以是采用预检测的方式预测关键点位置,根据该预测的关键点位置,进行局部图像提取,得到前序帧的局部图像;还可以是基于前序帧的脸部关键点的检测结果,进行局部图像提取,得到前序帧的局部图像。其中,前序帧的脸部关键点的检测结果,是采用本申请实施例提供的脸部关键点检测方法对该前序帧进行检测得到的检测结果。
基于本实施例的脸部关键点检测方法,在待检测帧为非预设帧时,结合待检测帧及其前序帧确定该待检测帧的局部图像对应的关键点的候选点,如此,可以确保在前序帧和待检测帧中关键点具有一致性。从而,可以提高动态场景中关键点的稳定性。当该脸部关键点检测方法应用于上妆类应用时,能够提高在连续帧视频上人脸关键点检测的稳定性,改善了妆容抖动的问题。
在其中一实施例中,对各关键点的候选点进行联合约束,确定各脸部关键点之后,还包括:基于各脸部关键点进行局部图像提取,确定后一帧脸部图像的前序帧中分别包括各关键点的局部图像。
其中,基于待检测帧的脸部关键点,进行局部图像提取,可以确定该待检测帧的后一帧脸部图像的前序帧(即当前的待检测帧)分别包括各关键点的局部图像,也即是,在采用本申请实施例提供的脸部关键点检测方法,得到当前的待检测帧的脸部关键点之后,根据得到的脸部关键点,进行局部图像提取,得到包括各关键点的局部图像,之后,可以对下一待检测帧进行处理,因此,可以将得到的包括各关键点的局部图像作为下一待检测帧的前序帧的局部图像。如,若当前待检测帧为第N帧,N为自然数,则根据第N帧的各脸部关键点,分别提取局部图像,得到第N+1帧的脸部图像的前一帧中分别包括各关键 点的局部图像。采用同样的方式可以获得前m帧中分别包括各关键点的局部图像。m为小于N的自然数。
如此,提供一种相对于通过预检测方式,更为准确的确定前序帧中分别包括各关键点的局部图像的方式,从而,可以进一步提高关键点检测的准确性。
在其中一实施例中,基于各局部图像,分别确定各局部图像对应的关键点的候选点,包括:当待检测帧为预设帧时,基于待检测帧的各局部图像,分别确定待检测脸部图像中各局部图像对应的关键点的候选点。
也即,当待检测帧为预先设置的帧时,可以根据该待检测帧的各局部图像,分别确定待检测脸部图像中各局部图像对应的关键点的候选点。该预设帧可以为非关键帧或首帧。如此,可以在待检测帧为预设帧时,如当该待检测帧为首帧时没有前序帧,基于该待检测帧的各局部图像,分别确定该待检测脸部图像中各局部图像对应的关键点的候选点。
在其中一实施例中,基于各局部图像,分别确定各局部图像对应的关键点的候选点,包括:预设帧为首帧,当待检测帧为预设帧时,基于待检测帧的各局部图像,分别确定待检测脸部图像中各局部图像对应的关键点的候选点。由于待检测帧为首帧,因此该待检测帧没有前序帧,计算机设备可以根据待检测帧的各局部图像,分别确定待检测脸部图像中各局部图像对应的关键点的候选点。
在其中一实施例中,基于各局部图像,分别确定各局部图像对应的关键点的候选点,包括:预设帧为非关键帧,当待检测帧为预设帧时,基于待检测帧的各局部图像,分别确定待检测脸部图像中各局部图像对应的关键点的候选点。其中,非关键帧可以为未包括关键信息的视频帧,因此,可以简化对非关键帧的处理方式,可以仅基于非关键帧自身的各局部图像,获取各局部图像对应的关键点的候选点,减少了计算机设备的运算量。
在其中一具体实施例,如图4、图5所示,采用训练好的神经网络模型,确定局部图像对应的关键点的候选点,并通过热量图的形式表示局部图像中各像素点作为关键点的概率,从而确定关键点的候选点。例如,每一个神经网络模型输入为一个16x16的局部图像,输出为8x8的热量图,中间包括两层卷积层和一层全连接层,卷积层的卷积核大小均为5x5,卷积层中不作填充和池化,卷积层均采用ReLU(The Rectified Linear Unit,修正线性单元)作为激活函数。热量图中像素值的大小表征该像素是关键点的概率的高低,像素值越大表明该 点是关键点的概率越高。该神经网络在训练时,神经网络的参数采用方差为0.01、均值为0的高斯分布进行初始化,训练方法采用SGD(Stochastic Gradient Descent,随机梯度下降)算法求解神经网络的参数,在每次迭代训练中反传误差为预测的热量图与标注的热量图之间的欧式距离。每个关键点的候选点的确定,采用相同的神经网络结构,但互相独立训练,因此,各脸部关键点对应的神经网络的参数各不相同。需要说明的是,训练采用的训练数据可以是公开的300W数据集和优图实验室标注的数据集。
在该具体实施例,如图4、6所示,当待检测帧为预设帧时,基于待检测帧的各局部图像,分别确定待检测脸部图像中各局部图像对应的关键点的候选点。如图5、6所示,当待检测帧为非预设帧时,获取待检测帧的前序帧中分别包括各关键点的局部图像;基于前一帧及待检测帧中对应的局部图像,分别确定待检测脸部图像中各局部图像对应的关键点的候选点。其中图6中上方的面部图像为预设帧的面部图像,下方的面部图像为非预设帧的面部图像。也即,当待检测帧为预设帧时,待检测脸部图像中每一个关键点的候选点的确定,是基于该待检测帧的一张局部图像确定的。当待检测帧为非预设帧时,待检测脸部图像中每一个关键点的候选点的确定,是联合该待检测帧的一张局部图像及前一帧的一张局部图像,总共两张局部图像确定的。
在其中一实施例中,待检测脸部图像为人脸图像,脸部关键点为人脸关键点。
人脸关键点(Facial Landmarks),是指人脸图像中具有五官语义的点的集合,例如眼角、眼皮中点、鼻尖、嘴角、轮廓等点的集合。
基于本实施例的脸部关键点检测方法,能够提高人脸关键点的确定效率,及准确性。
在其中一实施例中,基于各局部图像,分别确定各局部图像对应的关键点的候选点的方式包括:对每一局部图像,确定局部图像的热量图;该热量图包括对应的局部图像中各像素点作为关键点的概率;根据该热量图,确定局部图像对应的关键点的候选点。
可以通过热量图的形式来表示局部图像中,各像素点作为关键点的概率。该候选点可以为作为关键点的概率大于预设概率值的像素点。该预设概率值可 以为0、0.1等0至1区间内的任一数值。也即,可以根据该热量图确定局部图像对应的关键点的候选点。如果某个关键点存在遮挡、暗光等干扰因素,相应热量图中的热量值较低,但在联合约束时,将排查上述干扰因素。
由于热量图包含了关键点分布的概率信息,如果某个关键点存在遮挡、暗光等干扰因素,相应热量图中的热量值较低,因此,可以提高在侧脸、遮挡、暗光等特殊情况下关键点的检测准确性。当该脸部关键点检测方法应用于上妆类应用时,由于该方法可以提高在侧脸、遮挡、暗光等特殊情况下关键点的检测准确性,从而能够提高在侧脸、遮挡、暗光等特殊情况下上妆的准确性,改善妆容错位的问题。
在其中一实施例中,获取待检测脸部图像,包括:获取初始脸部图像;对初始脸部图像进行预处理,得到待检测脸部图像。
初始脸部图像可以为通过摄像头采集得到的、未经过加工处理的脸部图像。预处理可以包括旋正或/及缩放等操作,使得双眼瞳孔关键点处于同一水平线上,或/及,双眼瞳孔关键点的水平距离为预设值。旋正是指将脸部图像旋转至双眼瞳孔关键点处于同一水平线上。缩放是指将脸部图像进行放大或缩小的处理。预处理也可以包括获取初始脸部图像的纹理特征图像,或/及,对初始脸部图像进行脸部区域定位等操作。如此,可以进一步提升脸部关键点检测的效率。
在其中一实施例中,预处理包括:对初始脸部图像进行旋正,使得双眼瞳孔关键点处于水平线上;或/及,对初始脸部图像进行缩放,使得双眼瞳孔关键点的水平距离为预设值。该预设值为预先确定的双眼通孔关键点的水平距离值。如,可以为160像素。如此,可以进一步提升脸部关键点检测的效率。
在其中一实施例中,根据待检测脸部图像,确定待检测脸部图像中分别包括各关键点的局部图像,包括:对待检测脸部图像进行脸部关键点的初始化;基于初始化结果确定待检测脸部图像中分别包括各关键点的局部图像。
在本实施例中,对待检测脸部图像进行脸部关键点初始化的方式可以包括:当待检测脸部图像不存在前序帧脸部图像时,如可以是动态场景中的首帧,也可以独立的一张脸部图像,此时,可以根据平均脸部模型的关键点坐标,确定初始化关键点的初始化结果。该初始化结果可以为平均脸部模型的关键点坐标。 平均脸部模型的关键点坐标可以是对大量脸部模型进行分析之后,得到的各脸部关键点的平均位置坐标。当待检测脸部图像存在前序帧脸部图像时,可以根据前序帧的脸部图像的关键点坐标,确定初始化关键点的初始化结果。该初始化结果可以为前一帧脸部图像的关键点坐标。
可以理解地,在其它实施例中,也可以通过其它方式对待检测脸部图像进行脸部关键点初始化,如,通过预检测的方式对待检测脸部图像进行脸部关键点初始化。
基于初始化结果确定待检测脸部图像中分别包括各关键点的局部图像,可以是将初始化结果作为对应的关键点的局部图像的中心位置,按预设大小进行图像提取,得到包含该关键点的局部图像。
如此,提供一种效率高的确定待检测脸部图像中分别包括各关键点的局部图像的实施方式,从而,进一步提高脸部关键点的检测效率。
如图7所示,在一个具体实施例中,脸部关键点检测方法,方法应用于动态场景,包括:
获取初始脸部图像;对初始脸部图像进行旋正及缩放,使得双眼瞳孔关键点处于同一水平线上且双眼瞳孔关键点的水平距离为预设值,得到待检测脸部图像;待检测脸部图像为待检测帧的脸部图像;待检测脸部图像为人脸图像;
对待检测脸部图像进行脸部关键点的初始化;基于初始化结果确定待检测脸部图像中分别包括各关键点的局部图像,该待检测脸部图像为待检测帧的脸部图像;
当待检测帧为非预设帧时,获取待检测帧的前序帧中分别包括各关键点的局部图像;基于前序帧及待检测帧中对应的局部图像,分别确定待检测脸部图像中各局部图像对应的热量图;
当待检测帧为预设帧时,基于待检测帧的各局部图像,分别确定待检测脸部图像中各局部图像对应的热量图;
基于动态线性点分布模型对各热量图进行联合约束,确定各脸部关键点,脸部关键点为人脸关键点;
基于各脸部关键点进行局部图像提取,确定后一帧脸部图像的前序帧中分别包括各关键点的局部图像。
基于本实施例的脸部关键点检测方法,能够提高脸部关键点的检测效率、 准确性及稳定性。
为了更清楚地说明上述脸部关键点检测方法的有益效果,请参阅图8和图9。图8为当人脸的一侧脸部存在暗光情况时,采用现有技术的方式检测到该侧脸的眉毛区域、眼睛区域的关键点均有错乱,因此,在为对象进行“戴眼镜”的“上妆操作”时,眼镜佩戴的位置明显错乱。图9为采用本申请的实施方式检测到该存在暗光情况的侧脸的眉毛区域、眼睛区域的关键点准确,因此,在为对象进行“戴眼镜”的“上妆操作”时,眼镜佩戴的位置准确。
应该理解的是,虽然图3、7的流程图中的各个步骤按照箭头的指示依次显示,但是这些步骤并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明,这些步骤的执行并没有严格的顺序限制,这些步骤可以以其它的顺序执行。而且,图3、7中的至少一部分步骤可以包括多个子步骤或者多个阶段,这些子步骤或者阶段并不必然是在同一时刻执行完成,而是可以在不同的时刻执行,这些子步骤或者阶段的执行顺序也不必然是依次进行,而是可以与其它步骤或者其它步骤的子步骤或者阶段的至少一部分轮流或者交替地执行。
在一个实施例中,如图10所示,提供了一种脸部关键点检测装置,包括:
整体图像获取模块802,用于获取待检测脸部图像,该待检测脸部图像为待检测帧的脸部图像;
局部图像确定模块804,用于根据该待检测脸部图像,确定该待检测脸部图像中分别包括各关键点的局部图像;
局部候选点确定模块806,用于基于各局部图像,分别确定各局部图像对应的关键点的候选点;
整体关键点确定模块808,用于对各关键点的候选点进行联合约束,确定各脸部关键点。
该脸部关键点检测装置,获取待检测脸部图像;根据该待检测脸部图像,确定该待检测脸部图像中分别包括各关键点的局部图像;基于各局部图像,分别确定各局部图像对应的关键点的候选点;对各关键点的候选点进行联合约束,确定各脸部关键点。由于对整体的待检测脸部图像中分别包括各关键点的局部图像,分别确定该局部图像中对应的关键点的候选点。可以降低计算量,提高关键点候选点的确定效率。从而,可以使得各脸部关键点的检测效率得到提升。
在其中一实施例中,整体关键点确定模块,用于基于线性模型对各关键点的候选点进行联合约束,确定各脸部关键点。
在其中一实施例中,该线性模型为动态线性点分布模型。
在其中一实施例中,局部候选点确定模块,用于当该待检测帧为非预设帧时,获取该待检测帧的前序帧中分别包括各关键点的局部图像;基于该前序帧及该待检测帧中对应的局部图像,分别确定该待检测脸部图像中各局部图像对应的关键点的候选点。
在其中一实施例中,还包括前序局部图像确定模块,用于在整体关键点确定模块对各关键点的候选点进行联合约束,确定各脸部关键点之后,基于各该脸部关键点进行局部图像提取,确定后一帧脸部图像的前序帧中分别包括各关键点的局部图像。
在其中一实施例中,局部候选点确定模块,用于当该待检测帧为预设帧时,基于该待检测帧的各局部图像,分别确定该待检测脸部图像中各局部图像对应的关键点的候选点。
在其中一实施例中,该待检测脸部图像为人脸图像,该脸部关键点为人脸关键点。
在其中一实施例中,局部候选点确定模块,用于对每一局部图像,确定该局部图像的热量图;该热量图包括对应的局部图像中各像素点作为关键点的概率;
根据该热量图,确定该局部图像对应的关键点的候选点。
在其中一实施例中,整体图像获取模块,用于获取初始脸部图像;并对该初始脸部图像进行预处理,得到该待检测脸部图像。
在其中一实施例中,整体图像获取模块,还用于对该初始脸部图像进行旋正,使得双眼瞳孔关键点处于同一水平线上;或/及,对该初始脸部图像进行缩放,使得双眼瞳孔关键点的水平距离为预设值。
在其中一实施例中,局部候选点确定模块,用于对该待检测脸部图像进行脸部关键点的初始化;基于初始化结果确定该待检测脸部图像中分别包括各关键点的局部图像。
在一个实施例中,提供了一种计算机设备,该计算机设备可以为服务器,该计算机设备包括通过系统总线连接的处理器、存储器和网络接口。其中,该 计算机设备的处理器用于提供计算和控制能力。该计算机设备的存储器包括非易失性存储介质、内存储器。该非易失性存储介质存储有操作系统和计算机程序。该内存储器为非易失性存储介质中的操作系统和计算机程序的运行提供环境。该计算机设备的网络接口用于与外部的终端通过网络连接通信。该计算机程序被处理器执行时以实现一种脸部关键点检测方法。
在一个实施例中,提供了一种计算机设备,该计算机设备可以是终端。该计算机设备包括通过系统总线连接的处理器、存储器、网络接口、显示屏和输入装置。其中,该计算机设备的处理器用于提供计算和控制能力。该计算机设备的存储器包括非易失性存储介质、内存储器。该非易失性存储介质存储有操作系统和计算机程序。该内存储器为非易失性存储介质中的操作系统和计算机程序的运行提供环境。该计算机设备的网络接口用于与外部的终端通过网络连接通信。该计算机程序被处理器执行时以实现一种脸部关键点检测方法。该计算机设备的显示屏可以是液晶显示屏或者电子墨水显示屏,该计算机设备的输入装置可以是显示屏上覆盖的触摸层,也可以是计算机设备外壳上设置的按键、轨迹球或触控板,还可以是外接的键盘、触控板或鼠标等。
在一个实施方式中,提供了一种计算机设备,该计算机设备可以是服务器或终端。该计算机设备,包括存储器和处理器,该存储器存储有计算机程序,该处理器执行该计算机程序时实现上述脸部关键点检测方法的步骤。
在其中一实施例中,该计算机设备,包括存储器和处理器,该存储器存储有计算机程序,该处理器执行该计算机程序时实现以下步骤:
获取待检测脸部图像,该待检测脸部图像为待检测帧的脸部图像;
根据该待检测脸部图像,确定该待检测脸部图像中分别包括各关键点的局部图像;
基于各局部图像,分别确定各局部图像对应的关键点的候选点;
对各关键点的候选点进行联合约束,确定各脸部关键点。
在其中一实施例中,该对各关键点的候选点进行联合约束,确定各脸部关键点,包括:
基于线性模型对各关键点的候选点进行联合约束,确定各脸部关键点。
在其中一实施例中,该线性模型为动态线性点分布模型。
在其中一实施例中,该基于各局部图像,分别确定各局部图像对应的关键 点的候选点,包括:
当该待检测帧为非预设帧时,获取该待检测帧的前序帧中分别包括各关键点的局部图像;
基于该前序帧及该待检测帧中对应的局部图像,分别确定该待检测脸部图像中各局部图像对应的关键点的候选点。
在其中一实施例中,该对各关键点的候选点进行联合约束,确定各脸部关键点之后,还包括:
基于各该脸部关键点进行局部图像提取,确定后一帧脸部图像的前序帧中分别包括各关键点的局部图像。
在其中一实施例中,该基于各局部图像,分别确定各局部图像对应的关键点的候选点,包括:
当该待检测帧为预设帧时,基于该待检测帧的各局部图像,分别确定该待检测脸部图像中各局部图像对应的关键点的候选点。
在其中一实施例中,该待检测脸部图像为人脸图像,该脸部关键点为人脸关键点。
在其中一实施例中,该基于各局部图像,分别确定各局部图像对应的关键点的候选点的方式包括:
对每一局部图像,确定该局部图像的热量图;该热量图包括对应的局部图像中各像素点作为关键点的概率;
根据该热量图,确定该局部图像对应的关键点的候选点。
在其中一实施例中,该获取待检测脸部图像,包括:
获取初始脸部图像;
对该初始脸部图像进行预处理,得到该待检测脸部图像。
在其中一实施例中,该预处理包括:
对该初始脸部图像进行旋正,使得双眼瞳孔关键点处于同一水平线上;
或/及,
对该初始脸部图像进行缩放,使得双眼瞳孔关键点的水平距离为预设值。
在其中一实施例中,该根据该待检测脸部图像,确定该待检测脸部图像中分别包括各关键点的局部图像,包括:
对该待检测脸部图像进行脸部关键点的初始化;
基于初始化结果确定该待检测脸部图像中分别包括各关键点的局部图像。
在其中一实施方式中,提供一种计算机可读存储介质,其上存储有计算机程序,该计算机程序被处理器执行时实现上述脸部关键点检测方法的步骤。
在其中一实施例中,提供一种计算机可读存储介质,其上存储有计算机程序,该计算机程序被处理器执行时实现以下步骤:
获取待检测脸部图像,该待检测脸部图像为待检测帧的脸部图像;
根据该待检测脸部图像,确定该待检测脸部图像中分别包括各关键点的局部图像;
基于各局部图像,分别确定各局部图像对应的关键点的候选点;
对各关键点的候选点进行联合约束,确定各脸部关键点。
在其中一实施例中,该对各关键点的候选点进行联合约束,确定各脸部关键点,包括:
基于线性模型对各关键点的候选点进行联合约束,确定各脸部关键点。
在其中一实施例中,该线性模型为动态线性点分布模型。
在其中一实施例中,该基于各局部图像,分别确定各局部图像对应的关键点的候选点,包括:
当该待检测帧为非预设帧时,获取该待检测帧的前序帧中分别包括各关键点的局部图像;
基于该前序帧及该待检测帧中对应的局部图像,分别确定该待检测脸部图像中各局部图像对应的关键点的候选点。
在其中一实施例中,该对各关键点的候选点进行联合约束,确定各脸部关键点之后,还包括:
基于各该脸部关键点进行局部图像提取,确定后一帧脸部图像的前序帧中分别包括各关键点的局部图像。
在其中一实施例中,该基于各局部图像,分别确定各局部图像对应的关键点的候选点,包括:
当该待检测帧为预设帧时,基于该待检测帧的各局部图像,分别确定该待检测脸部图像中各局部图像对应的关键点的候选点。
在其中一实施例中,该待检测脸部图像为人脸图像,该脸部关键点为人脸关键点。
在其中一实施例中,该基于各局部图像,分别确定各局部图像对应的关键 点的候选点的方式包括:
对每一局部图像,确定该局部图像的热量图;该热量图包括对应的局部图像中各像素点作为关键点的概率;
根据该热量图,确定该局部图像对应的关键点的候选点。
在其中一实施例中,该获取待检测脸部图像,包括:
获取初始脸部图像;
对该初始脸部图像进行预处理,得到该待检测脸部图像。
在其中一实施例中,该预处理包括:
对该初始脸部图像进行旋正,使得双眼瞳孔关键点处于同一水平线上;
或/及,
对该初始脸部图像进行缩放,使得双眼瞳孔关键点的水平距离为预设值。
在其中一实施例中,该根据该待检测脸部图像,确定该待检测脸部图像中分别包括各关键点的局部图像,包括:
对该待检测脸部图像进行脸部关键点的初始化;
基于初始化结果确定该待检测脸部图像中分别包括各关键点的局部图像。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机程序来指令相关的硬件来完成,所述的计算机程序可存储于一非易失性计算机可读取存储介质中,该计算机程序在执行时,可包括如上述各方法的实施例的流程。其中,本申请所提供的各实施例中所使用的对存储器、存储、数据库或其它介质的任何引用,均可包括非易失性和/或易失性存储器。非易失性存储器可包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦除可编程ROM(EEPROM)或闪存。易失性存储器可包括随机存取存储器(RAM)或者外部高速缓冲存储器。作为说明而非局限,RAM以多种形式可得,诸如静态RAM(SRAM)、动态RAM(DRAM)、同步DRAM(SDRAM)、双数据率SDRAM(DDRSDRAM)、增强型SDRAM(ESDRAM)、同步链路(Synchlink)DRAM(SLDRAM)、存储器总线(Rambus)直接RAM(RDRAM)、直接存储器总线动态RAM(DRDRAM)、以及存储器总线动态RAM(RDRAM)等。
以上实施例的各技术特征可以进行任意的组合,为使描述简洁,未对上述实施例中的各个技术特征所有可能的组合都进行描述,然而,只要这些技术特 征的组合不存在矛盾,都应当认为是本说明书记载的范围。
以上所述实施例仅表达了本申请的几种实施方式,其描述较为具体和详细,但并不能因此而理解为对发明专利范围的限制。应当指出的是,对于本领域的普通技术人员来说,在不脱离本申请构思的前提下,还可以做出若干变形和改进,这些都属于本申请的保护范围。因此,本申请专利的保护范围应以所附权利要求为准。

Claims (22)

  1. 一种脸部关键点检测方法,应用于计算机设备,所述方法包括:
    获取待检测脸部图像,所述待检测脸部图像为待检测帧的脸部图像;
    根据所述待检测脸部图像,确定所述待检测脸部图像中分别包括各关键点的局部图像;
    基于各局部图像,分别确定各所述局部图像对应的关键点的候选点;
    对各关键点的候选点进行联合约束,确定各脸部关键点。
  2. 根据权利要求1所述的方法,其特征在于,所述对各关键点的候选点进行联合约束,确定各脸部关键点,包括:
    基于线性模型对各所述关键点的候选点进行联合约束,确定各所述脸部关键点。
  3. 根据权利要求1所述的方法,其特征在于,所述基于各局部图像,分别确定各局部图像对应的关键点的候选点,包括:
    当所述待检测帧为非预设帧时,获取所述待检测帧的前序帧中分别包括各关键点的局部图像;
    基于所述前序帧及所述待检测帧中对应的局部图像,分别确定所述待检测脸部图像中各局部图像对应的关键点的候选点。
  4. 根据权利要求1所述的方法,其特征在于,所述对各关键点的候选点进行联合约束,确定各脸部关键点之后,还包括:
    基于各所述脸部关键点进行局部图像提取,确定后一帧脸部图像的前序帧中分别包括各关键点的局部图像。
  5. 根据权利要求1所述的方法,其特征在于,所述基于各局部图像,分别确定各局部图像对应的关键点的候选点,包括:
    当所述待检测帧为预设帧时,基于所述待检测帧的各局部图像,分别确定所述待检测脸部图像中各局部图像对应的关键点的候选点。
  6. 根据权利要求1所述的方法,其特征在于,所述待检测脸部图像为人脸图像,所述脸部关键点为人脸关键点。
  7. 根据权利要求1所述的方法,其特征在于,所述基于各局部图像,分别确定各局部图像对应的关键点的候选点的方式包括:
    对每一局部图像,确定所述局部图像的热量图;所述热量图包括对应的局部图像中各像素点作为关键点的概率;
    根据所述热量图,确定所述局部图像对应的关键点的候选点。
  8. 根据权利要求1所述的方法,其特征在于,所述获取待检测脸部图像,包括:
    获取初始脸部图像;
    对所述初始脸部图像进行预处理,得到所述待检测脸部图像。
  9. 根据权利要求8所述的方法,其特征在于,所述预处理包括:
    对所述初始脸部图像进行旋正,使得双眼瞳孔关键点处于同一水平线上;
    或/及,
    对所述初始脸部图像进行缩放,使得双眼瞳孔关键点的水平距离为预设值。
  10. 根据权利要求9所述的方法,其特征在于,所述根据所述待检测脸部图像,确定所述待检测脸部图像中分别包括各关键点的局部图像,包括:
    对所述待检测脸部图像进行脸部关键点的初始化;
    基于初始化结果确定所述待检测脸部图像中分别包括各关键点的局部图像。
  11. 一种脸部关键点检测装置,所述装置包括:
    整体图像获取模块,用于获取待检测脸部图像,所述待检测脸部图像为待检测帧的脸部图像;
    局部图像确定模块,用于根据所述待检测脸部图像,确定所述待检测脸部图像中分别包括各关键点的局部图像;
    局部候选点确定模块,用于基于各局部图像,分别确定各局部图像对应的关键点的候选点;
    整体关键点确定模块,用于对各关键点的候选点进行联合约束,确定各脸部关键点。
  12. 根据权利要求11所述的装置,其特征在于,所述整体关键点确定模块,用于基于线性模型对各所述关键点的候选点进行联合约束,确定各所述脸部关键点。
  13. 根据权利要求11所述的装置,其特征在于,所述局部候选点确定模块,用于当所述待检测帧为非预设帧时,获取所述待检测帧的前序帧中分别包括各 关键点的局部图像;基于所述前序帧及所述待检测帧中对应的局部图像,分别确定所述待检测脸部图像中各局部图像对应的关键点的候选点。
  14. 根据权利要求11所述的装置,其特征在于,所述装置还包括:
    前序局部图像确定模块,用于在整体关键点确定模块对各关键点的候选点进行联合约束,确定各脸部关键点之后,基于各所述脸部关键点进行局部图像提取,确定后一帧脸部图像的前序帧中分别包括各关键点的局部图像。
  15. 根据权利要求11所述的装置,其特征在于,所述局部候选点确定模块,用于当所述待检测帧为预设帧时,基于所述待检测帧的各局部图像,分别确定所述待检测脸部图像中各局部图像对应的关键点的候选点。
  16. 根据权利要求11所述的装置,其特征在于,所述待检测脸部图像为人脸图像,所述脸部关键点为人脸关键点。
  17. 根据权利要求11所述的装置,其特征在于,所述局部候选点确定模块,用于对每一局部图像,确定所述局部图像的热量图;所述热量图包括对应的局部图像中各像素点作为关键点的概率;根据所述热量图,确定所述局部图像对应的关键点的候选点。
  18. 根据权利要求11所述的装置,其特征在于,所述整体图像获取模块,用于获取初始脸部图像;对所述初始脸部图像进行预处理,得到所述待检测脸部图像。
  19. 根据权利要求18所述的装置,其特征在于,所述整体图像获取模块,还用于对所述初始脸部图像进行旋正,使得双眼瞳孔关键点处于同一水平线上;或/及,对所述初始脸部图像进行缩放,使得双眼瞳孔关键点的水平距离为预设值。
  20. 根据权利要求19所述的装置,其特征在于,所述局部候选点确定模块,用于对所述待检测脸部图像进行脸部关键点的初始化;基于初始化结果确定所述待检测脸部图像中分别包括各关键点的局部图像。
  21. 一种计算机设备,包括存储器和处理器,所述存储器存储有计算机程序,其特征在于,所述处理器执行所述计算机程序时实现权利要求1-10任意一项所述的方法的步骤。
  22. 一种计算机可读存储介质,其上存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现权利要求1-10任意一项所述的方法的步骤。
PCT/CN2019/121237 2018-12-10 2019-11-27 脸部关键点检测方法、装置、计算机设备和存储介质 WO2020119458A1 (zh)

Priority Applications (4)

Application Number Priority Date Filing Date Title
KR1020217011194A KR102592270B1 (ko) 2018-12-10 2019-11-27 얼굴 랜드마크 검출 방법과 장치, 컴퓨터 장치, 및 저장 매체
EP19897431.3A EP3839807A4 (en) 2018-12-10 2019-11-27 METHOD AND DEVICE FOR FACE MARKING DETECTION, COMPUTER DEVICE AND STORAGE MEDIUM
JP2021516563A JP2022502751A (ja) 2018-12-10 2019-11-27 顔キーポイント検出方法、装置、コンピュータ機器及びコンピュータプログラム
US17/184,368 US11915514B2 (en) 2018-12-10 2021-02-24 Method and apparatus for detecting facial key points, computer device, and storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201811503905.5A CN109657583B (zh) 2018-12-10 2018-12-10 脸部关键点检测方法、装置、计算机设备和存储介质
CN201811503905.5 2018-12-10

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/184,368 Continuation US11915514B2 (en) 2018-12-10 2021-02-24 Method and apparatus for detecting facial key points, computer device, and storage medium

Publications (1)

Publication Number Publication Date
WO2020119458A1 true WO2020119458A1 (zh) 2020-06-18

Family

ID=66113105

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/121237 WO2020119458A1 (zh) 2018-12-10 2019-11-27 脸部关键点检测方法、装置、计算机设备和存储介质

Country Status (6)

Country Link
US (1) US11915514B2 (zh)
EP (1) EP3839807A4 (zh)
JP (1) JP2022502751A (zh)
KR (1) KR102592270B1 (zh)
CN (1) CN109657583B (zh)
WO (1) WO2020119458A1 (zh)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113761994A (zh) * 2020-08-07 2021-12-07 北京沃东天骏信息技术有限公司 处理图像的方法、装置、设备和计算机可读介质
CN114463534A (zh) * 2021-12-28 2022-05-10 佳都科技集团股份有限公司 一种目标关键点检测方法、装置、设备及存储介质
US11574500B2 (en) 2020-09-08 2023-02-07 Samsung Electronics Co., Ltd. Real-time facial landmark detection
CN115802160A (zh) * 2023-02-03 2023-03-14 北京润谊医疗管理顾问有限公司 一种眼底图像的智能拍摄方法及系统

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109657583B (zh) 2018-12-10 2021-10-22 腾讯科技(深圳)有限公司 脸部关键点检测方法、装置、计算机设备和存储介质
CN112016371B (zh) * 2019-05-31 2022-01-14 广州市百果园信息技术有限公司 人脸关键点检测方法、装置、设备及存储介质
EP3971820A4 (en) * 2019-09-30 2022-08-10 Beijing Sensetime Technology Development Co., Ltd. IMAGE PROCESSING METHOD, DEVICE AND ELECTRONIC DEVICE
CN110826534B (zh) * 2019-11-30 2022-04-05 杭州小影创新科技股份有限公司 一种基于局部主成分分析的人脸关键点检测方法及系统
CN111598051B (zh) * 2020-06-16 2023-11-14 腾讯科技(深圳)有限公司 一种脸部验证方法、装置、设备及可读存储介质
CN112257582A (zh) * 2020-10-21 2021-01-22 北京字跳网络技术有限公司 脚部姿态确定方法、装置、设备和计算机可读介质
CN112967235A (zh) * 2021-02-19 2021-06-15 联影智能医疗科技(北京)有限公司 图像检测方法、装置、计算机设备和存储介质
CN113344890B (zh) * 2021-06-18 2024-04-12 北京百度网讯科技有限公司 医学图像识别方法、识别模型训练方法及装置
CN113449657B (zh) * 2021-07-05 2022-08-30 中山大学 一种基于人脸关键点的深度伪造人脸视频检测方法、系统及介质
KR102529209B1 (ko) 2021-09-16 2023-05-09 주식회사 이엔터 얼굴 인식을 통한 특수효과 연출 시스템 및 그 방법

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106845377A (zh) * 2017-01-10 2017-06-13 北京小米移动软件有限公司 人脸关键点定位方法及装置
CN106991388A (zh) * 2017-03-27 2017-07-28 中国科学院自动化研究所 关键点定位方法
CN107945219A (zh) * 2017-11-23 2018-04-20 翔创科技(北京)有限公司 脸部图像对齐方法、计算机程序、存储介质及电子设备
CN109492531A (zh) * 2018-10-10 2019-03-19 深圳前海达闼云端智能科技有限公司 人脸图像关键点提取方法、装置、存储介质及电子设备
CN109657583A (zh) * 2018-12-10 2019-04-19 腾讯科技(深圳)有限公司 脸部关键点检测方法、装置、计算机设备和存储介质

Family Cites Families (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2002007096A1 (fr) * 2000-07-17 2002-01-24 Mitsubishi Denki Kabushiki Kaisha Dispositif de recherche d'un point caracteristique sur un visage
JP4532419B2 (ja) * 2006-02-22 2010-08-25 富士フイルム株式会社 特徴点検出方法および装置並びにプログラム
JP2012014557A (ja) * 2010-07-02 2012-01-19 Fujitsu Ltd 特徴点判定装置、特徴点判定方法および特徴点判定プログラム
CN102194131B (zh) * 2011-06-01 2013-04-10 华南理工大学 基于五官几何比例特征的快速人脸识别方法
CN102799868B (zh) * 2012-07-10 2014-09-10 吉林禹硕动漫游戏科技股份有限公司 人脸面部关键表情识别方法
US9152847B2 (en) * 2012-11-27 2015-10-06 Adobe Systems Incorporated Facial landmark localization by exemplar-based graph matching
US8948517B2 (en) * 2013-03-01 2015-02-03 Adobe Systems Incorporated Landmark localization via visual search
JP6202938B2 (ja) * 2013-08-22 2017-09-27 キヤノン株式会社 画像認識装置および画像認識方法
WO2016026135A1 (en) * 2014-08-22 2016-02-25 Microsoft Technology Licensing, Llc Face alignment with shape regression
CN106295476B (zh) * 2015-05-29 2019-05-17 腾讯科技(深圳)有限公司 人脸关键点定位方法和装置
CN105205462A (zh) * 2015-09-18 2015-12-30 北京百度网讯科技有限公司 一种拍照提示方法及装置
WO2017100929A1 (en) * 2015-12-15 2017-06-22 Applied Recognition Inc. Systems and methods for authentication using digital signature with biometrics
CN106295567B (zh) * 2016-08-10 2019-04-12 腾讯科技(深圳)有限公司 一种关键点的定位方法及终端
CN108280388A (zh) * 2017-01-06 2018-07-13 富士通株式会社 训练面部检测模型的方法和装置以及面部检测方法和装置
US11068741B2 (en) * 2017-12-28 2021-07-20 Qualcomm Incorporated Multi-resolution feature description for object recognition
CN110731076A (zh) * 2018-07-31 2020-01-24 深圳市大疆创新科技有限公司 一种拍摄处理方法、设备及存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106845377A (zh) * 2017-01-10 2017-06-13 北京小米移动软件有限公司 人脸关键点定位方法及装置
CN106991388A (zh) * 2017-03-27 2017-07-28 中国科学院自动化研究所 关键点定位方法
CN107945219A (zh) * 2017-11-23 2018-04-20 翔创科技(北京)有限公司 脸部图像对齐方法、计算机程序、存储介质及电子设备
CN109492531A (zh) * 2018-10-10 2019-03-19 深圳前海达闼云端智能科技有限公司 人脸图像关键点提取方法、装置、存储介质及电子设备
CN109657583A (zh) * 2018-12-10 2019-04-19 腾讯科技(深圳)有限公司 脸部关键点检测方法、装置、计算机设备和存储介质

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113761994A (zh) * 2020-08-07 2021-12-07 北京沃东天骏信息技术有限公司 处理图像的方法、装置、设备和计算机可读介质
US11574500B2 (en) 2020-09-08 2023-02-07 Samsung Electronics Co., Ltd. Real-time facial landmark detection
CN114463534A (zh) * 2021-12-28 2022-05-10 佳都科技集团股份有限公司 一种目标关键点检测方法、装置、设备及存储介质
CN115802160A (zh) * 2023-02-03 2023-03-14 北京润谊医疗管理顾问有限公司 一种眼底图像的智能拍摄方法及系统
CN115802160B (zh) * 2023-02-03 2023-04-11 北京润谊医疗管理顾问有限公司 一种眼底图像的智能拍摄方法及系统

Also Published As

Publication number Publication date
KR102592270B1 (ko) 2023-10-19
KR20210060554A (ko) 2021-05-26
CN109657583B (zh) 2021-10-22
CN109657583A (zh) 2019-04-19
US20210182537A1 (en) 2021-06-17
US11915514B2 (en) 2024-02-27
EP3839807A1 (en) 2021-06-23
EP3839807A4 (en) 2021-11-10
JP2022502751A (ja) 2022-01-11

Similar Documents

Publication Publication Date Title
WO2020119458A1 (zh) 脸部关键点检测方法、装置、计算机设备和存储介质
US10679046B1 (en) Machine learning systems and methods of estimating body shape from images
WO2021068323A1 (zh) 多任务面部动作识别模型训练方法、多任务面部动作识别方法、装置、计算机设备和存储介质
CN111354079A (zh) 三维人脸重建网络训练及虚拟人脸形象生成方法和装置
WO2022078041A1 (zh) 遮挡检测模型的训练方法及人脸图像的美化处理方法
CN112580416A (zh) 基于深暹罗网络和贝叶斯优化的视频跟踪
WO2022001236A1 (zh) 三维模型生成方法、装置、计算机设备及存储介质
CN111598998A (zh) 三维虚拟模型重建方法、装置、计算机设备和存储介质
CN109584327B (zh) 人脸老化模拟方法、装置以及设备
CN108764143B (zh) 图像处理方法、装置、计算机设备和存储介质
Kittler et al. 3D morphable face models and their applications
JP2023545200A (ja) パラメータ推定モデルの訓練方法、パラメータ推定モデルの訓練装置、デバイスおよび記憶媒体
US20130314437A1 (en) Image processing apparatus, image processing method, and computer program
CN110287836B (zh) 图像分类方法、装置、计算机设备和存储介质
US10977767B2 (en) Propagation of spot healing edits from one image to multiple images
JP2024501986A (ja) 3次元顔再構築の方法、3次元顔再構築の装置、デバイスおよび記憶媒体
EP4322056A1 (en) Model training method and apparatus
CN113570684A (zh) 图像处理方法、装置、计算机设备和存储介质
CN113469092B (zh) 字符识别模型生成方法、装置、计算机设备和存储介质
WO2022033513A1 (zh) 目标分割方法、装置、计算机可读存储介质及计算机设备
CN109544516B (zh) 图像检测方法及装置
Rehman et al. Face detection and tracking using hybrid margin-based ROI techniques
WO2022179603A1 (zh) 一种增强现实方法及其相关设备
WO2021127916A1 (zh) 脸部情感识别方法、智能装置和计算机可读存储介质
Hu et al. Face reenactment via generative landmark guidance

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19897431

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2021516563

Country of ref document: JP

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 2019897431

Country of ref document: EP

Effective date: 20210318

ENP Entry into the national phase

Ref document number: 20217011194

Country of ref document: KR

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE