CN114463815A - Facial expression capturing method based on face key points - Google Patents

Facial expression capturing method based on face key points Download PDF

Info

Publication number
CN114463815A
CN114463815A CN202210102416.9A CN202210102416A CN114463815A CN 114463815 A CN114463815 A CN 114463815A CN 202210102416 A CN202210102416 A CN 202210102416A CN 114463815 A CN114463815 A CN 114463815A
Authority
CN
China
Prior art keywords
expression
key points
face
facial
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210102416.9A
Other languages
Chinese (zh)
Inventor
杨帆
郝强
潘鑫淼
胡建国
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Zhenshi Intelligent Technology Co Ltd
Original Assignee
Nanjing Zhenshi Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Zhenshi Intelligent Technology Co Ltd filed Critical Nanjing Zhenshi Intelligent Technology Co Ltd
Priority to CN202210102416.9A priority Critical patent/CN114463815A/en
Publication of CN114463815A publication Critical patent/CN114463815A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects

Abstract

The invention discloses a facial expression capturing method based on face key points, which comprises the following steps: 1) binding the expression for the selected 3DMM model by using a facial expression binding algorithm; 2) generating a random 3D face, carrying out normalization processing on recorded face key points to obtain a normalized key point set, and forming a training data set by the expression coefficients and the corresponding normalized key point set; 3) building a multilayer perceptron model, and performing iterative optimization by using a gradient descent method to obtain an expression recognition model; 4) carrying out key point detection on the face image to be recognized, screening out key points and carrying out normalization processing to obtain normalized key points; 5) and inputting the normalized human face key points into an expression recognition model, and predicting the change degree of the human face expression at the moment. According to the method, the analysis of the fine expressions of the face can be realized only by combining the visible light camera with the machine learning model, the cost is low, the real-time performance is good, and the method is beneficial to large-scale application.

Description

Facial expression capturing method based on face key points
Technical Field
The invention relates to the technical field of facial expression capture, in particular to a facial expression capture method based on face key points.
Background
The existing facial expression capturing method needs to place a sensor or use a structured light camera on the face, the equipment price is high, and facial movement is influenced due to wearing discomfort. The method based on computer vision only can recognize a few macroscopic expressions such as happy expression and sad expression due to difficult data acquisition, and is difficult to recognize fine expressions such as frown and mouth corner lifting, so that a facial expression capturing method based on human face key points is needed, and the fine expressions of the human face can be analyzed in real time only by combining a visible light camera with a machine learning model.
Disclosure of Invention
The invention provides a facial expression capturing method based on face key points, which aims to solve the problems in the background technology.
In order to achieve the above object, an embodiment of the present invention provides a facial expression capturing method based on face key points, which is characterized by including the following steps:
1) binding the expression for the selected 3DMM model by using a facial expression binding algorithm;
2) generating a random 3D face, carrying out normalization processing on recorded face key points to obtain a normalized key point set, and forming a training data set by the expression coefficients and the corresponding normalized key point set;
3) building a multilayer perceptron model, and performing iterative optimization by using a gradient descent method to obtain an expression recognition model;
4) carrying out key point detection on the face image to be recognized, screening out key points and carrying out normalization processing to obtain normalized key points;
5) and inputting the normalized human face key points into an expression recognition model, and predicting the change degree of the human face expression at the moment.
Further, the step 1) further includes selecting a 3DMM model, binding the required expression for the 3DMM model by using a facial expression binding algorithm, and representing the change degree of the expression by using a value from 0 to 1 for each expression.
Further, the 3DMM model comprises a BFM model and an LSFM model.
Further, the step 2) further comprises the steps of,
randomly adjusting the identity coefficient, the expression coefficient and the rotation angle to generate a random 3D face, wherein the random multiple groups of expression coefficients respectively correspond to the multiple 3D faces;
projecting a plurality of 3D faces to a two-dimensional plane, recording the coordinates of key points of the faces of the two-dimensional plane, and then carrying out normalization processing on all the key points to obtain a normalized key point set;
and forming a training data set by the multiple groups of expression coefficients and the corresponding normalized key point sets.
Further, the step 3) further comprises,
building a multilayer perceptron model, inputting the model as the horizontal and vertical coordinates of key points, and outputting the model as a predicted expression;
and calculating the loss of the predicted expression and the real expression of one batch of training data by adopting an L1 loss function, and performing iterative optimization on the built multilayer perceptron model by using a gradient descent method to obtain an expression recognition model.
Further, the step 4) further comprises,
performing key point detection on a face image to be recognized by using a face key point detection tool, and screening out key points corresponding to training data; and carrying out normalization processing on the screened key point coordinates to obtain normalized key points.
Further, the step 4) further comprises,
and inputting the normalized human face key points into the expression recognition model, and predicting the change degree of the human face expression at the moment.
Furthermore, the expressions comprise eyebrow lifting, eyebrow wrinkling, eye closing and mouth opening.
The embodiment of the present invention further provides a terminal device, which includes a processor, a memory, and a computer program stored in the memory and configured to be executed by the processor, where the processor implements the steps of any one of the above facial expression capturing methods based on facial key points when executing the computer program.
The embodiment of the present invention further provides a computer-readable storage medium, where the computer-readable storage medium includes a stored computer program, where when the computer program runs, a device on which the computer-readable storage medium is located is controlled to execute any one of the above steps of the facial expression capture method based on facial keypoints.
The beneficial effects are that: the facial expression capturing method based on the key points of the human face is provided, and the analysis of the fine expressions of the human face can be realized only by combining a visible light camera with a machine learning model. The method has low cost and good real-time property, and is beneficial to large-scale application.
Drawings
FIG. 1 is a schematic flow chart of a facial expression capturing method based on face key points according to the present invention;
FIG. 2 is a schematic diagram of a coordinate structure of 42 key points near a two-dimensional plane face in a facial expression capture method based on face key points according to the present invention;
fig. 3 is a schematic structural diagram of a preferred embodiment of a terminal device provided in the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, fig. 1 is a flowchart illustrating a facial expression capturing method based on face key points according to a preferred embodiment of the present invention. The facial expression capturing method based on the face key points comprises the following steps:
s1, binding blenshapes (expression) for the 3DMM model: the Face 3DMM (3D portable Models) is a parameterized Face Model, and can generate a specific 3D Face shape by adjusting identity parameters, expression parameters, and rotation angles, and common Face 3DMM Models include a BFM (base Face Model), an LSFM (Large Scale Face Model), and the like. Selecting a proper 3DMM model, using a facial expression binding algorithm-Example-based facial profiling, binding M (M is a positive integer) expression blenshapes required by the 3DMM model, wherein the expression can be decoupled fine expressions such as eyebrow lifting, eyebrow wrinkling, eye closing, mouth opening and the like, each blenshape is a value from 0 to 1 and represents the change degree of the expression, and the 3DMM model after the blenshapes are calibrated can control the fine expression of the 3D face by adjusting the blenshapes coefficient;
and S2, generating training data: randomly adjusting the identity coefficient, the blenshapes coefficient and the rotation angle to generate a random 3D face, and then randomly adjusting N groups (N is a positive integer larger than 100) of blenshapes coefficients (B)0,B1,…,BN-1) Corresponding to N3D faces (Face)0,Face1,…,FaceN-1) Wherein
Figure BDA0003492891400000051
Is an nth set of random expression coefficients (N is an integer from 0 to N-1),
Figure BDA0003492891400000052
is the mth expression coefficient component (
Figure BDA0003492891400000058
M is an integer from 0 to M-1), and combining the 3D faces (Face)0,Face1,…,FaceN-1) Projecting to a two-dimensional plane, recording 42 key point coordinates near eyebrows (10), eyes (12), noses (9), lips (8) and chin (3) of the two-dimensional plane Face, as shown in fig. 2, then the nth 3D FacenCorresponding set of projection keypoints
Figure BDA0003492891400000053
K is an integer from 0 to 41,
Figure BDA0003492891400000054
respectively the abscissa and ordinate of the kth key point, normalizing the key points and collecting lmks of the key pointsnThe abscissa of the leftmost key point in the graph is recorded as leftnThe abscissa of the rightmost key point is denoted as rightnThe ordinate of the uppermost key point is recorded as topnThe ordinate of the lowest key point is recorded as bottomnNormalized nth key point set
Figure BDA0003492891400000055
Wherein the kth key point coordinate is
Figure BDA0003492891400000056
Forming a training data set by the N groups of blenshapes coefficients and the corresponding normalized key point set
Figure BDA0003492891400000057
S3, training an expression recognition model: building a multilayer perceptron model, wherein the model comprises 4 full connection layers, and the number of neurons of the full connection layers is 84, 256 and M respectively; inputting the horizontal and vertical coordinates of 42 key points by the model, and outputting predicted M expression blendshapes; calculating the loss of predicted blenshapes and real blenshapes of a batch of training data by adopting an L1 loss function, and iterating and optimizing the model by using a gradient descent method; the model after final training can accurately predict the expression blenshapes from the input key points;
s4, detecting the key points of the human face: using Face key point detection tool (such as Dlib, Face-alignment) to make key point detection on the Face image to be recognized, screening out 42 key points (lmk) corresponding to training data0,lmk1,…,lmk41),lmkk=(xk,yk) Is the k-th key point, xk,ykRespectively is the horizontal coordinate and the vertical coordinate of the kth key point; normalizing the coordinates of the key points, and recording the abscissa of the leftmost key point of the 42 key points as left, the abscissa of the rightmost key point as right, the ordinate of the uppermost key point as top and the ordinate of the lowermost key point as bottom; 42 key points after normalization are
Figure BDA0003492891400000061
Wherein the kth key point coordinate is
Figure BDA0003492891400000062
S5, facial expression capture: normalizing the face key points
Figure BDA0003492891400000063
Inputting the trained model, and predicting the change degrees of the M expressions of the human face at the moment.
Referring to fig. 3, fig. 3 is a schematic structural diagram of a terminal device according to a preferred embodiment of the present invention. The terminal device comprises a processor, a memory and a computer program stored in the memory and configured to be executed by the processor, wherein the processor executes the computer program to implement the steps of the facial expression capture method based on the face key points according to any one of the embodiments.
Preferably, the computer program may be divided into one or more modules/units (e.g., computer program 1, computer program 2, … …) that are stored in the memory and executed by the processor to implement the invention. The one or more modules/units may be a series of computer program instruction segments capable of performing specific functions, which are used for describing the execution process of the computer program in the terminal device.
The Processor may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, a discrete Gate or transistor logic device, a discrete hardware component, etc., the general purpose Processor may be a microprocessor, or the Processor may be any conventional Processor, the Processor is a control center of the terminal device, and various interfaces and lines are used to connect various parts of the terminal device.
The memory mainly includes a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required for at least one function, and the like, and the data storage area may store related data and the like. In addition, the memory may be a high speed random access memory, may also be a non-volatile memory, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash Card (Flash Card), and the like, or may also be other volatile solid state memory devices.
It should be noted that the terminal device may include, but is not limited to, a processor and a memory, and those skilled in the art will understand that the structural diagram of fig. 3 is only an example of the terminal device and does not constitute a limitation of the terminal device, and may include more or less components than those shown, or combine some components, or different components.
The embodiment of the present invention further provides a computer-readable storage medium, where the computer-readable storage medium includes a stored computer program, where when the computer program runs, a device on which the computer-readable storage medium is located is controlled to execute the steps of the facial expression capture method based on the facial keypoints according to any one of the above embodiments.
The embodiment of the invention provides a facial expression capturing method based on human face key points, which can realize the analysis of the fine expressions of the human face only by combining a visible light camera with a machine learning model.
It should be noted that the above-described system embodiments are merely illustrative, where the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. In addition, in the drawings of the embodiment of the system provided by the present invention, the connection relationship between the modules indicates that there is a communication connection between them, and may be specifically implemented as one or more communication buses or signal lines. One of ordinary skill in the art can understand and implement it without inventive effort.
While the foregoing is directed to the preferred embodiment of the present invention, it will be understood by those skilled in the art that various changes and modifications may be made without departing from the spirit and scope of the invention.

Claims (10)

1. A facial expression capturing method based on face key points is characterized by comprising the following steps:
1) binding the expression for the selected 3DMM model by using a facial expression binding algorithm;
2) generating a random 3D face, carrying out normalization processing on recorded face key points to obtain a normalized key point set, and forming a training data set by the expression coefficients and the corresponding normalized key point set;
3) building a multilayer perceptron model, and performing iterative optimization by using a gradient descent method to obtain an expression recognition model;
4) carrying out key point detection on the face image to be recognized, screening out key points and carrying out normalization processing to obtain normalized key points;
5) and inputting the normalized human face key points into an expression recognition model, and predicting the change degree of the human face expression at the moment.
2. The facial expression capturing method based on human face key points as claimed in claim 1, wherein the step 1) further comprises selecting a 3d dm model, using a facial expression binding algorithm to bind the required expressions for the 3d dm model, and each expression adopts a value of 0 to 1 to represent the degree of change of the expression.
3. The facial keypoint-based facial expression capture method of claim 1, wherein said 3DMM models comprise BFM models, LSFM models.
4. The facial keypoint-based facial expression capture method of claim 1, wherein said step 2) further comprises,
randomly adjusting the identity coefficient, the expression coefficient and the rotation angle to generate a random 3D face, wherein the random multiple groups of expression coefficients respectively correspond to the multiple 3D faces;
projecting a plurality of 3D faces to a two-dimensional plane, recording the coordinates of key points of the faces of the two-dimensional plane, and then carrying out normalization processing on all the key points to obtain a normalized key point set;
and forming a training data set by the multiple groups of expression coefficients and the corresponding normalized key point sets.
5. The facial keypoint-based facial expression capture method of claim 1, wherein said step 3) further comprises,
building a multilayer perceptron model, inputting the model as the horizontal and vertical coordinates of key points, and outputting the model as a predicted expression;
and calculating the loss of the predicted expression and the real expression of one batch of training data by adopting an L1 loss function, and performing iterative optimization on the built multilayer perceptron model by using a gradient descent method to obtain an expression recognition model.
6. The facial keypoint-based facial expression capture method of claim 5, wherein said step 4) further comprises,
performing key point detection on a face image to be recognized by using a face key point detection tool, and screening out key points corresponding to training data; and carrying out normalization processing on the screened key point coordinates to obtain normalized key points.
7. The facial keypoint-based facial expression capture method of claim 6, wherein said step 4) further comprises,
and inputting the normalized human face key points into the expression recognition model, and predicting the change degree of the human face expression at the moment.
8. The facial expression capture method based on facial keypoints according to claim 1, wherein the expressions comprise eyebrow lifting, eyebrow crumpling, eye closing and mouth opening.
9. A terminal device comprising a processor, a memory, and a computer program stored in the memory and configured to be executed by the processor, the processor implementing the steps of the facial keypoint-based facial expression capture method according to any one of claims 1 to 8 when executing the computer program.
10. A computer-readable storage medium, comprising a stored computer program, wherein the computer program, when executed, controls an apparatus on which the computer-readable storage medium is located to perform the steps of the method for capturing facial expression based on facial keypoints according to any one of claims 1-8.
CN202210102416.9A 2022-01-27 2022-01-27 Facial expression capturing method based on face key points Pending CN114463815A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210102416.9A CN114463815A (en) 2022-01-27 2022-01-27 Facial expression capturing method based on face key points

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210102416.9A CN114463815A (en) 2022-01-27 2022-01-27 Facial expression capturing method based on face key points

Publications (1)

Publication Number Publication Date
CN114463815A true CN114463815A (en) 2022-05-10

Family

ID=81410905

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210102416.9A Pending CN114463815A (en) 2022-01-27 2022-01-27 Facial expression capturing method based on face key points

Country Status (1)

Country Link
CN (1) CN114463815A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114627218A (en) * 2022-05-16 2022-06-14 成都市谛视无限科技有限公司 Human face fine expression capturing method and device based on virtual engine
CN116612512A (en) * 2023-02-02 2023-08-18 北京甲板智慧科技有限公司 Facial expression image processing method and device based on monocular RGB camera

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114627218A (en) * 2022-05-16 2022-06-14 成都市谛视无限科技有限公司 Human face fine expression capturing method and device based on virtual engine
CN116612512A (en) * 2023-02-02 2023-08-18 北京甲板智慧科技有限公司 Facial expression image processing method and device based on monocular RGB camera

Similar Documents

Publication Publication Date Title
CN109325437B (en) Image processing method, device and system
Maung Real-time hand tracking and gesture recognition system using neural networks
AU2012227166B2 (en) Face feature vector construction
CN111160269A (en) Face key point detection method and device
WO2020103700A1 (en) Image recognition method based on micro facial expressions, apparatus and related device
CN114463815A (en) Facial expression capturing method based on face key points
CN106687989A (en) Method and system of facial expression recognition using linear relationships within landmark subsets
CN111967392A (en) Face recognition neural network training method, system, equipment and storage medium
CN109145766A (en) Model training method, device, recognition methods, electronic equipment and storage medium
CN106778474A (en) 3D human body recognition methods and equipment
CN110598638A (en) Model training method, face gender prediction method, device and storage medium
CN106651915A (en) Target tracking method of multi-scale expression based on convolutional neural network
CN110222780A (en) Object detecting method, device, equipment and storage medium
CN110956082A (en) Face key point detection method and detection system based on deep learning
CN115050064A (en) Face living body detection method, device, equipment and medium
CN111680544B (en) Face recognition method, device, system, equipment and medium
WO2021042544A1 (en) Facial verification method and apparatus based on mesh removal model, and computer device and storage medium
CN110163095B (en) Loop detection method, loop detection device and terminal equipment
CN111382791A (en) Deep learning task processing method, image recognition task processing method and device
Forczmański et al. Comparative analysis of simple facial features extractors
Kim et al. Convolutional neural network architectures for gaze estimation on mobile devices
CN114360031B (en) Head pose estimation method, computer device, and storage medium
CN113362249B (en) Text image synthesis method, text image synthesis device, computer equipment and storage medium
CN114782592A (en) Cartoon animation generation method, device and equipment based on image and storage medium
CN107194980A (en) Faceform's construction method, device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination