CN109377544B - Human face three-dimensional image generation method and device and readable medium - Google Patents

Human face three-dimensional image generation method and device and readable medium Download PDF

Info

Publication number
CN109377544B
CN109377544B CN201811459413.0A CN201811459413A CN109377544B CN 109377544 B CN109377544 B CN 109377544B CN 201811459413 A CN201811459413 A CN 201811459413A CN 109377544 B CN109377544 B CN 109377544B
Authority
CN
China
Prior art keywords
face
dimensional image
dimensional
target
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811459413.0A
Other languages
Chinese (zh)
Other versions
CN109377544A (en
Inventor
陈雅静
林祥凯
宋奕兵
凌永根
暴林超
刘威
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN201811459413.0A priority Critical patent/CN109377544B/en
Publication of CN109377544A publication Critical patent/CN109377544A/en
Application granted granted Critical
Publication of CN109377544B publication Critical patent/CN109377544B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification

Abstract

The invention discloses a method, a device and a readable medium for generating a human face three-dimensional image, which belong to the technical field of image processing.A face characteristic parameter, a photographing environment characteristic and photographing parameter information which are identified from a target human face two-dimensional image by using an adjustable training model are utilized in the method and the device provided by the invention, the target human face three-dimensional model is reconstructed according to the face characteristic parameter and a three-dimensional base model in a standard human face template library, and the photographing environment characteristic and the photographing parameter information are simulated to render the target human face three-dimensional model to obtain a middle human face two-dimensional image; when it is determined that the target face two-dimensional image and the intermediate face two-dimensional image do not meet the consistency condition, adjusting the training model and returning to the step of obtaining the intermediate face two-dimensional image according to the target face two-dimensional image by using the adjusted training model; and when the consistency condition is determined to be met, the target face three-dimensional image is obtained based on the newly reconstructed target face three-dimensional model, so that the fidelity of the target face three-dimensional image is improved.

Description

Human face three-dimensional image generation method and device and readable medium
Technical Field
The invention relates to the technical field of image processing, in particular to a method and a device for generating a human face three-dimensional image and a readable medium.
Background
The human face is the most expressive part of the human body and has the characteristics of individuation and diversification. With the development of various kinds of software, the software is not enough to represent and convey some information only by using a human face two-dimensional image, and the human face three-dimensional image is considered to be more capable of stereoscopically and vividly conveying some information which cannot be conveyed by the human face two-dimensional image.
In the prior art, when a three-dimensional face image is generated, a neural network is directly utilized to fit real three-dimensional face image parameters by learning the characteristics of an input two-dimensional face image so as to obtain the three-dimensional face image.
Therefore, how to improve the fidelity of the generated three-dimensional human face image is higher is one of the first problems to be considered by the computer.
Disclosure of Invention
The embodiment of the invention provides a method and a device for generating a human face three-dimensional image and a readable medium, which are used for improving the fidelity of the generated human face three-dimensional image.
In a first aspect, an embodiment of the present invention provides a method for generating a three-dimensional image of a human face, including:
according to a target face two-dimensional image, obtaining an intermediate face two-dimensional image by using an adjustable training model, wherein the intermediate face two-dimensional image is obtained by reconstructing a target face three-dimensional model according to facial feature parameters recognized from the target face two-dimensional image and a three-dimensional base model in a standard face template library, and simulating photographing environment features and photographing parameter information recognized from the target face two-dimensional image to render the target face three-dimensional model;
determining whether the target face two-dimensional image and the middle face two-dimensional image meet a set consistency condition;
when the consistency condition is determined not to be met, adjusting the training model and returning to the step of obtaining the intermediate face two-dimensional image according to the target face two-dimensional image by using the adjusted training model;
and when the consistency condition is determined to be met, obtaining a target human face three-dimensional image based on the newly reconstructed target human face three-dimensional model.
In a second aspect, an embodiment of the present invention provides a device for generating a three-dimensional image of a human face, including:
the system comprises an obtaining unit, a processing unit and a processing unit, wherein the obtaining unit is used for obtaining an intermediate human face two-dimensional image by utilizing an adjustable training model according to a target human face two-dimensional image, the intermediate human face two-dimensional image is obtained by reconstructing a target human face three-dimensional model according to facial feature parameters recognized from the target human face two-dimensional image and a three-dimensional base model in a standard human face template library, and simulating photographing environment features and photographing parameter information recognized from the target human face two-dimensional image to render the target human face three-dimensional model;
the determining unit is used for determining whether the target face two-dimensional image and the middle face two-dimensional image meet set consistency conditions or not;
an adjusting unit, configured to adjust the training model and use the adjusted training model to return to the step of obtaining the intermediate two-dimensional face image according to the target two-dimensional face image again when the determining unit determines that the consistency condition is not satisfied;
and the generating unit is used for obtaining a target human face three-dimensional image based on the newly reconstructed target human face three-dimensional model when the determining unit determines that the consistency condition is met.
In a third aspect, an embodiment of the present invention provides a computer-readable medium, in which computer-executable instructions are stored, where the computer-executable instructions are used to execute the method for generating a three-dimensional image of a human face provided in the present application.
In a fourth aspect, an embodiment of the present invention provides an electronic device, including:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor to enable the at least one processor to execute the method for generating the three-dimensional image of the human face provided by the application.
The invention has the beneficial effects that:
according to the method, the device and the readable medium for generating the human face three-dimensional image, after the target human face two-dimensional image is obtained, the target human face two-dimensional image is input into an adjustable training model, the training model can identify face characteristic parameters, photographing environment characteristics and photographing parameter information from the target human face two-dimensional image, then the target human face three-dimensional model is reconstructed based on the identified face characteristic parameters and a three-dimensional base model in a standard human face template library, then the identified photographing environment characteristics and the photographing parameter information are simulated to render the target human face three-dimensional model to obtain an intermediate human face two-dimensional image, then whether the target human face two-dimensional image and the intermediate human face two-dimensional image meet set consistency conditions is determined, and when the consistency conditions are determined not met, the training model is adjusted, and the step of obtaining the intermediate human face two-dimensional image according to the target human face two-dimensional image is returned again by using the adjusted training model; and when the consistency condition is determined to be met, obtaining a target human face three-dimensional image based on the newly reconstructed target human face three-dimensional model. By adopting the method, the adjustable training model is dynamically adjusted, so that the facial feature parameters obtained based on the method are closer to the target face in the target face two-dimensional image, the recognized photographing environment features and photographing parameter information are closer to the photographing environment parameters and the photographing parameter information when the target face two-dimensional image is photographed, the target face three-dimensional model obtained based on the method is closer to the target face, and the fidelity of the target three-dimensional image obtained based on the method is higher.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by the practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the invention and not to limit the invention. In the drawings:
fig. 1 is a schematic view of an application scene of a human face three-dimensional image generation method provided by an embodiment of the invention;
fig. 2 is a schematic flow chart of a method for generating a three-dimensional image of a human face according to an embodiment of the present invention;
fig. 3 is one of the logic architecture diagrams of the method for generating a three-dimensional image of a human face according to the embodiment of the present invention;
fig. 4a is a schematic flowchart of a process for recognizing a shape feature parameter from a two-dimensional image of a target face by using an adjustable training model according to an embodiment of the present invention;
fig. 4b is a schematic diagram of feature points in a two-dimensional image of a target face according to an embodiment of the present invention;
fig. 5a is a schematic flow chart illustrating a process of recognizing an expression parameter from a two-dimensional image of a target face by using an adjustable training model according to an embodiment of the present invention;
FIG. 5b is a diagram of a basic expression base model of a standard human face according to an embodiment of the present invention;
fig. 6a is a schematic flow chart illustrating a process of recognizing a texture parameter from a two-dimensional image of a target face by using an adjustable training model according to an embodiment of the present invention;
FIG. 6b is a schematic diagram of a three-dimensional texture base model of a human face varying at-70, -50, -30, -15, 0, 15, 30, 50, and 70 degrees along the Y-axis according to an embodiment of the present invention;
fig. 7 is a schematic flow chart of reconstructing a three-dimensional model of a target face according to an embodiment of the present invention;
fig. 8 is a schematic flow chart illustrating a process of determining whether the identity feature information respectively represented by the target two-dimensional face image and the intermediate two-dimensional face image is consistent;
fig. 9 is a schematic diagram of an effect of a three-dimensional model of a target face reconstructed based on the two-dimensional image of the target face in fig. 4a according to an embodiment of the present invention;
fig. 10 is a second logic architecture diagram of the method for generating a three-dimensional image of a human face according to the embodiment of the present invention;
fig. 11 is a schematic structural diagram of a human face three-dimensional image generation device according to an embodiment of the present invention;
fig. 12 is a schematic structural diagram of a computing device for implementing a method for generating a three-dimensional image of a human face according to an embodiment of the present invention.
Detailed Description
The embodiment of the invention provides a method and a device for generating a human face three-dimensional image and a readable medium, which are used for improving the fidelity of the generated human face three-dimensional image.
The preferred embodiments of the present invention will be described below with reference to the accompanying drawings of the specification, it being understood that the preferred embodiments described herein are merely for illustrating and explaining the present invention, and are not intended to limit the present invention, and that the embodiments and features of the embodiments in the present invention may be combined with each other without conflict.
To facilitate understanding of the present invention, the present invention relates to technical terms in which:
1. VGG Face, a deep Face recognition model, is a Face recognition model proposed by the Visual Geometry Group (VGG) of oxford university, and the core of the model lies in the implementation of a convolutional neural network. The invention can extract Face characteristic parameters, photographing environment characteristics, photographing parameter information and the like of the Face from the two-dimensional Face image by using the VGG Face.
2. A convolutional neural network: (Neural Network, CNN) is a Neural Network for two-dimensional input recognition problem, consisting of one or more Convolutional and pooling layers (layers). The method is characterized by weight sharing, reduced parameter quantity and high invariance to translation, scaling, inclination or other forms of deformation.
3. Spherical harmonic illumination, a real-time rendering technique by which high quality rendering and shading effects can be produced.
4. The terminal device is an electronic device that can install various applications and can display an object provided in the installed application, and the electronic device may be mobile or fixed. For example, a mobile phone, a tablet computer, various wearable devices, a vehicle-mounted device, a Personal Digital Assistant (PDA), a point of sale (POS), a monitoring device in a subway station, or other electronic devices capable of implementing the above functions may be used.
In the face three-dimensional image generation method adopted in the prior art, a face three-dimensional image is obtained through estimation based on real data of the face two-dimensional image, and the method can cause that the face in the output face three-dimensional image and the face of the face two-dimensional image are not similar to the same person, so that the generated face three-dimensional image is not vivid enough.
In order to solve the problem of low image fidelity obtained by a face three-dimensional image generation method adopted in the prior art, the embodiment of the present invention provides a solution, referring to an application scene schematic diagram shown in fig. 1, a client capable of invoking a shooting function is installed on a user equipment 11, then a user 10 shoots a face two-dimensional image of the user 10 through the client installed in the user equipment 11, and then the face two-dimensional image is sent to a server 12, the server 12 implements the face three-dimensional image generation method provided by the present invention after receiving the face two-dimensional image of the user 10, at this time, the face two-dimensional image of the user 10 is a target face two-dimensional image, and the implementation process of the method is roughly: the server 12 recognizes the facial feature parameters of the user 10 in the image from the two-dimensional facial image of the user 10 by using an adjustable training model, and photographing environment features and photographing parameter information when the two-dimensional facial image is photographed, then the server 12 reconstructs the three-dimensional facial model of the user 10 based on the recognized facial feature parameters and a three-dimensional base model in a standard facial template library, and then renders the reconstructed three-dimensional facial model of the user 10 by simulating the recognized photographing environment features and photographing parameter information to obtain an intermediate two-dimensional facial image, then the server 12 determines whether the intermediate two-dimensional facial image and the two-dimensional facial image of the user 10 meet the set consistency condition, when the server 12 determines that the intermediate two-dimensional facial image and the two-dimensional facial image of the user 10 do not meet the set consistency condition, it indicates that the intermediate two-dimensional facial image has low similarity with the two-dimensional facial image of the user 100, so the adjustable training model is adjusted, and the step of obtaining the intermediate two-dimensional facial image based on the two-dimensional facial image of the user 10 is re-executed by using the adjusted training model, and then it is determined whether the re-obtained intermediate two-dimensional facial image and the two-dimensional facial image of the user 10 meet the set consistency condition, and the two-dimensional facial image of the training condition again, if the two-dimensional facial image of the user 10 does not meet the two-dimensional facial image of the adjusted training condition; when the consistency condition is met, it is determined that the similarity between the middle two-dimensional face image and the two-dimensional face image of the user 10 is very high, that is, the face three-dimensional model reconstructed based on the current training model is relatively good, the three-dimensional face image of the user 10 is obtained based on the newly reconstructed three-dimensional face model when the consistency condition is met, and then some operations can be executed by using the obtained three-dimensional face image of the user 10. By adopting the method, the simulation degree of the obtained human face three-dimensional model is greatly improved, and the fidelity of the human face three-dimensional image obtained based on the method is further provided.
The user equipment 11 and the server 12 are communicatively connected through a network, which may be a local area network, a wide area network, or the like. The user device 11 may be a portable device (e.g., a mobile phone, a tablet, a notebook computer, etc.) or a Personal Computer (PC), the server 12 may be any device capable of providing internet services, and the client in the user device 11 may be a client capable of invoking a photographing function, a WeChat client, a QQ client, a game client, etc.
It should be noted that the method for generating a three-dimensional image of a human face provided by the present invention may also be applied to a client, and when the requirements on the processing capability and the memory of the user equipment 11 are not particularly high, the user equipment 11 may implement the method for generating a three-dimensional image of a human face provided by the present invention, and the specific implementation process is similar to that of the server 12, and will not be described in detail herein.
The application scenario of the method for generating the human face three-dimensional image provided by the embodiment of the invention is that the method is applied to a client side which can use the human face three-dimensional image, such as virtual reality, for example, the image of a role in a game uses the human face three-dimensional image; when a user opens a game client, the client prompts the user to create a role in a game by using the face of the user, at the moment, the user needs to call a photographing function based on the game client, then a user face two-dimensional image is photographed, then the client sends the user face two-dimensional image to a game server, after the game server obtains the user face two-dimensional image, the face characteristic parameter, photographing environment characteristic and photographing parameter information of the user face in the two-dimensional image are recognized from the user face two-dimensional image by using an adjustable training model, then a face three-dimensional model is reconstructed by using the recognized face characteristic parameter of the user face and a three-dimensional base model in a standard face template library, then the recognized photographing environment characteristic and photographing parameter characteristic are simulated, the reconstructed face three-dimensional model is rendered to obtain an intermediate face two-dimensional image, then whether the intermediate face two-dimensional image meets the consistency condition with the user face two-dimensional image is determined, if the consistency condition is not met, the fact that the face characteristic parameter, the photographing environment characteristic and the photographing parameter information obtained by the current training model is not high enough, therefore, the user face three-dimensional game client can be used for obtaining the role in the user face image based on the training model after the training model is adjusted, and the user face two-dimensional image is obtained again, and the user face two-dimensional game server can be used for the game. Therefore, the user can develop the game based on the image, the fidelity of the image of the game role finally shown to the user and the face of the user is higher by adopting the method, and the satisfaction degree of the user is improved.
Because the similarity between the reconstructed face three-dimensional model and the face of the user is not necessarily high, the two-dimensional image obtained by rendering based on the reconstructed face three-dimensional model is called as an intermediate two-dimensional image, if the similarity is extremely high, the similarity between the intermediate two-dimensional image and the face two-dimensional image of the user is higher, and the intermediate two-dimensional image can be determined to be the face two-dimensional image of the user when the consistency condition is met.
The following describes a method for generating a three-dimensional image of a human face according to an exemplary embodiment of the present invention with reference to fig. 2 to 12 in conjunction with an application scenario shown in fig. 1. It should be noted that the above application scenarios are merely illustrated for the convenience of understanding the spirit and principles of the present invention, and the embodiments of the present invention are not limited in this respect. Rather, embodiments of the present invention may be applied to any scenario where applicable.
As shown in fig. 2, a schematic flow chart of a method for generating a three-dimensional image of a human face according to an embodiment of the present invention is described by taking the method as an example, where the flow of the server implementing the method may include the following steps:
and S21, according to the target face two-dimensional image, recognizing face characteristic parameters, photographing environment characteristics and photographing parameter information from the target face two-dimensional image by using an adjustable training model.
Specifically, referring to the logic framework diagram shown in fig. 3, the two-dimensional image of the target face is input into an adjustable training model, and then based on the model, the facial feature parameters of the target face can be recognized from the two-dimensional image of the target face, and the photographing environment feature and the photographing parameter information when the two-dimensional image of the target face is photographed are recognized.
In specific implementation, the adjustable training model in the invention can be, but is not limited to, VGG Face encoder, faceNet network structure, and the like. When the VGG Face encoder is used, the adjustable VGG Face encoder can be used for learning the Face feature parameters, the photographing environment features and the photographing parameter information of the target Face from the two-dimensional image of the target Face.
And S22, reconstructing a target human face three-dimensional model according to the recognized facial feature parameters and the three-dimensional base model in the standard human face template library.
Because the facial features of a user can uniquely identify the user except the identity of the user, and the facial features generally comprise face shapes, expressions, textures and the like, the face of the user can be determined and obtained by obtaining the shape, the expression and the texture of the target face based on the target face two-dimensional image, and the identity of the user can be identified. The shape features of the target face generally include eyebrow, eye, nose, mouth, cheek and other parts, and the expressions are various, but various expressions are also obtained based on some basic expression combinations; the texture may also be based on a combination of textures from different angles. Based on the above description, in order to better reconstruct the three-dimensional model of the target human face, the facial feature parameters recognized in the present invention may include, but are not limited to, shape feature parameters, expression parameters, texture parameters, and the like.
It should be noted that, when the standard face template library includes a shape template library of a standard face, the three-dimensional base model in the standard face template library is a three-dimensional shape base model; when the standard face template library comprises a basic expression template library of a standard face, the three-dimensional base model in the standard face template library is a basic expression base model; and when the standard face template library comprises a texture template library of the standard face, the three-dimensional base model in the standard face template library is the three-dimensional texture base model.
The process of obtaining the shape feature parameters, expression parameters, and texture parameters is described next:
preferably, the shape feature parameters can be recognized from the two-dimensional image of the target face by using an adjustable training model according to the process shown in fig. 4 a:
s41, determining the weight coefficient of each three-dimensional shape base model capable of forming the shape of the target human face by using the adjustable training model.
And S42, determining the weight coefficient of each three-dimensional shape substrate model as the shape characteristic parameter.
In steps S41 to S42, a shape template library of a standard face may be generated in advance based on a large number of faces, where the shape template library includes three-dimensional shape base templates of the standard face, and each three-dimensional shape base model is a complete three-dimensional face shape model.
After the server acquires the two-dimensional image of the target face, the server may first identify feature points representing each feature portion in the face, as shown in fig. 4b, and then perform three-dimensional reconstruction, so that 3D fitting processing needs to be performed on each feature point to obtain a three-dimensional feature point corresponding to the feature point, which is equivalent to adding a depth value to the two-dimensional image. The resulting three-dimensional feature points can be represented as: (x, y, z), wherein x represents the abscissa value of the pixel point corresponding to the three-dimensional feature point; y is expressed as an abscissa value of an ordinate point corresponding to the three-dimensional feature point; z is expressed as a depth value of the three-dimensional feature point. Wherein x and y are the same as x and y values of the feature points obtained based on the two-dimensional image. After the three-dimensional feature points of the feature points are obtained, the shape formed by the three-dimensional feature points can be matched with each three-dimensional shape base model, so as to determine the weight coefficient of each three-dimensional shape base model capable of forming the shape of the target human face. It can be understood as solving the linear equation f (x) = a 1 x 1 +a 2 x 2 +a 3 x 3 +...+a i x i +...+a n x n The process of coefficient in the formula, wherein f (x) is expressed as the shape formed by the three-dimensional characteristic points of the target human face, and x i Expressed as the ith three-dimensional shape base model; a is i And a weight coefficient representing the ith three-dimensional shape base model. Based on the above process, the weight coefficient of each three-dimensional shape base model capable of forming the shape of the target face in the target face two-dimensional image can be determined, and the determined weight coefficient of each three-dimensional shape base model is the shape characteristic parameter.
Specifically, the ratio between the maximum value of the three-dimensional feature point obtained from the two-dimensional image of the target face and the maximum value of the feature point in each three-dimensional shape base model may be determined as the weight coefficient of each three-dimensional shape base model, or the ratio between the maximum value of the three-dimensional feature point obtained from the three-dimensional image of the target face and the maximum value of the feature point in each three-dimensional shape base model may be determined as the weight coefficient of each three-dimensional shape base model, or the ratio between the average value obtained from the three-dimensional feature point obtained from the three-dimensional image of the target face and the average value obtained from the three-dimensional feature point of each three-dimensional shape base model may be determined as the weight coefficient of each three-dimensional shape base model, and similarly, the weight coefficient of each three-dimensional shape base model may be obtained.
Preferably, the adjustable training model can be used to identify expression parameters from the two-dimensional image of the target face according to the process shown in fig. 5 a:
s51, determining the weight coefficient of each basic expression base model capable of forming the expression shown in the two-dimensional image of the target face by using the adjustable training model.
And S52, determining the weight coefficient of each basic expression base model as an expression parameter.
In steps S51 to S52, a group of basic expression template libraries of a standard face may also be generated in advance based on all expressions that can be expressed by the face, where the basic expression template libraries include basic expression base models, as shown in fig. 5b, the present invention provides a schematic diagram of the basic expression base model of the standard face shown in fig. 5b, where the first one in fig. 5b is a standard face three-dimensional model, that is, a basic expression base model in the absence of expression, that is, a model in which a feature portion in the standard face three-dimensional model is in a natural state; only one feature in each of the remaining basic expression base models changes, for example, in fig. 5b, the expression of the 8 th basic expression base model in the first row is mouth open, and the expression of the 11 th basic expression base model is a left mouth corner rising expression. Then, the determined three-dimensional feature points are matched with each basic expression base model in fig. 5b, and based on this, the weight coefficients of each basic expression base model capable of forming the expression shown in the target face two-dimensional image can be determined, and the determined weight coefficients are expression parameters. Specifically, a three-dimensional feature point obtained based on the target face two-dimensional image may form a matrix C, a three-dimensional feature point of each basic expression base model may form a matrix Mi, then a difference matrix between the matrices C and Mi may be determined, then an absolute value of the difference matrix is obtained and is recorded as Di, then a sum of elements in each Di may be determined, since the smaller the sum of the elements is, the closer the expression shown in the basic expression base model is to the expression shown in the target face two-dimensional image is indicated, and the larger the sum of the elements of Di is, the larger the difference between the expression shown in the basic expression base model and the expression shown in the target face two-dimensional image is indicated, based on the above description, the larger the value may be set for the weight coefficient of the basic expression base model with the smallest sum of the elements, and the smaller value is set for the weight coefficient of the basic expression base model with the largest sum of the elements, and thus, the weight coefficients of each basic expression base model capable of forming the expression shown in the target face two-dimensional image may be obtained. Of course, the weighting coefficients of the base expression base models can be determined by a method similar to that shown in fig. 4 b.
Alternatively, the texture parameters may be recognized from the two-dimensional image of the target face by using an adjustable training model according to the process shown in fig. 6 a:
and S61, determining the weight coefficient of each three-dimensional texture base model capable of forming the texture of the target face in the two-dimensional image of the target face by using the adjustable training model.
And S62, determining the weight coefficient of each three-dimensional texture base model as a texture parameter.
In steps S61 to S62, a texture template library of a standard face may also be generated in advance based on a large number of textures of the face, where the texture template library includes three-dimensional texture base models of the standard face at various angles, for example, fig. 6b shows three-dimensional texture base models of a certain face at-70, -50, -30, -15, 0, 15, 30, 50, and 70 degrees in the Y axis, and a weight coefficient of each three-dimensional texture base model capable of forming a texture of a target face in a two-dimensional image of the target face may be determined based on the three-dimensional texture base models, which may specifically refer to the descriptions in fig. 4a and fig. 5a, and will not be described in detail here.
After the three facial feature parameter obtaining processes are introduced, when the target face three-dimensional model is reconstructed, the target face three-dimensional model can be reconstructed based on the recognized facial features of the target face and the three-dimensional base model in the standard face template library, and since the facial feature parameters include shape feature parameters, expression parameters and texture parameters, the standard face template library also correspondingly includes a shape template library of the standard face, a basic expression template library of the standard face and a texture template library of the standard face, and then the target face three-dimensional model is reconstructed based on the shape feature parameters, the expression parameters and the texture parameters and the corresponding template libraries.
Specifically, the reconstructing of the target face three-dimensional model according to the process shown in fig. 7 may include the following steps:
and S71, determining the three-dimensional shape model of the target human face based on the three-dimensional shape base models in the shape template library and the determined weight coefficients of the three-dimensional shape base models.
In practical applications, no matter the three-dimensional shape base model, the basic expression base model or the three-dimensional texture model, they are substantially formed by a matrix, so that when step S71 is implemented, the matrix of each three-dimensional shape base model and the weight coefficient of each three-dimensional shape base model determined in fig. 4a can be weighted and summed, and the weighted and summed result is the target human face three-dimensional shape model.
And S72, determining the expression deviation item of the target face based on each basic expression base model in the basic expression template library and the determined weight coefficient of each basic expression base model.
Specifically, difference matrixes between the matrixes of the basic expression base models and the standard human face three-dimensional model can be respectively determined, then the difference matrixes corresponding to the basic expression base models and the weight coefficients of the basic expression base models which are determined in fig. 5a and can form the expression shown in the target human face two-dimensional image are subjected to weighted summation, and the weighted summation result is the expression deviation item of the target human face.
And S73, generating a target face three-dimensional model with the expression in the target face two-dimensional image based on the target face three-dimensional shape model and the expression offset item.
In this step, the expression offset item generated in step S72 is fused into the target face three-dimensional shape model generated in step S71, that is: and summing the matrix corresponding to the expression offset item generated in the step S72 and the weighted summation result generated in the step S71, wherein the summation result is the target face three-dimensional model with the expression in the target face two-dimensional image.
And S74, determining the three-dimensional texture model of the target face based on the three-dimensional texture base models in the texture template base and the determined weight coefficients of the three-dimensional texture base models.
Specifically, the matrix of each three-dimensional texture base model and the weight coefficient of each three-dimensional texture base model determined in fig. 6a may be subjected to weighted summation to obtain the three-dimensional texture model of the target human face.
And S75, fusing the target face three-dimensional model with the expression in the target face two-dimensional image and the target face three-dimensional texture model, and reconstructing the target face three-dimensional model.
In this step, the three-dimensional texture model of the target face determined in step S74 is fused to the target face three-dimensional model having the expression in the target face two-dimensional image, and the target face three-dimensional model is obtained.
And S23, simulating the recognized photographing environment characteristics and photographing parameter information, and rendering the target face three-dimensional model to obtain a middle face two-dimensional image.
In this step, the photographing environment characteristics and photographing parameter information identified by the orthogonal projection technology and the spherical harmonic illumination model can be simulated, and then an environment consistent with the photographing environment characteristics and the photographing parameters of the two-dimensional image of the photographed target face is simulated, for example, a light source and a lighting position are determined, and then the photographing process is performed on the three-dimensional model of the target face, so that a middle two-dimensional image of the face can be rendered. Note that, instead of the spherical harmonic illumination model, a von reflection model (phong reflection model) may be used.
Specifically, the photographing environment characteristic in the present invention may be light information, and the photographing parameter information may be, but is not limited to, alignment parameter information, and the like.
S24, determining whether the target face two-dimensional image and the middle face two-dimensional image meet set consistency conditions, and if not, executing a step S25; if yes, go to step S26.
In this step, after an intermediate two-dimensional image of a face is rendered based on step S23, in order to determine that the similarity between the target two-dimensional image of the face and the rendered intermediate two-dimensional image of the face is sufficient, the present invention proposes to determine whether the target two-dimensional image of the face and the intermediate two-dimensional image of the face satisfy a set consistency condition. Only when the face feature parameters, the photographing environment features and the photographing parameter information identified based on the training model currently set in the step S21 are satisfied, the degree of reduction is very high, so that the target face three-dimensional model reconstructed based on the face feature parameters is very similar to the target face, and further the target face three-dimensional image output display can be obtained based on the target face three-dimensional model.
Preferably, the consistency conditions provided by the present invention include: and the identity characteristic information respectively represented by the target face two-dimensional image and the middle face two-dimensional image is consistent.
Whether the face in the intermediate face two-dimensional image and the face in the target face two-dimensional image are the same person can be determined by setting identity characteristic information consistency conditions, if yes, the target face three-dimensional model for generating the intermediate face two-dimensional image is a better model, the characteristics of the target face three-dimensional image obtained based on the target face three-dimensional model are more like the characteristics of the face in the target face two-dimensional image, and the simulation degree of the target face three-dimensional image is improved.
Specifically, it may be determined whether the identity feature information respectively represented by the target two-dimensional image of the face and the intermediate two-dimensional image of the face is consistent according to the method shown in fig. 8:
and S81, respectively determining the geographic positions of the user corresponding to the target face two-dimensional image and the user corresponding to the middle face two-dimensional image on the spherical surface of the simulated earth by using the trained identity recognition model.
Specifically, the target face two-dimensional image and the intermediate face two-dimensional image may be input into a trained identity recognition model, and based on the model, the geographic positions of the user corresponding to the target face two-dimensional image and the user corresponding to the intermediate face two-dimensional image on the spherical surface of the simulated earth may be determined.
It should be noted that, the identity recognition model in the present invention determines whether the identity feature information of the input target two-dimensional image of the face and the intermediate two-dimensional image of the face is consistent by limiting the feature vector of the last feature layer. In addition, the identification model in the invention can be, but is not limited to, a fixed VGG Face classifier, a FaceNet network model and the like. The identity recognition model limits the consistency of identity characteristic information, so that the finally obtained training model can learn face characteristics related to the identity of the target face, such as the height of a nose, the depression of an eye socket, the thickness of a lip and the like, and the target face three-dimensional model reconstructed based on the identity recognition model is more similar to the target face.
Specifically, the training process of the fixed VGG Face classifier is approximately as follows: and training the VGG Face classifier by using the data set to obtain a fixed VGG Face classifier, so that the classifier can determine the geographic position of the user corresponding to the two-dimensional Face image input to the classifier on the spherical surface of the simulated earth. The data set includes a two-dimensional image of the faces of a large number of users and the geographic location of each user. Then, the VGG Face classifier can simulate a geographical position for each user in the data set on the sphere according to the geographical position of each user, and the simulated geographical positions of different users on the sphere are different.
Based on the training process, after the Face two-dimensional image is input in the fixed VGG Face classifier, the geographic position of the user corresponding to the input Face two-dimensional image on the spherical surface can be calculated.
S82, judging whether the geographic position of the user corresponding to the target face two-dimensional image on the spherical surface of the simulated earth and the geographic position of the user corresponding to the middle face two-dimensional image on the spherical surface of the simulated earth meet identity matching conditions or not, and if yes, executing a step S83; if not, go to step S84.
And S83, determining that the identity characteristic information represented by the target face two-dimensional image is consistent with the identity characteristic information represented by the middle face two-dimensional image.
S84, determining that the identity characteristic information represented by the target face two-dimensional image is inconsistent with the identity characteristic information represented by the middle face two-dimensional image.
In steps S82 to S84, the identity matching condition in the present invention may be that the geographic location difference is within an allowable range, and the like. If it is determined based on the VGG Face classifier that the difference between the geographic position of the user on the spherical surface corresponding to the target Face two-dimensional image and the geographic position of the user on the spherical surface corresponding to the intermediate Face two-dimensional image is within the allowable range, it is determined that the identity feature information represented by the target Face two-dimensional image and the intermediate Face two-dimensional image is consistent, that is, the users in the two images are the same person, and if the difference is not within the allowable range, it is indicated that the users represented in the two images are not the same person, and the identity difference is large, it is further indicated that the reduction degree of the Face feature parameters, the photographing environment features and the photographing parameter information identified based on the training model in step S71 is not sufficient, and the training model in step S71 needs to be adjusted.
Preferably, the consistency conditions provided by the present invention further comprise: and the Euclidean distance between the feature points in the target face two-dimensional image and the corresponding feature points in the intermediate face two-dimensional image meets the feature point matching condition, and/or the pixel value difference between the target face two-dimensional image and the intermediate face two-dimensional image meets the image matching condition.
Specifically, the feature point matching condition in the present invention may be, but is not limited to, not greater than a preset euclidean distance threshold or the like, and the image matching condition in the present invention may be, but is not limited to, not greater than a pixel difference threshold or the like.
Specifically, at least 68 feature points can be respectively identified from the target two-dimensional image of the face and the middle two-dimensional image of the face, and the 68 feature points are respectively distributed on the eyebrow, the eye, the nose, the mouth and the cheek part. For example, feature points are respectively extracted from the target face two-dimensional image and the middle face two-dimensional image according to a feature point schematic diagram shown in fig. 4a, then, for each feature point, whether the euclidean distance between the position of the feature point in the target face two-dimensional image and the position of the feature point in the middle face two-dimensional image in the image is not greater than a preset euclidean distance threshold is determined, if yes, it is determined that a feature point matching condition is satisfied, that is, the two images have high alignment degree, and meanwhile, it can be ensured that the expression in the middle face two-dimensional image has high expression similarity degree with the expression in the target face two-dimensional image; if the two-dimensional image is larger than the preset Euclidean distance threshold value, determining that the feature point matching condition is not met, and adjusting the training model to align the intermediate two-dimensional image of the face obtained based on the adjusted model with the target two-dimensional image of the face.
Specifically, the method can also determine the pixel value difference between the target face two-dimensional image and the intermediate face two-dimensional image, then determine whether the pixel difference is not greater than the pixel difference threshold, if not, indicate that the image matching condition is met, indicate that the shooting environment characteristic and the texture information reduction degree of the target face two-dimensional image identified by the training model are high, and the texture information is skin, color and the like.
And S25, adjusting the training model and continuing to execute the step S21.
Specifically, when the training model in step S71 is adjusted, the convolution kernel and the bias in the training model may be adjusted based on the identity recognition result, the feature point matching result, and/or the image matching result of the identity recognition model, and then the steps S21 to S24 are re-executed based on the adjusted training model until it is determined that the determination result in step S24 is yes, that is, the target two-dimensional image of the face and the intermediate two-dimensional image of the face satisfy the set consistency condition.
And S26, obtaining a target human face three-dimensional image based on the newly reconstructed target human face three-dimensional model.
If the determination result in the step S24 is yes, obtaining a target face three-dimensional image based on the newly reconstructed target face three-dimensional model, wherein the fidelity of the target face three-dimensional image obtained based on the target face three-dimensional image is higher, and an effect schematic diagram of the target face two-dimensional image is shown in fig. 4 a; fig. 9 shows an effect schematic diagram of the target face three-dimensional model reconstructed based on the target face two-dimensional image of fig. 4a, which can obtain that the similarity between the target face three-dimensional model and the target face is higher based on the method provided by the present invention.
Referring to fig. 10, the logic architecture diagram of the method for generating a three-dimensional image of a human Face according to the present invention is implemented by taking an adjustable training model as a VGG Face encoder, an identity recognition model as a fixed VGG Face classifier, and a photographing environment characteristic as illumination information, and the implementation process thereof is roughly: the method comprises the steps that a Face image with the number of 1 in the figure 10 is a target Face two-dimensional image, the target Face two-dimensional image in the figure 10 is input into an adjustable VGG Face encoder, then the VGG Face encoder can learn shape feature parameters, expression parameters, texture parameters, photographing parameter information and illumination information from the target Face two-dimensional image, then a target Face three-dimensional model is reconstructed on the basis of the Face feature parameters, then the reconstructed target Face three-dimensional model is rendered by simulating the photographing parameter information and the illumination information to obtain a middle Face two-dimensional image, namely a figure with the number of 2 in the figure 10, then the target Face two-dimensional image and the middle Face two-dimensional image are input into a fixed VGG Face classifier, the VGG Face classifier can determine whether identity feature information represented by the target Face two-dimensional image and the middle Face two-dimensional image is consistent or not, in addition, whether feature points in the target Face two-dimensional image and the middle Face two-dimensional image meet feature point matching conditions or not is determined, and the three-dimensional Face image is determined again on the basis of the three-dimensional model, and the three-dimensional image is obtained when any one of the three-dimensional Face encoder does not meet the image matching conditions.
The invention provides a human face three-dimensional image generation method, after a target human face two-dimensional image is obtained, the target human face two-dimensional image is input into an adjustable training model, the training model can identify facial feature parameters, photographing environment features and photographing parameter information from the target human face two-dimensional image, then the target human face three-dimensional model is reconstructed based on the identified facial feature parameters and a three-dimensional base model in a standard human face template library, then the identified photographing environment features and photographing parameter information are simulated to carry out rendering processing on the target human face three-dimensional model to obtain an intermediate human face two-dimensional image, then whether the target human face two-dimensional image and the intermediate human face two-dimensional image meet set consistency conditions or not is determined, and when the consistency conditions are determined not met, the training model is adjusted, and the step of obtaining the intermediate human face two-dimensional image according to the target human face two-dimensional image is returned by using the adjusted training model; and when the consistency condition is determined to be met, obtaining a target face three-dimensional image based on the newly reconstructed target face three-dimensional model. By adopting the method, the adjustable training model is dynamically adjusted, so that the facial feature parameters obtained based on the method are closer to the target face in the target face two-dimensional image, the recognized photographing environment features and photographing parameter information are closer to the photographing environment parameters and the photographing parameter information when the target face two-dimensional image is photographed, the target face three-dimensional model obtained based on the method is closer to the target face, and the fidelity of the target three-dimensional image obtained based on the method is higher.
Based on the same inventive concept, the embodiment of the invention also provides a human face three-dimensional image generation device, and as the problem solving principle of the device is similar to that of the human face three-dimensional image generation method, the implementation of the device can refer to the implementation of the method, and repeated parts are not repeated.
As shown in fig. 11, a schematic structural diagram of a human face three-dimensional image generation apparatus provided in an embodiment of the present invention includes:
an obtaining unit 111, configured to obtain an intermediate human face two-dimensional image by using an adjustable training model according to a target human face two-dimensional image, where the intermediate human face two-dimensional image is obtained by reconstructing a target human face three-dimensional model according to facial feature parameters recognized from the target human face two-dimensional image and a three-dimensional basis model in a standard human face template library, and by rendering the target human face three-dimensional model by simulating photographing environment features and photographing parameter information recognized from the target human face two-dimensional image;
a determining unit 112, configured to determine whether the target face two-dimensional image and the intermediate face two-dimensional image satisfy a set consistency condition;
an adjusting unit 113, configured to, when the determining unit determines that the consistency condition is not met, adjust the training model and use the adjusted training model to return to the step of obtaining the intermediate two-dimensional face image according to the target two-dimensional face image;
a generating unit 114, configured to obtain a target face three-dimensional image based on the newly reconstructed target face three-dimensional model when the determining unit determines that the consistency condition is satisfied.
Preferably, the consistency condition includes:
and the identity characteristic information represented by the target face two-dimensional image and the middle face two-dimensional image is consistent.
Optionally, the consistency condition further comprises:
and the Euclidean distance between the feature points in the target face two-dimensional image and the corresponding feature points in the intermediate face two-dimensional image meets the feature point matching condition, and/or the pixel value difference between the target face two-dimensional image and the intermediate face two-dimensional image meets the image matching condition.
Preferably, the facial feature parameters include shape feature parameters; the standard human face template library comprises a shape template library of a standard human face, and the three-dimensional base model comprises a three-dimensional shape base model; then
The obtaining unit 111 is specifically configured to determine, by using an adjustable training model, a weight coefficient of each three-dimensional shape base model that can form the shape of the target face; and determining the weight coefficient of each three-dimensional shape base model as the shape characteristic parameter.
Optionally, the facial feature parameters further include expression parameters; the standard face template library also comprises a basic expression template library of a standard face, and the three-dimensional base model also comprises a basic expression base model;
the obtaining unit 111 is further configured to determine, by using an adjustable training model, a weight coefficient of each basic expression base model capable of constituting an expression shown in the two-dimensional image of the target face; and determining the weight coefficient of each basic expression base model as an expression parameter.
Optionally, the facial features further comprise texture parameters; the standard face template base also comprises a texture template base of a standard face, and the three-dimensional substrate model also comprises a three-dimensional texture substrate model;
the obtaining unit 111 is further configured to determine, by using an adjustable training model, a weight coefficient of each three-dimensional texture base model capable of forming a texture of a target face in the target face two-dimensional image; and determining the weight coefficient of each three-dimensional texture base model as a texture parameter.
Further, the obtaining unit 111 is specifically configured to determine a three-dimensional shape model of the target human face based on each three-dimensional shape base model in the shape template library and the determined weight coefficient of each three-dimensional shape base model; determining an expression deviation item of the target face based on each basic expression base model in the basic expression template library and the determined weight coefficient of each basic expression base model; generating a target face three-dimensional model with an expression in the target face two-dimensional image based on the target face three-dimensional shape model and the expression offset item; determining a three-dimensional texture model of the target face based on each three-dimensional texture base model in the texture template library and the determined weight coefficient of each three-dimensional texture base model; and fusing the target face three-dimensional model with the expression in the target face two-dimensional image and the target face three-dimensional texture model to reconstruct the target face three-dimensional model.
Optionally, the determining unit 112 is specifically configured to determine whether the identity feature information represented by the target two-dimensional image of the face and the identity feature information represented by the intermediate two-dimensional image of the face are consistent according to the following method: respectively determining the geographic positions of the user corresponding to the target face two-dimensional image and the user corresponding to the intermediate face two-dimensional image on the spherical surface of the simulated earth by using the trained identity recognition model; if the geographic position of the user corresponding to the target face two-dimensional image on the spherical surface of the simulated earth and the geographic position of the user corresponding to the middle face two-dimensional image on the spherical surface of the simulated earth meet identity matching conditions, identity characteristic information represented by the target face two-dimensional image and the middle face two-dimensional image is determined to be consistent; and if the target face two-dimensional image is determined to be not satisfied, determining that the identity characteristic information represented by the target face two-dimensional image is inconsistent with the identity characteristic information represented by the middle face two-dimensional image.
For convenience of description, the above parts are separately described as modules (or units) according to functional division. Of course, the functionality of the various modules (or units) may be implemented in the same or in multiple pieces of software or hardware in practicing the invention.
Having described the method, apparatus and readable medium for generating a three-dimensional image of a human face according to an exemplary embodiment of the present invention, a computing apparatus according to another exemplary embodiment of the present invention will be described.
As will be appreciated by one skilled in the art, aspects of the present invention may be embodied as a system, method or program product. Thus, various aspects of the invention may be embodied in the form of: an entirely hardware embodiment, an entirely software embodiment (including firmware, microcode, etc.) or an embodiment combining hardware and software aspects that may all generally be referred to herein as a "circuit," module "or" system.
In some possible embodiments, a computing device according to the present invention may comprise at least one processing unit, and at least one memory unit. Wherein the storage unit stores program code which, when executed by the processing unit, causes the processing unit to perform the steps in the three-dimensional image generation method for a human face according to various exemplary embodiments of the present invention described above in this specification. For example, the processing unit may execute a face three-dimensional image generation flow in steps S21 to S26 shown in fig. 2.
The computing device 120 according to this embodiment of the invention is described below with reference to fig. 12. The computing device 110 shown in FIG. 12 is only an example and should not be used to limit the scope or functionality of embodiments of the present invention.
As shown in fig. 12, the computing apparatus 110 is in the form of a general purpose computing device. Components of computing device 110 may include, but are not limited to: the at least one processing unit 111, the at least one memory unit 112, and a bus 113 connecting various system components (including the memory unit 112 and the processing unit 111).
Bus 113 represents one or more of any of several types of bus structures, including a memory bus or memory controller, a peripheral bus, a processor, or a local bus using any of a variety of bus architectures.
The storage unit 112 may include readable media in the form of volatile memory, such as Random Access Memory (RAM) 1121 and/or cache memory 1122, and may further include Read Only Memory (ROM) 1123.
Storage unit 112 may also include a program/utility 1125 having a set (at least one) of program modules 1124, such program modules 1124 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each of which, or some combination thereof, may comprise an implementation of a network environment.
The computing apparatus 110 may also communicate with one or more external devices 114 (e.g., keyboard, pointing device, etc.), may also communicate with one or more devices that enable a user to interact with the computing apparatus 110, and/or may communicate with any devices (e.g., router, modem, etc.) that enable the computing apparatus 110 to communicate with one or more other computing devices. Such communication may be through an input/output (I/O) interface 115. Also, the computing device 110 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the internet) through the network adapter 116. As shown, the network adapter 116 communicates with other modules for the computing device 11 over the bus 113. It should be understood that although not shown in the figures, other hardware and/or software modules may be used in conjunction with the computing device 110, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.
In some possible embodiments, the aspects of the method for generating a three-dimensional image of a human face provided by the present invention may also be implemented in the form of a program product, which includes program code for causing a computer device to execute the steps in the method for generating a three-dimensional image of a human face according to various exemplary embodiments of the present invention described above in this specification when the program product runs on the computer device, for example, the computer device may execute the flow of generating a three-dimensional image of a human face in steps S21 to S26 shown in fig. 2.
The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
The program product for the human face three-dimensional image generation method of the embodiment of the invention can adopt a portable compact disk read only memory (CD-ROM) and comprises program codes, and can run on a computing device. However, the program product of the present invention is not limited in this regard and, in the present document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
A readable signal medium may include a propagated data signal with readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A readable signal medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server. In situations involving remote computing devices, the remote computing devices may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to external computing devices (e.g., through the internet using an internet service provider).
It should be noted that although several units or sub-units of the apparatus are mentioned in the above detailed description, such division is merely exemplary and not mandatory. Indeed, the features and functions of two or more of the units described above may be embodied in one unit, according to embodiments of the invention. Conversely, the features and functions of one unit described above may be further divided into embodiments by a plurality of units.
Moreover, while the operations of the method of the invention are depicted in the drawings in a particular order, this does not require or imply that the operations must be performed in this particular order, or that all of the illustrated operations must be performed, to achieve desirable results. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step execution, and/or one step broken down into multiple step executions.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including the preferred embodiment and all changes and modifications that fall within the scope of the invention.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims (10)

1. A human face three-dimensional image generation method is characterized by comprising the following steps:
according to the target face two-dimensional image, recognizing face characteristic parameters, photographing environment characteristics and photographing parameter information from the target face two-dimensional image by using an adjustable training model;
reconstructing a target human face three-dimensional model according to the recognized facial feature parameters and the three-dimensional base model in the standard human face template library;
simulating the recognized photographing environment characteristics and photographing parameter information, and rendering the target human face three-dimensional model to obtain an intermediate human face two-dimensional image;
determining whether the target face two-dimensional image and the middle face two-dimensional image meet a set consistency condition;
when the consistency condition is determined not to be met, adjusting the training model and returning to the step of obtaining the intermediate face two-dimensional image according to the target face two-dimensional image by using the adjusted training model;
when the consistency condition is determined to be met, obtaining a target human face three-dimensional image based on the newly reconstructed target human face three-dimensional model;
when the consistency condition includes that the identity feature information represented by the target face two-dimensional image and the intermediate face two-dimensional image is consistent, the determining whether the target face two-dimensional image and the intermediate face two-dimensional image meet the set consistency condition includes:
respectively determining the geographic positions of the user corresponding to the target face two-dimensional image and the user corresponding to the intermediate face two-dimensional image on the spherical surface of the simulated earth by using the trained identity recognition model;
if the geographic position of the user corresponding to the target face two-dimensional image on the spherical surface of the simulated earth and the geographic position of the user corresponding to the middle face two-dimensional image on the spherical surface of the simulated earth meet identity matching conditions, identity characteristic information represented by the target face two-dimensional image and the middle face two-dimensional image is determined to be consistent;
and if the target face two-dimensional image is determined to be not satisfied, determining that the identity characteristic information represented by the target face two-dimensional image is inconsistent with the identity characteristic information represented by the middle face two-dimensional image.
2. The method of claim 1, wherein the consistency condition further comprises:
and the Euclidean distance between the feature points in the target face two-dimensional image and the corresponding feature points in the intermediate face two-dimensional image meets the feature point matching condition, and/or the pixel value difference between the target face two-dimensional image and the intermediate face two-dimensional image meets the image matching condition.
3. The method of claim 1 or 2, wherein the facial feature parameters include shape feature parameters; the standard human face template library comprises a shape template library of a standard human face, and the three-dimensional base model comprises a three-dimensional shape base model; then
Recognizing facial feature parameters from the two-dimensional image of the target human face by using an adjustable training model, wherein the recognizing comprises the following steps:
determining weight coefficients of each three-dimensional shape base model capable of forming the shape of the target face by using an adjustable training model;
and determining the weight coefficient of each three-dimensional shape base model as the shape characteristic parameter.
4. The method of claim 3, wherein the facial feature parameters further include expression parameters; the standard face template library also comprises a basic expression template library of a standard face, and the three-dimensional base model also comprises a basic expression base model;
the recognizing facial feature parameters from the two-dimensional image of the target human face by using the adjustable training model further comprises:
determining the weight coefficient of each basic expression base model capable of forming the expression shown in the target face two-dimensional image by using an adjustable training model;
and determining the weight coefficient of each basic expression base model as an expression parameter.
5. The method of claim 4, wherein the facial features further comprise texture parameters; the standard face template base also comprises a texture template base of a standard face, and the three-dimensional substrate model also comprises a three-dimensional texture substrate model;
the recognizing facial feature parameters from the two-dimensional image of the target human face by using the adjustable training model further comprises:
determining weight coefficients of each three-dimensional texture base model capable of forming textures of the target face in the two-dimensional image of the target face by using an adjustable training model;
and determining the weight coefficient of each three-dimensional texture base model as a texture parameter.
6. The method as claimed in claim 5, wherein reconstructing the three-dimensional model of the target face from the facial feature parameters identified from the two-dimensional image of the target face and the three-dimensional basis models in the library of standard face templates comprises:
determining a target human face three-dimensional shape model based on each three-dimensional shape base model in the shape template library and the determined weight coefficient of each three-dimensional shape base model;
determining an expression deviation item of the target face based on each basic expression base model in the basic expression template library and the determined weight coefficient of each basic expression base model;
generating a target face three-dimensional model with an expression in the target face two-dimensional image based on the target face three-dimensional shape model and the expression offset item;
determining a three-dimensional texture model of the target face based on each three-dimensional texture base model in the texture template library and the determined weight coefficient of each three-dimensional texture base model;
and fusing the target face three-dimensional model with the expression in the target face two-dimensional image and the target face three-dimensional texture model to reconstruct the target face three-dimensional model.
7. A device for generating a three-dimensional image of a human face, comprising:
the system comprises an obtaining unit, a processing unit and a processing unit, wherein the obtaining unit is used for obtaining an intermediate human face two-dimensional image by utilizing an adjustable training model according to a target human face two-dimensional image, the intermediate human face two-dimensional image is obtained by reconstructing a target human face three-dimensional model according to facial feature parameters recognized from the target human face two-dimensional image and a three-dimensional base model in a standard human face template library, and simulating photographing environment features and photographing parameter information recognized from the target human face two-dimensional image to render the target human face three-dimensional model;
the determining unit is used for determining whether the target face two-dimensional image and the middle face two-dimensional image meet a set consistency condition or not;
the adjusting unit is used for adjusting the training model when the determining unit determines that the consistency condition is not met, so that the obtaining unit obtains the intermediate face two-dimensional image again according to the target face two-dimensional image by using the adjusted training model;
the generating unit is used for obtaining a target human face three-dimensional image based on the latest reconstructed target human face three-dimensional model when the determining unit determines that the consistency condition is met;
wherein, when the consistency condition includes that the identity feature information represented by the target face two-dimensional image and the intermediate face two-dimensional image is consistent, the determining unit is specifically configured to:
respectively determining the geographic positions of the user corresponding to the target face two-dimensional image and the user corresponding to the intermediate face two-dimensional image on the spherical surface of the simulated earth by using the trained identity recognition model;
if the geographic position of the user corresponding to the target face two-dimensional image on the spherical surface of the simulated earth and the geographic position of the user corresponding to the middle face two-dimensional image on the spherical surface of the simulated earth meet identity matching conditions, identity characteristic information represented by the target face two-dimensional image and the middle face two-dimensional image is determined to be consistent;
and if the target face two-dimensional image is determined to be not satisfied, determining that the identity characteristic information represented by the target face two-dimensional image is inconsistent with the identity characteristic information represented by the middle face two-dimensional image.
8. The apparatus of claim 7, wherein the consistency condition further comprises:
and the Euclidean distance between the feature points in the target face two-dimensional image and the corresponding feature points in the intermediate face two-dimensional image meets the feature point matching condition, and/or the pixel value difference between the target face two-dimensional image and the intermediate face two-dimensional image meets the image matching condition.
9. A computer-readable medium having stored thereon computer-executable instructions for performing the method of any one of claims 1 to 6.
10. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1 to 6.
CN201811459413.0A 2018-11-30 2018-11-30 Human face three-dimensional image generation method and device and readable medium Active CN109377544B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811459413.0A CN109377544B (en) 2018-11-30 2018-11-30 Human face three-dimensional image generation method and device and readable medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811459413.0A CN109377544B (en) 2018-11-30 2018-11-30 Human face three-dimensional image generation method and device and readable medium

Publications (2)

Publication Number Publication Date
CN109377544A CN109377544A (en) 2019-02-22
CN109377544B true CN109377544B (en) 2022-12-23

Family

ID=65376343

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811459413.0A Active CN109377544B (en) 2018-11-30 2018-11-30 Human face three-dimensional image generation method and device and readable medium

Country Status (1)

Country Link
CN (1) CN109377544B (en)

Families Citing this family (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110163953B (en) * 2019-03-11 2023-08-25 腾讯科技(深圳)有限公司 Three-dimensional face reconstruction method and device, storage medium and electronic device
CN109902767B (en) * 2019-04-11 2021-03-23 网易(杭州)网络有限公司 Model training method, image processing device, model training apparatus, image processing apparatus, and computer-readable medium
CN111881708A (en) * 2019-05-03 2020-11-03 爱唯秀股份有限公司 Face recognition system
CN110428491B (en) * 2019-06-24 2021-05-04 北京大学 Three-dimensional face reconstruction method, device, equipment and medium based on single-frame image
CN110298319B (en) * 2019-07-01 2021-10-08 北京字节跳动网络技术有限公司 Image synthesis method and device
CN110310224B (en) * 2019-07-04 2023-05-30 北京字节跳动网络技术有限公司 Light effect rendering method and device
CN110675475B (en) * 2019-08-19 2024-02-20 腾讯科技(深圳)有限公司 Face model generation method, device, equipment and storage medium
CN110941332A (en) * 2019-11-06 2020-03-31 北京百度网讯科技有限公司 Expression driving method and device, electronic equipment and storage medium
CN111028330B (en) 2019-11-15 2023-04-07 腾讯科技(深圳)有限公司 Three-dimensional expression base generation method, device, equipment and storage medium
CN111080626A (en) * 2019-12-19 2020-04-28 联想(北京)有限公司 Detection method and electronic equipment
CN113129425A (en) * 2019-12-31 2021-07-16 Tcl集团股份有限公司 Face image three-dimensional reconstruction method, storage medium and terminal device
CN110807451B (en) * 2020-01-08 2020-06-02 腾讯科技(深圳)有限公司 Face key point detection method, device, equipment and storage medium
CN111243106B (en) * 2020-01-21 2021-05-25 杭州微洱网络科技有限公司 Method for correcting three-dimensional human body model based on 2D human body image
US11748943B2 (en) * 2020-03-31 2023-09-05 Sony Group Corporation Cleaning dataset for neural network training
CN111695471B (en) * 2020-06-02 2023-06-27 北京百度网讯科技有限公司 Avatar generation method, apparatus, device and storage medium
CN111729314A (en) * 2020-06-30 2020-10-02 网易(杭州)网络有限公司 Virtual character face pinching processing method and device and readable storage medium
CN112150615A (en) * 2020-09-24 2020-12-29 四川川大智胜软件股份有限公司 Face image generation method and device based on three-dimensional face model and storage medium
CN112052834B (en) * 2020-09-29 2022-04-08 支付宝(杭州)信息技术有限公司 Face recognition method, device and equipment based on privacy protection
CN112884881B (en) * 2021-01-21 2022-09-27 魔珐(上海)信息科技有限公司 Three-dimensional face model reconstruction method and device, electronic equipment and storage medium
CN113177466A (en) * 2021-04-27 2021-07-27 北京百度网讯科技有限公司 Identity recognition method and device based on face image, electronic equipment and medium
CN113240802B (en) * 2021-06-23 2022-11-15 中移(杭州)信息技术有限公司 Three-dimensional reconstruction whole-house virtual dimension installing method, device, equipment and storage medium
CN113838176B (en) * 2021-09-16 2023-09-15 网易(杭州)网络有限公司 Model training method, three-dimensional face image generation method and three-dimensional face image generation equipment
CN113963425B (en) * 2021-12-22 2022-03-25 北京的卢深视科技有限公司 Testing method and device of human face living body detection system and storage medium
CN116188640B (en) * 2022-12-09 2023-09-08 北京百度网讯科技有限公司 Three-dimensional virtual image generation method, device, equipment and medium
CN116912639B (en) * 2023-09-13 2024-02-09 腾讯科技(深圳)有限公司 Training method and device of image generation model, storage medium and electronic equipment

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102592309A (en) * 2011-12-26 2012-07-18 北京工业大学 Modeling method of nonlinear three-dimensional face
CN103593870A (en) * 2013-11-12 2014-02-19 杭州摩图科技有限公司 Picture processing device and method based on human faces
CN107067429A (en) * 2017-03-17 2017-08-18 徐迪 Video editing system and method that face three-dimensional reconstruction and face based on deep learning are replaced
CN108510437A (en) * 2018-04-04 2018-09-07 科大讯飞股份有限公司 A kind of virtual image generation method, device, equipment and readable storage medium storing program for executing

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130215113A1 (en) * 2012-02-21 2013-08-22 Mixamo, Inc. Systems and methods for animating the faces of 3d characters using images of human faces

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102592309A (en) * 2011-12-26 2012-07-18 北京工业大学 Modeling method of nonlinear three-dimensional face
CN103593870A (en) * 2013-11-12 2014-02-19 杭州摩图科技有限公司 Picture processing device and method based on human faces
CN107067429A (en) * 2017-03-17 2017-08-18 徐迪 Video editing system and method that face three-dimensional reconstruction and face based on deep learning are replaced
CN108510437A (en) * 2018-04-04 2018-09-07 科大讯飞股份有限公司 A kind of virtual image generation method, device, equipment and readable storage medium storing program for executing

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
《A Morphable Model For The Synthesis Of 3D Face》笔记;likewind1993;《CSDNhttps://blog.csdn.net/likewind1993/article/details/79177566》;20180128;正文第1页第2段到第3页最后1段 *

Also Published As

Publication number Publication date
CN109377544A (en) 2019-02-22

Similar Documents

Publication Publication Date Title
CN109377544B (en) Human face three-dimensional image generation method and device and readable medium
CN108961369B (en) Method and device for generating 3D animation
CN110163054B (en) Method and device for generating human face three-dimensional image
CN111260764B (en) Method, device and storage medium for making animation
US20220084163A1 (en) Target image generation method and apparatus, server, and storage medium
KR20230003059A (en) Template-based generation of 3D object meshes from 2D images
CN113272870A (en) System and method for realistic real-time portrait animation
US8531464B1 (en) Simulating skin deformation relative to a muscle
CN115345980B (en) Generation method and device of personalized texture map
CN113658309B (en) Three-dimensional reconstruction method, device, equipment and storage medium
CN115049799B (en) Method and device for generating 3D model and virtual image
CN110490959B (en) Three-dimensional image processing method and device, virtual image generating method and electronic equipment
US11514638B2 (en) 3D asset generation from 2D images
US20220358675A1 (en) Method for training model, method for processing video, device and storage medium
US20220156987A1 (en) Adaptive convolutions in neural networks
JP2023029984A (en) Method, device, electronic apparatus, and readable storage medium for generating virtual image
CN115239861A (en) Face data enhancement method and device, computer equipment and storage medium
CN110458924B (en) Three-dimensional face model establishing method and device and electronic equipment
Marques et al. Deep spherical harmonics light probe estimator for mixed reality games
CN114202615A (en) Facial expression reconstruction method, device, equipment and storage medium
CN111754622B (en) Face three-dimensional image generation method and related equipment
CN110827341A (en) Picture depth estimation method and device and storage medium
CN116342782A (en) Method and apparatus for generating avatar rendering model
CN115775300A (en) Reconstruction method of human body model, training method and device of human body reconstruction model
CN113223128B (en) Method and apparatus for generating image

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant