CN106097373A - A kind of smiling face's synthetic method based on branch's formula sparse component analysis model - Google Patents

A kind of smiling face's synthetic method based on branch's formula sparse component analysis model Download PDF

Info

Publication number
CN106097373A
CN106097373A CN201610473441.2A CN201610473441A CN106097373A CN 106097373 A CN106097373 A CN 106097373A CN 201610473441 A CN201610473441 A CN 201610473441A CN 106097373 A CN106097373 A CN 106097373A
Authority
CN
China
Prior art keywords
face
sigma
model
component analysis
projection
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201610473441.2A
Other languages
Chinese (zh)
Other versions
CN106097373B (en
Inventor
王存刚
王斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Liaocheng University
Original Assignee
Liaocheng University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Liaocheng University filed Critical Liaocheng University
Priority to CN201610473441.2A priority Critical patent/CN106097373B/en
Publication of CN106097373A publication Critical patent/CN106097373A/en
Application granted granted Critical
Publication of CN106097373B publication Critical patent/CN106097373B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T13/00Animation
    • G06T13/203D [Three Dimensional] animation
    • G06T13/403D [Three Dimensional] animation of characters, e.g. humans, animals or virtual beings
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • G06T2207/30201Face

Landscapes

  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The present invention relates to a kind of smiling face's synthetic method based on branch's formula sparse component analysis model, first, derive the branch's formula sparse component analysis model for face representation;Then, reconstruct and the rule of projection are provided based on this model;And then, utilize projection rule to obtain projection coefficient, utilize reconfiguration rule that the face of input is rebuild;Then, the face after rebuilding is repeated the process of above-mentioned projection and reconstruct;Finally, several facial images after rebuilding are exported, as smile's building-up process of input face.The remarkable result of the present invention is embodied in: the face of synthesis is substantially reasonable and smooth, and the smile of synthesis is realistic.

Description

Smiling face synthesis method based on fractional sparse component analysis model
Technical Field
The invention relates to a smiling face synthesis method based on a sectional sparse component analysis model, which can be widely applied to the fields of film making, virtual communities, game entertainment, animation synthesis and the like. Belongs to the field of computer vision, mode recognition and man-machine interaction.
Background
As a medium for transferring human emotional and mental states, human faces play a very important role in information transfer and expression in social communication. In recent years, the reconstruction and synthesis of realistic facial expressions by using computer technology becomes a research hotspot concentrated by researchers in the fields of computer vision, human-computer interaction and computer graphics, and is widely applied to industries such as advertisements, animations, movies and games. Since the 70 s of the 20 th century, the research on automatic synthesis of facial expressions by computers has been developed, and algorithms corresponding to the same have appeared endlessly. The facial expression synthesis method mainly comprises the following steps: synthesizing the facial expression based on the mixed sample; direct expression migration; sketch-based facial expression synthesis; and synthesizing the facial expression based on machine learning. Among them, the method based on machine learning is widely used for synthesizing facial expressions, and achieves more research results. The facial expression synthesis technology based on machine learning generally analyzes a given training sample, extracts effective facial expression changes, and further realizes reconstruction and synthesis of facial expressions.
The search of the existing literature finds that the existing human face synthesis method based on machine learning mainly comprises the following categories: one type is a method based on Canonical Correlation Analysis (CCA), such as: a paper "Real-time data driven modeling analysis (Real-time data driven transformation based on kernel-typical correlation analysis)" published in ACM Transactions on Graphics (ACM graphic conference), Wei-Wen Feng and Byung-Uck Kim 2008. The method divides a model of the face into components according to the distribution of the feature points, learns the mapping relation between the feature points and the components from a training sample by utilizing a typical correlation analysis technology, and calculates and obtains the face expression by utilizing a Poisson transformation technology. The drawback of this method is that the expression control points need to be determined in advance and cannot be changed in the subsequent calculation process. One type is a method of synthesizing facial expressions based on Independent Component Analysis (ICA), such as: yong Cao and Petros Faloutso published in 2003 in the Proceedings of the ACM SIGGRAPH/Europathics symposium on Computer Animation (ACM Computer Animation conference Proceedings) "Unsupervised learning for speech Animation editing based on semi-supervised learning". The method uses an independent component analysis technology to extract parameters of the facial expression, and divides the initial facial data into two parts of the expression and the voice, and the change of the facial expression can be edited by operating each component. The last category is methods based on Multidimensional deformable models (MMM), such as: Yao-Jen Chang and Tony Ezzat published in 2005 in the Proceedings of the ACM SIGGRAPH/Europathics symposium on Computer Animation (ACM Computer Animation conference corpus), a Transferable video realistic speech Animation, which realizes the migration of facial expressions based on a multi-dimensional deformable model, migrates the facial expressions of other characters into the current model, and completes the synthesis of facial expressions. In these methods, the model for face modeling is too complex, the computational complexity is too high, and the learned face part is not flexible enough and lacks robustness.
Disclosure of Invention
The invention aims to provide a smiling face synthesis method based on a fractional sparse component analysis model aiming at the defects of the existing method, and a real human face image can be synthesized under the condition that only a small number of labeled training samples exist. The invention designs a production model-a fractional sparse component analysis model on the basis of the facial image with good alignability and consistent structure, and carries out facial expression based on the model, and constructs projection and reconstruction rules so as to complete smiling face synthesis. Due to the typical structure of a human face, the segmented approach works well in human face representation and can capture some semantic parts corresponding to a training image, and represent the image by the combination of the parts. And the base learned by sparse component analysis is closer to the sample, so that the robustness is stronger on the face representation. The fractional sparse component analysis model combines the advantages of fractional methods and sparse component analysis.
In order to achieve the above object, the present invention is achieved by the following technical solutions. A smiling face synthesis method based on a fractional sparse component analysis model comprises the following steps of firstly, deriving a fractional sparse component analysis model for human face representation; then, rules of reconstruction and projection are given based on the model; secondly, obtaining a projection coefficient by using a projection rule, and reconstructing the input human face by using a reconstruction rule; then, repeating the projection and reconstruction processes for the reconstructed face; and finally, outputting the reconstructed multiple human face images as a smile synthesis process of the input human face.
A smiling face synthesis method based on a fractional sparse component analysis model comprises the following specific steps,
step one, learning and constructing a generative model for representing human faces;
the generative model finds a common spatial segmentation for a given face picture and learns a sparse component analysis model for each part of the segmentation;
let the probability distribution of the generative model be P (x, Z, R) which is calculated by P (x, Z, R) ═ P (xR, Z) P (R) P (Z). First, a continuous induction prior of a generative model is derived, and the continuous induction prior and a polynomial prior are combined together to form p (r) by using poe (product of experiments), that is:
P ( R ) = Π d P ( r d . ) = Π d 1 Z d P 1 ( r d . ) P 2 ( r d . ) = 1 Z 0 Π d , k α d k r d k Π d , k exp { - r d k ( 1 - 4 | D d | Σ d ′ ∈ D d r d ′ k ) } - - - ( 1 )
wherein Z is0Is a normalization function, and the variable used for recording the selection of the face image pixels is rd.=(rd1,,rdK)TAnd r isd.~Mult(n=1,αd.),DdIs a variable rdkSet of adjacent variables, P1(rd.) For polynomial priors, P2(rd.) Is a continuity induction prior;
then, modeling each part of the face into a sparse component analysis model, learning the sparse component analysis model for each face part, and solving P (xR, Z) and P (Z); in particular, for one face pixel xdFirst, a part, denoted as K, is selected from the K face parts, and accordingly, a pixel is generated from the K-th sparse component analysis model. The following model was obtained:
x d = Σ k = 1 K r d k Σ m = 1 M z k m w d k m + μ d + ϵ d
rd.~Multi(n=1,αd.)
zkm~Lap(u=0,b=1);
wherein x isdIs a pixel of a human face and,is an overcomplete basis for the kth moiety, M is the number of bases, μdIs an average value;α being random noised.=(αd1,,αdK)TA parameter that is a polynomial distribution; from the above model, one can obtain:
P ( x | R , Z ) = Π d P ( x d | r d . , Z ) = Π d N ( x d ; Σ k r d k Σ m z k m w d k m + μ d , σ d 2 ) - - - ( 2 )
P ( Z ) = Π k , m 1 2 b exp ( - | z k m - u | b ) - - - ( 3 )
in view of P (R) (formula (1)), P (Z) (formula (3)), and P (x | R, Z) (formula (2)), a generative model P (x, Z, R) of the face representation is derived:
P ( x , Z , R ) = P ( x | R , Z ) P ( R ) P ( Z ) = Π d N ( x d ; Σ k r d k Σ m z k m w d k m + μ d , σ d 2 ) 1 Z 0 Π d , k α d k r d k Π d , k exp { - r d k ( 1 - 4 | D d | Σ d ′ ∈ D d r d ′ k ) } Π k , m 1 2 b exp ( - | z k m - u | b ) - - - ( 4 )
step two, calculating a projection coefficient for synthesizing the smile of the human face;
let the learned fractional sparse component analysis model be θ; sample XcThe projection coefficients on the model theta are combination coefficientsThe mean value of (a);
the mean of the coefficients can be estimated by:
E P ( z k m c | X c , θ ) [ z k m c ] ≈ 1 J Σ i = 1 J z k m c , i - - - ( 5 )
wherein,is fromA medium sampled sample;
is a hidden variable zkmThe invention adopts Monte Carlo EM algorithm to estimate the posterior distribution of the hidden variables; the Monte Carlo method first uses the Gibbs sampling method to sample the samples of the hidden variables from the posterior, rdkThe Gibbs sample distribution of (d) can be expressed as:
P ( r d k | x , R - d k , Z ) = P ( x , Z | R ) P ( r d k | R - d k ) P ( R - d k ) P ( x , Z | R - d k ) P ( R - d k ) ∝ P ( x | Z , R ) P ( r d k | R - d k ) - - - ( 6 )
wherein R is-dkRepresents the removal of R from RdkThe set of variables obtained later. Substituting equations (2) and (3) into the above equation yields:
P ( r d k | x , R - d k , Z ) ∝ N ( x d ; Σ k r d k Σ m z k m w d k m + μ d , σ d 2 ) · exp { r d k ( Σ d ′ ∈ D d r d ′ k - 1 ) } α d k r d k - - - ( 7 )
zkmthe sample distribution of (a) can be similarly derived as follows:
P ( z k m | x , R , Z - k m ) ∝ N ( z k m ; μ z k m + , σ z k m 2 + ) z k m ≥ 0 ( P + ) N ( z k m ; μ z k m - , σ z k m 2 - ) z k m ≤ 0 ( P - ) - - - ( 8 )
wherein,
σ z k m 2 + = σ z h m 2 - = 1 / [ Σ d ( r d k w d k m / σ d k ) 2 ] ,
the resulting sampling profile is discontinuous and can be sampled from the profile in the following way:
(1) distribution P from equation (8)+Sampling and outputting non-negative samples (z)km≥0);
(2) Samples from the distribution P _ in equation (8), and outputs a negative sample (z)km≤0);
(3) Combining the two sets of samples as an output;i.e. the samples sampled from the distribution in this way; synthesizing a given image X using a model thetacFirst, the human face picture is projected onto the model theta by using the formula (5), and the projection coefficient is calculated
Step three, giving a reconstruction rule of the smile synthesis of the human face;
after calculating the projection coefficients, XcIs the mean of the conditional distribution shown in equation (2):
x ^ d c = Σ k α dk Σ m E P ( z km c | X c , θ ) [ z km c ] w dkm + u d - - - ( 9 )
reconstructing a human face on the model theta through a formula (9);
and step four, repeating the step two and the step three until all the intermediate faces are output, and finally synthesizing the smile of the face.
The invention provides a smiling face synthesis method based on a fractional sparse component analysis model, which comprises the following specific steps of:
(1) and constructing and learning a generative model for human face representation and modeling. The generative model finds a common spatial segmentation for a given face picture and learns a sparse component analysis model for each part of the segmentation, completing face modeling. Firstly, deriving continuous induction prior of a generative model, combining the continuous induction prior and polynomial prior to form total prior P (R) of the model by PoE (product of experiments), then modeling each part of the face as a sparse component analysis model, and providing each face part
Learning the sparse component analysis model to obtain P (x | R, Z) and P (Z).
(2) Calculating a projection coefficient of the smile of the human face synthesized on the learned fractional sparse component analysis model;
(3) after the projection coefficient is calculated, reconstructing a human face on the human face model through a reconstruction rule of human face smile synthesis;
(4) repeating the projection in the step (2) and the reconstruction process in the step (3) for the reconstructed face;
(5) and finally, outputting the reconstructed multiple human face images as a smile synthesis process of the input human face.
The invention relates to a smiling face synthesis method based on a fractional sparse component analysis model. First, a learning generative model is constructed for human face representation. The generative model finds a common spatial segmentation for a given face picture and learns a sparse component analysis model for each part of the segmentation, completing face modeling. Next, the projection coefficients of the input samples (face images) on the learned model are calculated, the projection coefficients being the average of a set of combination coefficients estimated by the monte carlo EM algorithm. And then, reconstructing the human face on the fractional sparse component analysis model according to the reconstruction rule of the human face smile synthesis. Then, the above processes of projection and reconstruction are repeated for the reconstructed face. And finally, outputting the reconstructed multiple human face images as a smile synthesis process of the input human face.
Compared with the prior art, the invention has the remarkable effects that: since the partial sparse component analysis model is adopted for modeling the face, the shape of the learned face part is very flexible (benefit from the partial representation method), and the robustness of each part is strong (benefit from the sparse component analysis model for modeling each part). It is worth noting that the method of the present invention allows the input picture not in the training set, i.e. the input picture is a completely new picture for the trained model, in this case, only if the learned model is flexible enough, the input face can be stably reconstructed, so that the face gradually has smile.
Drawings
FIG. 1 is a flowchart illustrating a method for synthesizing smiles of human faces according to the present invention.
Fig. 2 is a comparison of face parts learned by different methods, and fig. 2 (1): the face part is learned by MCFA; fig. 2 (2): a portion of SSPCA learning; fig. 2 (3): the present invention has been made in part.
FIG. 3 shows the smile of the face synthesized by the present invention.
FIG. 4 is a comparison graph of smile synthesis on an example database for the present invention and a classical method.
FIG. 5 is a comparison graph of smile synthesis on an example database for the present invention and a classical method.
Detailed Description
The technical solution of the present invention is explained in more detail below with reference to the accompanying drawings and specific embodiments. The following examples are carried out on the premise of the technical scheme of the invention, and detailed embodiments and processes are given, but the scope of the invention is not limited to the following examples.
Example 1: a smiling face synthesis method based on a fractional sparse component analysis model is specifically realized by the following steps:
step one, learning and constructing a production model for representing human faces.
The generative model finds a common spatial segmentation for a given face picture and learns a sparse component analysis model for each portion of this segmentation. Let the probability distribution of the generative model be P (x, Z, R), which is calculated as P (x, Z, R) ═ P (x | R, Z) P (R) P (Z). First, a continuous induction prior of the generative model is derived, and the continuous induction prior and the polynomial prior are combined together to form p (r) by using poe (product of experiments), that is:
P ( R ) = Π d P ( r d . ) = Π d 1 Z d P 1 ( r d . ) P 2 ( r d . ) = 1 Z 0 Π d , k α d k r d k Π d , k exp { - r d k ( 1 - 4 | D d | Σ d ′ ∈ D d r d ′ k ) }
( 1 )
wherein Z is0Is a normalization function, and the variable used for recording the selection of the face image pixels is rd.=(rd1,,rdK)TAnd r isd.~Mult(n=1,αd.),DdIs a variable rdkSet of adjacent variables, P1(rd.) For polynomial priors, P2(rd.) A priori is induced for continuity.
Then, each part of the face is modeled as a sparse component analysis model, and the sparse component analysis model is learned for each face part, to find P (x | R, Z) and P (Z). In particular, for one face pixel xdFirst, a part, denoted as K, is selected from the K face parts, and accordingly, a pixel is generated from the K-th sparse component analysis model. The following model was obtained:rd.~Multi(n=1,αd.);zkmlap (u ═ 0, b ═ 1). Wherein x isdIs a pixel of a human face and,is an overcomplete basis for the kth moiety, M is the number of bases, μdIs an average value;α being random noised.=(αd1,,αdK)TIs a parameter of the polynomial distribution. From the above model, one can obtain:
P ( x | R , Z ) = Π d P ( x d | r d . , Z ) = Π d N ( x d ; Σ k r d k Σ m z k m w d k m + μ d , σ d 2 ) - - - ( 2 )
P ( Z ) = Π k , m 1 2 b exp ( - | z k m - u | b ) - - - ( 3 )
in consideration of P (R) (formula (1)), P (Z) (formula (3)), and P (x | R, Z) (formula (2)), a generative model P (x, Z, R) of the face representation in the present invention is derived:
P ( x , Z , R ) = P ( x | R , Z ) P ( R ) P ( Z ) = Π d N ( x d ; Σ k r d k Σ m z k m w d k m + μ d , σ d 2 ) 1 Z 0 Π d , k α d k r d k Π d , k exp { - r d k ( 1 - 4 | D d | Σ d ′ ∈ D d r d ′ k ) } Π k , m 1 2 b exp ( - | z k m - u | b ) - - - ( 4 )
and step two, calculating a projection coefficient for synthesizing the smile of the human face.
Let the learned fractional sparse component analysis model be θ. Sample XcThe projection coefficients on the model theta are combination coefficientsIs measured. The mean of the coefficients can be estimated by:
E P ( z k m c | X c , θ ) [ z k m c ] ≈ 1 J Σ i = 1 J z k m c , i - - - ( 5 )
wherein,is fromThe sampled samples.The calculation of (c) is as follows.
Is a hidden variable zkmThe present invention estimates the posterior distribution of the hidden variables using the monte carlo EM algorithm. The Monte Carlo method first uses the Gibbs sampling method to sample the samples of the hidden variables from the posterior, rdkThe Gibbs sample distribution of (d) can be expressed as:
P ( r d k | x , R - d k , Z ) = P ( x , Z | R ) P ( r d k | R - d k ) P ( R - d k ) P ( x , Z | R - d k ) P ( R - d k ) ∝ P ( x | Z , R ) P ( r d k | R - d k ) - - - ( 6 )
wherein R is-dkRepresents the removal of R from RdkThe set of variables obtained later. Substituting equations (2) and (3) into the above equation yields:
P ( r d k | x , R - d k , Z ) ∝ N ( x d ; Σ k r d k Σ m z h m w d h m + μ d , σ d 2 ) · exp { r d k ( Σ d ′ ∈ D d r d ′ k - 1 ) } α d k r d k - - - ( 7 )
zkmthe sample distribution of (a) can be similarly derived as follows:
P ( z k m | x , R , Z - k m ) ∝ N ( z k m ; μ z k m + , σ z k m 2 + ) z k m ≥ 0 ( P + ) N ( z k m ; μ z k m - , σ z k m 2 - ) z k m ≤ 0 ( P - ) - - - ( 8 )
wherein,
σ z k m 2 + = σ z k m 2 - = 1 / [ Σ d ( r d k w d k m / σ d k ) 2 ] ,
the resulting sampling profile is discontinuous and can be effectively sampled from the profile by: (1) distribution P from equation (8)+Sampling and outputting non-negative samples (z)kmNot less than 0); (2) sampling from the distribution P-in equation (8), and outputting a negative sample (z)kmLess than or equal to 0); (3) the two sets of samples are combined as output.I.e. the samples sampled from the distribution in this way. Synthesizing a given image X using a model thetacFirst, the human face picture is projected onto the model theta by using the formula (5), and the projection coefficient is calculated
And step three, giving a reconstruction rule of the smile synthesis of the human face.
After calculating the projection coefficients, XcIs the mean of the conditional distribution shown in equation (2):
x ^ d c = Σ k α d k Σ m E P ( z k m c | X c , θ ) [ z k m c ] w d k m + u d - - - ( 9 )
and (4) reconstructing the human face on the model theta through the formula (9).
And step four, repeating the step two and the step three until all the intermediate faces are output, and finally synthesizing the smile of the face.
A specific embodiment of the present invention is to learn a smile of a human face from a training sample and then migrate the learned expression to a newly input human face. This is useful in animation and compositing, such as automatically compositing a piece of smile video with a given face. Training samples are obtained from the facial expression video.
Example 2; a smiling face synthesis method based on a fractional sparse component analysis model comprises the following specific implementation steps (realized by using Visual C + + language programming):
1. and learning the prior of the face part to realize the fractional representation of the face.
This embodiment uses the CBCL face database to learn the face part prior and migrate it into the smile synthesis experiment on the facial expression database. The number of face sections is set to K-6 and the number of bases is set to M-40. M works well over a large range (empirically M e 20, 240) and can learn reasonable face-part priors. The basis learned by the face part is shown in fig. 2, and for comparison, the basis learned by two other methods, namely multi-Factor user Analysis (MCFA) and Structured Sparse principal Component Analysis (sscaa), are listed in fig. 2. In fig. 2, (1) of fig. 2 is a face part learned by MCFA, fig. 2, (2) of fig. 2 is a face part learned by sscaa, and fig. 2, (3) of fig. 2 is a face part learned by the method of the present invention. As can be seen from fig. 2, the "eyes" and "nose" learned by the MCFA are reasonable but not continuous. Sscpca learns continuous and convex portions, but these portions divide the "mouth" into at least two distinct portions. The part of the invention that was learned is reasonable, see the "neck" in the bottom row.
2. Learning a generative model-a fractional sparse component analysis model.
Training samples are obtained from the facial expression video. The present embodiment roughly selects 8 intermediate states (each state corresponding to one face image) from among neutral expressions and smiles for each of 60 persons. For each state s (s 1.., 8), the model θ is learned with 50 different face imagessWhere the number of bases is set to M-40 (smaller M found to be prone to loss of face details, larger M increased computational complexity).
3. Giving a set of faces x with neutral expressions0The smile is synthesizedThe process is as follows:
(1) a human face x0Projected onto the first model theta1The projection coefficient is obtained by equation (5)(projection);
(2) by the formula (9) in the model theta1Up-reconstruct a human face, defined as(reconstruction);
(3) will be provided withRepeating the model θ as an input picture2A projection step (1) and a reconstruction step (2);
(4) output ofAs input face x0The result of (1).
The smiling face synthesized by the method of the invention is shown in fig. 3. As shown in the figure, the synthesized face smiles gradually from left to right. It should be noted that all the input faces in the embodiment are not in the training set, that is, the input image is new to the learned model, and only when the learned model is flexible enough, the input face can be stably reconstructed, so that the face gradually has smile. As can be seen from fig. 3, the face synthesized by the method of the present invention is substantially reasonable and smooth. FIGS. 4 and 5 show the contrasts of smiles of human faces synthesized by the method of the present invention, MCFA, SSPCA and the structured sparse Latent space method (LSSS). As shown in fig. 4 and 5, the face synthesized by MCFA may not be smooth (as shown in fig. 5). The results of the SSPCA and LSSS synthesis are somewhat ambiguous. The results of the inventive method benefit from two aspects: the shape of the face part is flexible; the sparse component analysis of each part has strong robustness.

Claims (2)

1. A smiling face synthesis method based on a fractional sparse component analysis model is characterized in that firstly, a fractional sparse component analysis model for human face representation is derived; then, rules of reconstruction and projection are given based on the model; secondly, obtaining a projection coefficient by using a projection rule, and reconstructing the input human face by using a reconstruction rule; then, repeating the projection and reconstruction processes for the reconstructed face; and finally, outputting the reconstructed multiple human face images as a smile synthesis process of the input human face.
2. The smiling face synthesis method according to claim 1, comprising the specific steps of:
step one, learning and constructing a generative model for representing human faces;
the generative model finds a common spatial segmentation for a given face picture and learns a sparse component analysis model for each part of the segmentation;
let the probability distribution of the generative model be P (x, Z, R) calculated as P (x, Z, R) ═ P (xR, Z) P (R) P (Z); first, a continuous induction prior of a generative model is derived, and the continuous induction prior and a polynomial prior are combined together to form p (r) by using poe (product of experiments), that is:
P ( R ) = Π d P ( r d . ) = Π d 1 Z d P 1 ( r d . ) P 2 ( r d . ) = 1 Z 0 Π d , k α d k r d k Π d , k exp { - r d k ( 1 - 4 | D d | Σ d ′ ∈ D d r d ′ k ) } - - - ( 1 )
wherein Z is0Is a normalization function, and the variable used for recording the selection of the face image pixels is rd.=(rd1,,rdK)TAnd r isd.~Mult(n=1,αd.),DdIs a variable rdkSet of adjacent variables, P1(rd.) For polynomial priors, P2(rd.) Is a continuity induction prior;
then, modeling each part of the face into a sparse component analysis model, learning the sparse component analysis model for each face part, and solving P (x | R, Z) and P (Z); in particular, for one face pixel xdFirst, a part is selected from K face parts, which is denoted as K, and a pixel is generated from the K sparse component analysis model correspondingly(ii) a The following model was obtained:
x d = Σ k = 1 K r d k Σ m = 1 M z k m w d k m + μ d + ϵ d ;
rd.~Multi(n=1,αd.);
Zkm~Lap(u=0,b=1);
wherein x isdIs a pixel of a human face and,is an overcomplete basis for the kth moiety, M is the number of bases, μdIs an average value;α being random noised.=(αd1,,αdK)TA parameter that is a polynomial distribution; from the above model, one can obtain:
P ( x | R , Z ) = Π d P ( x d | r d . , Z ) = Π d N ( x d ; Σ k r d k Σ m z k m w d k m + μ d , σ d 2 ) - - - ( 2 )
P ( Z ) = Π k , m 1 2 b exp ( - | z k m - u | b ) - - - ( 3 )
in view of P (R) (formula (1)), P (Z) (formula (3)), and P (x | R, Z) (formula (2)), a generative model P (x, Z, R) of the face representation is derived:
P ( x , Z , R ) = P ( x | R , Z ) P ( R ) P ( Z ) = Π d N ( x d ; Σ k r d k Σ m z k m w d k m + μ d , σ d 2 ) 1 Z 0 Π d , k α d k r d k Π d , k exp { - r d k ( 1 - 4 | D d | Σ d ′ ∈ D d r d ′ k ) } Π k , m 1 2 b exp ( - | z k m - u | b ) - - - ( 4 ) step two, calculating a projection coefficient for synthesizing the smile of the human face;
let the learned fractional sparse component analysis model be θ; sample XcThe projection coefficients on the model theta are combination coefficientsThe mean value of (a);
the mean of the coefficients can be estimated by:
E P ( z k m c | X c , θ ) [ z k m c ] ≈ 1 J Σ i = 1 J z k m c , i - - - ( 5 )
wherein,is fromA medium sampled sample;
is a hidden variable zkmThe invention adopts Monte Carlo EM algorithm to estimate the posterior distribution of the hidden variables; the Monte Carlo method first uses the Gibbs sampling method to sample the samples of the hidden variables from the posterior, rdkThe Gibbs sample distribution of (d) can be expressed as:
P ( r d k | x , R - d k , Z ) = P ( x , Z | R ) P ( r d k | R - d k ) P ( R - d k ) P ( x , Z | R - d k ) P ( R - d k ) ∝ P ( x | Z , R ) P ( r d k | R - d k ) - - - ( 6 )
wherein R is-dkRepresents the removal of R from RdkThe set of variables obtained later. Substituting the formula (2) and the formula (3) into the above formula,obtaining:
P ( r d k | x , R - d k , Z ) ∝ N ( x d ; Σ k r d k Σ m z k m w d k m + μ d , σ d 2 ) · exp { r d k ( Σ d ′ ∈ D d r d ′ k - 1 ) } α d k r d k - - - ( 7 )
zkmthe sample distribution of (a) can be similarly derived as follows:
P ( z k m | x , R , Z - k m ) ∝ N ( z k m ; μ z k m + , σ z k m 2 + ) z k m ≥ 0 ( P + ) N ( z k m ; μ z k m - , σ z k m 2 - ) z k m ≤ 0 ( P - ) - - - ( 8 )
wherein,
the resulting sampling profile is discontinuous and can be sampled from the profile in the following way:
(1) distribution P from equation (8)+Sampling and outputting non-negative samples (z)km≥0);
(2) Distribution P from equation (8)_Sampling and outputting negative samples (z)km≤0);
(3) Combining the two sets of samples as an output;i.e. the samples sampled from the distribution in this way; synthesizing a given image X using a model thetacFirst, the human face picture is projected onto the model theta by using the formula (5), and the projection coefficient is calculated
Step three, giving a reconstruction rule of the smile synthesis of the human face;
after calculating the projection coefficients, XcIs the mean of the conditional distribution shown in equation (2):
x ^ d c = Σ k α d k Σ m E P ( z k m c | X c , θ ) [ z k m c ] w d k m + u d - - - ( 9 )
reconstructing a human face on the model theta through a formula (9);
and step four, repeating the step two and the step three until all the intermediate faces are output, and finally synthesizing the smile of the face.
CN201610473441.2A 2016-06-24 2016-06-24 A kind of smiling face's synthetic method based on branch's formula sparse component analysis model Expired - Fee Related CN106097373B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610473441.2A CN106097373B (en) 2016-06-24 2016-06-24 A kind of smiling face's synthetic method based on branch's formula sparse component analysis model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610473441.2A CN106097373B (en) 2016-06-24 2016-06-24 A kind of smiling face's synthetic method based on branch's formula sparse component analysis model

Publications (2)

Publication Number Publication Date
CN106097373A true CN106097373A (en) 2016-11-09
CN106097373B CN106097373B (en) 2018-11-02

Family

ID=57253507

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610473441.2A Expired - Fee Related CN106097373B (en) 2016-06-24 2016-06-24 A kind of smiling face's synthetic method based on branch's formula sparse component analysis model

Country Status (1)

Country Link
CN (1) CN106097373B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112927328A (en) * 2020-12-28 2021-06-08 北京百度网讯科技有限公司 Expression migration method and device, electronic equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6950104B1 (en) * 2000-08-30 2005-09-27 Microsoft Corporation Methods and systems for animating facial features, and methods and systems for expression transformation
CN1920880A (en) * 2006-09-14 2007-02-28 浙江大学 Video flow based people face expression fantasy method
CN103198508A (en) * 2013-04-07 2013-07-10 河北工业大学 Human face expression animation generation method
CN104299250A (en) * 2014-10-15 2015-01-21 南京航空航天大学 Front face image synthesis method and system based on prior model
CN105426836A (en) * 2015-11-17 2016-03-23 上海师范大学 Single-sample face recognition method based on segmented model and sparse component analysis

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6950104B1 (en) * 2000-08-30 2005-09-27 Microsoft Corporation Methods and systems for animating facial features, and methods and systems for expression transformation
CN1920880A (en) * 2006-09-14 2007-02-28 浙江大学 Video flow based people face expression fantasy method
CN103198508A (en) * 2013-04-07 2013-07-10 河北工业大学 Human face expression animation generation method
CN104299250A (en) * 2014-10-15 2015-01-21 南京航空航天大学 Front face image synthesis method and system based on prior model
CN105426836A (en) * 2015-11-17 2016-03-23 上海师范大学 Single-sample face recognition method based on segmented model and sparse component analysis

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112927328A (en) * 2020-12-28 2021-06-08 北京百度网讯科技有限公司 Expression migration method and device, electronic equipment and storage medium
CN112927328B (en) * 2020-12-28 2023-09-01 北京百度网讯科技有限公司 Expression migration method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN106097373B (en) 2018-11-02

Similar Documents

Publication Publication Date Title
CN109508669B (en) Facial expression recognition method based on generative confrontation network
Zhou et al. Photorealistic facial expression synthesis by the conditional difference adversarial autoencoder
Cao et al. Expressive speech-driven facial animation
Hou et al. Improving variational autoencoder with deep feature consistent and generative adversarial training
Chuang et al. Mood swings: expressive speech animation
Xin et al. Arch: Adaptive recurrent-convolutional hybrid networks for long-term action recognition
CN111967533B (en) Sketch image translation method based on scene recognition
Du et al. Stylistic locomotion modeling and synthesis using variational generative models
CN105956995B (en) A kind of face appearance edit methods based on real-time video eigen decomposition
Ren et al. Two-stage sketch colorization with color parsing
CN113724354B (en) Gray image coloring method based on reference picture color style
CN110310351A (en) A kind of 3 D human body skeleton cartoon automatic generation method based on sketch
CN110598719A (en) Method for automatically generating face image according to visual attribute description
Wang et al. Computer-aided traditional art design based on artificial intelligence and human-computer interaction
Zhang et al. A survey on multimodal-guided visual content synthesis
CN110415261B (en) Expression animation conversion method and system for regional training
Ueno et al. Continuous and gradual style changes of graphic designs with generative model
CN106097373B (en) A kind of smiling face's synthetic method based on branch's formula sparse component analysis model
RU2710659C1 (en) Simultaneous uncontrolled segmentation of objects and drawing
Sun et al. Generation of virtual digital human for customer service industry
Ye Evolution and Application of Artificial Intelligence Art Design Based on Machine Learning Algorithm
Yang et al. An interactive facial expression generation system
CN116777738A (en) Authenticity virtual fitting method based on clothing region alignment and style retention modulation
Hu et al. Research on Current Situation of 3D face reconstruction based on 3D Morphable Models
Wang et al. Flow2Flow: Audio-visual cross-modality generation for talking face videos with rhythmic head

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20181102