CN102254336B

CN102254336B - Method and device for synthesizing face video

Info

Publication number: CN102254336B
Application number: CN 201110197873
Authority: CN
Inventors: 刘烨斌; 李凯; 王好谦; 徐枫; 戴琼海
Original assignee: Tsinghua University
Current assignee: Tsinghua University
Priority date: 2011-07-14
Filing date: 2011-07-14
Publication date: 2013-01-16
Anticipated expiration: 2031-07-14
Also published as: CN102254336A

Abstract

The invention discloses a method and a device for synthesizing a face video. The method for synthesizing the face video comprises the following steps of: establishing a target character expression database comprising a plurality of frames of face images; pre-treating the plurality of frames of face images so as to extract a face posture position and a face expression on each frame of face image; retrieving the joint similarity between the face posture position and the face expression in a user-defined face image sequence and the face posture position and the face expression in the plurality of frames of face images so as to obtain a retrieval image sequence matched with the user-defined face image sequence; and performing winding transform and smoothing treatment on the retrieval image sequence. According to the method and the device for synthesizing the face video disclosed by the embodiment of the invention, for any user-defined expression, a lifelike image sequence relative to a target character can be synthesized conveniently; and the automation degree is high.

Description

People's face image synthesizing method and device

Technical field

The present invention relates to field of Computer Graphics, particularly a kind of people's face image synthesizing method and device.

Background technology

In game making, film making and virtual reality, the analogue technique of human face expression has obtained fast development, has proposed the synthetic method of many expressions, yet synthesizing of true personage's expression still can not be satisfied actual demand.On the one hand, some lack the sense of reality based on the expression picture that the picture deformation technology of sample produces, and can not satisfy the requirement of a large amount of expression sequence senses of reality; On the other hand, the facial expression capturing technology can be transferred to another personage with performer's expression, this technology is comparatively ripe, in many film special efficacys (such as 3D film " A Fanda "), application is arranged, but implement comparatively complicated, and the performer need to wear a helmet, and is very inconvenient; Simultaneously, the performer also often need to repeat to perform certain expression until satisfy quality requirements, and this bothers for the performer very much.

Want personage's expression synthetic true to nature, main difficult point is the geometry of people's face and synthesizing of textural characteristics, and when the people did certain expression, his profile changed according to muscle, and the while is affected by ambient lighting also and produces bright alternately dark.Wherein, encoding facial movement system (FACS) is used widely in expression is analyzed and be synthetic, it is divided into a series of moving cells (AU) with people's face, the respective muscle motion of some basic facial expressions is provided, by changing the correlation combiner of these moving cells, just can produce different human face expressions.Although this method can access comparatively real human face expression, can not realize the robotization processing, namely a given personage's expression can not arrive another personage with Expression Mapping under the prerequisite that reduces as far as possible people's intervention.

Summary of the invention

Purpose of the present invention is intended to solve at least one of above-mentioned technological deficiency.For this reason, the present invention need to provide a kind of people's face image synthesizing method and device, and the advantage of this people's face image synthesizing method and device is: for user-defined any expression, can both synthesize easily the corresponding photorealism sequence of target person; Reduce the actor number of times; And automaticity is high.

According to an aspect of the present invention, provide a kind of people's face image synthesizing method, it is characterized in that, comprised step: set up target person expression database, comprise the multiframe facial image in the described target person expression database; Described multiframe facial image is carried out pre-service, with the posture position that extracts people's face on each frame facial image and the expression of people's face; Carry out the associating similarity retrieval of the expression of the posture position of people's face on the expression of the posture position of people's face in the user-defined human face image sequence and people's face and the described multiframe facial image and people's face, to obtain the retrieving images sequence with described user-defined human face image sequence coupling; And to described retrieving images sequence reel conversion and smoothing processing.

According to people's face image synthesizing method of the embodiment of the invention, for user-defined any expression, can both synthesize easily the corresponding photorealism sequence of target person.

According to one embodiment of present invention, the described step of setting up target person expression database comprises: gather a plurality of basic facial expression image sequences of described target person from a plurality of visual angles and a plurality of posture position, each the basic facial expression image sequence in wherein said a plurality of basic facial expression image sequences is expressed one's feelings by the change procedure of neutrality expression and described neutral expression and is consisted of.

According to people's face image synthesizing method of the embodiment of the invention, can comprise great amount of images in the target person expression database.

According to one embodiment of present invention, describedly described multiframe facial image is carried out pretreated step comprise: extract in the described multiframe facial image posture position of people's face in each frame facial image and describe the position of people's face with crab angle, helix angle and side rake angle; People's face in each frame facial image in the described multiframe facial image aimed at positive standard attitude and carry out feature and describe point and detect, describe the expression of people's face with described feature description.

According to people's face image synthesizing method of the embodiment of the invention, posture position and expression that can accurate description people face.

According to one embodiment of present invention, carry out described feature based on active shape model and describe the some detection.

The step of associating similarity retrieval of according to one embodiment of present invention, carrying out the expression of the posture position of people's face on the expression of the posture position of people's face in the described user-defined human face image sequence and people's face and the described multiframe facial image and people's face comprises: according to D _Pose=L (| Y _i-Y _j|)+L (| R _i-R _j|)+L (| P _i-P _j|) similarity of carrying out the posture position of people's face on the posture position of people's face in the described user-defined human face image sequence and the described multiframe facial image calculates, wherein Y represents crab angle, and R represents helix angle, and P represents side rake angle, I _iRepresent the arbitrary frame facial image in the described user-defined human face image sequence, I _jRepresent the arbitrary frame facial image in the described multiframe facial image, L (d) is the sigmoid function, is defined as

γ=ln99 wherein, T and σ are respectively average statistical and the standard deviations of variable d; According to described crab angle, described helix angle and described side rake angle, the posture position of people's face in the described user-defined human face image sequence is registered to positive attitude, then basis

Carry out the similarity of the expression of people's face on the expression of people's face in the described user-defined human face image sequence and the described multiframe facial image and calculate, wherein A _{I, k}It is image I _iK feature behind the aligning described some A _{J, k}It is image I _jK feature behind the aligning described a little, and J is the number of unique point, w _kBe the weight of k unique point, L (d) is the sigmoid function, is defined as

The unique point range normalization that calculates is arrived [0,1]; According to D (i, j)=D _Pose(i, j)+λ D _Expression(i, j) obtains the associating similarity, and wherein parameter lambda is to regulate the weight proportion of expression similarity and posture position similarity; And obtain image I in the user-defined human face image sequence according to described associating similarity _iA plurality of candidate images in described multiframe facial image.

According to people's face image synthesizing method of the embodiment of the invention, can obtain posture position and the suitable associating similarity between the expression of people's face and obtain qualified a plurality of candidate image.

According to one embodiment of present invention, described a plurality of candidate images for described associating similarity retrieval acquisition, further calculate the associating similarity between described a plurality of candidate images, and use the Dijkstra shortest path first to obtain described retrieving images sequence to guarantee time continuity and Space Consistency.

According to people's face image synthesizing method of the embodiment of the invention, can guarantee accuracy and the flatness of retrieving images sequence.

According to an aspect of the present invention, provide a kind of people's face video synthesizer, it is characterized in that, having comprised: target person expression database, described target person expression database is used for storage multiframe facial image; Pretreatment module, described pretreatment module are used for extracting the posture position of people's face on described each frame facial image of target person expression database and the expression of people's face; Associating similarity retrieval module, the associating similarity retrieval of the expression of the posture position of people's face and people's face on the expression that described associating similarity retrieval module is used for carrying out the posture position of user-defined human face image sequence people face and people's face and the described multiframe facial image is with the retrieving images sequence that obtains mating with described user-defined human face image sequence; And post-processing module, described post-processing module is for described retrieving images sequence is reeled conversion and smoothing processing.

According to people's face video synthesizer of the embodiment of the invention, for user-defined any expression, can both synthesize easily the corresponding photorealism sequence of target person.

According to one embodiment of present invention, described multiframe facial image comprises a plurality of basic facial expression image sequences that gather described target person from a plurality of visual angles and a plurality of posture position, and each the basic facial expression image sequence in wherein said a plurality of basic facial expression image sequences is expressed one's feelings by the change procedure of neutrality expression and described neutral expression and consisted of.

According to people's face video synthesizer of the embodiment of the invention, can comprise great amount of images in the target person expression database.

According to one embodiment of present invention, described pretreatment module is further used for: extract in the described multiframe facial image posture position of people's face in each frame facial image and describe the position of people's face with crab angle, helix angle and side rake angle; An and expression of the people's face in each frame facial image in the described multiframe facial image being aimed at and described with described feature description point people's face with positive standard attitude.

According to people's face video synthesizer of the embodiment of the invention, posture position and expression that can accurate description people face.

According to one embodiment of present invention, described pretreatment module is extracted and is carried out described feature based on active shape model and describe point and detect.

According to one embodiment of present invention, described associating similarity retrieval module is further used for: according to D _Pose=L (| Y _i-Y _j|)+L (| R _i-R _j|)+L (| P _i-P _j|) similarity of carrying out the posture position of people's face on the posture position of people's face in the described user-defined human face image sequence and the described multiframe facial image calculates, wherein Y represents crab angle, and R represents helix angle, and P represents side rake angle, I _iRepresent the arbitrary frame facial image in the described user-defined human face image sequence, I _jRepresent the arbitrary frame facial image in the described multiframe facial image, L (d) is the sigmoid function, is defined as

Carry out the similarity of the expression of people's face on the expression of people's face in the described user-defined human face image sequence and the described multiframe facial image and calculate, wherein A _{I, k}It is image I _iK feature behind the aligning described some A _{J, k}It is image I _jK feature behind the aligning described a little, and J is the number of unique point, w _kBe the weight of k unique point, L (d) is the sigmoid function, is defined as The unique point range normalization that calculates is arrived [0,1]; According to D (i, j)=D _Pose(i, j)+λ D _Expression(i, j) obtains the associating similarity, and wherein parameter lambda is to regulate the weight proportion of expression similarity and posture position similarity; And obtain image I in the user-defined human face image sequence according to described associating similarity _iA plurality of candidate images in described multiframe facial image.

According to people's face video synthesizer of the embodiment of the invention, can obtain posture position and the suitable associating similarity between the expression of people's face and obtain qualified a plurality of candidate image.

According to one embodiment of present invention, for described a plurality of candidate images, described associating similarity retrieval module further uses the Dijkstra shortest path first to obtain described retrieving images sequence to guarantee time continuity and Space Consistency according to the associating similarity between described a plurality of candidate images.

According to people's face video synthesizer of the embodiment of the invention, can guarantee accuracy and the flatness of retrieving images sequence.

Additional aspect of the present invention and advantage in the following description part provide, and part will become obviously from the following description, or recognize by practice of the present invention.

Description of drawings

Above-mentioned and/or the additional aspect of the present invention and advantage are from obviously and easily understanding becoming the description of embodiment below in conjunction with accompanying drawing, wherein:

Fig. 1 is the synoptic diagram of people's face image synthesizing method according to an embodiment of the invention;

Fig. 2 is the process flow diagram of people's face image synthesizing method according to an embodiment of the invention;

Fig. 3 is the process flow diagram of uniting according to an embodiment of the invention the method for similarity retrieval;

Fig. 4 extracts the synoptic diagram that face characteristic is described point based on active shape model (ASM) according to an embodiment of the invention;

Fig. 5 is the synoptic diagram that obtains according to an embodiment of the invention the retrieving images sequence; And

Fig. 6 is the block diagram of people's face video synthesizer according to an embodiment of the invention.

Embodiment

The below describes embodiments of the invention in detail, and the example of described embodiment is shown in the drawings, and wherein identical or similar label represents identical or similar element or the element with identical or similar functions from start to finish.Be exemplary below by the embodiment that is described with reference to the drawings, only be used for explaining the present invention, and can not be interpreted as limitation of the present invention.

Need to prove that in addition, term " first ", " second ", " the 3rd " only are used for describing purpose, and can not be interpreted as indication or hint relative importance or the implicit quantity that indicates indicated technical characterictic.Thus, one or more these features can be expressed or impliedly be comprised to the feature that is limited with " first ", " second ", " the 3rd ".Further, in description of the invention, except as otherwise noted, the implication of " a plurality of " is two or more.

Below with reference to accompanying drawing specific embodiments of the invention are described.

Fig. 1 is the synoptic diagram of people's face image synthesizing method according to an embodiment of the invention.As shown in Figure 1, comprise the multiframe facial image at target person expression database.Expression according to posture position and people's face of people's face in the user-defined human face image sequence, searched targets personage the express one's feelings posture position of people's face of multiframe facial image and the expression of people's face in the database, to retrieve the image sequence of coupling, afterwards to image sequence reel with smoothly.

Fig. 2 is the process flow diagram of people's face image synthesizing method according to an embodiment of the invention.As shown in Figure 2, people's face image synthesizing method may further comprise the steps.

Step S201 sets up target person expression database, comprises the multiframe facial image in the target person expression database.Particularly, target person is made a plurality of basic facial expression image sequences under the various visual angles acquisition system, and target person repeats to do for several times for each basic facial expression in the situation of the posture position that changes head.Gather as far as possible the expression under a plurality of visual angles and a plurality of posture position.Each basic facial expression image sequence is made of the change procedure expression of neutrality expression and neutral expression.Basic facial expression comprises happiness, anger, sadness, detest, fear, surprised etc.

Step S202 carries out pre-service to the multiframe facial image, with the posture position that extracts people's face on each frame facial image and the expression of people's face.Specifically comprise: extract in the multiframe facial image posture position of people's face in each frame facial image and describe the position of people's face with crab angle, helix angle and side rake angle; People's face in each frame facial image in the multiframe facial image aimed at positive standard attitude and carry out feature and describe point and detect, use characteristic is described an expression of describing people's face.

Step S203, carry out the associating similarity retrieval of the expression of the posture position of people's face on the expression of the posture position of people's face in the user-defined human face image sequence and people's face and the multiframe facial image and people's face, to obtain the retrieving images sequence with user-defined human face image sequence coupling.The concrete steps of associating similarity retrieval are described with reference to figure 3.Fig. 3 is the process flow diagram of uniting according to an embodiment of the invention the method for similarity retrieval.As shown in Figure 3, the associating similarity retrieval comprises the steps.

Step S2031 is according to D _Pose=L (| Y _i-Y _j|)+L (| R _i-R _j|)+L (| P _i-P _j|) similarity of carrying out the posture position of people's face on the posture position of people's face in the user-defined human face image sequence and the multiframe facial image calculates, wherein Y represents crab angle, and R represents helix angle, and P represents side rake angle, I _iRepresent the arbitrary frame facial image in the user-defined human face image sequence, I _jArbitrary frame facial image in the expression multiframe facial image, L (d) is the sigmoid function, is defined as

γ=ln99 wherein, T and σ are respectively average statistical and the standard deviations of variable d.

Step S2032 according to described crab angle, described helix angle and described side rake angle, is registered to positive attitude, then basis with the posture position of people's face in the described user-defined human face image sequence

Carry out the similarity of the expression of people's face on the expression of people's face in the user-defined human face image sequence and the multiframe facial image and calculate, wherein A _{I, k}It is image I _iK feature behind the aligning described some A _{J, k}It is image I _jK feature behind the aligning described a little, and J is the number of unique point, w _kBe the weight of k unique point, L (d) is the sigmoid function, is defined as

The unique point range normalization that calculates is arrived [0,1].The extraction of point is wherein described referring to Fig. 4 Expressive Features.

Fig. 4 extracts the synoptic diagram that face characteristic is described point based on active shape model (ASM) according to an embodiment of the invention.As shown in Figure 4, target person is expressed one's feelings face alignment on each frame facial image in the database to positive standard attitude, then carry out the active shape model feature and describe the some detection, detect a plurality of features to describe a little, these a plurality of features are described the expression that point defines people's face.

Step S2033 is according to D (i, j)=D _Pose(i, j)+λ D _Expression(i, j) obtains the associating similarity of posture position and expression, and wherein parameter lambda is to regulate the weight proportion of expression similarity and posture position similarity.

Step S2034, according to the associating similarity that obtains among the step S2033, for user-defined human face image sequence, retrieve one by one the nearest K frame of target person expression database middle distance image, the K frame of namely uniting the similarity minimum, namely each user images constantly has K candidate image.

Step S2035 for the K that obtains among a step S2034 candidate image, further according to the associating similarity between K the candidate image, uses the Dijkstra shortest path first to obtain the retrieving images sequence to guarantee time continuity and Space Consistency.Wherein, use with above-mentioned steps in associating similarity between the identical method calculating K candidate image.Particularly, utilize the Dijkstra shortest path first to retrieve one and be carved into t=m shortest path constantly during from t=1, wherein m is the length of user-defined human face image sequence.Fig. 5 is the synoptic diagram that obtains according to an embodiment of the invention the retrieving images sequence.Before implementing the Dijkstra shortest path first, need to set up a digraph, as shown in Figure 5, the node of this digraph is by the candidate face image construction of all target persons that retrieve, only allow to have between the candidate face image of adjacent moment directed edge to link to each other, and the length on limit (or being called cost) is defined as:

L(C _t，i，C _t+1，j)＝D(C _t，i，U _t)+D(C _t+1，j，U _t+1)+μD(C _t，i，C _t+1，j)，

Wherein, C _{T, i}With C _{T+1, j}Represent respectively t i candidate image and j the candidate image in the t+1 moment constantly, U _tWith U _T+1T and t+1 facial image of the definition of difference representative of consumer.The tolerance of this costs between nodes is so that select as far as possible the candidate of rank forward (near as far as possible with the distance of image in the user-defined human face image sequence), again can be so that not significantly sudden change between different constantly candidate image, μ regulates the accuracy of the human face image sequence that retrieves and the parameter of flatness.

Utilize shortest path first, draw node C _{1, i}With C _{M, j}Between shortest path, can find out that from the permutation and combination of first and last node always total K*K in such path is at this K ²Obtain again that of path minimum in the individual shortest path, just obtained the retrieving images sequence that needs.

Step S204 is to retrieving images sequence reel conversion and smoothing processing.Although the retrieving images sequence that retrieval obtains among the step S203 has kept time continuity and Space Consistency to a certain extent, also have certain difference but compare with the expression of people's face on the target person image, the conversion of therefore need to reeling comes so that the expression of the people's face expression of people's face on the image of the human face image sequence of match user definition more in the retrieving images sequence.Simultaneously, the retrieving images sequence may exist video jitter to jump, and therefore needs further smoothing processing.The concrete steps of smoothing processing are as follows.

For t user images constantly, utilize the feature of extracting to describe the people's face of naming a person for a particular job and be divided into the parts such as eyebrow, eyes, nose, face, calculate respectively U _tAnd C _tThese provincial characteristicss are described the distance of point, if this distance, thinks then that this regional aim personage's action and user's action have certain gap greater than a certain constant δ.For example the action of face not too meets the requirements, and then needs to be used in the target person expression database candidate image that again retrieval and user's face mate most, then the face that mates most that detects is substituted original C _tFace, can use to mend based on people's face of Poisson equation and paint algorithm, thereby when guaranteeing to substitute and the consistance of peripheral region texture, be unlikely to produce obvious feeling of unreality.

On time domain, may there be shake through people's face of mending after painting, needs the distance of adjacent moment between the image after the calculating benefit is painted, if detect shake, take the method for light stream optimization to realize seamlessly transitting, substitute shake constantly some image of front and back occurs.

Fig. 6 is the block diagram of people's face video synthesizer according to an embodiment of the invention.As shown in Figure 6, people's face video synthesizer 10 comprises target person expression database 110, pretreatment module 120, associating similarity retrieval module 130 and post-processing module 140.

Particularly, target person expression database 110 is used for storage multiframe facial image.Pretreatment module 120 is used for extracting the posture position of people's face on target person expression database 110 each frame facial image and the expression of people's face.The associating similarity retrieval of the expression of the posture position of people's face and people's face on the expression that associating similarity retrieval module 130 is used for carrying out the posture position of user-defined human face image sequence people face and people's face and the multiframe facial image is with the retrieving images sequence that obtains mating with user-defined human face image sequence.Post-processing module 140 is for the retrieving images sequence is reeled conversion and smoothing processing.

In one embodiment of the invention, the multiframe facial image comprises that from a plurality of basic facial expression image sequences of a plurality of visual angles and a plurality of posture position collection target person wherein each the basic facial expression image sequence in a plurality of basic facial expression image sequences is expressed one's feelings by the change procedure of neutrality expression and neutral expression and consisted of.Describe a little based on the active shape model detected characteristics.

In one embodiment of the invention, pretreatment module 120 is further used for: extract in the multiframe facial image posture position of people's face in each frame facial image and describe the position of people's face with crab angle, helix angle and side rake angle; And the people's face in each frame facial image in the multiframe facial image is aimed at also use characteristic description put an expression of describing people's face with positive standard attitude.

In one embodiment of the invention, associating similarity retrieval module 130 is further used for: according to D _Pose=L (| Y _i-Y _j|)+L (| R _i-R _j|)+L (| P _i-P _j|) similarity of carrying out the posture position of people's face on the posture position of people's face in the user-defined human face image sequence and the multiframe facial image calculates, wherein Y represents crab angle, and R represents helix angle, and P represents side rake angle, I _iRepresent the arbitrary frame facial image in the user-defined human face image sequence, I _jArbitrary frame facial image in the expression multiframe facial image, L (d) is the sigmoid function, is defined as

The unique point range normalization that calculates is arrived [0,1]; According to D (i, j)=D _Pose(i, j)+λ D _Expression(i, j) obtains the associating similarity, and wherein parameter lambda is to regulate the weight proportion of expression similarity and posture position similarity; And according to the associating similarity obtain image I in the user-defined human face image sequence _iA plurality of candidate images in the multiframe facial image.For a plurality of candidate images, associating similarity retrieval module 130 further uses the Dijkstra shortest path first to obtain the retrieving images sequence to guarantee time continuity and Space Consistency according to the associating similarity between a plurality of candidate images.

People's face image synthesizing method and device according to the embodiment of the invention have following advantage: for user-defined any expression, can both synthesize easily the corresponding photorealism sequence of target person; Reduce the actor number of times; And automaticity is high.

In the description of this instructions, the description of reference term " embodiment ", " some embodiment ", " example ", " concrete example " or " some examples " etc. means to be contained at least one embodiment of the present invention or the example in conjunction with specific features, structure, material or the characteristics of this embodiment or example description.In this manual, the schematic statement of above-mentioned term not necessarily referred to identical embodiment or example.And the specific features of description, structure, material or characteristics can be with suitable mode combinations in any one or more embodiment or example.

Although illustrated and described embodiments of the invention, those having ordinary skill in the art will appreciate that: can carry out multiple variation, modification, replacement and modification to these embodiment in the situation that does not break away from principle of the present invention and aim, scope of the present invention is limited by claim and equivalent thereof.

Although illustrated and described embodiments of the invention, for the ordinary skill in the art, be appreciated that without departing from the principles and spirit of the present invention and can carry out multiple variation, modification, replacement and modification to these embodiment that scope of the present invention is by claims and be equal to and limit.

Claims

1. people's face image synthesizing method is characterized in that, comprises step:

Set up target person expression database, comprise the multiframe facial image in the described target person expression database;

Extract in the described multiframe facial image posture position of people's face in each frame facial image and describe the position of people's face with crab angle, helix angle and side rake angle;

People's face in each frame facial image in the described multiframe facial image aimed at positive standard attitude and carry out feature and describe point and detect, describe the expression of people's face with described feature description;

According to D _Pose=L (| Y _i-Y _j|)+L (| R _i-R _j|)+L (| P _i-P _j|) similarity of carrying out the posture position of people's face on the posture position of people's face in the user-defined human face image sequence and the described multiframe facial image calculates, wherein Y represents crab angle, and R represents helix angle, and P represents side rake angle, I _iRepresent the arbitrary frame facial image in the described user-defined human face image sequence, I _jRepresenting the arbitrary frame facial image in the described multiframe facial image, is the sigmoid function L(d), is defined as

γ=ln99 wherein, T and σ are respectively average statistical and the standard deviations of variable d;

According to described crab angle, described helix angle and described side rake angle, the posture position of people's face in the described user-defined human face image sequence is registered to positive attitude, then basis Carry out the similarity of the expression of people's face on the expression of people's face in the described user-defined human face image sequence and the described multiframe facial image and calculate, wherein A _{I, k}It is image I _iK feature behind the aligning described some A _{J, k}It is image I _jK feature behind the aligning described a little, and J is the number of unique point, w _kBe the weight of k unique point, L (d) is the sigmoid function, is defined as

The unique point range normalization that calculates is arrived [0,1];

According to D (i, j)=D _Pose(i, j)+λ D _Expression(i, j) obtains the associating similarity, and wherein parameter lambda is to regulate the weight proportion of expression similarity and posture position similarity;

Obtain image I in the user-defined human face image sequence according to described associating similarity _iA plurality of candidate images in described multiframe facial image, thus retrieving images sequence with described user-defined human face image sequence coupling obtained; And

To described retrieving images sequence reel conversion and smoothing processing.

2. people's face image synthesizing method according to claim 1 is characterized in that, the described step of setting up target person expression database comprises:

Gather a plurality of basic facial expression image sequences of described target person from a plurality of visual angles and a plurality of posture position, each the basic facial expression image sequence in wherein said a plurality of basic facial expression image sequences is expressed one's feelings by the change procedure of neutrality expression and described neutral expression and is consisted of.

3. people's face image synthesizing method according to claim 1 is characterized in that, extracts described feature based on active shape model and describes a little.

4. people's face image synthesizing method according to claim 1, it is characterized in that, described a plurality of candidate images for described associating similarity retrieval acquisition, further calculate the associating similarity between described a plurality of candidate images, and use the Dijkstra shortest path first to obtain described retrieving images sequence to guarantee time continuity and Space Consistency.

5. people's face video synthesizer is characterized in that, comprising:

Target person expression database, described target person expression database is used for storage multiframe facial image;

Pretreatment module, the position that described pretreatment module is used for extracting the posture position of people's face in each frame facial image of described multiframe facial image and describes people's face with crab angle, helix angle and side rake angle, and the people's face in each frame facial image in the described multiframe facial image aimed at positive standard attitude and carry out feature and describe point and detect, the expression of people's face is described with described feature description;

Associating similarity retrieval module, described associating similarity retrieval module is used for carrying out the associating similarity retrieval of the expression of the posture position of people's face on the expression of the posture position of user-defined human face image sequence people face and people's face and the described multiframe facial image and people's face, with the retrieving images sequence that obtains mating with described user-defined human face image sequence, wherein said associating similarity retrieval comprises according to D _Pose=L (| Y _i-Y _j|)+L (| R _i-R _j|)+L (| P _i-P _j|) similarity of carrying out the posture position of people's face on the posture position of people's face in the described user-defined human face image sequence and the described multiframe facial image calculates, wherein Y represents crab angle, and R represents helix angle, and P represents side rake angle, I _iRepresent the arbitrary frame facial image in the described user-defined human face image sequence, I _jRepresenting the arbitrary frame facial image in the described multiframe facial image, is the sigmoid function L(d), is defined as

The unique point range normalization that calculates is arrived [0,1]; According to D (i, j)=D _Pose(i, j)+λ D _Expression(i, j) obtains the associating similarity, and wherein parameter lambda is to regulate the weight proportion of expression similarity and posture position similarity; With obtain image I in the user-defined human face image sequence according to described associating similarity _iA plurality of candidate images in described multiframe facial image; And

Post-processing module, described post-processing module is for described retrieving images sequence is reeled conversion and smoothing processing.

6. people's face video synthesizer according to claim 5, it is characterized in that, described multiframe facial image comprises a plurality of basic facial expression image sequences that gather described target person from a plurality of visual angles and a plurality of posture position, and each the basic facial expression image sequence in wherein said a plurality of basic facial expression image sequences is expressed one's feelings by the change procedure of neutrality expression and described neutral expression and consisted of.

7. people's face video synthesizer according to claim 5 is characterized in that, described pretreatment module is extracted based on active shape model and carried out the detection of described feature description point.

8. people's face video synthesizer according to claim 5, it is characterized in that, for described a plurality of candidate images, described associating similarity retrieval module further uses the Dijkstra shortest path first to obtain described retrieving images sequence to guarantee time continuity and Space Consistency according to the associating similarity between described a plurality of candidate images.