CN108268845A - A kind of dynamic translation system using generation confrontation network synthesis face video sequence - Google Patents

A kind of dynamic translation system using generation confrontation network synthesis face video sequence Download PDF

Info

Publication number
CN108268845A
CN108268845A CN201810045782.9A CN201810045782A CN108268845A CN 108268845 A CN108268845 A CN 108268845A CN 201810045782 A CN201810045782 A CN 201810045782A CN 108268845 A CN108268845 A CN 108268845A
Authority
CN
China
Prior art keywords
dynamic
face
training
frame
appearance
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN201810045782.9A
Other languages
Chinese (zh)
Inventor
夏春秋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Vision Technology Co Ltd
Original Assignee
Shenzhen Vision Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Vision Technology Co Ltd filed Critical Shenzhen Vision Technology Co Ltd
Priority to CN201810045782.9A priority Critical patent/CN108268845A/en
Publication of CN108268845A publication Critical patent/CN108268845A/en
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation

Abstract

A kind of dynamic translation system using generation confrontation network synthesis face video sequence proposed in the present invention, main contents include:Bottom frame definition, grid setting, grid solve, its process is, definition includes the target framework including generator and discriminator, using pre-training Recursive Networks generation appearance compression behavioral characteristics and pass through dynamic channel encoder and merge with static nature and be input to generator;Then design minimizes function comprising the maximization including whole variables, is solved by qualifications with three steps fractionations and obtains the optimal solution of grid.The present invention, which can realize, is substituted into a target image in one section of face video sequence, and with the dynamics properties of former video, generation confrontation network is provided to pursue optimal balance solution, while the invention has preferably kept the detailed information and degree of dynamism of face while completing and replacing video frame.

Description

A kind of dynamic translation system using generation confrontation network synthesis face video sequence
Technical field
The present invention relates to computer vision fields, and generation confrontation network synthesis face video is utilized more particularly, to a kind of The dynamic translation system of sequence.
Background technology
In computer vision, face is replaced in video sequence, especially replaced face can follow former video sequence Variation and change, become very challenging problem.With the universal and smart mobile phone camera point of smart mobile phone The raising of resolution, people can take out mobile phone photograph, self-timer and recorded video whenever and wherever possible and whenever and wherever possible by photo or Video sharing is to internet.These high-resolution images provide largely with video for the application in terms of facial image replacement Material.In addition, recognition of face detects, blocks the constantly improve of the technologies such as detection and localization, machine learning and pattern-recognition, also it is figure The research of the automatic face replacement system of picture/video provides sufficient technical support.And the face in video sequence is replaced, synthesis Technology has important theory significance and application value in amusement, virtual reality, secret protection, Video chat etc..It is giving pleasure to Happy economic aspect, the relevant application program of face replacement technology occupies in major mobile phone application market downloads ranking list Preceding column position, they can bring enjoyment to the entertainment life of people, produce huge economic benefit, different productions simultaneously behind Industry, company personal can be constantly opened to seize market resource closer to, more true the relevant technologies;Secondly, face figure As replacement technology has critical role in terms of virtual reality technology.Same face is needed to exist simultaneously in different scenes, Or it simulates different people and occurs in Same Scene, such as International video meeting, disaster escape rehearsal, panoramic technique, tourism The occasions such as foreground point preview can all benefit from face replacement technology.It is protected in addition, being replaced in facial image in the privacy being concerned There is important research significance in terms of shield.Such as how to reject or replace when acquiring a large amount of public informations key person face, In terms of the protection of the irrelevant personnel involved in public security or criminal case, such privacy concern can all need face replacement technology It solves.However, in place of current industry or educational circles still have some deficits for such technology, such as during image is transformed into video, Do not retain abundant facial expression or detailed information, while noise may be introduced, cause the distortion of image.
The present invention proposes a kind of dynamic using generation confrontation network synthesis face video sequence proposed in the present invention Converting system, first definition include generator and the target framework including discriminator, are generated using the Recursive Networks of pre-training outer It sees compression behavioral characteristics and passes through dynamic channel encoder and merge with static nature and be input to generator;Then design is comprising all Maximization including variable minimizes function, and splitting solution by three steps of qualifications obtains the optimal of grid Solution.The present invention, which can realize, is substituted into a target image in one section of face video sequence, and with the dynamic of former video Change performance, provide generation confrontation network to pursue optimal balance solution, while the invention is while completing to replace video frame The detailed information and degree of dynamism of face are preferably kept.
Invention content
For replacement face in video sequence is solved the problems, such as, generation confrontation is utilized the purpose of the present invention is to provide a kind of Network synthesizes the dynamic translation system of face video sequence, and definition first includes generator and the target framework including discriminator, Using pre-training Recursive Networks generation appearance compression behavioral characteristics and pass through dynamic channel encoder merge with static nature it is defeated Enter to generator;Then design minimizes function comprising the maximization including whole variables, passes through three steps of qualifications It splits to solve and obtains the optimal solution of grid.The present invention can be realized is substituted into one section of face video sequence by a target image In row, and with the dynamics properties of former video, generation confrontation network is provided to pursue optimal balance solution, while the hair The bright detailed information and degree of dynamism that face has been preferably kept while completing and replacing video frame.
Turn to solve the above problems, the present invention provides a kind of dynamic using generation confrontation network synthesis face video sequence System is changed, main contents include:
(1) bottom frame defines;
(2) grid is set;
(3) grid solves.
Wherein, bottom frame definition defines applicability generation confrontation network model, specially:
1) data-oriented collectionDefinition generation networkIts role is to receive to input random become Amount Afterwards, data set x is imitated, changes the distribution of z and generates imitation data set
2) discriminator is definedIts role is to differentiate the data set of imitationWhether given with true Data set x has consistent distribution;
3) Game Rule is defined:The D if data of G generations are successfully out-tricked, G win victory;If D successfully identifies the mould of G generations Imitative data, then D triumphs;
4) definition frame target:Training network G and D simultaneously, and at the same time being allowed to obtain optimal performance, compete with one another for reaching After balance, the data set of G generations at this timeWith the distribution closest to x, the specific minimum process that maximizes is:
Wherein, pxAnd pzIt is the distribution of variable x and z respectively.
Target quiescent facial image is replaced dynamic face figure in original video sequence by the grid setting Picture, including appearance compressive features encoder A, dynamic channel encoder F, generator G, discriminator group Ds、Dd
The appearance compressive features encoder, using the recurrent neural network of pre-training, by dynamic in original video Face and same video in first frame Static Human Face between do calculus of differences, obtain appearance compression behavioral characteristics, this feature The as input feature vector of system, specially:Given length is the original video dynamic sequence Y=[y of T first0,y1,…,yT], it cuts Take the face figure y of its first frame0It as starting point, is replicated T times, generates a static sequence Y(st)=[y0,y0,…,y0];So Generate hiding space-time characterisation feature H and H accordingly to dynamic and static sequence using the recurrent neural network of pre-training respectively afterwards(st); Finally use H and H(st)Calculus of differences is done, obtains appearance compression behavioral characteristics:
Wherein, time span remains T.
The dynamic channel encoder, compresses behavioral characteristics by appearanceWith static spatial feature Temporally frame t carries out linear combining, and the feature combination for then finishing merging is input to generator G, wherein t ∈ T.
The generator carries out feature using symmetry, front and rear connected convolutional neural networks to image Study and extraction, feature of image itself can be kept during training.
The discriminator group, for the dynamic video sequence currently exported, design static authentication device DsAnd dynamic discrimination Device DdTo differentiate the sequence true and false of generation, wherein, static authentication device DsBe currently generated the fidelity of content frame for checking, i.e., with The extent of deviation of original target image;Dynamic discrimination device DdFor check current sequence whether be dynamic, i.e., the expression of face with Whether appearance is in variable condition, if being in true dynamic, output token Z(d), otherwise it is labeled as
The grid solves, including object function and optimization process.
The object function, the parameter for being related to variation in the training process is required for dynamic training, including dynamic channel Encoder F, generator G, discriminator Ds、Dd, specifically, training objective be so that the image of F and G generations is closest to original video, Therefore it needs to minimize its error, while needs to maximize discriminator Ds、DdError, thus object function mathematical expression Formula is:
Wherein, T represents time span.
The optimization process, point three steps solve the optimal solution in mathematic(al) representation (3), specially:
1) discriminator D is maximizeds、DdLoss itemWith, wherein
2) confrontation loss is minimizedTo train generator, wherein,
Meanwhile reconstruct loss by minimizing the still image based on L1 normsTo improve each frame still image The quality of reconstruct, wherein,
3) it is that dynamic continuity is kept in the case where time span is larger, the reconstruct of appearance compression behavioral characteristics is lost The restriction based on L1 norms is carried out, wherein,
It, can be in the hope of the optimal solution in mathematic(al) representation (3) by above three step.
Description of the drawings
Fig. 1 is a kind of frame of dynamic translation system using generation confrontation network synthesis face video sequence of the present invention Figure.
Fig. 2 is a kind of system network of dynamic translation system using generation confrontation network synthesis face video sequence of the present invention Network setting figure.
Fig. 3 is that a kind of example of dynamic translation system using generation confrontation network synthesis face video sequence of the present invention shows It is intended to.
Specific embodiment
It should be noted that in the absence of conflict, the feature in embodiment and embodiment in the application can phase It mutually combines, the present invention is described in further detail in the following with reference to the drawings and specific embodiments.
Fig. 1 is a kind of frame of dynamic translation system using generation confrontation network synthesis face video sequence of the present invention Figure.Mainly include bottom frame to define;Grid is set, and grid solves.
Bottom frame defines, and defines applicability generation confrontation network model, specially:
1) data-oriented collectionDefinition generation networkIts role is to receive to input random become Amount Afterwards, data set x is imitated, changes the distribution of z and generates imitation data set
2) discriminator is definedIts role is to differentiate the data set of imitationWhether given with true Data set x has consistent distribution;
3) Game Rule is defined:The D if data of G generations are successfully out-tricked, G win victory;If D successfully identifies the mould of G generations Imitative data, then D triumphs;
4) definition frame target:Training network G and D simultaneously, and at the same time being allowed to obtain optimal performance, compete with one another for reaching After balance, the data set of G generations at this timeWith the distribution closest to x, the specific minimum process that maximizes is:
Wherein, pxAnd pzIt is the distribution of variable x and z respectively.
Fig. 2 is a kind of system network of dynamic translation system using generation confrontation network synthesis face video sequence of the present invention Network setting figure.Mainly include appearance compressive features encoder A, dynamic channel encoder F, generator G, discriminator group Ds、Dd
Appearance compressive features encoder, using the recurrent neural network of pre-training, by face dynamic in original video Calculus of differences is done between the first frame Static Human Face in same video, obtains appearance compression behavioral characteristics, this feature is to be The input feature vector of system, specially:Given length is the original video dynamic sequence Y=[y of T first0,y1,…,yT], intercept its The face figure y of one frame0It as starting point, is replicated T times, generates a static sequence Y(st)=[y0,y0,…,y0];Then distinguish Hiding space-time characterisation feature H and H accordingly are generated to dynamic and static sequence using the recurrent neural network of pre-training(st);Finally use H and H(st)Calculus of differences is done, obtains appearance compression behavioral characteristics:
Wherein, time span remains T.
Appearance is compressed behavioral characteristics by dynamic channel encoderWith static spatial featureTemporally Frame t carries out linear combining, and the feature combination for then finishing merging is input to generator G, wherein t ∈ T.
Generator, image is carried out using symmetry, front and rear connected convolutional neural networks the study of feature with Extraction, feature of image itself can be kept during training.
Discriminator group, for the dynamic video sequence currently exported, design static authentication device DsWith dynamic discrimination device DdWith mirror The sequence true and false not generated, wherein, static authentication device DsAnd original object the fidelity of content frame is currently generated for checking, i.e., The extent of deviation of image;Dynamic discrimination device DdFor checking whether current sequence is dynamic, i.e., whether the expression of face and appearance In variable condition, if in true dynamic, output token Z(d), otherwise it is labeled as
Grid solves, including object function and optimization process.
Object function, the parameter for being related to variation in the training process is required for dynamic training, including dynamic channel encoder F, generator G, discriminator Ds、Dd, specifically, training objective is so that the image of F and G generations is closest to original video, therefore needs Its error is minimized, while needs to maximize discriminator Ds、DdError, thus the mathematic(al) representation of object function is:
Wherein, T represents time span.
Optimization process, point three steps solve the optimal solution in mathematic(al) representation (3), specially:
1) discriminator D is maximizeds、DdLoss itemWith, wherein
2) confrontation loss is minimizedTo train generator, wherein,
Meanwhile reconstruct loss by minimizing the still image based on L1 normsTo improve each frame still image The quality of reconstruct, wherein,
3) it is that dynamic continuity is kept in the case where time span is larger, the reconstruct of appearance compression behavioral characteristics is lost The restriction based on L1 norms is carried out, wherein,
It, can be in the hope of the optimal solution in mathematic(al) representation (3) by above three step.
Fig. 3 is that a kind of example of dynamic translation system using generation confrontation network synthesis face video sequence of the present invention shows It is intended to.As shown in the figure, in " smile " and " surprised " two expressions, replaced face can show former video in video Expression remains abundant detailed information, and there is no introduce to be enough the distortion content for causing visual discomfort.
For those skilled in the art, the present invention is not limited to the details of above-described embodiment, in the essence without departing substantially from the present invention In the case of refreshing and range, the present invention can be realized in other specific forms.In addition, those skilled in the art can be to this hair Bright to carry out various modification and variations without departing from the spirit and scope of the present invention, these improvements and modifications also should be regarded as the present invention's Protection domain.Therefore, appended claims are intended to be construed to include preferred embodiment and fall into all changes of the scope of the invention More and change.

Claims (10)

1. a kind of dynamic translation system using generation confrontation network synthesis face video sequence, which is characterized in that mainly include Bottom frame defines (one);Grid sets (two), and grid solves (three).
2. (one) is defined based on the bottom frame described in claims 1, which is characterized in that define applicability generation confrontation network Model, specially:
1) data-oriented collectionDefinition generation network G:Its role is to receive to input stochastic variableAfterwards, data set x is imitated, changes the distribution of z and generates imitation data set
2) discriminator D is defined:Its role is to differentiate the data set of imitationWhether with true data-oriented Collecting x has consistent distribution;
3) Game Rule is defined:The D if data of G generations are successfully out-tricked, G win victory;If D successfully identifies the imitation number of G generations According to then D wins;
4) definition frame target:Training network G and D simultaneously, and at the same time being allowed to obtain optimal performance, compete with one another for reaching balance Afterwards, the data set of G generations at this timeWith the distribution closest to x, the specific minimum process that maximizes is:
Wherein, pxAnd pzIt is the distribution of variable x and z respectively.
3. (two) are set based on the grid described in claims 1, which is characterized in that replace target quiescent facial image Dynamic facial image in original video sequence, including appearance compressive features encoder A, dynamic channel encoder F, generator G, Discriminator group Ds、Dd
4. based on the appearance compressive features encoder described in claims 3, which is characterized in that use the recurrent neural of pre-training Network will do calculus of differences between the first frame Static Human Face in face and same video dynamic in original video, obtain Appearance compresses behavioral characteristics, this feature is the input feature vector of system, specially:Given length is that the original video of T moves first State sequence Y=[y0,y1,…,yT], intercept the face figure y of its first frame0It as starting point, is replicated T times, generates a static state Sequence Y(st)=[y0,y0,…,y0];Then dynamic and static sequence is generated using the recurrent neural network of pre-training respectively corresponding Hiding space-time characterisation feature H and H(st);Finally use H and H(st)Calculus of differences is done, obtains appearance compression behavioral characteristics:
Wherein, time span remains T.
5. based on the dynamic channel encoder described in claims 3, which is characterized in that appearance is compressed behavioral characteristicsWith Static spatial featureTemporally frame t carries out linear combining, and the feature combination for then finishing merging is input to Generator G, wherein t ∈ T.
6. based on the generator described in claims 3, which is characterized in that use symmetry, front and rear connected convolution Neural network carries out image the study and extraction of feature, can keep feature of image itself during training.
7. based on the discriminator group described in claims 3, which is characterized in that for the dynamic video sequence currently exported, if Count static authentication device DsWith dynamic discrimination device DdTo differentiate the sequence true and false of generation, wherein, static authentication device DsIt is current for checking Generate content frame fidelity, i.e., with the extent of deviation of original target image;Dynamic discrimination device DdFor checking that current sequence is No is dynamic, i.e., whether the expression of face is in variable condition with appearance, if in really dynamic, output token Z(d), otherwise It is labeled as
8. (three) are solved based on the grid described in claims 1, which is characterized in that including object function and optimized Journey.
9. based on the object function described in claims 8, which is characterized in that being related to the parameter of variation in the training process all needs Dynamic training is wanted, including dynamic channel encoder F, generator G, discriminator Ds、Dd, specifically, training objective is so that F and G life Into image closest to original video, it is therefore desirable to minimize its error, while need to maximize discriminator Ds、DdError, by This mathematic(al) representation for obtaining object function is:
Wherein, T represents time span.
10. based on the optimization process described in claims 8, which is characterized in that point three steps are solved in mathematic(al) representation (3) most Excellent solution, specially:
1) discriminator D is maximizeds、DdLoss itemWithWherein
2) confrontation loss is minimizedTo train generator, wherein,
Meanwhile reconstruct loss by minimizing the still image based on L1 normsTo improve each frame still image reconstruct Quality, wherein,
3) it is that dynamic continuity is kept in the case where time span is larger, the reconstruct of appearance compression behavioral characteristics is lost and is carried out Based on the restriction of L1 norms, wherein,
It, can be in the hope of the optimal solution in mathematic(al) representation (3) by above three step.
CN201810045782.9A 2018-01-17 2018-01-17 A kind of dynamic translation system using generation confrontation network synthesis face video sequence Withdrawn CN108268845A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810045782.9A CN108268845A (en) 2018-01-17 2018-01-17 A kind of dynamic translation system using generation confrontation network synthesis face video sequence

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810045782.9A CN108268845A (en) 2018-01-17 2018-01-17 A kind of dynamic translation system using generation confrontation network synthesis face video sequence

Publications (1)

Publication Number Publication Date
CN108268845A true CN108268845A (en) 2018-07-10

Family

ID=62775913

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810045782.9A Withdrawn CN108268845A (en) 2018-01-17 2018-01-17 A kind of dynamic translation system using generation confrontation network synthesis face video sequence

Country Status (1)

Country Link
CN (1) CN108268845A (en)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109410179A (en) * 2018-09-28 2019-03-01 合肥工业大学 A kind of image abnormity detection method based on generation confrontation network
CN109658369A (en) * 2018-11-22 2019-04-19 中国科学院计算技术研究所 Video intelligent generation method and device
CN110210386A (en) * 2019-05-31 2019-09-06 北京市商汤科技开发有限公司 For acting the video generation method migrated and neural network training method and device
CN110647864A (en) * 2019-09-30 2020-01-03 上海依图网络科技有限公司 Single multi-graph feature recognition method, equipment and medium based on generation countermeasure network
CN110647659A (en) * 2019-09-27 2020-01-03 上海依图网络科技有限公司 Imaging system and video processing method
CN110826593A (en) * 2019-09-29 2020-02-21 腾讯科技(深圳)有限公司 Training method for fusion image processing model, image processing method, image processing device and storage medium
CN111243066A (en) * 2020-01-09 2020-06-05 浙江大学 Facial expression migration method based on self-supervision learning and confrontation generation mechanism
CN111242837A (en) * 2020-01-03 2020-06-05 杭州电子科技大学 Face anonymous privacy protection method based on generation of countermeasure network
CN111401101A (en) * 2018-12-29 2020-07-10 上海智臻智能网络科技股份有限公司 Video generation system based on portrait
CN113034698A (en) * 2019-12-24 2021-06-25 辉达公司 Generating panoramas using one or more neural networks
US11443559B2 (en) 2019-08-29 2022-09-13 PXL Vision AG Facial liveness detection with a mobile device
WO2022205416A1 (en) * 2021-04-02 2022-10-06 深圳先进技术研究院 Generative adversarial network-based facial expression generation method
CN117726729A (en) * 2024-01-30 2024-03-19 北京烽火万家科技有限公司 Name card manufacturing method, system, medium and equipment based on virtual digital person technology

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107451619A (en) * 2017-08-11 2017-12-08 深圳市唯特视科技有限公司 A kind of small target detecting method that confrontation network is generated based on perception
CN107577985A (en) * 2017-07-18 2018-01-12 南京邮电大学 The implementation method of the face head portrait cartooning of confrontation network is generated based on circulation

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107577985A (en) * 2017-07-18 2018-01-12 南京邮电大学 The implementation method of the face head portrait cartooning of confrontation network is generated based on circulation
CN107451619A (en) * 2017-08-11 2017-12-08 深圳市唯特视科技有限公司 A kind of small target detecting method that confrontation network is generated based on perception

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
WISSAM J. BADDAR, GEONMO GU ET. AL.: ""Dynamics Transfer GAN: Generating Video by Transferring Arbitrary Temporal Dynamics from a Source Video to a Single Target Image"", 《ARXIV》 *

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109410179A (en) * 2018-09-28 2019-03-01 合肥工业大学 A kind of image abnormity detection method based on generation confrontation network
CN109410179B (en) * 2018-09-28 2021-07-23 合肥工业大学 Image anomaly detection method based on generation countermeasure network
CN109658369A (en) * 2018-11-22 2019-04-19 中国科学院计算技术研究所 Video intelligent generation method and device
CN111401101A (en) * 2018-12-29 2020-07-10 上海智臻智能网络科技股份有限公司 Video generation system based on portrait
CN110210386A (en) * 2019-05-31 2019-09-06 北京市商汤科技开发有限公司 For acting the video generation method migrated and neural network training method and device
US11443559B2 (en) 2019-08-29 2022-09-13 PXL Vision AG Facial liveness detection with a mobile device
US11669607B2 (en) 2019-08-29 2023-06-06 PXL Vision AG ID verification with a mobile device
CN110647659B (en) * 2019-09-27 2023-09-15 上海依图网络科技有限公司 Image pickup system and video processing method
CN110647659A (en) * 2019-09-27 2020-01-03 上海依图网络科技有限公司 Imaging system and video processing method
CN110826593A (en) * 2019-09-29 2020-02-21 腾讯科技(深圳)有限公司 Training method for fusion image processing model, image processing method, image processing device and storage medium
US11526712B2 (en) 2019-09-29 2022-12-13 Tencent Technology (Shenzhen) Company Limited Training method and apparatus for image fusion processing model, device, and storage medium
CN110647864A (en) * 2019-09-30 2020-01-03 上海依图网络科技有限公司 Single multi-graph feature recognition method, equipment and medium based on generation countermeasure network
CN113034698A (en) * 2019-12-24 2021-06-25 辉达公司 Generating panoramas using one or more neural networks
CN111242837B (en) * 2020-01-03 2023-05-12 杭州电子科技大学 Face anonymity privacy protection method based on generation countermeasure network
CN111242837A (en) * 2020-01-03 2020-06-05 杭州电子科技大学 Face anonymous privacy protection method based on generation of countermeasure network
CN111243066B (en) * 2020-01-09 2022-03-22 浙江大学 Facial expression migration method based on self-supervision learning and confrontation generation mechanism
CN111243066A (en) * 2020-01-09 2020-06-05 浙江大学 Facial expression migration method based on self-supervision learning and confrontation generation mechanism
WO2022205416A1 (en) * 2021-04-02 2022-10-06 深圳先进技术研究院 Generative adversarial network-based facial expression generation method
CN117726729A (en) * 2024-01-30 2024-03-19 北京烽火万家科技有限公司 Name card manufacturing method, system, medium and equipment based on virtual digital person technology

Similar Documents

Publication Publication Date Title
CN108268845A (en) A kind of dynamic translation system using generation confrontation network synthesis face video sequence
Dolhansky et al. The deepfake detection challenge (dfdc) dataset
Chen et al. Vgan-based image representation learning for privacy-preserving facial expression recognition
Duarte et al. WAV2PIX: Speech-conditioned Face Generation using Generative Adversarial Networks.
Zhang Deepfake generation and detection, a survey
CN109815928A (en) A kind of face image synthesis method and apparatus based on confrontation study
Yadav et al. Deepfake: A survey on facial forgery technique using generative adversarial network
Burton et al. Mental representations of familiar faces
Whittaker et al. “All around me are synthetic faces”: the mad world of AI-generated media
DE112013001461T5 (en) Modify the look of a participant during a videoconference
JP2021516831A (en) Biological detection method, device and storage medium
CN114937115A (en) Image processing method, face replacement model processing method and device and electronic equipment
CN113870133A (en) Multimedia display and matching method, device, equipment and medium
WO2023154135A1 (en) Systems and methods for facial attribute manipulation
Hajarolasvadi et al. Generative adversarial networks in human emotion synthesis: A review
Weerawardana et al. Deepfakes detection methods: a literature survey
Si et al. Speech2video: Cross-modal distillation for speech to video generation
CN115100707A (en) Model training method, video information generation method, device and storage medium
Arora et al. A review of techniques to detect the GAN-generated fake images
CN117041664A (en) Digital human video generation method and device, electronic equipment and storage medium
CN116449958A (en) Virtual office system based on meta universe
WO2023124697A1 (en) Image enhancement method, apparatus, storage medium, and electronic device
Parkin et al. Creating artificial modalities to solve rgb liveness
Chen et al. Hierarchical cross-modal talking face generationwith dynamic pixel-wise loss
Saif et al. Deepfake videos: synthesis and detection techniques–a survey

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20180710