CN112884636A - Style migration method for automatically generating stylized video - Google Patents
Style migration method for automatically generating stylized video Download PDFInfo
- Publication number
- CN112884636A CN112884636A CN202110117964.4A CN202110117964A CN112884636A CN 112884636 A CN112884636 A CN 112884636A CN 202110117964 A CN202110117964 A CN 202110117964A CN 112884636 A CN112884636 A CN 112884636A
- Authority
- CN
- China
- Prior art keywords
- encoder
- migration
- style
- video
- lightweight
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000013508 migration Methods 0.000 title claims abstract description 80
- 230000005012 migration Effects 0.000 title claims abstract description 79
- 238000000034 method Methods 0.000 title claims abstract description 45
- 230000008569 process Effects 0.000 claims abstract description 20
- 238000013140 knowledge distillation Methods 0.000 claims abstract description 14
- 230000004927 fusion Effects 0.000 claims abstract description 6
- 238000007906 compression Methods 0.000 claims abstract description 5
- 239000011159 matrix material Substances 0.000 claims description 20
- 238000010586 diagram Methods 0.000 claims description 18
- 238000006243 chemical reaction Methods 0.000 claims description 9
- PXFBZOLANLWPMH-UHFFFAOYSA-N 16-Epiaffinine Natural products C1C(C2=CC=CC=C2N2)=C2C(=O)CC2C(=CC)CN(C)C1C2CO PXFBZOLANLWPMH-UHFFFAOYSA-N 0.000 claims description 4
- 238000004821 distillation Methods 0.000 claims description 4
- 230000000717 retained effect Effects 0.000 claims description 3
- 230000003287 optical effect Effects 0.000 description 10
- 230000006870 function Effects 0.000 description 9
- 230000000694 effects Effects 0.000 description 4
- 238000012360 testing method Methods 0.000 description 4
- 238000012549 training Methods 0.000 description 4
- 238000005457 optimization Methods 0.000 description 2
- 230000004913 activation Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 150000001875 compounds Chemical class 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 238000004883 computer application Methods 0.000 description 1
- 238000005034 decoration Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 238000011176 pooling Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/04—Context-preserving transformations, e.g. by using an importance map
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
The invention discloses a style migration method for automatically generating stylized videos, which comprises the steps of constructing a style migration model for automatically generating stylized videos, wherein the style migration model comprises a high-compression self-encoder model based on knowledge distillation and a feature migration module based on semantic alignment; the self-encoder is divided into an encoder and a decoder, the encoder can encode original video content frames and style images into feature maps, the feature migration module can fuse content features and style features obtained by encoding of the encoder based on semantic alignment, finally fusion migration features based on semantic alignment are obtained, and finally stylized video frames are obtained through the decoder. The method and the device can ensure the stability of the migrated video, can realize the stylization of any video in any style, have very high speed in the style migration process, and have higher practicability.
Description
Technical Field
The invention belongs to the field of computer application, and particularly relates to a style migration method for automatically generating a stylized video.
Background
With the development and popularization of the internet and the mobile internet, more and more short video platforms begin to rise, the artistic requirements of people for videos are gradually increased on the basis of the short video platforms, and the forms created by professional artists or professional editing technicians are not only inconvenient but also high in cost. Therefore, the automatic generation of videos of any artistic style from videos by computer technology is receiving attention and favor of people.
Given a content graph and a target style sheet, style migration is aimed at producing a stylized image that can have both content graph structure and style sheet texture. A great deal of research work is already carried out on a style migration method based on a single image, and a great deal of attention is now turned to the field of video style migration because the video style migration has very wide application prospects (including short video artistic conversion and the like); clearly, style migration of video is more practical and challenging than style migration of single images.
Compared with the traditional image style migration, the video style migration is more difficult in that the stylized quality, the stability and the computational efficiency are considered at the same time. Currently available video style migration methods can be roughly classified into two categories according to whether optical flow is used or not.
The first category is methods using optical flow, which propose a loss of temporal consistency to obtain stability between adjacent frames by means of supervised constraints of optical flow. The method comprises an optimized optical flow constraint-based method, and although the method can obtain stable migration video, the time for each frame style migration of the video is nearly three minutes, and the extremely slow migration speed is unacceptable. Video style migration methods based on feed-forward networks are proposed in the successors, but the video style migration tasks cannot achieve real-time effects because optical flow constraints are still used in the training stage and the testing stage. To solve this problem, some methods only use the optical flow in the training phase and avoid using the optical flow in the testing phase, but then the speed is increased but the effect of the final migration is very unstable compared to those methods that also use the optical flow in the testing phase.
The second category is methods that do not use optical flow, such as LST, which can implement feature affine and thus can result in stable stylized video. After that, there are studies proposed to use an Avatar-Net based decoration module in combination with a compound normalization method to ensure video stability. However, the existing methods that do not use optical flow all use the original VGG network to encode content and style features, and the VGG network is very bulky, which means that a very large memory space is required to store the VGG model, which will greatly limit their applications in some small terminal devices.
Disclosure of Invention
The purpose of the invention is as follows: the invention provides a style migration method for automatically generating stylized videos, which can realize real-time stable arbitrary video style migration.
The technical scheme is as follows: the invention provides a style migration method for automatically generating a stylized video, which specifically comprises the following steps:
(1) constructing a video style migration network model, wherein the model comprises a high-compression self-encoder module based on knowledge distillation and a feature migration module based on semantic alignment; the self-encoder module comprises a lightweight encoder and a lightweight decoder;
(2) encoder encoding of content video frames and trellis diagrams: knowledge distillation is carried out on the lightweight encoder based on the VGG network, the encoder learns the encoding capacity of a teacher network VGG encoder while having enough lightweight, and original video content frames and style images are encoded into feature maps;
(3) a semantic alignment based feature migration module: fusing content features and style features obtained by encoding of an encoder to obtain fusion migration features based on semantic alignment;
(4) knowledge distillation of the lightweight decoder based on VGG network: the decoder can learn the decoding capability of a VGG decoder of a teacher network while the decoder is light enough, and the decoder decodes the fused and migrated features to obtain stylized video frames, and finally synthesizes the videos.
Further, the implementation of step (2) requires that the optimization of the loss function is as follows:
wherein, I is an original image, an encoder in a VGG network is E, and a lightweight encoder is EI' is the reconstructed picture, Ek(I) For the k-th layer output characteristic diagram in the original VGG encoder,and the k-th layer output characteristic diagram in the lightweight encoder is shown, wherein lambda and gamma are both hyper-parameters.
Further, the step (3) is realized as follows:
the characteristic diagram of the content image output by the encoder is Fc∈RCx(WxH)The output obtained from the stylized image is Fs∈RCx(WxH)Wherein C is the number of the channels of the feature map, and W and H are the width and the height of the feature map respectively; the feature migration module based on semantic alignment aims at finding a feature migration which enables semantic alignment of content graphs of different video frames through conversion, and supposing that the conversion process can be parameterized into a projection matrix P e RCxCThen the optimized objective function is:
wherein ,denotes from FcIn the operation of selecting the i-th position feature vector, AijTo representAndk neighbor matrix of (1);
solving for P as:
wherein A is the affine matrix defined above, U is the diagonal matrix, and is a matrix with characteristic alignment function, and the projection matrix P is formed as P ═ g (F (F)c)f(Fs)T) In the linear conversion process, g (x) MX and f (x) XTT(ii) a The (x) process chooses to fit with three convolutional layers, and the g () process uses a fully-connected layer fit.
Further, the implementation of step (4) requires that the optimization of the loss function is as follows:
wherein I is an original image, Ek(I) For the k-th layer output characteristic diagram in the original VGG encoder,for the k-th layer output characteristic diagram in the lightweight encoder, I' is a decoder using lightweightDecoding the resulting reconstructed picture with λ being a hyperparametric, the above distillation process being aimed atWhile the information of the original E can be retained,can be well combined withAnd performing image reconstruction on the obtained output characteristic information.
Has the advantages that: compared with the prior art, the invention has the beneficial effects that: 1. the stability between adjacent frames and the order consistency are considered while the higher stylized quality of the video frames is finished, so that the stability of the video after the migration can be ensured; 2. the stylization has rich diversity, and can realize the stylization of any video in any style; 3. in the process of video style migration, real-time performance needs to be achieved, that is, the speed of the style migration process needs to be guaranteed to be very high, and in order to have higher practicability, the light weight of the whole model needs to be guaranteed.
Drawings
FIG. 1 is a flow chart of the invention;
FIG. 2 is a block diagram of a high compression self-encoder of the present invention based on knowledge distillation;
FIG. 3 is a schematic diagram of a video style migration network structure constructed by the present invention;
FIG. 4 is an exemplary diagram of a video style migration effect according to the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings.
The invention provides a style migration method for automatically generating a stylized video, which mainly comprises three stages in the video style migration process, wherein the first stage is to encode a content video frame and a style graph by an encoder, the second stage is to perform characteristic style migration and fusion on the content and style characteristics obtained by encoding, and the third stage is to perform decoder decoding on the migrated and fused characteristics to obtain a stylized video frame and finally synthesize a video. The sizes of the encoder and the decoder largely determine whether the model is light, and whether the design of the characteristic migration part directly determines whether the stylized video obtained by migration is stable, whether the stylized video can be migrated in real time and whether the stylized video has the capability of migrating in any style. As shown in fig. 1, the method specifically comprises the following steps:
step 1: and constructing a video style migration network model, as shown in FIG. 3, which comprises a high-compression self-encoder module based on knowledge distillation and a feature migration module based on semantic alignment.
The self-encoder is divided into an encoder and a decoder, the encoder can encode original video content frames and style images into feature maps, the feature migration module can fuse content features and style features obtained by encoding of the encoder based on semantic alignment, finally fusion migration features based on semantic alignment are obtained, and finally stylized video frames are obtained through the decoder.
A feature style migration module (FTM) based on semantic alignment, which can ensure the stability between adjacent frames in the video style migration process; the size of the video style migration model is only 2.67MB, and the speed of executing the video style migration can reach 166.67 fps.
Step 2: encoder encoding of content video frames and trellis diagrams: knowledge distillation is carried out on the lightweight encoder based on the VGG network, the encoder learns the encoding capacity of the VGG encoder of a teacher network while having enough lightweight, and original video content frames and style images are encoded into feature maps.
As shown in fig. 2, the network structure of the lightweight encoder and decoder specifically includes: a lightweight encoder network of four symmetric groups of upsampled and downsampled convolutional layers, max pooling layers and employing a ReLU activation function to feature encode input video frames and arbitrary style images. The VGG network is a network structure widely used in style migration, and the lightweight encoder network is a student network obtained by knowledge distillation based on the VGG teacher network, so that the VGG network can use few parameters to realize an image encoding process. As in the network architecture of fig. 2Partly as shown, it is desirable to have the encoder network learn while being sufficiently lightweightThe encoding capability to the teacher network VGG encoder, where the loss function needs to be optimized, is as follows:
wherein, the encoder in the original VGG-based network is E, and the lightweight encoder is defined asI' is a reconstructed picture obtained by reconstruction of a decoder, wherein I is an original image, an encoder in a VGG network is E, and a lightweight encoder is EI' is the reconstructed picture, Ek(I) For the k-th layer output characteristic diagram in the original VGG encoder,is a k-th layer output characteristic diagram in a lightweight encoder, and both lambda and gamma are hyper-parameters
And step 3: knowledge distillation is carried out on the lightweight decoder based on the VGG network, so that the decoder can learn the decoding capability of the VGG decoder in the teacher network while being light enough.
As in the network architecture of fig. 2Partially shown, for the lightweight decoder network for feature decoding of the migrated features, the VGG network is used as a teacher network to perform knowledge distillation, and the decoder network needs to be capable of learning the decoding capability of the VGG decoder of the teacher network while having a sufficiently lightweight, where the loss function to be optimized is as follows:
wherein Is implemented by a lightweight decoderDecoding the resulting reconstructed picture, the above distillation process being aimed atWhile the information of the original E can be retained,can be well combined withAnd performing image reconstruction on the obtained output characteristic information.
And 4, step 4: and the feature migration module based on semantic alignment fuses the content features and the style features obtained by the encoder coding to obtain fusion migration features based on semantic alignment.
The feature migration module based on semantic alignment is a key for realizing real-time stable video style migration, and needs to be capable of efficiently completing style feature migration and simultaneously performing feature semantic alignment. To achieve the above idea, the idea of manifold alignment is adopted. Assume that the feature map of the content image output from the encoder is Fc∈RCx(WxH)The output of the style image obtained by the lightweight coding network is Fs∈RCx(WxH)Wherein C is the number of the channels of the feature map, and W and H are the width and height of the feature map respectively. The FTM module is designed to output the feature F after semantic alignment migrationcsAnd outputs it to the decoder to obtain the migrated result map. In fact, the goal of the FTM module we have designed is to find a transform that enables semantically aligned feature migration of content maps of different video frames, assuming that the transform process can be parameterized as a projection matrix P e RCxCThen the optimized objective function is:
wherein ,denotes from FcIn the operation of selecting the i-th position feature vector, AijTo representAndk of (2) is a neighbor matrix. Therefore, the objective function is to make the content features after conversion similar to the k-nearest neighbor features of the style feature space. Equivalent to that during the video style migration, there may be some moving objects and some lighting changes, which may cause jitter after the migration. But based on the above affine-preserving transformation, stable consistency can be kept between two adjacent frames, thereby generating a stable video style migration result.
Solving the above equation in effect calculates its closed form solution for P, which can be obtained by taking the derivative of P and making the derivative 0:
wherein A is the affine matrix defined above, U is the diagonal matrix, and also a matrix. Since A is a diagonal matrix, it can be decomposed into TTT, the projection matrix P may thus be formalized as P ═ g (F)c)f(Fs)T) In the above linear conversion process, g (x) MX and f (x) XTT. Even if we can solve P in a closed form, the process of matrix inversion is very importantTime consuming, we have designed an FTM network module result to fit the solution process. Where the f (x) process we choose to fit with three convolutional layers, the g () process uses a fully-connected layer fit.
The content images needed to be used for training the self-encoder are preprocessed. Uniformly adjusting the image to 256 × 256 pixels; the content image is respectively input into a student self-encoder network and a teacher self-encoder network, and the part comprises an encoding part and a decoding part. The encoding section encodes the image; the decoding part reconstructs the input image according to the characteristic code obtained by the encoder. Meanwhile, through the feature perception loss and the reconstruction loss, as shown in fig. 2, the training method based on knowledge distillation ensures that the light-weight self-encoder network obtained by distillation can have the capabilities of multi-level feature extraction and feature-based image reconstruction; the content image and the style image are respectively sent into a style migration network added with a semantic alignment feature migration module as shown in fig. 3, the middle feature migration module is trained (a distilled lightweight self-encoder network is fixed), and the migration module is trained based on the designed content loss Lc and the style loss Ls.
In the testing stage, the video frames and the selected style images are directly input into a trained lightweight style migration model, the model automatically and efficiently outputs stylized results, and finally, stable stylized videos are synthesized in real time, as shown in fig. 4, the stylized videos are style migration results every 10 frames in one video, and it can be seen that the style migration with semantic alignment can be performed no matter whether the videos are foreground or background, so that stable video frame results are generated.
Claims (4)
1. A style migration method for automatically generating stylized video is characterized by comprising the following steps:
(1) constructing a video style migration network model, wherein the model comprises a high-compression self-encoder module based on knowledge distillation and a feature migration module based on semantic alignment; the self-encoder module comprises a lightweight encoder and a lightweight decoder;
(2) encoder encoding of content video frames and trellis diagrams: knowledge distillation is carried out on the lightweight encoder based on the VGG network, the encoder learns the encoding capacity of a teacher network VGG encoder while having enough lightweight, and original video content frames and style images are encoded into feature maps;
(3) a semantic alignment based feature migration module: fusing content features and style features obtained by encoding of an encoder to obtain fusion migration features based on semantic alignment;
(4) knowledge distillation of the lightweight decoder based on VGG network: the decoder can learn the decoding capability of a VGG decoder of a teacher network while the decoder is light enough, and the decoder decodes the fused and migrated features to obtain stylized video frames, and finally synthesizes the videos.
2. The method for automatically generating style migration of stylized video according to claim 1, wherein said step (2) is implemented by optimizing a loss function as follows:
wherein, I is an original image, an encoder in a VGG network is E, and a lightweight encoder is EI' is the reconstructed picture, Ek(I) For the k-th layer output characteristic diagram in the original VGG encoder,and the k-th layer output characteristic diagram in the lightweight encoder is shown, wherein lambda and gamma are both hyper-parameters.
3. The method for automatically generating style migration of stylized video according to claim 1, wherein said step (3) is implemented as follows:
the characteristic diagram of the content image output by the encoder is Fc∈RCx(WxH)The output obtained from the stylized image is Fs∈RCx (WxH)Wherein C is the number of the channels of the feature map, and W and H are the width and the height of the feature map respectively; the feature migration module based on semantic alignment aims at finding a feature migration which enables semantic alignment of content graphs of different video frames through conversion, and supposing that the conversion process can be parameterized into a projection matrix P e RCxCThen the optimized objective function is:
wherein ,denotes from FcIn the operation of selecting the i-th position feature vector, AijTo representAndk neighbor matrix of (1);
solving for P as:
wherein A is the affine matrix defined above, U is the diagonal matrix, and is a matrix with characteristic alignment function, and the projection matrix P is formed as P ═ g (F (F)c)f(Fs)T) In the linear conversion process, g (x) MX and f (x) XTT(ii) a The (x) process chooses to fit with three convolutional layers, and the g () process uses a fully-connected layer fit.
4. The method for automatically generating style migration of stylized video according to claim 1, characterized in that said step (4) is implemented by optimizing a loss function as follows:
wherein I is an original image, Ek(I) For the k-th layer output characteristic diagram in the original VGG encoder,for the k-th layer output characteristic diagram in the lightweight encoder, I' is a decoder using lightweightDecoding the resulting reconstructed picture with λ being a hyperparametric, the above distillation process being aimed atWhile the information of the original E can be retained,can be well combined withAnd performing image reconstruction on the obtained output characteristic information.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110117964.4A CN112884636B (en) | 2021-01-28 | 2021-01-28 | Style migration method for automatically generating stylized video |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110117964.4A CN112884636B (en) | 2021-01-28 | 2021-01-28 | Style migration method for automatically generating stylized video |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112884636A true CN112884636A (en) | 2021-06-01 |
CN112884636B CN112884636B (en) | 2023-09-26 |
Family
ID=76052976
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110117964.4A Active CN112884636B (en) | 2021-01-28 | 2021-01-28 | Style migration method for automatically generating stylized video |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112884636B (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113989102A (en) * | 2021-10-19 | 2022-01-28 | 复旦大学 | Rapid style migration method with high shape-preserving property |
CN114331827A (en) * | 2022-03-07 | 2022-04-12 | 深圳市其域创新科技有限公司 | Style migration method, device, equipment and storage medium |
CN114885174A (en) * | 2022-02-23 | 2022-08-09 | 中国科学院自动化研究所 | Video processing method and device and electronic equipment |
CN118283201A (en) * | 2024-06-03 | 2024-07-02 | 上海蜜度科技股份有限公司 | Video synthesis method, system, storage medium and electronic equipment |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20190236814A1 (en) * | 2016-10-21 | 2019-08-01 | Google Llc | Stylizing input images |
CN110175951A (en) * | 2019-05-16 | 2019-08-27 | 西安电子科技大学 | Video Style Transfer method based on time domain consistency constraint |
CN110310221A (en) * | 2019-06-14 | 2019-10-08 | 大连理工大学 | A kind of multiple domain image Style Transfer method based on generation confrontation network |
CN110706151A (en) * | 2018-09-13 | 2020-01-17 | 南京大学 | Video-oriented non-uniform style migration method |
US20200151938A1 (en) * | 2018-11-08 | 2020-05-14 | Adobe Inc. | Generating stylized-stroke images from source images utilizing style-transfer-neural networks with non-photorealistic-rendering |
US20200167658A1 (en) * | 2018-11-24 | 2020-05-28 | Jessica Du | System of Portable Real Time Neurofeedback Training |
CN111325681A (en) * | 2020-01-20 | 2020-06-23 | 南京邮电大学 | Image style migration method combining meta-learning mechanism and feature fusion |
CN111932445A (en) * | 2020-07-27 | 2020-11-13 | 广州市百果园信息技术有限公司 | Compression method for style migration network and style migration method, device and system |
-
2021
- 2021-01-28 CN CN202110117964.4A patent/CN112884636B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20190236814A1 (en) * | 2016-10-21 | 2019-08-01 | Google Llc | Stylizing input images |
CN110706151A (en) * | 2018-09-13 | 2020-01-17 | 南京大学 | Video-oriented non-uniform style migration method |
US20200151938A1 (en) * | 2018-11-08 | 2020-05-14 | Adobe Inc. | Generating stylized-stroke images from source images utilizing style-transfer-neural networks with non-photorealistic-rendering |
US20200167658A1 (en) * | 2018-11-24 | 2020-05-28 | Jessica Du | System of Portable Real Time Neurofeedback Training |
CN110175951A (en) * | 2019-05-16 | 2019-08-27 | 西安电子科技大学 | Video Style Transfer method based on time domain consistency constraint |
CN110310221A (en) * | 2019-06-14 | 2019-10-08 | 大连理工大学 | A kind of multiple domain image Style Transfer method based on generation confrontation network |
CN111325681A (en) * | 2020-01-20 | 2020-06-23 | 南京邮电大学 | Image style migration method combining meta-learning mechanism and feature fusion |
CN111932445A (en) * | 2020-07-27 | 2020-11-13 | 广州市百果园信息技术有限公司 | Compression method for style migration network and style migration method, device and system |
Non-Patent Citations (1)
Title |
---|
暴雨轩;芦天亮;杜彦辉;: "深度伪造视频检测技术综述", 计算机科学, no. 09, pages 289 - 298 * |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113989102A (en) * | 2021-10-19 | 2022-01-28 | 复旦大学 | Rapid style migration method with high shape-preserving property |
CN114885174A (en) * | 2022-02-23 | 2022-08-09 | 中国科学院自动化研究所 | Video processing method and device and electronic equipment |
CN114331827A (en) * | 2022-03-07 | 2022-04-12 | 深圳市其域创新科技有限公司 | Style migration method, device, equipment and storage medium |
CN118283201A (en) * | 2024-06-03 | 2024-07-02 | 上海蜜度科技股份有限公司 | Video synthesis method, system, storage medium and electronic equipment |
CN118283201B (en) * | 2024-06-03 | 2024-10-15 | 上海蜜度科技股份有限公司 | Video synthesis method, system, storage medium and electronic equipment |
Also Published As
Publication number | Publication date |
---|---|
CN112884636B (en) | 2023-09-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112884636A (en) | Style migration method for automatically generating stylized video | |
CN113762322B (en) | Video classification method, device and equipment based on multi-modal representation and storage medium | |
CN110930342B (en) | Depth map super-resolution reconstruction network construction method based on color map guidance | |
CN111862294B (en) | Hand-painted 3D building automatic coloring network device and method based on ArcGAN network | |
CN111242844B (en) | Image processing method, device, server and storage medium | |
CN113934890B (en) | Method and system for automatically generating scene video by characters | |
WO2023151529A1 (en) | Facial image processing method and related device | |
CN113344188A (en) | Lightweight neural network model based on channel attention module | |
CN111626968B (en) | Pixel enhancement design method based on global information and local information | |
CN112381716A (en) | Image enhancement method based on generation type countermeasure network | |
CN115829876A (en) | Real degraded image blind restoration method based on cross attention mechanism | |
CN118230081B (en) | Image processing method, apparatus, electronic device, computer readable storage medium, and computer program product | |
CN114841859A (en) | Single-image super-resolution reconstruction method based on lightweight neural network and Transformer | |
CN112837212B (en) | Image arbitrary style migration method based on manifold alignment | |
CN117994447A (en) | Auxiliary generation method and system for 3D image of vehicle model design oriented to sheet | |
CN118229632A (en) | Display screen defect detection method, model training method, device, equipment and medium | |
CN112257464A (en) | Machine translation decoding acceleration method based on small intelligent mobile device | |
CN116977631A (en) | Streetscape semantic segmentation method based on DeepLabV3+ | |
CN116311455A (en) | Expression recognition method based on improved Mobile-former | |
CN114118415B (en) | Deep learning method of lightweight bottleneck attention mechanism | |
CN113706572B (en) | End-to-end panoramic image segmentation method based on query vector | |
CN110245677A (en) | A kind of image descriptor dimension reduction method based on convolution self-encoding encoder | |
CN114513684B (en) | Method for constructing video image quality enhancement model, video image quality enhancement method and device | |
CN117036729A (en) | Lightweight semantic image translation method based on feature pyramid | |
CN118887592A (en) | Modal missing RGBT tracking method and system based on missing perception prompt |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |