CN111429342B - Photo style migration method based on style corpus constraint - Google Patents
Photo style migration method based on style corpus constraint Download PDFInfo
- Publication number
- CN111429342B CN111429342B CN202010239903.0A CN202010239903A CN111429342B CN 111429342 B CN111429342 B CN 111429342B CN 202010239903 A CN202010239903 A CN 202010239903A CN 111429342 B CN111429342 B CN 111429342B
- Authority
- CN
- China
- Prior art keywords
- style
- network
- photo
- student
- teacher
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000013508 migration Methods 0.000 title claims abstract description 37
- 230000005012 migration Effects 0.000 title claims abstract description 37
- 238000000034 method Methods 0.000 title claims abstract description 18
- 238000012549 training Methods 0.000 claims abstract description 19
- 238000004821 distillation Methods 0.000 claims abstract description 5
- 230000006870 function Effects 0.000 claims description 15
- 238000013461 design Methods 0.000 claims description 8
- 238000010586 diagram Methods 0.000 claims description 5
- 238000012937 correction Methods 0.000 claims description 2
- 238000005457 optimization Methods 0.000 claims description 2
- 238000004806 packaging method and process Methods 0.000 claims description 2
- 238000011176 pooling Methods 0.000 claims description 2
- 230000000694 effects Effects 0.000 description 5
- 239000011159 matrix material Substances 0.000 description 3
- 238000007619 statistical method Methods 0.000 description 3
- 238000013528 artificial neural network Methods 0.000 description 2
- 238000006243 chemical reaction Methods 0.000 description 2
- 238000013140 knowledge distillation Methods 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- 230000015556 catabolic process Effects 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 238000012805 post-processing Methods 0.000 description 1
- 238000009877 rendering Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/04—Context-preserving transformations, e.g. by using an importance map
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Image Analysis (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention relates to a photo style migration method based on style corpus constraint, which comprises the following steps: acquiring a data set required by a student network for training, selecting a teacher network and the student network, acquiring generated photos of the teacher network and the student network, constructing a style corpus, designing a multi-level anti-distillation strategy based on the constraint of the style corpus, training and optimizing the student network to carry out photo style migration, and acquiring stylized photos. The method provided by the invention can effectively relieve the problems of image distortion, unreal and the like after stylization caused by mutual interference of the style information and the content information of the single photo, and remarkably improves the style migration efficiency of the photo.
Description
Technical Field
The invention relates to the field of grid migration in image processing, in particular to a method for representing and migrating single-piece image style information in photo style migration.
Background
Style migration is the main research content of non-realistic drawing in computer graphics, and the drawing styles of different artistic forms are modeled through algorithms, so that the expression form of visual information in an image is enhanced. The research on the artistic stylization of the image can enrich the theoretical contents of computer graphics, image processing and the like, and can deepen and expand the application field of the image. Photo style migration and art style migration are two main tasks of style migration, and compared with art style migration, photo style migration not only needs to migrate style information of an art photo to a content photo, but also requires a stylized image to be the same as a photo taken by a camera.
The existing photo style migration method mainly adopts statistical methods such as a gram matrix [1] and a covariance matrix [2] [4] to model style information of a single artistic photo, and uses a loss function and complex feature transformation based on the gram matrix to conduct style rendering. Because the style information and the content information are mutually wound in a single image, the style information cannot be clearly and accurately modeled by a mathematical formula alone, so that the content information and the style information are mutually influenced in the style migration, the problem that the stylized image has structure distortion, inconsistent style of the same semantic space and blurred image exists, and the application requirement of the style migration of the photo is not met. In order to solve the problem of image quality degradation caused by the failure to accurately model style information, the conventional method needs to introduce complex color space constraint [1], additional post-processing [2] [3] and complex feature transformation operation [4], so that the migration speed of the style of the photo is slow, and the practical application is severely restricted. Therefore, there is a need to study photo style migration methods that have better migration effects and are more efficient.
Reference is made to:
1.F.Lua,S.Paris,E.Shechtman,and K.Bala,“Deep photo styletransfer,”in Proceedings of the IEEE Conference on Computer Visionand Pattern Recognition,2017,pp.4990–4998.
2.Y.Li,M.-Y.Liu,X.Li,M.-H.Yang,and J.Kautz,“A closed-formsolution to photorealistic image stylization,”in Proceedings of theEuropean Conference on ComputerVision(ECCV),2018,pp.453–468.
3.X.Li,S.Liu,J.Kautz,and M.-H.Yang,“Learning linear transformations for fast image and video style transfer,”in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2019,pp.3809–3817.
4.J.Yoo,Y.Uh,S.Chun,B.Kang,and J.-W.Ha,“Photorealisticstyle transfer via wavelet transforms,”in International Conference onComputer Vision(ICCV),2019.
disclosure of Invention
Aiming at the problem that the prior method can not effectively solve the problem that single image style information and content information are intertwined, the invention provides a photo style migration method based on style corpus constraint and anti-distillation learning strategy, which mainly comprises the following steps:
step S1: acquiring a data set required for training a student network;
step S2: selecting a teacher network and a student network to obtain generated photos of the teacher network and the student network;
step S3: constructing a style corpus;
step S4: designing a multi-level anti-distillation strategy based on style corpus constraints;
step S5: training and optimizing a student network to carry out photo style migration;
step S6: acquiring a stylized photo;
compared with the current method for modeling the style information of the single image by using the statistical method, the method provided by the invention adopts the style corpus to restrict the style information of the single image, and can effectively overcome the problem that the style information and the content information of the single image are difficult to accurately model due to intertwining. Based on the characteristics of the same photo style in the same style package and different photo styles in different style packages, the consistency constraint is carried out on the style migration effect through countermeasure learning, and the problems of image distortion, unrealism and the like caused by mutual interference of style information and content information are relieved. Finally, the invention uses the strategy of knowledge distillation to directly learn the complex feature conversion operation in the photo style migration by using the neural network, thereby improving the photo style migration efficiency.
Drawings
FIG. 1 is a schematic flow chart of the present invention;
FIG. 2 is a schematic diagram of a frame of the present invention;
FIG. 3 is an effect diagram of the present invention;
table 1 is a table of the test time statistics of the present invention.
Detailed Description
Referring to fig. 1 and 2, which are a flowchart and a frame diagram of a photo style migration method based on style corpus constraint, the method mainly comprises the following steps: acquiring a data set required by a student network for training, selecting a teacher network and the student network, acquiring generated photos of the teacher network and the student network, constructing a style corpus, designing a multi-level anti-distillation strategy based on the constraint of the style corpus, training and optimizing the student network to carry out photo style migration, and acquiring stylized photos. The specific implementation details of each step are as follows:
step S1: the data set required for training the student network is acquired in the following specific way:
step S11: the COCO dataset was downloaded as a content dataset, the number of images in the dataset being noted N.
Step S12: and downloading the art photos disclosed by the WikiArt website as a style data set, wherein the number of photos in the data set is recorded as M.
Step S2: selecting a teacher network and a student network to obtain generated photos of the teacher network and the student network, wherein the specific mode is as follows:
step S21: selecting end-to-end style migration network WCT based on wavelet transform correction 2 As a teacher network, the fixed network weight parameter is denoted T.
Step S22: an artistic style migration network AdaIN is selected as a student network, jump connection is introduced between a pooling layer of an encoder and a corresponding deconvolution layer in a decoder, and the network is randomly initialized and marked as S.
Step S23: carrying out standardization, clipping and packaging treatment on the content data set and the style data set; optionally one image from the content data set, denoted c i (i=1, 2, …, N), optionally one image from the style dataset, denoted r j (j=1, 2, …, M), c i And r j Inputting teacher network T to obtain generated photo T i,j C, adding i And r j Inputting student network S to obtain generated photo S i,j 。
Step S3: the style corpus is constructed by the following specific modes:
step S31: utilizing teacher network T to make style photo r j Is rendered to all images in the content data set, and the resulting generated photo collection is recorded as a style packageStyle bag B j All of the photos in (a) are different but the style is the same.
Step S32: according to step S31, obtaining style packages corresponding to all photos in the style data set, and defining a style corpusIs thatThe style information of the photos in different style packages is different, but the content images are the same.
Step S4: the multi-level anti-distillation strategy based on style corpus constraint is designed in the following specific mode:
step S41: design loss function L pix =||s i,j -t i,j || 1 So that the student network generates a photo s i,j Generating a photo t with a teacher network i,j As close as possible in pixel space.
Step S42: design loss functionSo that the student network generates a photo s i,j Generating a photo t with a teacher network i,j In the feature space as close as possible, wherein +.>The characteristic diagram corresponding to the image in the loss network VGG19 is shown, and the parameter k is shown as 3 rd and 8 th of VGG16 network,
15. 22 layers. Lambda (lambda) k The weight coefficients corresponding to different layers are represented, and 1,1,0.5,0.5 is respectively taken.
Step S43: design loss functionSo that the student network generates a photo s i,j Generating a photo t with a teacher network i,j As close as possible in overall distribution, where D cd Is a discriminator consisting entirely of convolutional layers, the parameter C representing the set of content images, R representing the set of style images, Ω s Representing the result set of the student network output omega T Representing the teacher's network output result set. Symbol E [. Cndot.]Indicating a pair of brackets []The function value in the inner is expected.
Content image c i Style image r j Generating photos s with student network i,j Spliced together as FalseContent image c i Style image r j Generating a photo t with a teacher network i,j Spliced together to be used as True, and the student network generates photos s through countermeasure training i,j Generating a photo t with a teacher network i,j As close as possible.
Step S44: design loss function
So that the student network generates a photo s i,j Generating a photo t with a teacher network i,j As close as possible in style, where D sd Representing a style discriminator consisting entirely of convolutional layers, symbol E [. Cndot.]Indicating a pair of brackets []The function value in (t) is expected i+1,j ,t i,j ) Representing two pictures of different content and same style generated by teacher network,(s) i,j ,t i,j ) Is two photos with the same content and style generated by a student network and a teacher network, (t) i,j ,t i,j+1 ) Representing two photos of the same content but in different styles generated by the teacher network.
Step S5: the training and optimizing student network carries out photo style migration in the following specific modes:
step S51: combining multiple loss functions according to different weights to obtain an overall optimization function
The whole network is trained, and the T parameters of the teacher network are always fixed in the training process.
Step S52: locking style discriminator D sd And student network S, update condition discriminator D cd Parameter twice post-lock condition discriminator D cd 。
Step S53: unlocking style discriminator D sd Locking style discriminator D after updating parameters twice sd 。
Step S54: and unlocking the student network S, and locking the student network S after updating the parameters once.
Step S55: repeating steps S52, S53 and S54 until the condition discriminator D cd And style discriminator D sd Stopping training when the loss function converges near 0.5, and storing the training student network S and the condition discriminator D cd And style discriminator D sd 。
Step S6: and (3) acquiring the stylized photo, namely selecting any one content image and style photo to input the student network S obtained in the step S5, and obtaining the stylized photo.
Compared with the current method for modeling the style information of the single image by using the statistical method, the method provided by the invention adopts the style corpus to restrict the style information of the single image, and can effectively overcome the problem that the style information and the content information of the single image are difficult to accurately model due to intertwining. Based on the characteristics of the same photo style in the same style package and different photo styles in different style packages, the consistency constraint is carried out on the style migration effect through countermeasure learning, and the problems of image distortion, unrealism and the like caused by mutual interference of style information and content information are relieved. Finally, the invention uses the strategy of knowledge distillation to directly learn the complex feature conversion operation in the photo style migration by using the neural network, thereby improving the photo style migration efficiency. The photo style migration method based on the style corpus constraint has the effects and efficiency shown in the figure 3 and the table 1, effectively solves the problems of image distortion and unrealistic in the prior photo style migration, and improves the speed by 13-50 times compared with the method [4 ].
Claims (1)
1. A photo style migration method based on style corpus constraint is characterized by comprising the following steps:
step S1: the data set required for training the student network is acquired in the following specific way:
step S11: downloading COCO data sets as content data sets, wherein the number of images in the data sets is recorded as N;
step S12: downloading art photos disclosed by a Wikiart website as a style data set, wherein the number of photos in the data set is recorded as M;
step S2: selecting a teacher network and a student network to obtain generated photos of the teacher network and the student network, wherein the specific mode is as follows:
step S21: selecting end-to-end style migration network WCT based on wavelet transform correction 2 As a teacher network, fixing a network weight parameter, and marking as T;
step S22: selecting an artistic style migration network AdaIN as a student network, introducing jump connection between a pooling layer of an encoder and a corresponding deconvolution layer in a decoder, randomly initializing the network, and marking as S;
step S23: carrying out standardization, clipping and packaging treatment on the content data set and the style data set; optionally one image from the content data set, denoted c i (i=1, 2, …, N), optionally one image from the style dataset, denoted r j (j=1, 2, …, M), c i And r j Inputting teacher network T to obtain generated photo T i,j C, adding i And r j Inputting student network S to obtain generated photo S i,j ;
Step S3: the style corpus is constructed by the following specific modes:
step S31: utilizing teacher network T to make style photo r j Is rendered to all images in the content data set, and the resulting generated photo collection is recorded as a style packageStyle bag B j All the photos have different contents but the same style;
step S32: according to step S31, obtaining style packages corresponding to all photos in the style data set, and defining a style corpus asThe style information of the photos in different style packages is different but the content images are the same;
step S4: the multi-level anti-distillation strategy based on style corpus constraint is designed in the following specific mode:
step S41: design loss function L pix =||s i,j -t i,j || 1 So that the student network generates a photo s i,j Generating a photo t with a teacher network i,j As close as possible in pixel space;
step S42: design loss functionSo that the student network generates a photo s i,j Generating a photo t with a teacher network i,j In the feature space as close as possible, wherein +.>Representing the corresponding characteristic diagram of the image in the loss network VGG-16, and the parameter k represents layers 3, 8, 15 and 22 of the VGG16 network; lambda (lambda) k The weight coefficients corresponding to different layers are represented, and the values 1,1,0.5,0.5 are respectively taken;
step S43: design loss functionSo that the student network generates a photo s i,j Generating a photo t with a teacher network i,j As close as possible in overall distribution, where D cd Is a discriminator consisting entirely of convolutional layers, the parameter C representing the set of content images, R representing the set of style images, Ω s Representing the result set of the student network output omega T Representing the teacher network output result set, symbol E [. Cndot.]Indicating a pair of brackets []The function value in the content image c is calculated and expected i Style image r j Generating photos s with student network i,j Spliced together as False, and the content image c i Style image r j Generating a photo t with a teacher network i,j Spliced together to be used as True, and the student network generates photos s through countermeasure training i,j Generating a photo t with a teacher network i,j Approaching as much as possible;
step S44: design loss function So that the student network generates a photo s i,j Generating a photo t with a teacher network i,j As close as possible in style, where D sd Representing a style discriminator consisting entirely of convolutional layers, E [. Cndot.]Indicating a pair of brackets []The function value in (t) is expected i+1,j ,t i,j ) Representing two pictures of different content and same style generated by teacher network,(s) i,j ,t i,j ) Is two photos with the same content and style generated by a student network and a teacher network, (t) i,j ,t i,j+1 ) Representing two photos with the same content and different styles generated by a teacher network;
step S5: the training and optimizing student network carries out photo style migration in the following specific modes:
step S51: combining multiple loss functions according to different weights to obtain an overall optimization function
Training the whole network, wherein the T parameter of the teacher network is always fixed in the training process;
step S52: locking style discriminator D sd And student network S, update condition discriminator D cd Parameter twice post-lock condition discriminator D cd ;
Step S53: unlocking style discriminator D sd Locking style discriminator D after updating parameters twice sd ;
Step S54: unlocking the student network S, and locking the student network S after updating parameters once;
step S55: repeating steps S52, S53 and S54 until the condition discriminator D cd And style discriminator D sd Stopping training when the loss function converges near 0.5, and storing the training student network S and the condition discriminator D cd And style discriminator D sd ;
Step S6: and (3) acquiring the stylized photo, namely selecting any one content image and style photo to input the student network S obtained in the step S5, and obtaining the stylized photo.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010239903.0A CN111429342B (en) | 2020-03-31 | 2020-03-31 | Photo style migration method based on style corpus constraint |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010239903.0A CN111429342B (en) | 2020-03-31 | 2020-03-31 | Photo style migration method based on style corpus constraint |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111429342A CN111429342A (en) | 2020-07-17 |
CN111429342B true CN111429342B (en) | 2024-01-05 |
Family
ID=71550668
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010239903.0A Active CN111429342B (en) | 2020-03-31 | 2020-03-31 | Photo style migration method based on style corpus constraint |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111429342B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113344771B (en) * | 2021-05-20 | 2023-07-25 | 武汉大学 | Multifunctional image style migration method based on deep learning |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109523460A (en) * | 2018-10-29 | 2019-03-26 | 北京达佳互联信息技术有限公司 | Moving method, moving apparatus and the computer readable storage medium of image style |
CN110175951A (en) * | 2019-05-16 | 2019-08-27 | 西安电子科技大学 | Video Style Transfer method based on time domain consistency constraint |
CN110458750A (en) * | 2019-05-31 | 2019-11-15 | 北京理工大学 | A kind of unsupervised image Style Transfer method based on paired-associate learning |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10565757B2 (en) * | 2017-06-09 | 2020-02-18 | Adobe Inc. | Multimodal style-transfer network for applying style features from multi-resolution style exemplars to input images |
US10872399B2 (en) * | 2018-02-02 | 2020-12-22 | Nvidia Corporation | Photorealistic image stylization using a neural network model |
-
2020
- 2020-03-31 CN CN202010239903.0A patent/CN111429342B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109523460A (en) * | 2018-10-29 | 2019-03-26 | 北京达佳互联信息技术有限公司 | Moving method, moving apparatus and the computer readable storage medium of image style |
CN110175951A (en) * | 2019-05-16 | 2019-08-27 | 西安电子科技大学 | Video Style Transfer method based on time domain consistency constraint |
CN110458750A (en) * | 2019-05-31 | 2019-11-15 | 北京理工大学 | A kind of unsupervised image Style Transfer method based on paired-associate learning |
Non-Patent Citations (1)
Title |
---|
基于VGG-19图像风格迁移算法的设计与分析;张月;刘彩云;熊杰;;信息技术与信息化(第01期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN111429342A (en) | 2020-07-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Hao et al. | Low-light image enhancement with semi-decoupled decomposition | |
US11328523B2 (en) | Image composites using a generative neural network | |
US11200638B2 (en) | Image style transform methods and apparatuses, devices and storage media | |
CN111199531B (en) | Interactive data expansion method based on Poisson image fusion and image stylization | |
CN105374007B (en) | Merge the pencil drawing generation method and device of skeleton stroke and textural characteristics | |
CN109544662B (en) | Method and system for coloring cartoon style draft based on SRUnet | |
CN107977414A (en) | Image Style Transfer method and its system based on deep learning | |
US20220237834A1 (en) | View Synthesis Robust to Unconstrained Image Data | |
CN111986075B (en) | Style migration method for target edge clarification | |
CN111223062A (en) | Image deblurring method based on generation countermeasure network | |
CN111739082A (en) | Stereo vision unsupervised depth estimation method based on convolutional neural network | |
CN110533579B (en) | Video style conversion method based on self-coding structure and gradient order preservation | |
Chen et al. | Geoaug: Data augmentation for few-shot nerf with geometry constraints | |
CN107240085A (en) | A kind of image interfusion method and system based on convolutional neural networks model | |
CN109447897B (en) | Real scene image synthesis method and system | |
Liu et al. | Painting completion with generative translation models | |
CN111429342B (en) | Photo style migration method based on style corpus constraint | |
Zhong et al. | Deep attentional guided image filtering | |
Li et al. | Flexicurve: Flexible piecewise curves estimation for photo retouching | |
CN117495662A (en) | Cartoon image style migration method and system based on Stable diffration | |
CN116934936A (en) | Three-dimensional scene style migration method, device, equipment and storage medium | |
Zhang et al. | A fast solution for Chinese calligraphy relief modeling from 2D handwriting image | |
CN113869503B (en) | Data processing method and storage medium based on depth matrix decomposition completion | |
CN110866866A (en) | Image color-matching processing method and device, electronic device and storage medium | |
CN113487475B (en) | Interactive image editing method, system, readable storage medium and electronic equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |