CN111429342B - Photo style migration method based on style corpus constraint - Google Patents

Photo style migration method based on style corpus constraint Download PDF

Info

Publication number
CN111429342B
CN111429342B CN202010239903.0A CN202010239903A CN111429342B CN 111429342 B CN111429342 B CN 111429342B CN 202010239903 A CN202010239903 A CN 202010239903A CN 111429342 B CN111429342 B CN 111429342B
Authority
CN
China
Prior art keywords
style
network
photo
student
teacher
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010239903.0A
Other languages
Chinese (zh)
Other versions
CN111429342A (en
Inventor
乔应旭
刘红敏
霍占强
杨红果
王静
付威廉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Henan University of Technology
Original Assignee
Henan University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Henan University of Technology filed Critical Henan University of Technology
Priority to CN202010239903.0A priority Critical patent/CN111429342B/en
Publication of CN111429342A publication Critical patent/CN111429342A/en
Application granted granted Critical
Publication of CN111429342B publication Critical patent/CN111429342B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/04Context-preserving transformations, e.g. by using an importance map
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to a photo style migration method based on style corpus constraint, which comprises the following steps: acquiring a data set required by a student network for training, selecting a teacher network and the student network, acquiring generated photos of the teacher network and the student network, constructing a style corpus, designing a multi-level anti-distillation strategy based on the constraint of the style corpus, training and optimizing the student network to carry out photo style migration, and acquiring stylized photos. The method provided by the invention can effectively relieve the problems of image distortion, unreal and the like after stylization caused by mutual interference of the style information and the content information of the single photo, and remarkably improves the style migration efficiency of the photo.

Description

Photo style migration method based on style corpus constraint
Technical Field
The invention relates to the field of grid migration in image processing, in particular to a method for representing and migrating single-piece image style information in photo style migration.
Background
Style migration is the main research content of non-realistic drawing in computer graphics, and the drawing styles of different artistic forms are modeled through algorithms, so that the expression form of visual information in an image is enhanced. The research on the artistic stylization of the image can enrich the theoretical contents of computer graphics, image processing and the like, and can deepen and expand the application field of the image. Photo style migration and art style migration are two main tasks of style migration, and compared with art style migration, photo style migration not only needs to migrate style information of an art photo to a content photo, but also requires a stylized image to be the same as a photo taken by a camera.
The existing photo style migration method mainly adopts statistical methods such as a gram matrix [1] and a covariance matrix [2] [4] to model style information of a single artistic photo, and uses a loss function and complex feature transformation based on the gram matrix to conduct style rendering. Because the style information and the content information are mutually wound in a single image, the style information cannot be clearly and accurately modeled by a mathematical formula alone, so that the content information and the style information are mutually influenced in the style migration, the problem that the stylized image has structure distortion, inconsistent style of the same semantic space and blurred image exists, and the application requirement of the style migration of the photo is not met. In order to solve the problem of image quality degradation caused by the failure to accurately model style information, the conventional method needs to introduce complex color space constraint [1], additional post-processing [2] [3] and complex feature transformation operation [4], so that the migration speed of the style of the photo is slow, and the practical application is severely restricted. Therefore, there is a need to study photo style migration methods that have better migration effects and are more efficient.
Reference is made to:
1.F.Lua,S.Paris,E.Shechtman,and K.Bala,“Deep photo styletransfer,”in Proceedings of the IEEE Conference on Computer Visionand Pattern Recognition,2017,pp.4990–4998.
2.Y.Li,M.-Y.Liu,X.Li,M.-H.Yang,and J.Kautz,“A closed-formsolution to photorealistic image stylization,”in Proceedings of theEuropean Conference on ComputerVision(ECCV),2018,pp.453–468.
3.X.Li,S.Liu,J.Kautz,and M.-H.Yang,“Learning linear transformations for fast image and video style transfer,”in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2019,pp.3809–3817.
4.J.Yoo,Y.Uh,S.Chun,B.Kang,and J.-W.Ha,“Photorealisticstyle transfer via wavelet transforms,”in International Conference onComputer Vision(ICCV),2019.
disclosure of Invention
Aiming at the problem that the prior method can not effectively solve the problem that single image style information and content information are intertwined, the invention provides a photo style migration method based on style corpus constraint and anti-distillation learning strategy, which mainly comprises the following steps:
step S1: acquiring a data set required for training a student network;
step S2: selecting a teacher network and a student network to obtain generated photos of the teacher network and the student network;
step S3: constructing a style corpus;
step S4: designing a multi-level anti-distillation strategy based on style corpus constraints;
step S5: training and optimizing a student network to carry out photo style migration;
step S6: acquiring a stylized photo;
compared with the current method for modeling the style information of the single image by using the statistical method, the method provided by the invention adopts the style corpus to restrict the style information of the single image, and can effectively overcome the problem that the style information and the content information of the single image are difficult to accurately model due to intertwining. Based on the characteristics of the same photo style in the same style package and different photo styles in different style packages, the consistency constraint is carried out on the style migration effect through countermeasure learning, and the problems of image distortion, unrealism and the like caused by mutual interference of style information and content information are relieved. Finally, the invention uses the strategy of knowledge distillation to directly learn the complex feature conversion operation in the photo style migration by using the neural network, thereby improving the photo style migration efficiency.
Drawings
FIG. 1 is a schematic flow chart of the present invention;
FIG. 2 is a schematic diagram of a frame of the present invention;
FIG. 3 is an effect diagram of the present invention;
table 1 is a table of the test time statistics of the present invention.
Detailed Description
Referring to fig. 1 and 2, which are a flowchart and a frame diagram of a photo style migration method based on style corpus constraint, the method mainly comprises the following steps: acquiring a data set required by a student network for training, selecting a teacher network and the student network, acquiring generated photos of the teacher network and the student network, constructing a style corpus, designing a multi-level anti-distillation strategy based on the constraint of the style corpus, training and optimizing the student network to carry out photo style migration, and acquiring stylized photos. The specific implementation details of each step are as follows:
step S1: the data set required for training the student network is acquired in the following specific way:
step S11: the COCO dataset was downloaded as a content dataset, the number of images in the dataset being noted N.
Step S12: and downloading the art photos disclosed by the WikiArt website as a style data set, wherein the number of photos in the data set is recorded as M.
Step S2: selecting a teacher network and a student network to obtain generated photos of the teacher network and the student network, wherein the specific mode is as follows:
step S21: selecting end-to-end style migration network WCT based on wavelet transform correction 2 As a teacher network, the fixed network weight parameter is denoted T.
Step S22: an artistic style migration network AdaIN is selected as a student network, jump connection is introduced between a pooling layer of an encoder and a corresponding deconvolution layer in a decoder, and the network is randomly initialized and marked as S.
Step S23: carrying out standardization, clipping and packaging treatment on the content data set and the style data set; optionally one image from the content data set, denoted c i (i=1, 2, …, N), optionally one image from the style dataset, denoted r j (j=1, 2, …, M), c i And r j Inputting teacher network T to obtain generated photo T i,j C, adding i And r j Inputting student network S to obtain generated photo S i,j
Step S3: the style corpus is constructed by the following specific modes:
step S31: utilizing teacher network T to make style photo r j Is rendered to all images in the content data set, and the resulting generated photo collection is recorded as a style packageStyle bag B j All of the photos in (a) are different but the style is the same.
Step S32: according to step S31, obtaining style packages corresponding to all photos in the style data set, and defining a style corpusIs thatThe style information of the photos in different style packages is different, but the content images are the same.
Step S4: the multi-level anti-distillation strategy based on style corpus constraint is designed in the following specific mode:
step S41: design loss function L pix =||s i,j -t i,j || 1 So that the student network generates a photo s i,j Generating a photo t with a teacher network i,j As close as possible in pixel space.
Step S42: design loss functionSo that the student network generates a photo s i,j Generating a photo t with a teacher network i,j In the feature space as close as possible, wherein +.>The characteristic diagram corresponding to the image in the loss network VGG19 is shown, and the parameter k is shown as 3 rd and 8 th of VGG16 network,
15. 22 layers. Lambda (lambda) k The weight coefficients corresponding to different layers are represented, and 1,1,0.5,0.5 is respectively taken.
Step S43: design loss functionSo that the student network generates a photo s i,j Generating a photo t with a teacher network i,j As close as possible in overall distribution, where D cd Is a discriminator consisting entirely of convolutional layers, the parameter C representing the set of content images, R representing the set of style images, Ω s Representing the result set of the student network output omega T Representing the teacher's network output result set. Symbol E [. Cndot.]Indicating a pair of brackets []The function value in the inner is expected.
Content image c i Style image r j Generating photos s with student network i,j Spliced together as FalseContent image c i Style image r j Generating a photo t with a teacher network i,j Spliced together to be used as True, and the student network generates photos s through countermeasure training i,j Generating a photo t with a teacher network i,j As close as possible.
Step S44: design loss function
So that the student network generates a photo s i,j Generating a photo t with a teacher network i,j As close as possible in style, where D sd Representing a style discriminator consisting entirely of convolutional layers, symbol E [. Cndot.]Indicating a pair of brackets []The function value in (t) is expected i+1,j ,t i,j ) Representing two pictures of different content and same style generated by teacher network,(s) i,j ,t i,j ) Is two photos with the same content and style generated by a student network and a teacher network, (t) i,j ,t i,j+1 ) Representing two photos of the same content but in different styles generated by the teacher network.
Step S5: the training and optimizing student network carries out photo style migration in the following specific modes:
step S51: combining multiple loss functions according to different weights to obtain an overall optimization function
The whole network is trained, and the T parameters of the teacher network are always fixed in the training process.
Step S52: locking style discriminator D sd And student network S, update condition discriminator D cd Parameter twice post-lock condition discriminator D cd
Step S53: unlocking style discriminator D sd Locking style discriminator D after updating parameters twice sd
Step S54: and unlocking the student network S, and locking the student network S after updating the parameters once.
Step S55: repeating steps S52, S53 and S54 until the condition discriminator D cd And style discriminator D sd Stopping training when the loss function converges near 0.5, and storing the training student network S and the condition discriminator D cd And style discriminator D sd
Step S6: and (3) acquiring the stylized photo, namely selecting any one content image and style photo to input the student network S obtained in the step S5, and obtaining the stylized photo.
Compared with the current method for modeling the style information of the single image by using the statistical method, the method provided by the invention adopts the style corpus to restrict the style information of the single image, and can effectively overcome the problem that the style information and the content information of the single image are difficult to accurately model due to intertwining. Based on the characteristics of the same photo style in the same style package and different photo styles in different style packages, the consistency constraint is carried out on the style migration effect through countermeasure learning, and the problems of image distortion, unrealism and the like caused by mutual interference of style information and content information are relieved. Finally, the invention uses the strategy of knowledge distillation to directly learn the complex feature conversion operation in the photo style migration by using the neural network, thereby improving the photo style migration efficiency. The photo style migration method based on the style corpus constraint has the effects and efficiency shown in the figure 3 and the table 1, effectively solves the problems of image distortion and unrealistic in the prior photo style migration, and improves the speed by 13-50 times compared with the method [4 ].

Claims (1)

1. A photo style migration method based on style corpus constraint is characterized by comprising the following steps:
step S1: the data set required for training the student network is acquired in the following specific way:
step S11: downloading COCO data sets as content data sets, wherein the number of images in the data sets is recorded as N;
step S12: downloading art photos disclosed by a Wikiart website as a style data set, wherein the number of photos in the data set is recorded as M;
step S2: selecting a teacher network and a student network to obtain generated photos of the teacher network and the student network, wherein the specific mode is as follows:
step S21: selecting end-to-end style migration network WCT based on wavelet transform correction 2 As a teacher network, fixing a network weight parameter, and marking as T;
step S22: selecting an artistic style migration network AdaIN as a student network, introducing jump connection between a pooling layer of an encoder and a corresponding deconvolution layer in a decoder, randomly initializing the network, and marking as S;
step S23: carrying out standardization, clipping and packaging treatment on the content data set and the style data set; optionally one image from the content data set, denoted c i (i=1, 2, …, N), optionally one image from the style dataset, denoted r j (j=1, 2, …, M), c i And r j Inputting teacher network T to obtain generated photo T i,j C, adding i And r j Inputting student network S to obtain generated photo S i,j
Step S3: the style corpus is constructed by the following specific modes:
step S31: utilizing teacher network T to make style photo r j Is rendered to all images in the content data set, and the resulting generated photo collection is recorded as a style packageStyle bag B j All the photos have different contents but the same style;
step S32: according to step S31, obtaining style packages corresponding to all photos in the style data set, and defining a style corpus asThe style information of the photos in different style packages is different but the content images are the same;
step S4: the multi-level anti-distillation strategy based on style corpus constraint is designed in the following specific mode:
step S41: design loss function L pix =||s i,j -t i,j || 1 So that the student network generates a photo s i,j Generating a photo t with a teacher network i,j As close as possible in pixel space;
step S42: design loss functionSo that the student network generates a photo s i,j Generating a photo t with a teacher network i,j In the feature space as close as possible, wherein +.>Representing the corresponding characteristic diagram of the image in the loss network VGG-16, and the parameter k represents layers 3, 8, 15 and 22 of the VGG16 network; lambda (lambda) k The weight coefficients corresponding to different layers are represented, and the values 1,1,0.5,0.5 are respectively taken;
step S43: design loss functionSo that the student network generates a photo s i,j Generating a photo t with a teacher network i,j As close as possible in overall distribution, where D cd Is a discriminator consisting entirely of convolutional layers, the parameter C representing the set of content images, R representing the set of style images, Ω s Representing the result set of the student network output omega T Representing the teacher network output result set, symbol E [. Cndot.]Indicating a pair of brackets []The function value in the content image c is calculated and expected i Style image r j Generating photos s with student network i,j Spliced together as False, and the content image c i Style image r j Generating a photo t with a teacher network i,j Spliced together to be used as True, and the student network generates photos s through countermeasure training i,j Generating a photo t with a teacher network i,j Approaching as much as possible;
step S44: design loss function So that the student network generates a photo s i,j Generating a photo t with a teacher network i,j As close as possible in style, where D sd Representing a style discriminator consisting entirely of convolutional layers, E [. Cndot.]Indicating a pair of brackets []The function value in (t) is expected i+1,j ,t i,j ) Representing two pictures of different content and same style generated by teacher network,(s) i,j ,t i,j ) Is two photos with the same content and style generated by a student network and a teacher network, (t) i,j ,t i,j+1 ) Representing two photos with the same content and different styles generated by a teacher network;
step S5: the training and optimizing student network carries out photo style migration in the following specific modes:
step S51: combining multiple loss functions according to different weights to obtain an overall optimization function
Training the whole network, wherein the T parameter of the teacher network is always fixed in the training process;
step S52: locking style discriminator D sd And student network S, update condition discriminator D cd Parameter twice post-lock condition discriminator D cd
Step S53: unlocking style discriminator D sd Locking style discriminator D after updating parameters twice sd
Step S54: unlocking the student network S, and locking the student network S after updating parameters once;
step S55: repeating steps S52, S53 and S54 until the condition discriminator D cd And style discriminator D sd Stopping training when the loss function converges near 0.5, and storing the training student network S and the condition discriminator D cd And style discriminator D sd
Step S6: and (3) acquiring the stylized photo, namely selecting any one content image and style photo to input the student network S obtained in the step S5, and obtaining the stylized photo.
CN202010239903.0A 2020-03-31 2020-03-31 Photo style migration method based on style corpus constraint Active CN111429342B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010239903.0A CN111429342B (en) 2020-03-31 2020-03-31 Photo style migration method based on style corpus constraint

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010239903.0A CN111429342B (en) 2020-03-31 2020-03-31 Photo style migration method based on style corpus constraint

Publications (2)

Publication Number Publication Date
CN111429342A CN111429342A (en) 2020-07-17
CN111429342B true CN111429342B (en) 2024-01-05

Family

ID=71550668

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010239903.0A Active CN111429342B (en) 2020-03-31 2020-03-31 Photo style migration method based on style corpus constraint

Country Status (1)

Country Link
CN (1) CN111429342B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113344771B (en) * 2021-05-20 2023-07-25 武汉大学 Multifunctional image style migration method based on deep learning

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109523460A (en) * 2018-10-29 2019-03-26 北京达佳互联信息技术有限公司 Moving method, moving apparatus and the computer readable storage medium of image style
CN110175951A (en) * 2019-05-16 2019-08-27 西安电子科技大学 Video Style Transfer method based on time domain consistency constraint
CN110458750A (en) * 2019-05-31 2019-11-15 北京理工大学 A kind of unsupervised image Style Transfer method based on paired-associate learning

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10565757B2 (en) * 2017-06-09 2020-02-18 Adobe Inc. Multimodal style-transfer network for applying style features from multi-resolution style exemplars to input images
US10872399B2 (en) * 2018-02-02 2020-12-22 Nvidia Corporation Photorealistic image stylization using a neural network model

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109523460A (en) * 2018-10-29 2019-03-26 北京达佳互联信息技术有限公司 Moving method, moving apparatus and the computer readable storage medium of image style
CN110175951A (en) * 2019-05-16 2019-08-27 西安电子科技大学 Video Style Transfer method based on time domain consistency constraint
CN110458750A (en) * 2019-05-31 2019-11-15 北京理工大学 A kind of unsupervised image Style Transfer method based on paired-associate learning

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于VGG-19图像风格迁移算法的设计与分析;张月;刘彩云;熊杰;;信息技术与信息化(第01期);全文 *

Also Published As

Publication number Publication date
CN111429342A (en) 2020-07-17

Similar Documents

Publication Publication Date Title
Hao et al. Low-light image enhancement with semi-decoupled decomposition
US11328523B2 (en) Image composites using a generative neural network
US11200638B2 (en) Image style transform methods and apparatuses, devices and storage media
CN111199531B (en) Interactive data expansion method based on Poisson image fusion and image stylization
CN105374007B (en) Merge the pencil drawing generation method and device of skeleton stroke and textural characteristics
CN109544662B (en) Method and system for coloring cartoon style draft based on SRUnet
CN107977414A (en) Image Style Transfer method and its system based on deep learning
US20220237834A1 (en) View Synthesis Robust to Unconstrained Image Data
CN111986075B (en) Style migration method for target edge clarification
CN111223062A (en) Image deblurring method based on generation countermeasure network
CN111739082A (en) Stereo vision unsupervised depth estimation method based on convolutional neural network
CN110533579B (en) Video style conversion method based on self-coding structure and gradient order preservation
Chen et al. Geoaug: Data augmentation for few-shot nerf with geometry constraints
CN107240085A (en) A kind of image interfusion method and system based on convolutional neural networks model
CN109447897B (en) Real scene image synthesis method and system
Liu et al. Painting completion with generative translation models
CN111429342B (en) Photo style migration method based on style corpus constraint
Zhong et al. Deep attentional guided image filtering
Li et al. Flexicurve: Flexible piecewise curves estimation for photo retouching
CN117495662A (en) Cartoon image style migration method and system based on Stable diffration
CN116934936A (en) Three-dimensional scene style migration method, device, equipment and storage medium
Zhang et al. A fast solution for Chinese calligraphy relief modeling from 2D handwriting image
CN113869503B (en) Data processing method and storage medium based on depth matrix decomposition completion
CN110866866A (en) Image color-matching processing method and device, electronic device and storage medium
CN113487475B (en) Interactive image editing method, system, readable storage medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant