CN114418872A - A real image aesthetic enhancement method based on mGANprior - Google Patents

A real image aesthetic enhancement method based on mGANprior Download PDF

Info

Publication number
CN114418872A
CN114418872A CN202111627418.1A CN202111627418A CN114418872A CN 114418872 A CN114418872 A CN 114418872A CN 202111627418 A CN202111627418 A CN 202111627418A CN 114418872 A CN114418872 A CN 114418872A
Authority
CN
China
Prior art keywords
image
inv
aesthetic
enhanced
real image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111627418.1A
Other languages
Chinese (zh)
Other versions
CN114418872B (en
Inventor
张桦
苟若芸
张灵均
吴以凡
许艳萍
叶挺聪
包尔权
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Dianzi University
Original Assignee
Hangzhou Dianzi University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Dianzi University filed Critical Hangzhou Dianzi University
Priority to CN202111627418.1A priority Critical patent/CN114418872B/en
Publication of CN114418872A publication Critical patent/CN114418872A/en
Application granted granted Critical
Publication of CN114418872B publication Critical patent/CN114418872B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/77Retouching; Inpainting; Scratch removal
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a real image aesthetic feeling enhancement method based on mGANPrior, for a real image to be enhanced in aesthetic feeling, selecting a PGGAN pre-training generation model of a corresponding type, and determining the type of an aesthetic feeling effect needing to be enhanced; performing semantic segmentation on the real image by using a cascade segmentation module method; obtaining an inverse mapping image by using an mGANNprior method; according to the aesthetic style to be enhanced, corresponding degradation transformation is carried out on the real image and the inversely mapped image, loss is calculated, and a final hidden vector and an image I with the enhanced image aesthetic style are obtained through gradient descent optimizationenh(ii) a The method of the invention realizes the aesthetic enhancement of the real image, not only retains the original information of the image to the maximum extent, but also aligns the imageLike controllable aesthetic style modification, the invention provides a loss function of degradation transformation according to aesthetic factors, and can generate aesthetic fuzzy effect on a real image.

Description

一种基于mGANprior的真实图像美感增强方法A real image aesthetic enhancement method based on mGANprior

技术领域technical field

本发明涉及生成对抗网络的逆映射和图像的语义分割及图像美学增强领域,具体是将真实图像逆映射到生成对抗网络的隐空间,针对两种美学风格构造美学退化变换用于损失计算,从而达到增强真实图像美感的效果。The invention relates to the field of inverse mapping of generative adversarial networks, semantic segmentation of images and image aesthetic enhancement, in particular to inverse mapping of real images to latent spaces of generative adversarial networks, and constructing aesthetic degradation transformations for two aesthetic styles for loss calculation, thereby To achieve the effect of enhancing the beauty of real images.

背景技术Background technique

基于生成对抗网络(GAN)的研究近年来取得了非常大的突破和进展,学者们研发了大量基于GAN的优质衍生成模型如PGGAN、StyleGAN、BigGAN等。PGGAN是首个提出渐进式训练方法从而得到高质量生成图像的生成模型,该模型通过渐进式的训练方式最终可以得到1024×1024的高质量图像。PGGAN提供了多个场景下预训练好的生成模型,如教堂、塔楼、桥梁、卧室等。The research based on Generative Adversarial Network (GAN) has made great breakthroughs and progress in recent years. Scholars have developed a large number of high-quality GAN-based derivative models such as PGGAN, StyleGAN, BigGAN, etc. PGGAN is the first generative model that proposes a progressive training method to obtain high-quality generated images. The model can finally obtain high-quality images of 1024×1024 through the progressive training method. PGGAN provides pre-trained generative models for multiple scenarios, such as churches, towers, bridges, bedrooms, etc.

GAN模型将隐空间映射到图像空间,而GAN逆映射则是其逆过程,即建立图像空间到隐空间的映射。GAN逆映射旨在将真实图像映射回预训练的GAN模型的隐空间中,然后可以由生成模型从逆映射得到的隐空间将图像重建。mGANprior是一种可以将真实图像进行高质量重建的GAN逆映射方法,经验证,该方法在室内外各种场景生成模型上的逆映射效果优良。同时,mGANprior在计算生成图像和真实图像间的损失前,通过对图像施加不同的退化变换实现了对真实图像的上色、超分辨率重建及图像复原等图像处理工作。The GAN model maps the latent space to the image space, and the GAN inverse mapping is its inverse process, that is, the mapping from the image space to the latent space is established. GAN inverse mapping aims to map the real image back into the latent space of the pre-trained GAN model, which can then be reconstructed from the latent space obtained by the inverse mapping by the generative model. mGANprior is a GAN inverse mapping method that can reconstruct real images with high quality. It has been verified that this method has excellent inverse mapping effects on various indoor and outdoor scene generation models. At the same time, before calculating the loss between the generated image and the real image, mGANprior realizes image processing such as coloring, super-resolution reconstruction and image restoration of the real image by applying different degradation transformations to the image.

语义分割是将标签或类别与图片的每个像素关联的一种深度学习算法。它用来识别构成可区分类别的像素集合。例如,桥梁风景照需要识别桥梁、小河、草地、树木、天空等。级联分割模块(Cascade SegmentationModule)方法是《Scene Parsing throughADE20KDataset》(通过ADE20K数据集进行场景解析)论文中提出的一种通用的语义分割方法,将场景以级联方式解析为素材、对象和对象部分,可以应用在不同场景下的语义分割。Semantic segmentation is a deep learning algorithm that associates a label or category with each pixel of an image. It is used to identify collections of pixels that constitute distinguishable categories. For example, a bridge landscape photo needs to identify bridges, creeks, grass, trees, sky, etc. The Cascade Segmentation Module method is a general semantic segmentation method proposed in the paper "Scene Parsing through ADE20K Dataset" (Scene Parsing through ADE20K Dataset), which parses the scene into materials, objects and object parts in a cascade manner. , which can be applied to semantic segmentation in different scenarios.

基于上述背景,若针对14类美学风格设计退化变换方式,则可以实现基于GAN逆映射的真实图像美感增强。其前提条件是该退化变换已知且可导,例如图像上色、超分辨率重建以及图像复原所对应的退化变换分别是灰度化、降采样和图像裁剪,均是已知且可导的。而美学因素以目前的研究成果仍不存在可导的变换方法,因此本发明拟针对美学风格设计退化变换方式,从而实现基于GAN的真实图像美感增强。Based on the above background, if the degradation transformation method is designed for 14 types of aesthetic styles, the real image aesthetic enhancement based on GAN inverse mapping can be realized. The premise is that the degradation transformation is known and derivable. For example, the degradation transformations corresponding to image colorization, super-resolution reconstruction, and image restoration are grayscale, downsampling, and image cropping, which are all known and derivable. . However, there is still no derivable transformation method for aesthetic factors based on the current research results. Therefore, the present invention intends to design a degradation transformation method for aesthetic style, so as to realize the enhancement of real image aesthetics based on GAN.

发明内容SUMMARY OF THE INVENTION

针对上述问题,本发明提出了一种基于mGANprior的真实图像美感增强方法,针对浅景深、运动模糊两个模糊类美学风格设计了不同的退化变换方法,实现对真实图像的模糊美感增强,本发明的技术方案包括如下步骤:In view of the above problems, the present invention proposes a real image beauty enhancement method based on mGANprior, and designs different degradation transformation methods for the two fuzzy aesthetic styles of shallow depth of field and motion blur, so as to realize the enhancement of the fuzzy beauty of real images. The technical solution includes the following steps:

步骤1:选择一张待增强美感的真实图像I,选择对应类型的PGGAN预训练生成模型,并确定需要对该图像进行浅景深美感增强或是运动模糊美感增强。Step 1: Select a real image I to be enhanced, select the corresponding type of PGGAN pre-training generation model, and determine whether the image needs to be enhanced with shallow depth of field or motion blur.

步骤2:使用级联分割模块(Cascade SegmentationModule)方法对真实图像I进行语义分割,提取主体像素。并根据需要增强的美感效果(浅景深或运动模糊),得到不同二值矩阵m。Step 2: Use the Cascade Segmentation Module method to perform semantic segmentation on the real image I, and extract the main pixels. And according to the need to enhance the aesthetic effect (shallow depth of field or motion blur), different binary matrices m are obtained.

步骤3:使用mGANprior方法得到逆映射的图像IinvStep 3: Use the mGANprior method to obtain the inversely mapped image I inv .

步骤4:根据需增强的美学风格,对真实图像I与逆映射的图像Iinv做对应的退化变换并计算损失,随后根据梯度下降优化隐向量zi,(i∈(1,n)),将优化后的zi,(i∈(1,n))再次作为输入,通过步骤3的mGANprior方法得到新的逆映射的图像Iinv,并计算损失,直到损失连续十个迭代无下降趋势则停止训练,得到最终的隐向量zi,(i∈(1,n))与图像美学风格增强后的图像Ienh。其中损失函数采用均方差损失(MSE)与感知特征重建损失结合的方式。Step 4: According to the aesthetic style to be enhanced, perform the corresponding degenerate transformation on the real image I and the inversely mapped image I inv and calculate the loss, and then optimize the hidden vector zi ,(i∈(1,n)) according to gradient descent, Take the optimized zi , (i∈(1,n)) as input again, obtain a new inversely mapped image I inv through the mGANprior method in step 3, and calculate the loss until the loss has no downward trend for ten consecutive iterations. Stop training and get the final latent vector zi ,(i∈(1,n)) and the image I enh after image aesthetic style enhancement. The loss function adopts the combination of mean square error loss (MSE) and perceptual feature reconstruction loss.

进一步的,步骤3具体方法如下:Further, the specific method of step 3 is as follows:

采用mGANprior方法将步骤1所选生成模型的生成网络从指定层一分为二,该层及之前的网络层为前置网络G1,该层之后的所有网络则为后置网络G2,其中指定层层次根据需求自行选择,逆映射的效果与层次深度成正相关。随机生成数量n个隐向量zi,(i∈(1,n))作为前置网络G1的输入,通过前置网络G1得到n个特征图,将得到的特征图基于自适应通道重要性组合后输入后置网络G2,得到生成图像IinvThe mGANprior method is used to divide the generation network of the generation model selected in step 1 into two from the specified layer. This layer and the previous network layer are the pre-network G1, and all the networks after this layer are the post-network G2. The specified layer The level is selected according to the needs, and the effect of inverse mapping is positively related to the depth of the level. Randomly generate a number of n latent vectors z i, (i∈(1,n)) as the input of the pre-network G1, obtain n feature maps through the pre-network G1, and combine the obtained feature maps based on the adaptive channel importance combination Then input the post network G2 to get the generated image I inv .

进一步的,步骤4具体方法如下:Further, the specific method of step 4 is as follows:

(1)对真实图像I和逆映射的图像Iinv施加图像退化变换得到X和Xinv(1) Apply image degradation transformation to the real image I and the inversely mapped image I inv to obtain X and X inv .

若是浅景深美感增强,则做如下浅景深退化变换。根据m对真实图像I背景部分和逆映射的图像Iinv中主体部分像素做下采样。退化变换公式如下:If the beauty of shallow depth of field is enhanced, do the following shallow depth of field degradation transformation. Downsample the pixels in the background part of the real image I and the main part of the inverse-mapped image I inv according to m. The degenerate transformation formula is as follows:

X=I*m+down(I*1-m))#(1)X=I*m+down(I*1-m))#(1)

Xinv=down(Iinv*m)+Iinv*(1-m)#(2)X inv =down(I inv *m)+I inv *(1-m)#(2)

若是运动模糊美感增强,则做如下运动模糊退化变换。根据m对真实图像I主体部分像素做下采样。退化变换公式如下:If the beauty of motion blur is enhanced, the following motion blur degradation transformation is performed. Downsample the pixels of the main part of the real image I according to m. The degenerate transformation formula is as follows:

X=down(I*m)+I*(1-m)#(3)X=down(I*m)+I*(1-m)#(3)

Xinv=Iinv#(4)X inv =I inv #(4)

(2)计算X和Xinv的MSE与感知特征损失:(2) Calculate the MSE and perceptual feature loss of X and X inv :

Figure BDA0003440313440000031
Figure BDA0003440313440000031

其中

Figure BDA0003440313440000032
为X与Xinv的MSE损失,其中φ(·)为感知特征提取器,φ(X)和φ(Xinv)分别为X与Xinv的感知特征,||φ(X),φ(Xinv)||1为φ(X)和φ(Xinv)的L1距离。in
Figure BDA0003440313440000032
is the MSE loss of X and X inv , where φ( ) is the perceptual feature extractor, φ(X) and φ(X inv ) are the perceptual features of X and X inv , respectively, ||φ(X),φ(X inv )|| 1 is the L1 distance of φ(X) and φ(X inv ).

(3)利用梯度下降优化zi,(i∈(1,n))(3) Use gradient descent to optimize zi ,(i∈(1,n)) .

(4)迭代训练。利用步骤3的mGANprior方法,将优化后的zi,(i∈(1,n))再次作为前置网络G1的输入,得到n个特征图;根据自适应通道重要性原则将n个特征图进行组合输入G2,得到新的逆映射的图像Iinv(4) Iterative training. Using the mGANprior method of step 3, the optimized zi , (i∈(1,n)) is used as the input of the pre-network G1 again, and n feature maps are obtained; according to the principle of adaptive channel importance, the n feature maps are Perform combined input G2 to obtain a new inversely mapped image I inv ;

(5)重复上述(1)-(4)步骤做退化变换并计算损失,根据梯度下降对zi,(i∈(1,n))进行优化并迭代训练。直到损失连续十个迭代不再有下降趋势则停止训练,得到模糊美感增强的Ienh。使用浅景深退化变换则得到浅景深美感增强的图像,使用运动模糊退化变换则得到运动模糊美感增强的图像。(5) Repeat the above steps (1)-(4) to do degenerate transformation and calculate the loss, optimize zi ,(i∈(1,n)) according to gradient descent and iteratively train. The training is stopped until the loss no longer has a downward trend for ten consecutive iterations, and the I enh with enhanced fuzzy aesthetics is obtained. Using the shallow depth-of-field degradation transform results in a shallow depth-of-field aesthetic-enhanced image, and using the motion-blur degradation transform results in a motion-blurred aesthetic-enhanced image.

本发明的有益效果如下:The beneficial effects of the present invention are as follows:

1.利用GAN的逆映射方法mGANprior,实现了真实图像的美感增强。既最大程度的保留图像的原有信息,又对图像进行了可控的美学风格修改。1. Using mGANprior, an inverse mapping method of GAN, to enhance the beauty of real images. It not only preserves the original information of the image to the greatest extent, but also modifies the aesthetic style of the image in a controllable manner.

2.美学风格中的模糊效果,如浅景深和运动模糊,是两种GAN最难实现的效果,因为GAN模型在训练过程中会尽量消除模糊,本专利依据美学因素提出一种退化变换的损失函数,可以在真实图像上生成具有美感的模糊效果。2. The blur effect in the aesthetic style, such as shallow depth of field and motion blur, is the most difficult effect of the two GANs, because the GAN model will try to eliminate the blur during the training process. This patent proposes a loss of degenerate transformation based on aesthetic factors. function to generate an aesthetically pleasing blur effect on a real image.

附图说明Description of drawings

图1是本发明方法实施流程图。Fig. 1 is a flow chart of the implementation of the method of the present invention.

具体实施方式Detailed ways

下面结合附图,对本发明的具体实施方案作进一步详细描述。The specific embodiments of the present invention will be described in further detail below with reference to the accompanying drawings.

如图1所示,本发明提出了一个基于mGANprior的真实图像美感增强方法,输入一张真实图像并选择合适的PGGAN生成模型,对图像进行语义分割提取出想要增强的主体部分,根据图像及提取的语义生成二值矩阵m,其中主体部分值为1,背景部分值为0。使用mGANprior得到逆映射的图像,对真实图像与逆映射图像进行退化变换并计算损失,根据梯度下降进行参数优化与迭代训练,最终得到美感增强的图像,详细步骤如下。As shown in Figure 1, the present invention proposes a real image aesthetic enhancement method based on mGANprior, input a real image and select a suitable PGGAN generation model, and perform semantic segmentation on the image to extract the main part to be enhanced. The extracted semantics generate a binary matrix m, where the value of the main part is 1 and the value of the background part is 0. Use mGANprior to obtain the inversely mapped image, degenerate the real image and the inversely mapped image and calculate the loss, and perform parameter optimization and iterative training according to gradient descent, and finally obtain an image with enhanced aesthetics. The detailed steps are as follows.

步骤1:选择一张待增强美感的真实图像I,人为判断该图像是否属于PGGAN预训练生成模型中的某一类,例如含桥梁的图像则选择PGGAN对应桥梁生成模型。若不属于PGGAN预训练生成模型中的任何一类,则可任意选择一个PGGAN生成模型。并确定想要对该图像进行浅景深美感增强或是运动模糊美感增强。。Step 1: Select a real image I to be enhanced with aesthetics, and manually determine whether the image belongs to a certain category in the PGGAN pre-training generation model. For example, for images containing bridges, select the PGGAN corresponding bridge generation model. If it does not belong to any of the PGGAN pre-trained generative models, a PGGAN generative model can be selected arbitrarily. And decide whether you want a shallow depth of field aesthetic enhancement or a motion blur aesthetic enhancement to this image. .

步骤2:使用级联分割模块对图像进行语义分割,提取主体像素。并由此得到二值矩阵m,若增强浅景深美感,则将需清晰成像部分在m中值置为1,其余值置0;若增强运动模糊美感,则将需添加模糊部分在m中值置为1,其余值置0。Step 2: Semantically segment the image using a cascaded segmentation module to extract subject pixels. And the binary matrix m is obtained from this. If the beauty of shallow depth of field is enhanced, the part that needs to be clearly imaged is set to 1 in m, and the rest of the values are set to 0; if the beauty of motion blur is enhanced, the blurred part needs to be added in the value of m. Set to 1, other values are set to 0.

其中景深是指对焦点前后的清晰成像的范围,在此范围内的画面元素都可以清晰成像,而范围之外的画面元素会渐变虚化模糊。浅景深意味着只有一部分图像处于对焦状态。运动模糊是静态场景或一系列的图片像电影或是动画中快速移动的物体造成明显的模糊拖动痕迹。The depth of field refers to the range of clear imaging before and after the focus point. The picture elements within this range can be clearly imaged, while the picture elements outside the range will gradually become blurred. Shallow depth of field means that only part of the image is in focus. Motion blur is the apparent blurring of a static scene or a series of pictures like a movie or animation of fast-moving objects.

步骤3:通过mGANprior模型得到逆映射图像IinvStep 3: Obtain the inverse mapping image I inv through the mGANprior model.

(1)将PGGAN生成网络以第8层作为分界线,第一层到第八层作为前置网络G1,后续网络作为后置网络G2;(1) The PGGAN generation network takes the 8th layer as the dividing line, the first layer to the eighth layer are used as the front network G1, and the subsequent network is used as the rear network G2;

(2)生成30个隐向量zi,(i∈(1,30))(2) Generate 30 hidden vectors z i,(i∈(1,30)) ;

(3)zi,(i∈(1,30))输入G1得到30个特征图;(3) z i, (i∈(1,30)) input G1 to get 30 feature maps;

(4)根据自适应通道重要性原则将30个特征图进行组合并输入G2得到逆映射的图像Iinv(4) 30 feature maps are combined according to the principle of adaptive channel importance and input to G2 to obtain an inversely mapped image I inv .

步骤4:根据需增强的美感对真实图像I和逆映射的图像Iinv施加不同的退化变换得到X与Xinv,计算X与Xinv的损失作为训练损失,通过梯度下降优化zi,(i∈(1,30))Step 4: Apply different degradation transformations to the real image I and the inversely mapped image I inv according to the aesthetic feeling to be enhanced to obtain X and X inv , calculate the loss of X and X inv as the training loss, and optimize zi , (i by gradient descent ∈(1,30)) .

(1)对真实图像I和逆映射的图像Iinv施加图像退化变换得到X和Xinv(1) Apply image degradation transformation to the real image I and the inversely mapped image I inv to obtain X and X inv .

若是浅景深美感增强,则做如下浅景深退化变换。根据m对真实图像I背景部分和逆映射的图像Iinv中主体部分像素做下采样。退化变换公式如下:If the beauty of shallow depth of field is enhanced, do the following shallow depth of field degradation transformation. Downsample the pixels in the background part of the real image I and the main part of the inverse-mapped image I inv according to m. The degenerate transformation formula is as follows:

X=I*m+down(I*(1-m))#(1)X=I*m+down(I*(1-m))#(1)

Xinv=down(Iinv*m)+Iinv*(1-m)#(2)X inv =down(I inv *m)+I inv *(1-m)#(2)

若是运动模糊美感增强,则做如下运动模糊退化变换。根据m对真实图像I主体部分像素做下采样。退化变换公式如下:If the beauty of motion blur is enhanced, the following motion blur degradation transformation is performed. Downsample the pixels of the main part of the real image I according to m. The degenerate transformation formula is as follows:

X=down(I*m)+I*(1-m)#(3)X=down(I*m)+I*(1-m)#(3)

Xinv=Iinv#(4)X inv =I inv #(4)

(2)计算X和Xinv的MSE与感知特征损失(2) Calculate the MSE and perceptual feature loss of X and X inv

Figure BDA0003440313440000051
Figure BDA0003440313440000051

其中

Figure BDA0003440313440000052
为X与Xinv的MSE损失,其中φ(·)为感知特征提取器,φ(X)和φ(Xinv)分别为X与Xinv的感知特征,||φ(X),φ(Xinv)||1为φ(X)和φ(Xinv)的L1距离。in
Figure BDA0003440313440000052
is the MSE loss of X and X inv , where φ( ) is the perceptual feature extractor, φ(X) and φ(X inv ) are the perceptual features of X and X inv , respectively, ||φ(X),φ(X inv )|| 1 is the L1 distance of φ(X) and φ(X inv ).

(3)利用梯度下降优化zi,(i∈(1,30))(3) Use gradient descent to optimize zi ,(i∈(1,30)) .

(4)迭代训练。利用步骤3的mGANprior方法,将优化后的zi,(i∈(1,30))再次作为前置网络G1的输入,得到30个特征图;根据自适应通道重要性原则将30个特征图进行组合输入G2,得到新的逆映射的图像Iinv(4) Iterative training. Using the mGANprior method in step 3, the optimized zi , (i∈(1,30)) is used as the input of the pre-network G1 again, and 30 feature maps are obtained; according to the principle of adaptive channel importance, the 30 feature maps are Perform combined input G2 to obtain a new inversely mapped image I inv ,

(5)重复上述(1)-(4)步骤做退化变换并计算损失,根据梯度下降对zi,(i∈(1,30))进行优化并迭代训练。直到损失连续十个迭代不再有下降趋势则停止训练,得到模糊美感增强的Ienh。使用浅景深退化变换则得到浅景深美感增强的图像,使用运动模糊退化变换则得到运动模糊美感增强的图像。(5) Repeat the above steps (1)-(4) to do degenerate transformation and calculate the loss, optimize zi ,(i∈(1,30)) according to gradient descent and iteratively train. The training is stopped until the loss no longer has a downward trend for ten consecutive iterations, and the I enh with enhanced fuzzy aesthetics is obtained. Using the shallow depth-of-field degradation transform results in a shallow depth-of-field aesthetic-enhanced image, and using the motion-blur degradation transform results in a motion-blurred aesthetic-enhanced image.

Claims (3)

1. A method for enhancing the aesthetic feeling of a real image based on mGAN prior is characterized by comprising the following steps:
step 1: selecting a real image I to be enhanced in aesthetic feeling, selecting a PGGAN pre-training generation model of a corresponding type, and determining that shallow depth of field aesthetic feeling enhancement or motion blur aesthetic feeling enhancement needs to be performed on the image;
step 2: performing semantic segmentation on the real image I by using a cascade segmentation module method, and extracting a main pixel; obtaining different binary matrixes m according to the aesthetic feeling effect enhanced by the requirement;
step (ii) of3: obtaining an inverse mapped image I by using an mGANprepror methodinv
And 4, step 4: according to the aesthetic style to be enhanced, the real image I and the inverse mapping image I are subjected toinvMaking a corresponding degenerate transformation and calculating the loss, and then optimizing the hidden vector z according to the gradient descenti,(i∈(1,n))Will optimize zi,(i∈(1,n))Again as input, a new inverse mapped image I is obtained by the mGANPrior method of step 3invAnd calculating loss, stopping training until ten continuous iterations of the loss have no descending trend, and obtaining a final hidden vector zi,(i∈(1,n))Image I with enhanced aesthetic styleenh(ii) a The loss function adopts a mode of combining mean square error loss (MSE) and perceptual feature reconstruction loss.
2. The method for enhancing the real image aesthetic feeling based on the mGANprior as claimed in claim 1, wherein the specific method in step 3 is as follows:
dividing the generation network of the generation model selected in the step 1 into two parts from a designated layer by adopting an mGANNprior method, wherein the layer and the previous network layer are front networks G1, all the networks behind the layer are rear networks G2, the designated layer is selected by self according to requirements, and the inverse mapping effect is positively correlated with the layer depth; randomly generating a number n of hidden vectors zi,(i∈(1,n))The pre-network G1 is used as input of the pre-network G1 to obtain n feature maps, the obtained feature maps are combined based on the adaptive channel importance and input into the post-network G2 to obtain a generated image Iinv
3. The method for enhancing the real image aesthetic feeling based on the mGANprior as claimed in claim 1, wherein the step 4 is as follows:
(1) for real image I and inverse mapped image IinvApplying an image degradation transformation to obtain X and Xinv
If the aesthetic feeling of the shallow depth of field is enhanced, performing the following shallow depth of field degradation transformation; according to m pairs of real image I background part and inverse mapping image IinvDown-sampling the pixels of the middle main body part; the degenerate transformation equation is as follows:
X=I*m+down(I*(1-m))#(1)
Xinv=down(Iinv*m)+Iinv*(1-m)#(2)
if the aesthetic feeling of the motion blur is enhanced, performing the following motion blur degradation transformation; downsampling the pixels of the main part of the real image I according to the m; the degenerate transformation equation is as follows:
X=down(I*m)+I*(1-m)#(3)
Xinv=Iinv#(4)
(2) calculating X and XinvMSE and perceptual feature loss of:
Figure FDA0003440313430000021
wherein
Figure FDA0003440313430000022
Is X and XinvWhere φ (-) is a perceptual feature extractor, φ (X) and φ (X)inv) Are X and X respectivelyinvThe perceptual features of (1), i (X), phi (X)inv)||1Is phi (X) and phi (X)inv) L1 distance;
(3) optimizing z with gradient descenti,(i∈(1,n))
(4) Performing iterative training; optimizing z by using the method of step 3 of mGANPriori,(i∈(1,n))Taking the n characteristic graphs as the input of the preposed network G1 again; combining the n characteristic graphs according to the principle of the importance of the self-adaptive channel and inputting the n characteristic graphs into G2 to obtain a new image I of inverse mappinginv
(5) Repeating the steps (1) to (4) to perform degradation transformation and calculate loss, and performing gradient descent on zi,(i∈(1,n))Optimizing and performing iterative training; stopping training until ten consecutive iterations are lost and no longer trend downward, obtaining fuzzy aesthetic enhanced Ienh(ii) a Obtaining a shallow depth of field aesthetically enhanced image using a shallow depth of field degenerative transform, using motion blur degradationThe transformation results in a motion blurred, aesthetically enhanced image.
CN202111627418.1A 2021-12-28 2021-12-28 A real image aesthetic enhancement method based on mGANprior Active CN114418872B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111627418.1A CN114418872B (en) 2021-12-28 2021-12-28 A real image aesthetic enhancement method based on mGANprior

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111627418.1A CN114418872B (en) 2021-12-28 2021-12-28 A real image aesthetic enhancement method based on mGANprior

Publications (2)

Publication Number Publication Date
CN114418872A true CN114418872A (en) 2022-04-29
CN114418872B CN114418872B (en) 2025-04-11

Family

ID=81270002

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111627418.1A Active CN114418872B (en) 2021-12-28 2021-12-28 A real image aesthetic enhancement method based on mGANprior

Country Status (1)

Country Link
CN (1) CN114418872B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117649338A (en) * 2024-01-29 2024-03-05 中山大学 Method for generating countermeasures against network inverse mapping for face image editing

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107705242A (en) * 2017-07-20 2018-02-16 广东工业大学 A kind of image stylization moving method of combination deep learning and depth perception
CN111310582A (en) * 2020-01-19 2020-06-19 北京航空航天大学 Turbulence degradation image semantic segmentation method based on boundary perception and counterstudy
CN112581360A (en) * 2020-12-30 2021-03-30 杭州电子科技大学 Multi-style image aesthetic quality enhancement method based on structural constraint

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107705242A (en) * 2017-07-20 2018-02-16 广东工业大学 A kind of image stylization moving method of combination deep learning and depth perception
CN111310582A (en) * 2020-01-19 2020-06-19 北京航空航天大学 Turbulence degradation image semantic segmentation method based on boundary perception and counterstudy
CN112581360A (en) * 2020-12-30 2021-03-30 杭州电子科技大学 Multi-style image aesthetic quality enhancement method based on structural constraint

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117649338A (en) * 2024-01-29 2024-03-05 中山大学 Method for generating countermeasures against network inverse mapping for face image editing
CN117649338B (en) * 2024-01-29 2024-05-24 中山大学 A Generative Adversarial Network Inverse Mapping Method for Face Image Editing

Also Published As

Publication number Publication date
CN114418872B (en) 2025-04-11

Similar Documents

Publication Publication Date Title
CN111028177B (en) An edge-based deep learning image de-blurring method
US20200334894A1 (en) 3d motion effect from a 2d image
CN110570377A (en) A Fast Image Style Transfer Method Based on Group Normalization
CN109462747B (en) DIBR system cavity filling method based on generation countermeasure network
CN115063318B (en) Low-light image enhancement method based on adaptive frequency decomposition and related equipment
EP2449524A1 (en) Contrast enhancement
US12340440B2 (en) Adaptive convolutions in neural networks
Li et al. Uphdr-gan: Generative adversarial network for high dynamic range imaging with unpaired data
CN116012232A (en) Image processing method, device, storage medium, and electronic equipment
Kumar et al. Structure-preserving NPR framework for image abstraction and stylization
CN110796622B (en) Image bit enhancement method based on multi-layer characteristics of series neural network
Chen et al. Towards Deep Style Transfer: A Content-Aware Perspective.
CN114359044B (en) An image super-resolution system based on reference images
CN116486074A (en) A Medical Image Segmentation Method Based on Coding of Local and Global Context Information
CN107464217A (en) A kind of image processing method and device
CN113160198A (en) Image quality enhancement method based on channel attention mechanism
CN115829876A (en) Real degraded image blind restoration method based on cross attention mechanism
CN117274059A (en) Low-resolution image reconstruction method and system based on image coding-decoding
CN113160055A (en) Image super-resolution reconstruction method based on deep learning
CN116109510A (en) A Face Image Inpainting Method Based on Dual Generation of Structure and Texture
CN105590296B (en) A kind of single-frame images Super-Resolution method based on doubledictionary study
Li et al. High-resolution network for photorealistic style transfer
CN109993701B (en) Depth map super-resolution reconstruction method based on pyramid structure
Liu et al. Facial image inpainting using multi-level generative network
CN116258627A (en) A system and method for super-resolution restoration of extremely degraded face images

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant