CN111325661A - Seasonal style conversion model and method for MSGAN image - Google Patents

Seasonal style conversion model and method for MSGAN image Download PDF

Info

Publication number
CN111325661A
CN111325661A CN202010106255.1A CN202010106255A CN111325661A CN 111325661 A CN111325661 A CN 111325661A CN 202010106255 A CN202010106255 A CN 202010106255A CN 111325661 A CN111325661 A CN 111325661A
Authority
CN
China
Prior art keywords
image
generator
seasonal
discriminator
true
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010106255.1A
Other languages
Chinese (zh)
Other versions
CN111325661B (en
Inventor
张福泉
王传胜
林强
王冰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhonggong Huike Beijing Intelligent Technology Co ltd
Original Assignee
Jinggong Digital Performance Fuzhou Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jinggong Digital Performance Fuzhou Technology Co ltd filed Critical Jinggong Digital Performance Fuzhou Technology Co ltd
Priority to CN202010106255.1A priority Critical patent/CN111325661B/en
Publication of CN111325661A publication Critical patent/CN111325661A/en
Application granted granted Critical
Publication of CN111325661B publication Critical patent/CN111325661B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • G06T3/04
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The invention relates to a seasonal style conversion model of an image named MSGAN and a method thereof, comprising a generator G, a generator F and a first true and false discriminator DGA second true-false discriminator DFAnd season discriminator DS. The goal of the generator is to convert the input image into other specific seasonal styles. The role of the true-false discriminator is to distinguish whether the image is a composite image. The season discriminator seasonally classifies each of the synthesized image and the real image. Both discriminators can provide guidance to the generator, respectively. In order to give the correct optimization direction of the network, the MSGAN respectively uses the style loss, the structural similarity loss and the color loss to improve the generation energy of the generatorForce. Moreover, the MSGAN guides the image style conversion task by using the saliency information of the image for the first time, so that the result of the image style conversion is more consistent with the real condition of human eyes.

Description

Seasonal style conversion model and method for MSGAN image
Technical Field
The invention relates to the technical field of image processing, in particular to a seasonal style conversion model and a seasonal style conversion method for an image named MSGAN.
Background
In recent years, cartoons, animations and 3D movies have become popular forms of art, and they have the advantage that human figures and scenes can be designed by computer software and used to deduce scripts. The cartoon work needs not only designers to design the character image, but also to draw the scene appearing in the work. Due to the complexity and diversity of storylines, seasonal variations in the seasonal style of the same scene may also often occur with the storyline. Such special effects are also required for many conventional movies or television shows, not only for cartoon works. However, manual computer software implementation of the season change task by designers is time consuming and requires a lot of associated software skills. Therefore, it is an important task to design a special algorithm to automatically switch the seasonal scenes in the movie or cartoon scenes, so as to reduce the production cost and time of most movie works and make the workers concentrate on other works related to the movie works. Furthermore, the algorithm can be embedded into some software to improve the performance of the software, such as Photoshop.
The technology of performing style migration on pictures and making the pictures undergo style conversion has become a hot problem in recent years. Conventional approaches have been directed to designing a specific image filtering algorithm to migrate the original image into a fixed one style. However, many complex style conversion tasks require the conversion of the original image into a variety of different styles of images, such as seasonal conversion tasks. Therefore, fixed image filtering algorithms cannot accomplish this task. Recently, an image style migration method based on deep learning has become a mainstream method, which can transfer the style of an input image into a given style. Especially, the generative countermeasure network can obtain good image style migration effect.
Although generative countermeasure networks have met with significant success, current methods do not adequately accomplish seasonal style migration tasks. There are three main reasons for this: first, seasonal changes, unlike other style transition tasks, sometimes require the addition or subtraction of certain elements from the original image, such as adding snow and subtracting leaves from the image in a winter scene, which is a difficult point in style migration tasks. Secondly, the main feature of seasonal variation is color transformation, and most of image style migration focuses on adding texture to the original image and neglects color variation. Thirdly, when the seasonal style conversion is performed, different contents are affected by seasonal changes to different degrees, for example, leaves respectively appear green and yellow in spring and autumn, while the color of a trunk does not change much, but the content of an image is difficult to identify by a traditional image style migration algorithm.
Disclosure of Invention
In view of the above, the present invention is directed to a seasonal style conversion model of images named MSGAN and a method thereof, which can train images of unpaired different seasons and generate better seasonal style conversion effect.
The invention is realized by adopting the following scheme: a model for seasonal style conversion of an image named MSGAN comprises a generator G, a generator F, a first true and false discriminator DGA second true-false discriminator DFAnd season discriminator DS
The input of the generator G comprises an input image and a condition vector carrying input seasonal style information, and the generator G converts the input image into a seasonal style image determined by the condition vector; the first true and false discriminator DGThe converted image of the discrimination generator G isIf the image is a composite image, feeding the result back to the generator G for providing guidance for the generator G; the season discriminator DSPerforming seasonal classification on each synthesized image or real image, and feeding the result back to the generator G to provide guidance for the generator G;
the generator F converts the image generated by the generator G into a composite image similar to the original input image, and the second true-false discriminator DFIt is discriminated whether or not the converted image of the generator F is a composite image.
Further, the loss function of the generator G during training is:
Figure BDA0002388530400000031
in the formula, X and Y represent pictures of an input model during training, and LcGAN(G,DGX, Y) denotes a generator G and a first discriminator DGOf the opposing loss function, Lcyc(G) A circular consistency loss function, L, representing the generator GcolorRepresenting the hue loss function, LssimRepresenting a similarity loss function, Lstyle(G,DS) Representing seasonal style loss functions, wherein α, β, gamma and delta are proportional weights of network loss values, and each loss function is as follows:
Figure BDA0002388530400000032
Figure BDA0002388530400000033
represents the mathematical expectation of when (x, y) obeys the true data distribution,
Figure BDA0002388530400000034
representing the mathematical expectation of x when it obeys the true data distribution, DG(x, y) the result output when the first true-false discriminator discriminates the true image, DG(x, G (x | c)) represents a first true-false discriminator DGThe result output when the image synthesized by the generator G is identified;
Figure BDA0002388530400000035
wherein F (G (x | c)) represents the conversion result of the generator F to the result of the output of the generator G under the condition c;
Figure BDA0002388530400000036
wherein, G (x | c)wIndicating the output of the generator G when the condition c is received, ywA tone representing a real image;
Figure BDA0002388530400000037
wherein, N is the number of pixels p in the window, SSIM (×) is a loss function;
Figure BDA0002388530400000041
wherein the content of the first and second substances,
Figure BDA0002388530400000042
representing the mathematical expectation of when y obeys the true data distribution, DS(G (x | c)) represents the judgment result of the season discriminator Ds on the generator G under the condition c, DS(y) represents the discrimination result of the real image by the season discriminator Ds.
Further, the loss function of the generator F during training is:
L(F,DF)=LcGAN(F,DF,X,Y)+αLcyc(F)+βLcolor+γLssim
in the formula, X and Y represent pictures of an input model during training, and LcGAN(F,DFX, Y) denotes a generator F and a second discriminator DFOf the opposing loss function, Lcyc(F) A circular consistency loss function, L, representing the generator FcolorRepresenting the hue loss function, LssimRepresenting similarity loss functions, wherein α, β and gamma are proportional weights of network loss values, and each loss function is as follows:
Figure BDA0002388530400000043
wherein G (x | c) represents an image generated by the generator G from the input image x and the condition c,
Figure BDA0002388530400000044
representing (x, y) subject to the mathematical expectation of the real data pairs,
Figure BDA0002388530400000045
mathematical expectation representing x as obeying real data, DF(x, y) denotes a second true-false discriminator DFAuthentication of data pairs, DF(x, G (x | c)) represents a second true-false discriminator DFThe identification result of the synthesized data;
Figure BDA0002388530400000046
wherein G (F (y)) represents the output result when the input of the generator G is the output of the generator F;
Figure BDA0002388530400000047
wherein, G (x | c)wRepresenting the hue, y, of the data synthesized by the generator G when receiving the condition cwA hue representing a true image color;
Figure BDA0002388530400000048
where N is the number of pixels p in the window and SSIM (×) is the loss function.
Further, in order to improve the visual effect of the network output image, the saliency information of the image is used as a reference to guide the optimization of the network, and specifically, the saliency information of the network is used to set the proportional weight of the network loss value.
Wherein, the setting of the proportional weight of the network loss value by the significance information of the network specifically comprises the following steps:
step S1: carrying out multi-scale superpixel segmentation and saliency segmentation on an input original image;
step S2: and judging whether each region segmented by the multi-scale superpixels is in the significance region, if so, making the weight value of the network loss value be a group of preset values, otherwise, making the weight value of the network loss value be half of each weight value of the network loss value of the region in the significance region.
Preferably, in the present invention, if the current region is in the significant region, the values of α, β, γ, δ are 10, 2, 1 or 10, 4, 2, 1 respectively, and if the current region is in the non-significant region, the values of α, β, γ, δ are 5, 2, 0.5 or 5, 2, 1, 0.5 respectively.
Further, the structure of the generator G and the generator F is a symmetric convolutional neural network with 7 layers of connection, which sequentially from input to input is: an auto-encoder, a residual block, an auto-decoder.
Further, the season discriminator Ds is structured as: classical AlexNet network plus softmax classifier.
Further, before the picture is input to the generator G, the input image is converted into a gray scale map so as not to have a season feature.
The invention also provides an image seasonal style conversion method based on the MSGAN model, which specifically comprises the following steps:
step S1: constructing the MSGAN model and training the MSGAN model;
step S2: after the training is finished, a generator G in the training system is used as a conversion model;
step S3: and preprocessing the picture to be converted, and inputting the preprocessed picture and the set seasonal conditions into the conversion model together to obtain the converted picture corresponding to the season.
Further, in step S3, the preprocessing specifically includes: and carrying out gray level processing on the picture to be converted so that the picture does not have seasonal characteristics.
Compared with the prior art, the invention has the following beneficial effects:
1. the invention provides a new GAN-based MSGAN model, which can carry out seasonal conversion on input images. The MSGAN model of the invention can be used for training unpaired images in different seasons, so that the use is very convenient.
2. In order to improve the effect of generating images, the invention proposes a new loss suitable for seasonal style conversion: and tone loss, which can guide the optimization direction of the network according to the visual characteristics of the color, so that the tone of the output result is more similar to that of a given reference image.
3. The method uses the saliency information of the image to guide the season style conversion task so as to ensure that different image contents can have different optimization weights in the SMGAN, thereby improving the effect of the season style conversion and shortening the training time, and leading the result output by the network to be more in line with the human visual effect experimental result.
Drawings
Fig. 1 is a schematic structural diagram of an MSGAN model according to an embodiment of the present invention.
Fig. 2 is a schematic structural diagram of a generator G and a generator F according to an embodiment of the present invention.
Fig. 3 is a pseudo code of a weight setting algorithm according to an embodiment of the present invention.
Fig. 4 is a schematic diagram of the overall data flow according to the embodiment of the present invention. Wherein, (a) a data flow for converting an input raw image into an image of other seasonal style; (b) for converting the synthesized image into an original-style image.
Detailed Description
The invention is further explained below with reference to the drawings and the embodiments.
It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the disclosure. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments according to the present application. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.
As shown in FIG. 1, the present embodiment provides a seasonal style conversion model of an image named MSGAN, which is characterized by comprising a generator G, a generator F, and a first true and false discriminator DGA second true-false discriminator DFAnd season discriminator DS
The input of the generator G comprises an input image and a condition vector carrying input seasonal style information, and the generator G converts the input image into a seasonal style image determined by the condition vector; the first true and false discriminator DGDistinguishing whether the image converted by the generator G is a synthetic image or not, and feeding back the result to the generator G to provide guidance for the generator G; the season discriminator DSPerforming seasonal classification on each synthesized image or real image, and feeding the result back to the generator G to provide guidance for the generator G;
the generator F converts the image generated by the generator G into a composite image similar to the original input image, and the second true-false discriminator DFIt is discriminated whether or not the converted image of the generator F is a composite image.
In this embodiment, the loss function of the generator G during training is:
Figure BDA0002388530400000081
in the formula, X and Y represent pictures of an input model during training, and LcGAN(G,DGX, Y) denotes a generator G and a first discriminator DGOf the opposing loss function, Lcyc(G) A circular consistency loss function, L, representing the generator GcolorRepresenting the hue loss function, LssimRepresenting a similarity loss function, Lstyle(G,DS) Representing seasonal style loss functionα, β, gamma and delta are proportional weights of network loss values, and each loss function is as follows:
Figure BDA0002388530400000082
wherein G (x | c) represents an image generated by the generator G from the input image x and the condition c,
Figure BDA0002388530400000083
represents the mathematical expectation of when (x, y) obeys the true data distribution,
Figure BDA0002388530400000084
representing the mathematical expectation of x when it obeys the true data distribution, DG(x, y) the result output when the first true-false discriminator discriminates the true image, DG(x, G (x | c)) represents a first true-false discriminator DGThe result output when the image synthesized by the generator G is discriminated.
Figure BDA0002388530400000085
Wherein F (G (x | c)) represents the conversion result of the generator F to the result of the output of the generator G under the condition c;
Figure BDA0002388530400000086
wherein, G (x | c)wTone value, y, representing the result output by generator G under condition cwRepresenting a tonal value of a reference image;
Figure BDA0002388530400000091
wherein, N is the number of pixels p in the window, SSIM (×) is a loss function;
Figure BDA0002388530400000092
wherein the content of the first and second substances,
Figure BDA0002388530400000093
indicating when y obeys the true data distributionMathematical expectation of DS(G (x | c)) represents the judgment result of the season discriminator Ds on the generator G under the condition c, DS(y) represents the discrimination result of the real image by the season discriminator Ds.
In this embodiment, the loss function of the generator F during training is:
L(F,DF)=LcGAN(F,DF,X,Y)+αLcyc(F)+βLcolor+γLssim
in the formula, X and Y represent pictures of an input model during training, and LcGAN(F,DFX, Y) denotes a generator F and a second discriminator DFOf the opposing loss function, Lcyc(F) A circular consistency loss function, L, representing the generator FcolorRepresenting the hue loss function, LssimRepresenting similarity loss functions, wherein α, β and gamma are proportional weights of network loss values, and each loss function is as follows:
Figure BDA0002388530400000094
wherein G (x | c) represents an image generated by the generator G from the input image x and the condition c,
Figure BDA0002388530400000095
representing (x, y) subject to the mathematical expectation of the real data pairs,
Figure BDA0002388530400000096
mathematical expectation representing x as obeying real data, DF(x, y) denotes a second true-false discriminator DFAuthentication of data pairs, DF(x, G (x | c)) represents a second true-false discriminator DFThe identification result of the synthesized data;
Figure BDA0002388530400000097
wherein G (F (y) represents the result of the output when the input of the generator G is the output of the generator F;
Figure BDA0002388530400000098
wherein, G (x | c)wRepresenting the hue, y, of the data synthesized by the generator G when receiving the condition cwA hue representing a true image color;
Figure BDA0002388530400000101
where N is the number of pixels p in the window and SSIM (×) is the loss function.
Preferably, the similarity measure plays an important role in object matching. Efforts are made to maintain similarity of target feature structures when converting input images to other seasonal styles. To ensure consistency in content between the input and output images, the present embodiment uses a loss of structural similarity. Each pixel point P of the input image X and the composite image G (X | c) is selected to be filtered using a window size of 13. The SSIM loss function can be described as:
Figure BDA0002388530400000102
here,. mu.xIs the average value of x, μyIs the mean value of y, σxIs the standard deviation of x, σyIs the standard deviation of y, σxyIs the covariance of x and y, c1=0.012,c2=0.032. The present embodiment calculates the loss between the input image X and the synthesized image G (X | c) as:
Figure BDA0002388530400000103
where N is the number of pixels p in windows x and y.
Preferably, since color features in different seasons need to be learned, the embodiment uses an average filter of a sliding window as a color loss, so that the color of the generated image is closer to the real situation. Similar to the SSIM loss function, the present embodiment selects a 13 × 13 sliding window to calculate the hue between the synthetic image G and the real image y. Hue refers to a color attribute, and hue is related to wavelength and is a human feeling of different colors. The formula for calculating the hue of an image from an RGB image is as follows:
Figure BDA0002388530400000111
wherein the calculation formula of theta is as follows:
Figure BDA0002388530400000112
thus, the hue loss function is described as:
Figure BDA0002388530400000113
the overall data flow of the architecture proposed by the present embodiment is shown in fig. 4. Wherein (a): a data flow for converting the input original image into an image of other seasonal style; (b) the method comprises the following steps The synthesized image is converted into an original style image.
In this embodiment, different from the conventional cGAN, in order to improve the visual effect of the network output image, the saliency information of the image is used as a reference to guide the optimization of the network, specifically, the saliency information of the network is used to set the proportional weight of the network loss value, so that a more definite direction is provided for the optimization of the network, and the network is more emphasized in the optimization process. The biggest characteristic of the seasonal style conversion task is that it is impossible to convert all regions in an image with identical operators, because seasonal changes have a small effect on some scenes and a large effect on some regions. Therefore, the embodiment guides the optimization direction of the network by using the saliency information of the image, so that the output result is more real and reliable.
Wherein, the setting of the proportional weight of the network loss value by the significance information of the network specifically comprises the following steps:
step S1: carrying out multi-scale superpixel segmentation and saliency segmentation on an input original image;
step S2: and judging whether each region segmented by the multi-scale superpixels is in the significance region, if so, making the weight value of the network loss value be a group of preset values, otherwise, making the weight value of the network loss value be half of each weight value of the network loss value of the region in the significance region.
In this embodiment, a classical superpixel segmentation algorithm, such as SLIC, is used to perform multi-scale superpixel segmentation on an original image. Given an H × W image, the size of each super-pixel is (H × W)/K, and the distance between adjacent seeds can be approximated as S ═ sqrt ((H × W)/K), assuming pre-segmentation into K super-pixels of the same size. Reselecting the seed point in n × n neighborhood of the seed point (generally, n is 3), and the specific method comprises the following steps: and calculating gradient values of all pixel points in the neighborhood, and moving the seed point to the place with the minimum gradient in the neighborhood, wherein the purpose of doing so is to prevent the clustering center from possibly being at the edge position of the image. Then, by calculating a distance metric, a class label (i.e., to which cluster center) is assigned to each pixel point in the neighborhood around each seed point. The distance metric includes a color distance dc and a spatial distance ds:
Figure BDA0002388530400000121
in the formula Ii,ai,biRespectively representing the color values of the pixel points i in the LAB color space.
Figure BDA0002388530400000122
Figure BDA0002388530400000123
In the formula, Ns represents the maximum spatial distance within a class, and is defined as Ns ═ S ═ sqrt (H × W/K), and is applied to each cluster. The maximum color distance Nc varies from image to image and from cluster to cluster, so it is replaced by a fixed constant m (span [1,40], generally 10). The resulting distance measure D' is as follows:
Figure BDA0002388530400000131
because each pixel point can be searched by a plurality of seed points, each pixel point has a distance with the surrounding seed points, and the seed point corresponding to the minimum value is taken as the clustering center of the pixel point.
The training data and the test images used in this embodiment are all 600 × 400 in size. The input image is divided by the SLIC algorithm, and the image is divided into 300 small regions. At the same time, the image is significantly segmented using the algorithm proposed by Guanghai Liu et al. The 300 small regions divided by the SLIC algorithm are respectively 10, 4, 2 and 1 if the small regions are in the salient region. If in the insignificant area, that is set to 5, 2, 1, 0.5, respectively. The pseudo code of the weight setting algorithm is shown in fig. 3.
Preferably, in this embodiment, if the current region is in the significant region, the values α, β, γ, and δ are respectively 10, 2, 1, or 10, 4, 2, 1, and if the current region is in the insignificant region, the values α, β, γ, and δ are respectively 5, 2, 0.5, or 5, 2, 1, and 0.5.
As shown in fig. 2, in this embodiment, the structure of the generator G and the generator F is a symmetric convolutional neural network with 7 layers of residual blocks, which sequentially from input to input: an auto-encoder, a residual block, an auto-decoder.
The generator is a symmetric CNN network with a 9-ResNet connection. The residual block reserves the characteristics of the size, the shape and the like of the previous layer of network and directly acts on the next layer of network. The structure can effectively reduce the operation amount of the network and prevent the problem of gradient disappearance in the training process. The structure of the decoder is symmetrical to that of the encoder, and the image which is consistent with the size of the input image can be recovered from the characteristic diagram. Assume that the present embodiment uses a vector of n × 1 as the condition vector c to carry the input seasonal style information. The condition vector c is connected to the condition map m as a bias term. In order to avoid the influence of vector size imbalance, the condition map m is consistent with the size of the input image. The input to the SMGAN can be expressed as:
x′=x+m。
in this embodiment, the discriminator includes two types, one is a conventional binary discriminator for determining whether the image is a composite image, such as a first true-false discriminator and a second true-false discriminator in this embodiment. The other adopts a classic AlexNet classifier to judge the seasonal style of the image, namely a seasonal discriminator (also called style discriminator) in the embodiment. And a softmax activation function is adopted at the last layer of the season discriminator, so that the probability of the corresponding category can be output, and guidance is provided for the generator G, and the generator G can generate a more real simulation image. The j-class prediction for a given sample x and weight w is as follows:
Figure BDA0002388530400000141
the input image is converted into images of four different seasons, so K is 4. p (n) is the probability distribution of the nth class ground truth; q (n) is the probability distribution of the nth class prediction output P (y ═ n | x); h (p, q) is the cross entropy between p and q and can be expressed as follows:
Figure BDA0002388530400000142
therefore, the present embodiment will style the penalty function LstyleThe definition is as follows:
Figure BDA0002388530400000143
using the condition label c in the AlexNet structured softmax classifier, n classes of seasonal style images G (x | c) are generated, and although a condition vector c is given in the generator, a seasonal discriminator DSIt is helpful for the true-false discriminator to distinguish the simulated image from the true image.
In the present embodiment, before the picture is input to the generator G, the input image is converted into a gray scale map so as not to have a season feature. Since the input image has its own seasonal style, it is difficult for the discriminator to make an accurate and objective judgment of the image when converting the input image into an image of its own seasonal style. Therefore, the input image needs to be initialized to a certain degree so that the input image does not have seasonal characteristics. The simplest method is to convert the input RGB image into a gray-scale map, and the conversion formula is as follows:
Gray=R*0.299+G*0.587+B*0.114;
in the formula, R, G, B represent three channels of images R, G, B, respectively.
In summary, in the present embodiment, the goal of the generator is to convert the input image into other specific seasonal style. The role of the true-false discriminator is to distinguish whether the image is a composite image. The season discriminator seasonally classifies each of the synthesized image and the real image. Both discriminators can provide guidance to the generator, respectively. In order to give the correct optimization direction of the network, the MSGAN uses the style loss, the structural similarity loss and the color loss to improve the generating capability of the generator respectively. Moreover, the MSGAN guides the image style conversion task by using the saliency information of the image for the first time, so that the result of the image style conversion is more consistent with the real condition of human eyes.
The embodiment also provides an image seasonal style conversion method based on the MSGAN model, which specifically comprises the following steps:
step S1: constructing the MSGAN model and training the MSGAN model;
step S2: after the training is finished, a generator G in the training system is used as a conversion model;
step S3: and preprocessing the picture to be converted, and inputting the preprocessed picture and the set seasonal conditions into the conversion model together to obtain the converted picture corresponding to the season.
In this embodiment, in step S3, the preprocessing specifically includes: and carrying out gray level processing on the picture to be converted so that the picture does not have seasonal characteristics.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The foregoing is directed to preferred embodiments of the present invention, other and further embodiments of the invention may be devised without departing from the basic scope thereof, and the scope thereof is determined by the claims that follow. However, any simple modification, equivalent change and modification of the above embodiments according to the technical essence of the present invention are within the protection scope of the technical solution of the present invention.

Claims (10)

1. A model for seasonal style conversion of an image named MSGAN comprises a generator G, a generator F, a first true and false discriminator DGA second true-false discriminator DFAnd season discriminator DS
The input of the generator G comprises an input image and a condition vector carrying input seasonal style information, and the generator G converts the input image into a seasonal style image determined by the condition vector; the first true and false discriminator DGDistinguishing whether the image converted by the generator G is a synthetic image or not, and feeding back the result to the generator G to provide guidance for the generator G; the season discriminator DSPerforming seasonal classification on each synthesized image or real image, and feeding the result back to the generator G to provide guidance for the generator G;
the generator F converts the image generated by the generator G into a composite image similar to the original input image, and the second true-false discriminator DFIt is discriminated whether or not the converted image of the generator F is a composite image.
2. The model of seasonal style conversion of an image named MSGAN as claimed in claim 1 wherein the loss function of the generator G when trained is:
Figure FDA0002388530390000011
wherein X and Y represent trainingPicture of time input model, LcGAN(G,DGX, Y) denotes a generator G and a first discriminator DGOf the opposing loss function, Lcyc(G) A circular consistency loss function, L, representing the generator GcolorRepresenting the hue loss function, LssimRepresenting a similarity loss function, Lstyle(G,DS) Representing seasonal style loss functions, wherein α, β, gamma and delta are proportional weights of network loss values, and each loss function is as follows:
Figure FDA0002388530390000012
wherein G (x | c) represents an image generated by the generator G from the input image x and the condition c,
Figure FDA0002388530390000013
represents the mathematical expectation of when (x, y) obeys the true data distribution,
Figure FDA0002388530390000014
representing the mathematical expectation of x when it obeys the true data distribution, DG(x, y) the result output when the first true-false discriminator discriminates the true image, DG(x, G (x | c)) represents a first true-false discriminator DGThe result output when the image synthesized by the generator G is identified;
Figure FDA0002388530390000021
wherein F (G (x | c)) represents the conversion result of the generator F to the result of the output of the generator G under the condition c;
Figure FDA0002388530390000022
wherein, G (x | c)wTone value, y, representing the result output by generator G under condition cwRepresenting a tonal value of a reference image;
Figure FDA0002388530390000023
wherein, N is the number of pixels p in the window, SSIM (×) is a loss function;
Figure FDA0002388530390000024
wherein the content of the first and second substances,
Figure FDA0002388530390000025
representing the mathematical expectation of when y obeys the true data distribution, DS(G (x | c)) represents the judgment result of the season discriminator Ds on the generator G under the condition c, DS(y) represents the discrimination result of the real image by the season discriminator Ds.
3. The model of seasonal style conversion of an image named MSGAN as claimed in claim 1 wherein the loss function of the generator F in training is:
L(F,DF)=LcGAN(F,DF,X,Y)+αLcyc(F)+βLcolor+γLssim
in the formula, X and Y represent pictures of an input model during training, and LcGAN(F,DFX, Y) denotes a generator F and a second discriminator DFOf the opposing loss function, Lcyc(F) A circular consistency loss function, L, representing the generator FcolorRepresenting the hue loss function, LssimRepresenting similarity loss functions, wherein α, β and gamma are proportional weights of network loss values, and each loss function is as follows:
Figure FDA0002388530390000026
wherein G (x | c) represents an image generated by the generator G from the input image x and the condition c,
Figure FDA0002388530390000027
representing (x, y) subject to the mathematical expectation of the real data pairs,
Figure FDA0002388530390000028
mathematical expectation representing x as obeying real data, DF(x, y) denotes a second true-false discriminator DFAuthentication of data pairs, DF(x, G (x | c)) represents a second true-false discriminator DFThe identification result of the synthesized data;
Figure FDA0002388530390000031
wherein G (F (y) represents the result of the output when the input of the generator G is the output of the generator F;
Figure FDA0002388530390000032
wherein, G (x | c)wRepresenting the hue, y, of the data synthesized by the generator G when receiving the condition cwA hue representing a true image color;
Figure FDA0002388530390000033
where N is the number of pixels p in the window and SSIM (×) is the loss function.
4. The model of seasonal style conversion of images named MSGAN as claimed in claim 1, wherein the saliency information of the images is used as a reference to guide the optimization of the network, and specifically the saliency information of the network is used to set the scale weight of the loss value of the network in order to improve the visual effect of the network output images.
5. The model of seasonal style conversion of images named MSGAN as claimed in claim 4, wherein the setting of the scaling weight of the network loss value with the network saliency information comprises the following steps:
step S1: carrying out multi-scale superpixel segmentation and saliency segmentation on an input original image;
step S2: and judging whether each region segmented by the multi-scale superpixels is in the significance region, if so, making the weight value of the network loss value be a group of preset values, otherwise, making the weight value of the network loss value be half of each weight value of the network loss value of the region in the significance region.
6. The model of seasonal style conversion of an image named MSGAN as claimed in claim 1 wherein the structure of generator G and generator F is a symmetric convolutional neural network with 7 layers of residual blocks, in order from input to input: an automatic coding structure, a residual block structure and a self-decoding structure.
7. The model of seasonal style conversion of images named MSGAN as claimed in claim 1, wherein the structure of the seasonal discriminator Ds is a classical AlexNet network structure.
8. The model of seasonal style conversion of an image named MSGAN as claimed in claim 1 wherein the input image is converted to a gray scale map without seasonal features before the picture is input to the generator G.
9. An image seasonal style conversion method based on the seasonal style conversion model of the image named MSGAN of any one of claims 1 to 8, comprising the steps of:
step S1: constructing the MSGAN model and training the MSGAN model;
step S2: after the training is finished, a generator G in the training system is used as a conversion model;
step S3: and preprocessing the picture to be converted, and inputting the preprocessed picture and the set seasonal conditions into the conversion model together to obtain the converted picture corresponding to the season.
10. The method of image seasonal style conversion based on a seasonal style conversion model of an image named MSGAN according to claim 9, wherein: in step S3, the preprocessing specifically includes: and carrying out gray level processing on the picture to be converted so that the picture does not have seasonal characteristics.
CN202010106255.1A 2020-02-21 2020-02-21 Seasonal style conversion model and method for image named MSGAN Active CN111325661B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010106255.1A CN111325661B (en) 2020-02-21 2020-02-21 Seasonal style conversion model and method for image named MSGAN

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010106255.1A CN111325661B (en) 2020-02-21 2020-02-21 Seasonal style conversion model and method for image named MSGAN

Publications (2)

Publication Number Publication Date
CN111325661A true CN111325661A (en) 2020-06-23
CN111325661B CN111325661B (en) 2024-04-09

Family

ID=71163467

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010106255.1A Active CN111325661B (en) 2020-02-21 2020-02-21 Seasonal style conversion model and method for image named MSGAN

Country Status (1)

Country Link
CN (1) CN111325661B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112258381A (en) * 2020-09-29 2021-01-22 北京达佳互联信息技术有限公司 Model training method, image processing method, device, equipment and storage medium
CN112561864A (en) * 2020-12-04 2021-03-26 深圳格瑞健康管理有限公司 Method, system and storage medium for training caries image classification model
CN113066114A (en) * 2021-03-10 2021-07-02 北京工业大学 Cartoon style migration method based on Retinex model
CN113538216A (en) * 2021-06-16 2021-10-22 电子科技大学 Image style migration method based on attribute decomposition

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106845471A (en) * 2017-02-20 2017-06-13 深圳市唯特视科技有限公司 A kind of vision significance Forecasting Methodology based on generation confrontation network
CN108171320A (en) * 2017-12-06 2018-06-15 西安工业大学 A kind of image area switching network and conversion method based on production confrontation network
CN108334904A (en) * 2018-02-07 2018-07-27 深圳市唯特视科技有限公司 A kind of multiple domain image conversion techniques based on unified generation confrontation network
CN108805188A (en) * 2018-05-29 2018-11-13 徐州工程学院 A kind of feature based recalibration generates the image classification method of confrontation network
CN109408776A (en) * 2018-10-09 2019-03-01 西华大学 A kind of calligraphy font automatic generating calculation based on production confrontation network
CN109815893A (en) * 2019-01-23 2019-05-28 中山大学 The normalized method in colorized face images illumination domain of confrontation network is generated based on circulation
CN110322396A (en) * 2019-06-19 2019-10-11 怀光智能科技(武汉)有限公司 A kind of pathological section color method for normalizing and system
CN110570346A (en) * 2019-08-19 2019-12-13 西安理工大学 Method for performing style migration on calligraphy based on cyclic generation countermeasure network
CN110570363A (en) * 2019-08-05 2019-12-13 浙江工业大学 Image defogging method based on Cycle-GAN with pyramid pooling and multi-scale discriminator

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106845471A (en) * 2017-02-20 2017-06-13 深圳市唯特视科技有限公司 A kind of vision significance Forecasting Methodology based on generation confrontation network
CN108171320A (en) * 2017-12-06 2018-06-15 西安工业大学 A kind of image area switching network and conversion method based on production confrontation network
CN108334904A (en) * 2018-02-07 2018-07-27 深圳市唯特视科技有限公司 A kind of multiple domain image conversion techniques based on unified generation confrontation network
CN108805188A (en) * 2018-05-29 2018-11-13 徐州工程学院 A kind of feature based recalibration generates the image classification method of confrontation network
CN109408776A (en) * 2018-10-09 2019-03-01 西华大学 A kind of calligraphy font automatic generating calculation based on production confrontation network
CN109815893A (en) * 2019-01-23 2019-05-28 中山大学 The normalized method in colorized face images illumination domain of confrontation network is generated based on circulation
CN110322396A (en) * 2019-06-19 2019-10-11 怀光智能科技(武汉)有限公司 A kind of pathological section color method for normalizing and system
CN110570363A (en) * 2019-08-05 2019-12-13 浙江工业大学 Image defogging method based on Cycle-GAN with pyramid pooling and multi-scale discriminator
CN110570346A (en) * 2019-08-19 2019-12-13 西安理工大学 Method for performing style migration on calligraphy based on cyclic generation countermeasure network

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112258381A (en) * 2020-09-29 2021-01-22 北京达佳互联信息技术有限公司 Model training method, image processing method, device, equipment and storage medium
CN112258381B (en) * 2020-09-29 2024-02-09 北京达佳互联信息技术有限公司 Model training method, image processing method, device, equipment and storage medium
CN112561864A (en) * 2020-12-04 2021-03-26 深圳格瑞健康管理有限公司 Method, system and storage medium for training caries image classification model
CN112561864B (en) * 2020-12-04 2024-03-29 深圳格瑞健康科技有限公司 Training method, system and storage medium for caries image classification model
CN113066114A (en) * 2021-03-10 2021-07-02 北京工业大学 Cartoon style migration method based on Retinex model
CN113538216A (en) * 2021-06-16 2021-10-22 电子科技大学 Image style migration method based on attribute decomposition

Also Published As

Publication number Publication date
CN111325661B (en) 2024-04-09

Similar Documents

Publication Publication Date Title
CN111325661A (en) Seasonal style conversion model and method for MSGAN image
CN110853026B (en) Remote sensing image change detection method integrating deep learning and region segmentation
CN108537239B (en) Method for detecting image saliency target
CN103119625B (en) Video character separation method and device
Li et al. Globally and locally semantic colorization via exemplar-based broad-GAN
CN109903339B (en) Video group figure positioning detection method based on multi-dimensional fusion features
CN107066916A (en) Scene Semantics dividing method based on deconvolution neutral net
CN108595558B (en) Image annotation method based on data equalization strategy and multi-feature fusion
CN110728302A (en) Method for identifying color textile fabric tissue based on HSV (hue, saturation, value) and Lab (Lab) color spaces
CN112991371B (en) Automatic image coloring method and system based on coloring overflow constraint
CN107146219B (en) Image significance detection method based on manifold regularization support vector machine
CN106373096A (en) Multi-feature weight adaptive shadow elimination method
CN103955942A (en) SVM-based depth map extraction method of 2D image
Xiao et al. Interactive deep colorization with simultaneous global and local inputs
CN111274964B (en) Detection method for analyzing water surface pollutants based on visual saliency of unmanned aerial vehicle
CN115457551A (en) Leaf damage identification method suitable for small sample condition
CN113705579B (en) Automatic image labeling method driven by visual saliency
CN114495170A (en) Pedestrian re-identification method and system based on local self-attention inhibition
CN107194870A (en) A kind of image scene reconstructing method based on conspicuousness object detection
CN110796716B (en) Image coloring method based on multiple residual error network and regularized transfer learning
CN108537771B (en) MC-SILTP moving target detection method based on HSV
Wicaksono et al. Improve image segmentation based on closed form matting using K-means clustering
CN111275718A (en) Clothes amount detection and color protection washing discrimination method based on significant region segmentation
CN111461772A (en) Video advertisement integration system and method based on generation countermeasure network
CN113205152B (en) Feature fusion method for look-around fusion

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: Room 1502, 15 / F, building 17, Innovation Park Phase II, No. 7, wulongjiang middle Avenue, high tech Zone, Fuzhou City, Fujian Province

Applicant after: Jinggong Huichuang (Fuzhou) Technology Co.,Ltd.

Address before: No.053, area D, a second floor, No.99 Gujing, mabao village, Shangjie Town, Minhou County, Fuzhou City, Fujian Province, 350108

Applicant before: Jinggong digital performance (Fuzhou) Technology Co.,Ltd.

CB02 Change of applicant information
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20240424

Address after: 3004, 30th Floor, Building 7, No. 26 Chengtong Street, Shijingshan District, Beijing, 100043

Patentee after: Zhonggong Huike (Beijing) Intelligent Technology Co.,Ltd.

Country or region after: China

Address before: Room 1502, 15 / F, building 17, Innovation Park Phase II, No. 7, wulongjiang middle Avenue, high tech Zone, Fuzhou City, Fujian Province

Patentee before: Jinggong Huichuang (Fuzhou) Technology Co.,Ltd.

Country or region before: China