CN108805803B - Portrait style migration method based on semantic segmentation and deep convolution neural network - Google Patents

Portrait style migration method based on semantic segmentation and deep convolution neural network Download PDF

Info

Publication number
CN108805803B
CN108805803B CN201810606345.XA CN201810606345A CN108805803B CN 108805803 B CN108805803 B CN 108805803B CN 201810606345 A CN201810606345 A CN 201810606345A CN 108805803 B CN108805803 B CN 108805803B
Authority
CN
China
Prior art keywords
image
portrait
style
content
semantic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810606345.XA
Other languages
Chinese (zh)
Other versions
CN108805803A (en
Inventor
赵辉煌
郑金华
孙雅琪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hengyang Normal University
Original Assignee
Hengyang Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hengyang Normal University filed Critical Hengyang Normal University
Priority to CN201810606345.XA priority Critical patent/CN108805803B/en
Publication of CN108805803A publication Critical patent/CN108805803A/en
Application granted granted Critical
Publication of CN108805803B publication Critical patent/CN108805803B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/04Context-preserving transformations, e.g. by using an importance map
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/194Segmentation; Edge detection involving foreground-background segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10004Still image; Photographic image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20076Probabilistic image processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • G06T2207/30201Face

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a portrait style migration method based on semantic segmentation and a deep convolutional neural network, which comprises the steps of firstly selecting a portrait to be converted and a target style portrait, then carrying out semantic segmentation on two images to segment a portrait region and a background region, then segmenting specific five sense organs from the portrait region, then defining a portrait style migration loss function, adopting a deep convolutional neural network VGG-19 as an image advanced style feature extraction basic model, defining a content constraint layer and a style constraint layer, then defining the content constraint layer and the style constraint layer in the VGG-19 model, and establishing a new model structure. And respectively inputting the segmented semantic image and the original image into a new VGG-19 model, extracting high-level style features and content features of the image, utilizing a portrait style migration loss function, adopting a gradient descent method, and finally generating a style migration result image through repeated iteration to minimize the loss function.

Description

Portrait style migration method based on semantic segmentation and deep convolution neural network
Technical Field
The invention relates to the field of deep learning, in particular to a portrait style migration method based on semantic segmentation and a deep convolutional neural network.
Background
With the rapid development of scientific and technological technology, in the field of deep learning research, the process of fusing semantic content of a picture with different styles by Using CNN is called Neural Style migration (Neural Style Transfer), and the oral report article "image Style Transfer Using Neural networks" of garys et al in CVPR proves the surprising ability of Convolutional Neural Networks (CNN) in image Style migration: by separating and recombining picture content and styles, CNNs can create artistic charm works. Since then, there has been a great interest in neural style migration in academic research and industrial applications, transferring the artistic style with artistic works to daily photos, becoming a computer vision task that has received great attention in both academic and industrial circles. Meanwhile, the method brings about a plurality of exclamatory applications in the style migration of the portrait. A Torr Vision Group at Oxford university provides a model (Conditional Random Fields as Current Neural Networks) in ICCV 2015, and after training, a CRFes RNN model can segment target contents in an image.
The existing style migration method mainly has the following problems: the style migration of images has great randomness, so that the effect is not ideal in many cases. Particularly, for the style migration of the portrait, some errors sometimes occur, for example, the eye part features in the style image are migrated to the mouth, or the image background features are migrated to the portrait, and the migration effect is very undesirable.
Disclosure of Invention
The invention provides a portrait style migration method based on semantic segmentation and a deep convolution neural network, aiming at realizing targeted style migration of a portrait and improving the portrait style migration effect.
In order to achieve the technical purpose, the technical scheme of the invention is that,
a portrait style migration method based on semantic segmentation and a deep convolutional neural network comprises the following steps:
step 1, selecting a content portrait image needing style transfer and a style portrait image serving as a style source, and performing semantic segmentation on the content image and the style image respectively to segment a portrait region and a background region, namely forming a semantic image of the content image and the style image;
step 2, adopting a deep convolutional neural network VGG-19 as an image high-level feature extraction original model, taking relu5_1 as a content constraint feature extraction layer, and taking relu3_1 and relu4_1 as style constraint feature extraction layers;
step 3, establishing new feature graphs for the content constraint feature extraction layer and the style constraint feature extraction layer respectively;
step 4, generating Gaussian noise images randomly as new initialization images;
step 5, adjusting the size of the initialized new image according to the size of the content portrait image;
step 6, inputting the style portrait image, the content image semantic image and the style image semantic image into a convolutional neural network VGG-19, and then calculating style constraint layer loss functions of the content portrait semantic image and the style portrait semantic image on style constraint layers relu3_1 and relu4_1 by using a Markov random field;
step 7, inputting the initialized new image into a convolutional neural network VGG-19, and calculating a content constraint loss function of the finally generated style image in a content constraint layer relu5_1 by using a Markov random field model;
step 8, integrating the results of the step 6 and the step 7 to obtain a total loss function, generating a portrait style transition result by respectively adopting an optimization algorithm based on a gradient descent method for different layers, namely generating the gradient of the style transition portrait by iterative calculation by adopting the gradient descent method, and approaching the original content portrait and the style portrait along the direction of negative gradient by using the total loss function so as to ensure that the style transition portrait generated by each iteration is similar to the original content portrait and the style portrait respectively as much as possible;
and 9, repeating the steps 6-8 for 100 times of iteration, repeating the steps 5-8 for 3 times of iteration, and outputting the final portrait style transition image.
In the method, in the step 1, firstly, semantic segmentation is carried out on a content image and a style image to segment semantic images of a portrait region and a background region, then, semantic segmentation is further carried out on the portrait region to segment 5 regions of a face, a nose, eyes, a mouth and a body as 5 semantic images, and finally 6 semantic images of the background, the face, the nose, the eyes, the mouth and the body are obtained.
In the method, in the step 3, the new feature map of the content constraint feature extraction layer is
Figure BDA0001694432720000033
Where l represents the content constrained feature extraction layer in the corresponding VGG-19, i.e. relu5_1,
Figure BDA0001694432720000034
is a feature graph generated by a content portrait image on a content constraint layer based on a VGG19 network model, βcThe parameters are adjusted for semantic content portrait weights,
Figure BDA0001694432720000035
semantic image representing content portrait, k being 1,2,3,4,5,6, βcValue range [0,200](ii) a The new characteristic diagram of the style constraint characteristic extraction layer is
Figure BDA0001694432720000036
Wherein l represents the style constraint feature extraction layers in the corresponding VGG-19, namely relu3_1 and relu4_1,
Figure BDA0001694432720000037
is a feature diagram generated by a style portrait image based on a VGG19 network model in a style constraint layer, βsThe parameters are adjusted for semantic style portrait weights,
Figure BDA0001694432720000038
semantic image representing a stylistic portrait, k being 1,2,3,4,5,6, βsValue range [0,200]。
Said method, said step 5, setting the size of the initialization new image to be
Figure BDA0001694432720000032
Wherein
Figure BDA0001694432720000031
hcThe length and the width of the content portrait image are respectively, L is a parameter for adjusting the image size, and L is respectively 3,2 and 1 in each iteration.
In the method, in step 6, the style constraint layer loss function is:
Figure BDA0001694432720000041
wherein the content of the first and second substances,
Figure BDA0001694432720000042
phi (x) is a feature map, i represents the ith, j represents the jth, phi (x) and mcDividing the blocks into local blocks of r x r size, i.e. local blocks, wherein each local block is psi (phi (x)), psi (m)c) Dividing phi (x) into p1 local patches, and dividing mcThe segmentation generates p2 local patches,
Figure BDA0001694432720000043
representing a stylistic portrait image, R representing a set of real numbers, wc,hcRespectively the length and width of the portrait image of the content,
Figure BDA0001694432720000044
wherein R represents a real number set, ws,hsLength and width of the portrait image of contents, mcSemantic image, m, representing a portrait of contentsA semantic image representing a stylistic portrait;
Figure BDA0001694432720000045
represents the ith local patch in Ψ (Φ (x)),
Figure BDA0001694432720000046
to represent
Figure BDA0001694432720000047
The ith local patch of (1). While
Figure BDA0001694432720000048
And
Figure BDA0001694432720000049
each representing Ψ*(Φ(xs) Either) or
Figure BDA00016944327200000410
Neutralization of
Figure BDA00016944327200000411
Or
Figure BDA00016944327200000412
The most matched local patch, k, represents the number of semantic images;
wherein the local patch selection rule is defined as
Figure BDA00016944327200000413
Figure BDA00016944327200000414
The method, the step 7, the content constraint loss function is
Ec(Φ(x),Φ(xc))=||Φ(x)-Φ(xc)||2
The method, the step 8, the total loss function is
E(x)=α1Es(Φ(x),Φ(xs),mc,ms)+α2Ec(Φ(x),Φ(xc))
α therein1And α2The values of the adjustment parameters are α for adjusting the intensity of the original content image and the lattice image contained in the generated image respectively1∈[0,1],α2∈[0,200]。
In the step 8, the optimization algorithm based on the gradient descent method includes the following steps:
(1) initialization, where the iteration parameters i-0, j-m, define the matrix H and initialize it to a diagonal matrix with elements 1, and the allowable error e-10-5Calculating an initial gradient
Figure BDA0001694432720000051
x0A Gaussian noise image randomly generated in the step 4 is obtained;
(2) if i<Itr or if
Figure BDA0001694432720000052
The ith iteration node is outputFruit xi+1And ending the optimization algorithm; otherwise, turning to the step (3); wherein itr is the highest number of iterations;
(3) definition of piIs the negative gradient direction p of the ith iterationi=-gi
(4) Updating the result of the ith iteration, xi+1=xi+pi
(5) Definition siAs a result of the previous step xiAnd the error of the result of this iteration, i.e. si=xi+1-xiDefinition of yiGraduating as a result of the previous step
Figure BDA0001694432720000053
And the gradient of the result of this iteration
Figure BDA0001694432720000054
Error, i.e.
Figure BDA0001694432720000055
Definition of
Figure BDA0001694432720000056
Wherein T represents a matrix transfer;
(6) updating
Figure BDA0001694432720000057
(7) Defining variable q as xiGradient of (2)
Figure BDA0001694432720000058
(8) Iterative calculation of j ═ 1
Get
Figure BDA0001694432720000059
Updating q, q ═ q-aiyi-jUntil j is m, and m is a preset iteration number;
(9) update gi,gi=Hiq;
(10) Iterative calculation of j ═ 1
Taking out the raw materials,
Figure BDA00016944327200000510
update gi,gi=gi+si-j(aj-b) until j ═ m
(11) And (5) updating an iteration step, i is equal to i +1, and jumping to the step (2).
The method, in the optimization algorithm based on the gradient descent method, further comprises the step of retaining the results of the latest m times after the step (5) is executed, if i>m, then delete si-m、si-m-1...s1And yi-m、yi-m-1...y1
The method establishes an image content model and an image style model based on high-level semantic representation in a convolutional neural network, and then optimizes an initial image (such as a random noise image) to enable the initial image to have content representation similar to a content portrait image and style representation similar to a style portrait image in the same convolutional neural network, so that an image fusing the content of the content portrait image and the style of the style portrait image is generated, and a style transfer function is realized.
The difference and the advantage of the invention compared with other style transfer algorithms are
(1) The invention carries out more subdivision on a feature map generated by the original portrait, namely feature map, establishes a loss function by extracting sub-blocks of the feature map, and minimizes the loss function by adopting a gradient descent method. Therefore, the generated portrait has better detail characteristics and more ideal effect. Has essential difference with the traditional method.
(2) According to the method, the original style portrait and the content portrait are subjected to semantic segmentation to obtain a plurality of semantic images, the semantic portraits are converted into feature maps, the feature maps are added to a selected layer in a VGG network model, and more features are provided for an image style migration method to select.
(3) The present invention defines a new loss function. The constraint of the semantic image on the output result is increased. The method avoids the generation of some errors in the style transfer (such as the transfer of eye part characteristics to the mouth in the style portrait or the transfer of image background characteristics to the portrait), and improves the effect of the portrait style transfer.
In conclusion, the invention realizes the technical effect of style transfer on any style portrait image which can be subjected to semantic segmentation.
Drawings
FIG. 1 is a system flow diagram of the present invention;
FIG. 2 is a model architecture diagram of the present invention;
FIG. 3 is a content portrait image employed by embodiments of the present invention;
FIG. 4 is a stylistic portrait image employed by embodiments of the present invention;
FIG. 5 is a style migration result of the portrait style migration method of the present invention.
Fig. 6 is a style migration result display of the portrait style migration method by the conventional method.
Detailed Description
Referring to fig. 1 and fig. 2, which are a system flowchart and a model architecture diagram of the present invention, respectively, and referring to fig. 4, the present embodiment selects an artistic image as a style portrait
Figure BDA0001694432720000075
Selecting an image as the content portrait
Figure BDA0001694432720000076
As shown in fig. 3. Wherein wc,hcLength and width, w, of the portrait image of the content, respectivelys,hsRespectively, the length and width of the portrait image of the content; then, semantic segmentation is carried out on the style portrait and the content portrait by adopting a semantic-based image segmentation algorithm:
step 1, selecting a CRF as RNN model developed by Oxford university as a semantic segmentation model of an image portrait region, performing semantic segmentation on a content image and a style image respectively to segment the portrait region and a background region,
step 2, adopting an Openface face region segmentation algorithm, calibrating the face, nose, eyes, mouth and body regions of the portrait region, and then performing semantic segmentationAnd (4) cutting and segmenting 5 regions of the face, the nose, the eyes, the mouth and the body to serve as 5 semantic images, and finally obtaining 6 semantic images of the background, the face, the nose, the eyes, the mouth and the body. Semantic image in which the content is portrait
Figure BDA0001694432720000071
And style portrait semantic images
Figure BDA0001694432720000072
k=1,2,3,4,5,6。
FIG. 3 is a target content image
Figure BDA0001694432720000073
FIG. 4 is a target portrait style image
Figure BDA0001694432720000074
Our goal is to generate a style migration graph 5.
And 3, selecting a deep convolutional neural network VGG-19 which obtains excellent performances in ImageNet image classification competition in 2014 as an image advanced style feature extraction model.
Step 4, setting a content constraint layer, and selecting the target content image x shown in the figure 3cFIG. 4 is a target style image xsSelecting relu5_1 as a content constraint layer, selecting relu3_1 and relu4_1 as style constraint layers, and setting L to be 3,2 and 1, namely, adopting three layers of iterations, wherein the maximum iteration number itr of each layer is 100;
step 5, reading semantic images of the content portraits at a VGG19 network content constraint layer relu5_1
Figure BDA0001694432720000081
And content portrait xcAnd updating feature maps in the VGG19 network content constraint layer.
Figure BDA0001694432720000082
Figure BDA0001694432720000083
Feature maps, f at content constraint level for new VGG19 networkscIs a content portrait xcFeature maps generated at the content constraint layer.
And take βc=20。
And 6, establishing a new input and output model in the content layer relu5_1, and recalculating the gradient of the network model in the relu5_1 layer. And updating the output of the network model at the relu5_1 layer to obtain new output at the relu5_ l layer.
Step 7, setting a style constraint layer, and enabling the target style image xsThe input is input into a convolutional neural network VGG-19, and the style image is calculated at a style constraint layer relu3_ l, relu4_ 1.
Step 8, reading semantic images of the style portraits at a VGG19 network style constraint layer relu3_ l, relu4_1
Figure BDA0001694432720000084
And style portrait xsUpdating feature maps in the VGG19 network style constraint layer,
Figure BDA0001694432720000085
Figure BDA0001694432720000086
feature maps, f at the style constraint level for a new VGG19 networksIs a style portrait xsIn feature maps generated by the stylistic constraint layer βs=20。
And 9, establishing a new input and output model at the style layers relu3_ l and relu4_1, and recalculating the gradient of the network model at the relu3_ l and relu4_1 layers. The updated network model is output at the relu3_ l, relu4_1 level. And results in a new output at the level relu3_ l, relu4_ 1.
Step 10, generating Gaussian noise image randomly as new initialization image
Figure BDA0001694432720000087
Step 11, utilizing the node of the last iterationIf necessary, resetting the image size
Figure BDA0001694432720000088
Wherein
Figure BDA0001694432720000089
Step 12, the target content portrait xcAnd semantic image mcInputting the data into a convolutional neural network VGG-19, outputting feature maps in the network model and recording the feature maps as phi (x) in a content constraint layer by using a Markov Random Field (MRF) modelc),mc
Step 13, the target style image xsAnd semantic image msInputting into convolutional neural network VGG-19, and outputting feature maps in the network model at content constraint layer by using Markov Random Field (MRF) model and recording as phi (x)s),ms
Step 14, Φ (x)s),msDividing by 1 step size, and dividing by phi (x)s),msAnd mcDivided into p small blocks (local patch) of size 3 × 3.
Step 15, loss functions on style constraint layers relu3_ l and relu4_1,
Figure BDA0001694432720000091
Figure BDA0001694432720000092
βcsthe method is used for adjusting the weight of semantic images, wherein p1 and p2 represent that phi (x) is segmented to generate p1 local patches and m iscThe segmentation generates p2 local patches,
step 16, Ψi(Φ (x)) represents a local patch, and
Figure BDA0001694432720000093
and
Figure BDA0001694432720000098
respectively represents phi (x)s) Or
Figure BDA0001694432720000094
Meso-and Ψi(Φ (x)) and
Figure BDA0001694432720000095
the best matching patch, k, represents the number of semantic images.
Step 17, the local patch selection rule is defined as,
Figure BDA0001694432720000096
Figure BDA0001694432720000097
step 18, calculating a loss function on the content constraint layer relu5_1, inputting the new image X into the convolutional neural network VGG-19 to obtain the loss function of X generating the portrait on the content constraint layer relu5_ l by utilizing a Markov Random Field (MRF) model on the content constraint layer,
Ec(Φ(x),Φ(xc))=||Φ(x)-Φ(xc)||2
step 19, establish the total loss function:
E(x)=α1Es(Φ(x),Φ(xs),mc,ms)+α2Ec(Φ(x),Φ(xc))
get α1=0.001,α2=20。
Step 20, the minimization optimization function e (x) is then solved by gradient descent. An input image X is generated. The optimization algorithm based on the gradient descent method comprises the following steps:
(1) initialization, where the iteration parameters i-0, j-m, define the matrix H and initialize it to a diagonal matrix with elements 1, and the allowable error e-10-5Calculating an initial gradient
Figure BDA0001694432720000101
x0For the gaussian noise image randomly generated in step 4, the preset iteration number m is 6, and itr is 100;
(2) if i<Itr or if
Figure BDA0001694432720000102
The ith iteration result x is outputi+1And ending the optimization algorithm; otherwise, turning to the step (3); wherein itr is the highest number of iterations;
(3) definition of piIs the negative gradient direction p of the ith iterationi=-gi
(4) Updating the result of the ith iteration, xi+1=xi+pi
(5) Definition siAs a result of the previous step xiAnd the error of the result of this iteration, i.e. si=xi+1-xiDefinition of yiGraduating as a result of the previous step
Figure BDA0001694432720000103
And the gradient of the result of this iteration
Figure BDA0001694432720000104
Error, i.e.
Figure BDA0001694432720000105
Definition of
Figure BDA0001694432720000106
Wherein T represents a matrix transfer;
(6) updating
Figure BDA0001694432720000107
(7) Defining variable q as xiGradient of (2)
Figure BDA0001694432720000108
(8) Iterative calculation of j ═ 1
Get
Figure BDA0001694432720000109
Updating q, q ═ q-aiyi-jUntil j ═ m,;
(9) update gi,gi=Hiq;
(10) Iterative calculation of j ═ 1
Taking out the raw materials,
Figure BDA00016944327200001010
update gi,gi=gi+si-j(aj-b) until j ═ m
(11) And (5) updating an iteration step, i is equal to i +1, and jumping to the step (2).
Meanwhile, in order to save memory overhead, after the step (5) is executed, only the step of retaining the results of the latest m times is executed, if i>m, then delete si-m、si-m-1...s1And yi-m、yi-m-1...y1Therefore, the effect of saving the memory can be achieved during operation.
And step 21, repeating the steps 12-20, and generating a new generated image after iterating for 100 times.
And step 22, repeating the steps 11-21, and outputting a final style migration result image after 3 iterations.
The generated style transfer effect image is as shown in fig. 4.
The experimental result shows that the style transfer function of the image can be effectively realized by the method.

Claims (8)

1. A portrait style migration method based on semantic segmentation and a deep convolutional neural network is characterized by comprising the following steps:
step 1, selecting a content portrait image needing style transfer and a style portrait image serving as a style source, and performing semantic segmentation on the content image and the style image respectively to segment a portrait region and a background region, namely forming a semantic image of the content image and the style image;
step 2, adopting a deep convolutional neural network VGG-19 as an image high-level feature extraction original model, taking relu5_1 as a content constraint feature extraction layer, and taking relu3_1 and relu4_1 as style constraint feature extraction layers;
step 3, establishing new feature graphs for the content constraint feature extraction layer and the style constraint feature extraction layer respectively;
step 4, generating Gaussian noise images randomly as new initialization images;
step 5, adjusting the size of the initialized new image according to the size of the content portrait image;
step 6, inputting the style portrait image, the content image semantic image and the style image semantic image into a convolutional neural network VGG-19, and then calculating style constraint layer loss functions of the content portrait semantic image and the style portrait semantic image on style constraint layers relu3_1 and relu4_1 by using a Markov random field;
step 7, inputting the initialized new image into a convolutional neural network VGG-19, and calculating a content constraint loss function of the finally generated style image in a content constraint layer relu5_1 by using a Markov random field model;
step 8, integrating the results of the step 6 and the step 7 to obtain a total loss function, generating a portrait style transition result by respectively adopting an optimization algorithm based on a gradient descent method for different layers, namely generating the gradient of the style transition portrait by iterative calculation by adopting the gradient descent method, and approaching the original content portrait and the style portrait along the direction of negative gradient by using the total loss function so as to ensure that the style transition portrait generated by each iteration is similar to the original content portrait and the style portrait respectively as much as possible;
step 9, repeating the step 6-8 for 100 iterations, repeating the step 5-8 for 3 iterations, and outputting a final portrait style transition image;
in step 6, the style constraint layer loss function is:
Figure FDA0002325194580000021
wherein the content of the first and second substances,
Figure FDA0002325194580000022
phi (x) is a feature map, i represents the ith, j represents the jth, phi (x) and mcDividing the blocks into local blocks of r x r size, i.e. local blocks, wherein each local block is psi (phi (x)), psi (m)c) Dividing phi (x) into p1 local patches, and dividing mcThe segmentation generates p2 local patches,
Figure FDA0002325194580000023
representing a stylistic portrait image, R representing a set of real numbers, wc,hcRespectively the length and width of the portrait image of the content,
Figure FDA0002325194580000024
wherein R represents a real number set, ws,hsLength and width of the portrait image of contents, mcSemantic image, m, representing a portrait of contentsSemantic images representing stylistic portraits, βcAdjusting parameters for semantic content portrait weights, βsAdjusting parameters for semantic style portrait weights;
Figure FDA0002325194580000025
denotes Ψ*The ith local patch in (Φ (x)),
Figure FDA0002325194580000026
to represent
Figure FDA0002325194580000027
The ith localpatch in (1), and
Figure FDA0002325194580000028
and
Figure FDA0002325194580000029
each representing Ψ*(Φ(xs) Either) or
Figure FDA00023251945800000210
Neutralization of
Figure FDA00023251945800000211
Or
Figure FDA00023251945800000212
The most matched local patch, k, represents the number of semantic images;
wherein the local patch selection rule is defined as
Figure FDA00023251945800000213
Figure FDA00023251945800000214
2. The method as claimed in claim 1, wherein in step 1, semantic segmentation is performed on the content image and the style image to segment semantic images of the portrait region and the background region, and then semantic segmentation is further performed on the portrait region to segment 5 regions of the face, the nose, the eyes, the mouth and the body as 5 semantic images, and finally 6 semantic images of the background, the face, the nose, the eyes, the mouth and the body are obtained.
3. The method according to claim 2, wherein in step 3, the new feature map of the content constraint feature extraction layer is
Figure FDA0002325194580000031
Where l represents the content constrained feature extraction layer in the corresponding VGG-19, i.e. relu5_1,
Figure FDA0002325194580000032
is a feature graph generated by a content portrait image on a content constraint layer based on a VGG19 network model, βcThe parameters are adjusted for semantic content portrait weights,
Figure FDA0002325194580000033
semantic image representing content portrait, k being 1,2,3,4,5,6, βcValue range [0,200](ii) a The new characteristic diagram of the style constraint characteristic extraction layer is
Figure FDA0002325194580000034
Wherein l represents the style constraint feature extraction layers in the corresponding VGG-19, namely relu3_1 and relu4_1,
Figure FDA0002325194580000035
is a feature diagram generated by a style portrait image based on a VGG19 network model in a style constraint layer, βsThe parameters are adjusted for semantic style portrait weights,
Figure FDA0002325194580000036
semantic image representing a stylistic portrait, k being 1,2,3,4,5,6, βsValue range [0,200]。
4. The method of claim 1, wherein in step 5, the initialization new image is sized to be initialized
Figure FDA0002325194580000037
Wherein
Figure FDA0002325194580000038
hcThe length and the width of the content portrait image are respectively, L is a parameter for adjusting the image size, and L is respectively 3,2 and 1 in each iteration.
5. The method of claim 4, wherein in step 7, the content constraint penalty function is
Ec(Φ(x),Φ(xc))=||Φ(x)-Φ(xc)||2
6. The method of claim 5, wherein in step 8, the total loss function is
E(x)=α1Es(Φ(x),Φ(xs),mc,ms)+α2Ec(Φ(x),Φ(xc))
α therein1And α2The values of the adjustment parameters are α for adjusting the intensity of the original content image and the lattice image contained in the generated image respectively1∈[0,1],α2∈[0,200]。
7. The method according to claim 1, wherein in step 8, the gradient descent method-based optimization algorithm comprises the following steps:
(1) initialization, where the iteration parameters i is 0, j is m, the matrix H is defined and initialized to a diagonal matrix with all elements 1, and the allowable error epsilon is 10-5Calculating an initial gradient g1=▽f(x0),x0A Gaussian noise image randomly generated in the step 4 is obtained;
(2) if i<Itr or if ▽ f (x)i+1)||≤10-5Then output the ith iteration result xi+1And ending the optimization algorithm; otherwise, turning to the step (3); wherein itr is the highest number of iterations;
(3) definition of piIs the negative gradient direction p of the ith iterationi=-gi
(4) Updating the result of the ith iteration, xi+1=xi+pi
(5) Definition siAs a result of the previous step xiAnd the error of the result of this iteration, i.e. si=xi+1-xiDefinition of yi▽ f (x) of gradient for the result of the previous stepi) And the gradient ▽ f (x) of the result of this iterationi+1) Error, i.e. yi=▽f(xi+1)-▽f(xi) Definition of
Figure FDA0002325194580000041
Wherein T represents a matrix transfer;
(6) updating
Figure FDA0002325194580000042
(7) Defining variable q as xi▽ f (x)i);
(8) Iterative calculation of j ═ 1
Get
Figure FDA0002325194580000043
Updating q, q ═ q-aiyi-jUntil j is m, and m is a preset iteration number;
(9) update gi,gi=Hiq;
(10) Iterative calculation of j ═ 1
Taking out the raw materials,
Figure FDA0002325194580000044
update gi,gi=gi+si-j(aj-b) until j ═ m
(11) And (5) updating an iteration step, i is equal to i +1, and jumping to the step (2).
8. The method according to claim 7, wherein the optimization algorithm based on the gradient descent method further comprises a step of retaining the results of the last m times after the step (5) is performed if i>m, then delete si-m、si-m-1...s1And yi-m、yi-m-1...y1
CN201810606345.XA 2018-06-13 2018-06-13 Portrait style migration method based on semantic segmentation and deep convolution neural network Active CN108805803B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810606345.XA CN108805803B (en) 2018-06-13 2018-06-13 Portrait style migration method based on semantic segmentation and deep convolution neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810606345.XA CN108805803B (en) 2018-06-13 2018-06-13 Portrait style migration method based on semantic segmentation and deep convolution neural network

Publications (2)

Publication Number Publication Date
CN108805803A CN108805803A (en) 2018-11-13
CN108805803B true CN108805803B (en) 2020-03-13

Family

ID=64085760

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810606345.XA Active CN108805803B (en) 2018-06-13 2018-06-13 Portrait style migration method based on semantic segmentation and deep convolution neural network

Country Status (1)

Country Link
CN (1) CN108805803B (en)

Families Citing this family (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109829353B (en) * 2018-11-21 2023-04-18 东南大学 Face image stylizing method based on space constraint
CN109583362B (en) * 2018-11-26 2021-11-30 厦门美图之家科技有限公司 Image cartoon method and device
CN109712068A (en) * 2018-12-21 2019-05-03 云南大学 Image Style Transfer and analogy method for cucurbit pyrography
CN109961442B (en) * 2019-03-25 2022-11-18 腾讯科技(深圳)有限公司 Training method and device of neural network model and electronic equipment
CN111815756A (en) * 2019-04-12 2020-10-23 北京京东尚科信息技术有限公司 Image generation method and device, computer readable medium and electronic equipment
CN110084741B (en) * 2019-04-26 2024-06-14 衡阳师范学院 Image wind channel migration method based on saliency detection and depth convolution neural network
JP7394147B2 (en) * 2019-04-29 2023-12-07 センスタイム グループ リミテッド Image generation method and device, electronic equipment, and storage medium
CN110378838B (en) * 2019-06-25 2023-04-18 达闼机器人股份有限公司 Variable-view-angle image generation method and device, storage medium and electronic equipment
CN112561779B (en) * 2019-09-26 2023-09-29 北京字节跳动网络技术有限公司 Image stylization processing method, device, equipment and storage medium
CN111127309B (en) * 2019-12-12 2023-08-11 杭州格像科技有限公司 Portrait style migration model training method, portrait style migration method and device
CN114930798A (en) * 2019-12-30 2022-08-19 苏州臻迪智能科技有限公司 Shooting object switching method and device, and image processing method and device
CN111223039A (en) * 2020-01-08 2020-06-02 广东博智林机器人有限公司 Image style conversion method and device, electronic equipment and storage medium
CN111242841B (en) * 2020-01-15 2023-04-18 杭州电子科技大学 Image background style migration method based on semantic segmentation and deep learning
CN111340720B (en) * 2020-02-14 2023-05-19 云南大学 Color matching woodcut style conversion algorithm based on semantic segmentation
CN111382782B (en) * 2020-02-23 2024-04-26 华为技术有限公司 Method and device for training classifier
CN111325664B (en) * 2020-02-27 2023-08-29 Oppo广东移动通信有限公司 Style migration method and device, storage medium and electronic equipment
CN113496238A (en) * 2020-03-20 2021-10-12 北京京东叁佰陆拾度电子商务有限公司 Model training method, point cloud data stylization method, device, equipment and medium
CN111402407B (en) * 2020-03-23 2023-05-02 杭州相芯科技有限公司 High-precision portrait model rapid generation method based on single RGBD image
CN111340745B (en) * 2020-03-27 2021-01-05 成都安易迅科技有限公司 Image generation method and device, storage medium and electronic equipment
CN111986302A (en) * 2020-07-23 2020-11-24 北京石油化工学院 Image style migration method and device based on deep learning
CN111986075B (en) * 2020-08-12 2022-08-09 兰州交通大学 Style migration method for target edge clarification
CN111986076A (en) * 2020-08-21 2020-11-24 深圳市慧鲤科技有限公司 Image processing method and device, interactive display device and electronic equipment
CN112288621B (en) * 2020-09-21 2022-09-16 山东师范大学 Image style migration method and system based on neural network
CN112529771B (en) * 2020-12-07 2024-05-31 陕西师范大学 Portrait style migration method
CN112541856B (en) * 2020-12-07 2022-05-03 重庆邮电大学 Medical image style migration method combining Markov field and Graham matrix characteristics
CN113160033B (en) * 2020-12-28 2023-04-28 武汉纺织大学 Clothing style migration system and method
CN112950454B (en) * 2021-01-25 2023-01-24 西安电子科技大学 Image style migration method based on multi-scale semantic matching
US20220237838A1 (en) * 2021-01-27 2022-07-28 Nvidia Corporation Image synthesis using one or more neural networks
CN114493994B (en) * 2022-01-13 2024-04-16 南京市测绘勘察研究院股份有限公司 Ancient painting style migration method for three-dimensional scene

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106847294A (en) * 2017-01-17 2017-06-13 百度在线网络技术(北京)有限公司 Audio-frequency processing method and device based on artificial intelligence
CN106952224A (en) * 2017-03-30 2017-07-14 电子科技大学 A kind of image style transfer method based on convolutional neural networks
CN107767328A (en) * 2017-10-13 2018-03-06 上海交通大学 The moving method and system of any style and content based on the generation of a small amount of sample

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106250931A (en) * 2016-08-03 2016-12-21 武汉大学 A kind of high-definition picture scene classification method based on random convolutional neural networks
US9922432B1 (en) * 2016-09-02 2018-03-20 Artomatix Ltd. Systems and methods for providing convolutional neural network based image synthesis using stable and controllable parametric models, a multiscale synthesis framework and novel network architectures

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106847294A (en) * 2017-01-17 2017-06-13 百度在线网络技术(北京)有限公司 Audio-frequency processing method and device based on artificial intelligence
CN106952224A (en) * 2017-03-30 2017-07-14 电子科技大学 A kind of image style transfer method based on convolutional neural networks
CN107767328A (en) * 2017-10-13 2018-03-06 上海交通大学 The moving method and system of any style and content based on the generation of a small amount of sample

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Style Transfer Via Texture Synthesis;Michael Elad等;《 IEEE Transactions on Image Processing 》;20170308;第26卷(第5期);第2338-2351页 *
面向手机应用的图像色彩风格迁移系统设计与实现;蔡兴泉等;《信息通信》;20160630(第6期);第139-140页 *

Also Published As

Publication number Publication date
CN108805803A (en) 2018-11-13

Similar Documents

Publication Publication Date Title
CN108805803B (en) Portrait style migration method based on semantic segmentation and deep convolution neural network
Yue et al. Dual adversarial network: Toward real-world noise removal and noise generation
Yang et al. High-resolution image inpainting using multi-scale neural patch synthesis
CN110969250B (en) Neural network training method and device
CN109903236B (en) Face image restoration method and device based on VAE-GAN and similar block search
US20160283842A1 (en) Neural network and method of neural network training
CN110084741B (en) Image wind channel migration method based on saliency detection and depth convolution neural network
CN108647723B (en) Image classification method based on deep learning network
CN110706214B (en) Three-dimensional U-Net brain tumor segmentation method fusing condition randomness and residual error
CN112183501B (en) Depth counterfeit image detection method and device
WO2017214507A1 (en) Neural network and method of neural network training
CN111986075B (en) Style migration method for target edge clarification
Zhang et al. Bionic face sketch generator
CN103942571B (en) Graphic image sorting method based on genetic programming algorithm
CA3137297C (en) Adaptive convolutions in neural networks
CN108734677B (en) Blind deblurring method and system based on deep learning
CN111127309B (en) Portrait style migration model training method, portrait style migration method and device
Xu et al. Styleswap: Style-based generator empowers robust face swapping
CN112101364B (en) Semantic segmentation method based on parameter importance increment learning
WO2021042857A1 (en) Processing method and processing apparatus for image segmentation model
CN109920021A (en) A kind of human face sketch synthetic method based on regularization width learning network
CN112884648A (en) Method and system for multi-class blurred image super-resolution reconstruction
CN116863194A (en) Foot ulcer image classification method, system, equipment and medium
WO2016172889A1 (en) Image segmentation method and device
JP6935868B2 (en) Image recognition device, image recognition method, and program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant