CN113888400B - Image style migration method and device - Google Patents

Image style migration method and device Download PDF

Info

Publication number
CN113888400B
CN113888400B CN202111302183.9A CN202111302183A CN113888400B CN 113888400 B CN113888400 B CN 113888400B CN 202111302183 A CN202111302183 A CN 202111302183A CN 113888400 B CN113888400 B CN 113888400B
Authority
CN
China
Prior art keywords
style
image
demodulated
feature
hidden space
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111302183.9A
Other languages
Chinese (zh)
Other versions
CN113888400A (en
Inventor
李祎
谢鑫
付海燕
王波
郭艳卿
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dalian University of Technology
Original Assignee
Dalian University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dalian University of Technology filed Critical Dalian University of Technology
Priority to CN202111302183.9A priority Critical patent/CN113888400B/en
Publication of CN113888400A publication Critical patent/CN113888400A/en
Application granted granted Critical
Publication of CN113888400B publication Critical patent/CN113888400B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/04Context-preserving transformations, e.g. by using an importance map
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
  • Image Processing (AREA)

Abstract

The invention provides an image style migration method and device, wherein the method comprises the following steps: inputting the content image I c and the wind grid image I s into a pre-trained encoder network E for feature extraction, fusing and projecting the content features C and the wind grid features S into a hidden space Z; inputting the information of the style characteristic S into a first layer convolution of a decoder network D to obtain a demodulated first layer weight A' 1 of the decoder; obtaining a separation matrix W based on FastICA algorithm, the separation matrix W enabling the matrixThe correlation of the vectors in' is minimal; based on the separation matrix W and matrix' Computing to obtain a demodulated semantic direction set; and editing the hidden space vector in the hidden space Z based on the acquired semantic direction, and finally acquiring the image after style migration by combining with the decoder network D. The method does not need to train a large-scale style data set or learn any parameters, and can be applied to most style migration models.

Description

Image style migration method and device
Technical Field
The invention relates to application of artificial intelligence in the fields of computer vision and image style migration, in particular to an image style migration method and device.
Background
Early image style migration techniques were narrow in the range of styles to which they were applied, and often only one algorithm was specific to one image texture type, and the migration conversion results were not ideal. With the rise of artificial intelligence and depth learning in recent years, the field of image style migration has developed many excellent achievements, and the generated stylized images are more and more lifelike. The nature of style transfer is to migrate the style of the drawing to the image of the photograph and preserve the original content of the photograph. In order to produce images having multiple styles, models often require a large set of style data. In addition to the need of large-scale data sets, most of the current style migration models adopt two methods of iterative optimization and feedforward network to improve the quality of model stylized images:
Iterative optimization (ITERATIVE OPTIMIZATION) method: the image iteration is to directly perform optimization iteration on the white noise image to realize style migration, and the optimization target is the image. Many algorithms calculate the maximum mean difference in an iterative process, measuring the difference between the style image and the content image. The two images are "aligned" so as to reduce losses and errors caused by image iterations.
Feed-forward network (feed-forward network) method: the optimization target is a neural network model, the model is updated by gradient descent to optimize the network model, and style migration is realized in a network feedforward mode.
Both methods have advantages and disadvantages. The method based on iterative optimization has the advantages of high quality, good controllability, easy parameter adjustment, long calculation time and poor real-time performance of the synthesized image. The feedforward network-based method is high in calculation speed, can be used for video rapid stylization, is a mainstream technology of industrial application software at present, but needs a large amount of training data when the image generation quality is to be further improved.
Disclosure of Invention
According to the technical problems of long calculation time and large quantity of training data, the image style migration method and device are provided. The invention mainly learns different style semantics from the hidden space of the pre-training style migration model, modifies related coding information in the hidden space along different semantic directions and decodes the related coding information to obtain images with various styles.
The invention adopts the following technical means:
An image style migration method, comprising the steps of:
acquiring a content image I c and a style image I s;
Inputting the content image I c and the wind grid image I s into a pre-trained encoder network E for feature extraction, so as to obtain a content feature C and a wind grid feature S;
fusing the content features C and the wind grid features S through mathematical operation or a convolution network, and projecting the fused image features to a hidden space Z;
Inputting the style tensor obtained after the coding of the coder to the style characteristic S, inputting the style tensor to the first layer convolution of the decoder network D, and adjusting the weight A 1 corresponding to the first layer convolution of the decoder network based on the style characteristic S, thereby obtaining the demodulated first layer weight A' 1 of the decoder;
obtaining a separation matrix W based on the FastICA algorithm, wherein the separation matrix W minimizes the correlation of each vector in the matrix A' 1 TA′1;
Calculating and acquiring a demodulated semantic direction set based on the separation matrix W and the matrix A' 1 TA′1;
and editing the hidden space vector in the hidden space Z based on the acquired semantic direction, and finally acquiring the image after style migration by combining with the decoder network D.
Further, fusing the content features C and the style features S, including obtaining a fusion result according to the following calculation:
wherein AdaIN (C, S) is the fusion result of the content feature C and the style feature S, sigma (S) is the standard deviation of the style feature S, mu (S) is the mean value of the style feature S, sigma (C) is the standard deviation of the content feature C, and mu (C) is the mean value of the content feature C.
Further, adjusting the weight a 1 corresponding to the first layer convolution of the decoder network based on the style characteristic S to obtain a demodulated first layer weight a '1, including obtaining the demodulated first layer weight a' 1 according to the following manner:
Wherein A 1 is the weight corresponding to the first layer convolution of the decoder network, A' 1 is the weight of the first layer of the demodulated decoder, S is the style characteristic, and epsilon is a constant term.
Further, the demodulated semantic direction set is obtained through calculation based on the separation matrix W and the matrix A' 1 TA′1, and the demodulated semantic direction set is obtained according to the following calculation:
N={n1,n2,…,nk}=WA1 TA′1
where N is the demodulated semantic direction set, N i is the i-th semantic direction, i= … k, W is the separation matrix, a '1 is the demodulated decoder first layer weight, and a' 1 T is the transpose of the demodulated decoder first layer weight.
Further, editing the hidden space vector in the hidden space Z based on the acquired semantic direction includes acquiring the hidden space vector in the hidden space Z based on the following calculation:
z′=z+αni
Wherein z' is the edited hidden space vector, z is the hidden space vector, alpha is the preset style change degree, and n i th semantic direction.
The invention also provides an image style migration device, which is used for realizing the image style migration method as claimed in claim 1, comprising the following steps:
an acquisition unit configured to acquire a content image I c and a style image I s;
The encoding unit is used for inputting the content image I c and the wind grid image I s into a pre-trained encoder network E for feature extraction so as to obtain a content feature C and a wind grid feature S;
The fusion unit is used for fusing the content features C and the wind grid features S through mathematical operation or a convolution network and projecting the fused image features to the hidden space Z;
The weight adjusting unit is used for inputting the information of the style characteristic S into the first layer convolution of the decoder network D, and adjusting the weight A 1 corresponding to the first layer convolution of the decoder network based on the style characteristic S so as to obtain the demodulated first layer weight A' 1 of the decoder;
A separation matrix acquisition unit, configured to acquire a separation matrix W based on a fastca algorithm, where the separation matrix W minimizes a correlation of each vector in the matrix a' 1 TA′1;
The calculating unit is used for calculating and acquiring a demodulated semantic direction set based on the separation matrix W and the matrix A' 1 TA′1;
and the decoding unit is used for editing the hidden space vector in the hidden space Z based on the acquired semantic direction and finally acquiring the image after style migration by combining with the decoder network D.
Further, fusing the content features C and the style features S, including obtaining a fusion result according to the following calculation:
wherein AdaIN (C, S) is the fusion result of the content feature C and the style feature S, sigma (S) is the standard deviation mean of the style feature S, mu (S) is the mean of the style feature S, sigma (C) is the standard deviation of the content feature C, and mu (C) is the mean of the content feature C.
Further, the weight adjustment unit obtains the demodulated decoder first layer weight a' 1 according to the following manner:
Wherein A 1 is the weight corresponding to the first layer convolution of the decoder network, A' 1 is the weight of the first layer of the demodulated decoder, S is the style characteristic, and epsilon is a constant term.
Further, the computing unit obtains the demodulated semantic direction set according to the following calculation:
N={n1,n2,…,nk}=WA′1 TA′1
where N is the demodulated semantic direction set, N i is the i-th semantic direction, i= … k, W is the separation matrix, a '1 is the demodulated decoder first layer weight, and a' 1 T is the transpose of the demodulated decoder first layer weight.
Further, the decoding unit acquires the hidden space vector in the hidden space Z based on the following calculation:
z′=z+αni
Wherein z' is the edited hidden space vector, z is the hidden space vector, alpha is the degree of change or style change of the potential vector, and n i is the i-th semantic direction.
Compared with the prior art, the invention has the following advantages:
1. The invention greatly reduces the complexity of the model, does not need to learn any parameters or a large number of data sets, and can learn a large number of styles from the hidden space of the pre-training model only by simple mathematical theory.
2. The invention can efficiently generate various types of images and edit the target properties of the images. The algorithm is simple and easy to use, can be embedded into different style migration models, and has strong universality and flexibility.
3. Compared with the traditional method, the method saves time and simultaneously avoids waste of equipment resources.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings may be obtained according to the drawings without inventive effort to a person skilled in the art.
FIG. 1 is a schematic diagram of a style migration basic framework.
FIG. 2 is a flow chart of an image style migration method according to the present invention.
FIG. 3 is a graphical representation of the results of the present invention in example 1 tested on AdaIN, linear, MST, SANet models.
FIG. 4 is a graphical representation of the results of the invention of example 2 performed on AdaIN, linear, MST, SANet models.
Detailed Description
In order that those skilled in the art will better understand the present invention, a technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, shall fall within the scope of the present invention.
Image style migration is a hot research direction in the field of computer vision. With the rise of deep learning, the image style migration field is developed in a breakthrough manner. The style migration is to migrate the style of one canvas to another real image and keep the content in the real image unchanged. The input is a content graph and a style graph, and the output is a stylized result. The entire stylized image is divided into three parts as shown in fig. 1.
A first part: and (5) feature coding. And inputting a content image and a style image, and extracting relevant characteristics of the two images by using an encoder network by the system, namely, the content characteristics and the style characteristics.
A second part: and (5) feature fusion. The system needs to integrate the style characteristics of the oil painting into the content characteristics of the real image, namely, the two characteristics are combined into a potential space, so that preparation work is performed for generating a new picture for a subsequent decoder.
Third section: and (5) feature decoding. The decoder obtains the potential codes from the hidden space and converts the potential codes into artistic images with the same style as the style pictures through the neural network.
Currently, conventional style migration models require a large number of artwork datasets and advanced convolutional neural network architectures to train to produce realistic artistic images, a process that is time consuming and labor intensive. In order to solve the above problems, the present invention provides a style migration method, mainly aiming at the hidden space of the third part, by learning and exploring the abundant potential information in the hidden space, the system can learn a great deal of artistic styles from the hidden space of the pre-training model. Compared with the traditional method, the method has strong universality and flexibility, and can be embedded into most style migration models, such as AdaIN, linear, MST, SANet and the like without relearning the data set.
As shown in FIG. 2, the invention provides an image style migration method, which is an unsupervised decoupling method and is mainly applied to a feature decoding part in style migration. The first layer weight of the decoder is decomposed, a large number of artistic semantics are learned from the demodulated weight, and the attributes corresponding to the images are edited according to the directions of the artistic semantics, so that the artistic works with various styles are generated. Mainly comprises the following steps:
S1, acquiring a content image I c and a grid image I s.
S2, inputting the content image I c and the grid image I s into a pre-trained encoder network E for feature extraction, so as to obtain a content feature C and a grid feature S.
And S3, fusing the content features C and the wind grid features S through mathematical operation or a convolution network, and projecting the fused image features to the hidden space Z.
Specifically, in this embodiment, it is preferable to use adaptive instance normalization, that is, calculate two feature means and standard deviation respectively, fuse the content feature C and the wind grid feature S, and specifically obtain a fusion result according to the following calculation:
wherein AdaIN (C, S) is the fusion result of the content feature C and the style feature S, sigma (S) is the standard deviation of the style feature S, mu (S) is the mean value of the style feature S, sigma (C) is the standard deviation of the content feature C, and mu (C) is the mean value of the content feature C.
S4, inputting information of the style characteristic S into the first layer convolution of the decoder network D, wherein the information of the style characteristic S refers to a style tensor obtained by encoding a style image through an encoder, and adjusting a weight A 1 corresponding to the first layer convolution of the decoder network based on the information of the style characteristic S, so as to obtain a demodulated first layer weight A' 1 of the decoder. This step is mainly used for adapting the pre-trained model, specifically, the first layer weights a' 1 of the decoder after the past demodulation are calculated according to the following:
Wherein A 1 is the weight corresponding to the first layer convolution of the decoder network, A' 1 is the weight of the first layer of the decoder after demodulation, S is the style characteristic, epsilon is a constant term, and the function of the constant term is to make the denominator of the formula be not 0.
The information of the style characteristic S is re-added to the first layer convolution of the decoder network to achieve weight demodulation so that the adjusted weights contain more style information.
S5, obtaining a separation matrix W based on the FastICA algorithm, wherein the separation matrix W minimizes the correlation of each vector in the matrix A' 1 TA′1.
S6, calculating and obtaining a demodulated semantic direction set based on the separation matrix W and the matrix A' 1 TA′1.
Since the artistic style in an image is extremely complex, most semantic attributes are coupled, changing one attribute while changing another is highly likely to change another artistic attribute. To decouple the efficient artistic semantics, we need to reduce the correlation of the individual vectors in the matrix a' 1 TA′1 as much as possible. In this embodiment, the fastca algorithm preferably finds a separation matrix W to multiply with the matrix a '1 TA′1, i.e., n=wa' 1 TA′1, and uses correlation minimization to separate each artistic semantic meaning with the greatest possibility, so as to learn the artistic style from the hidden space.
S7, editing the hidden space vector in the hidden space Z based on the acquired semantic direction, and finally acquiring the image after style migration by combining with the decoder network D.
Specifically, modifying the hidden space vector to z '=z+αn i, ultimately generates a new artistic image i=d (z').
The application effect of the present invention will be further described below by way of specific application examples.
As shown in FIG. 3, the method of the present invention is tested on AdaIN, linear, MST, SANet models, where the first column of each set of images represents the source content image and the bottom right hand corner represents the style image; the second column represents the decoded source output artistic image; the third column and the fourth column represent images with diversified styles obtained by modifying hidden space vectors along different semantic directions, namely, the modified results along the positive and negative directions, namely, z' =z+αn i (the third column), and z=z- αn i (the fourth column) realize editing of the related attributes of the picture.
The present invention is applicable to a variety of aspects:
1. Entertainment applications
Modern people have stronger and stronger dependence on internet social contact, and people have higher requirements on specific applications of the internet social contact. The algorithm can be well applied to various software from various drawing software on a computer to various drawing software on a mobile phone. People can beautify or modify own pictures easily and share various social platforms. As the demand of people for beauty is higher, artistic beautification is also gradually proposed. It is also desirable to make the favorite pictures into various styles, such as cool tone style, nostalgic style, photo-by-print style, sketch style, oil painting style, etc., as shown in fig. 3, after enjoying the pictures taken by the user.
2. Auxiliary creation tool
Along with the development of mobile internet technology, various intelligent products are layered endlessly. The advent of the graphic era makes the content of pictures with rich colors and various types be touted by users, and users are eagerly desirous of beautifying and editing the pictures which are shot immediately, sharing communication, label indication and map rendering. At present, beautifying photos is becoming a hobby for people. The algorithm can serve as a user-assisted creation tool, is particularly beneficial to painters to conveniently create artistic works of specific styles, as shown in fig. 4, and can be applied to the aspects of creating computer vision diagrams, fashion designs and the like.
3. Meets the functional requirement
The image style migration function often requires a server with at least one GPU and runs on the Linux operating system. Server-side functions require servers capable of network connection, often require large memory, and data persistence requires large memory server hard disks. This makes many excellent style migration methods impractical to use. The algorithm greatly reduces the complexity of the model, does not need a large number of data sets to train, does not need to learn any parameters, and can run on a plurality of platforms. The algorithm not only meets the hardware requirement, but also meets the requirement of a user.
As shown in fig. 4, for the present invention, the diversity of algorithms is verified on AdaIN, linear, MST, SANet four models, the first column in the figure represents the source content image, and the lower right corner represents the style image; the second column represents the decoded source output artistic image; other columns represent diversified artistic images that we generate after modifying the potential vectors along the artistic semantic direction learned from the hidden space.
The application also provides an image style migration device corresponding to the image style migration method in the application, comprising:
an acquisition unit configured to acquire a content image I c and a style image I s;
The encoding unit is used for inputting the content image I c and the wind grid image I s into a pre-trained encoder network E for feature extraction so as to obtain a content feature C and a wind grid feature S;
The fusion unit is used for fusing the content features C and the wind grid features S through mathematical operation or a convolution network and projecting the fused image features to the hidden space Z;
The weight adjusting unit is used for inputting the information of the style characteristic S into the first layer convolution of the decoder network D, and adjusting the weight A 1 corresponding to the first layer convolution of the decoder network based on the style characteristic S so as to obtain the demodulated first layer weight A' 1 of the decoder;
A separation matrix acquisition unit, configured to acquire a separation matrix W based on a fastca algorithm, where the separation matrix W minimizes a correlation of each vector in the matrix a' 1 TA′1;
The calculating unit is used for calculating and acquiring a demodulated semantic direction set based on the separation matrix W and the matrix A' 1 TA′1;
and the decoding unit is used for editing the hidden space vector in the hidden space z based on the acquired semantic direction and finally acquiring the image after style migration by combining with the decoder network D.
Further, the weight adjustment unit obtains the demodulated decoder first layer weight a' 1 according to the following manner:
Wherein A 1 is the weight corresponding to the first layer convolution of the decoder network, A' 1 is the weight of the first layer of the demodulated decoder, S is the style characteristic, and epsilon is a constant term.
Further, the computing unit obtains the demodulated semantic direction set according to the following calculation:
N={n1,n2,…,nk}=WA′1 TA′1
where N is the demodulated semantic direction set, N i is the i-th semantic direction, i= … k, W is the separation matrix, a '1 is the demodulated decoder first layer weight, and a' 1 T is the transpose of the demodulated decoder first layer weight.
Further, the decoding unit acquires the hidden space vector in the hidden space Z based on the following calculation:
z′=z+αni
Wherein, z' is the edited hidden space vector, z is the hidden space vector, n i is the i-th semantic direction, alpha is the degree of style change, the variable is defined manually, if people want to make the image change obvious, alpha is set to be larger, otherwise, alpha is set to be smaller. The change in alpha will cause a change in the potential vector which in turn will cause a change in the style of the final image, so the variable can also be defined as the degree of change in the potential vector.
For the embodiments of the present invention, since they correspond to those in the above embodiments, the description is relatively simple, and the relevant similarities will be found in the description of the above embodiments, and will not be described in detail herein.
The foregoing embodiment numbers of the present invention are merely for the purpose of description, and do not represent the advantages or disadvantages of the embodiments.
In the foregoing embodiments of the present invention, the descriptions of the embodiments are emphasized, and for a portion of this disclosure that is not described in detail in this embodiment, reference is made to the related descriptions of other embodiments.
In the several embodiments provided in the present application, it should be understood that the disclosed technology may be implemented in other manners. The above-described embodiments of the apparatus are merely exemplary, and the division of the units, for example, may be a logic function division, and may be implemented in another manner, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be through some interfaces, units or modules, or may be in electrical or other forms.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied essentially or in part or all of the technical solution or in part in the form of a software product stored in a storage medium, including instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a usb disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a removable hard disk, a magnetic disk, or an optical disk, or other various media capable of storing program codes.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the invention.

Claims (4)

1. An image style migration method is characterized by comprising the following steps:
acquiring a content image I c and a style image I s;
Inputting the content image I c and the wind grid image I s into a pre-trained encoder network E for feature extraction, so as to obtain a content feature C and a wind grid feature S;
fusing the content features C and the wind grid features S through mathematical operation or a convolution network, projecting the fused image features to a hidden space Z, and fusing the content features C and the wind grid features S, wherein the method comprises the following steps of obtaining a fusion result according to the following calculation:
Wherein AdaIN (C, S) is the fusion result of the content feature C and the style feature S, sigma (S) is the standard deviation of the style feature S, mu (S) is the mean value of the style feature S, sigma (C) is the standard deviation of the content feature C, and mu (C) is the mean value of the content feature C;
Inputting the style tensor obtained after the coding of the coder to the style characteristic S, inputting the style tensor to the first layer convolution of the decoder network D, adjusting the weight A 1 corresponding to the first layer convolution of the decoder network based on the style characteristic S, and further obtaining the demodulated first layer weight A 1 ', wherein the method comprises the steps of obtaining the demodulated first layer weight A 1' of the decoder according to the following modes:
Wherein A 1 is the weight corresponding to the first layer convolution of the decoder network, A 1' is the weight of the first layer of the demodulated decoder, S is the style characteristic, and epsilon is a constant term;
obtaining a separation matrix W based on the FastICA algorithm, wherein the separation matrix W minimizes the correlation of each vector in the matrix A 1'TA'1;
Calculating and acquiring the demodulated semantic direction set based on the separation matrix W and the matrix a 1'TA'1, including acquiring the demodulated semantic direction set according to the following calculation:
N={n1,n2,L,nk}=WA'1 TA'1
Wherein N is the demodulated semantic direction set, N i is the i-th semantic direction, i=1lk, w is the separation matrix, a '1 is the demodulated decoder first layer weight, and a' 1 T is the transpose of the demodulated decoder first layer weight;
and editing the hidden space vector in the hidden space Z based on the acquired semantic direction, and finally acquiring the image after style migration by combining with the decoder network D.
2. The image style migration method of claim 1, wherein editing the hidden space vector in the hidden space Z based on the acquired semantic direction comprises acquiring the hidden space vector in the hidden space Z based on the following calculation:
z′=z+αni
Wherein z' is the edited hidden space vector, z is the hidden space vector, alpha is the preset style change degree, and n i th semantic direction.
3. An image style migration apparatus for implementing the image style migration method according to claim 1, comprising:
an acquisition unit configured to acquire a content image I c and a style image I s;
The encoding unit is configured to input the content image I c and the grid image I s into a pre-trained encoder network E for feature extraction, thereby obtaining a content feature C and a grid feature S, where the content feature C and the grid feature S are fused, and the fusion result is obtained according to the following calculation:
Wherein AdaIN (C, S) is the fusion result of the content feature C and the style feature S, sigma (S) is the standard deviation of the style feature S, mu (S) is the mean value of the style feature S, sigma (C) is the standard deviation of the content feature C, and mu (C) is the mean value of the content feature C;
The fusion unit is used for fusing the content features C and the wind grid features S through mathematical operation or a convolution network and projecting the fused image features to the hidden space Z;
The weight adjustment unit is configured to input information of the style characteristic S into a first layer convolution of the decoder network D, adjust a weight a 1 corresponding to the first layer convolution of the decoder network based on the style characteristic S, and obtain a demodulated first layer weight a '1, and obtain a demodulated first layer weight a' 1 according to the following manner:
Wherein A 1 is the weight corresponding to the first layer convolution of the decoder network, A' 1 is the weight of the first layer of the demodulated decoder, S is the style characteristic, and epsilon is a constant term;
A separation matrix acquisition unit, configured to acquire a separation matrix W based on a fastca algorithm, where the separation matrix W minimizes a correlation of each vector in the matrix a' 1 TA'1;
the calculating unit is configured to calculate and obtain a demodulated semantic direction set based on the separation matrix W and the matrix a 1'TA'1, and includes obtaining a demodulated semantic direction set according to the following calculation:
N={n1,n2,L,nk}=WA'1 TA'1
Wherein N is the demodulated semantic direction set, N i is the i-th semantic direction, i=1lk, w is the separation matrix, a '1 is the demodulated decoder first layer weight, and a' 1 T is the transpose of the demodulated decoder first layer weight;
and the decoding unit is used for editing the hidden space vector in the hidden space Z based on the acquired semantic direction and finally acquiring the image after style migration by combining with the decoder network D.
4. The image style migration apparatus according to claim 3, wherein the decoding unit acquires the hidden space vector in the hidden space Z based on the following calculation:
z′=z+αni
wherein z' is the edited hidden space vector, z is the hidden space vector, alpha is the degree of change or style change of the potential vector, and n i is the i-th semantic direction.
CN202111302183.9A 2021-11-04 2021-11-04 Image style migration method and device Active CN113888400B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111302183.9A CN113888400B (en) 2021-11-04 2021-11-04 Image style migration method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111302183.9A CN113888400B (en) 2021-11-04 2021-11-04 Image style migration method and device

Publications (2)

Publication Number Publication Date
CN113888400A CN113888400A (en) 2022-01-04
CN113888400B true CN113888400B (en) 2024-04-26

Family

ID=79017059

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111302183.9A Active CN113888400B (en) 2021-11-04 2021-11-04 Image style migration method and device

Country Status (1)

Country Link
CN (1) CN113888400B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106651766A (en) * 2016-12-30 2017-05-10 深圳市唯特视科技有限公司 Image style migration method based on deep convolutional neural network
JP2018132855A (en) * 2017-02-14 2018-08-23 国立大学法人電気通信大学 Image style conversion apparatus, image style conversion method and image style conversion program
CN111325681A (en) * 2020-01-20 2020-06-23 南京邮电大学 Image style migration method combining meta-learning mechanism and feature fusion
CN112837212A (en) * 2021-01-28 2021-05-25 南京大学 Image arbitrary style migration method based on manifold alignment

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106651766A (en) * 2016-12-30 2017-05-10 深圳市唯特视科技有限公司 Image style migration method based on deep convolutional neural network
JP2018132855A (en) * 2017-02-14 2018-08-23 国立大学法人電気通信大学 Image style conversion apparatus, image style conversion method and image style conversion program
CN111325681A (en) * 2020-01-20 2020-06-23 南京邮电大学 Image style migration method combining meta-learning mechanism and feature fusion
CN112837212A (en) * 2021-01-28 2021-05-25 南京大学 Image arbitrary style migration method based on manifold alignment

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
吉燕妮.基于深度学习的风格迁移算法的改进与实现研究.全文. *
柳东静.基于语义匹配与风格采样的图像风格迁移技术研究.全文. *

Also Published As

Publication number Publication date
CN113888400A (en) 2022-01-04

Similar Documents

Publication Publication Date Title
Esser et al. Structure and content-guided video synthesis with diffusion models
CN111465965B (en) System and method for real-time complex character animation and interactivity
CN113269872A (en) Synthetic video generation method based on three-dimensional face reconstruction and video key frame optimization
CN112837210B (en) Multi-shape variable-style face cartoon automatic generation method based on feature map segmentation
Clarke et al. Automatic generation of 3D caricatures based on artistic deformation styles
Huang et al. Diffstyler: Controllable dual diffusion for text-driven image stylization
CN113362422B (en) Shadow robust makeup transfer system and method based on decoupling representation
CN117496072B (en) Three-dimensional digital person generation and interaction method and system
Ye et al. 3D-CariGAN: an end-to-end solution to 3D caricature generation from normal face photos
Huang et al. Real-world automatic makeup via identity preservation makeup net
Guo et al. Sparsectrl: Adding sparse controls to text-to-video diffusion models
CN117522697A (en) Face image generation method, face image generation system and model training method
Tan et al. Semantic probability distribution modeling for diverse semantic image synthesis
CN113888400B (en) Image style migration method and device
CN117095071A (en) Picture or video generation method, system and storage medium based on main body model
CN112837212A (en) Image arbitrary style migration method based on manifold alignment
CN110097615B (en) Stylized and de-stylized artistic word editing method and system
Huo et al. CAST: Learning both geometric and texture style transfers for effective caricature generation
Togo et al. Text-guided style transfer-based image manipulation using multimodal generative models
Shen et al. Overview of Cartoon Face Generation
Bagwari et al. An edge filter based approach of neural style transfer to the image stylization
Xing et al. Stylized Image Generation based on Music-image Synesthesia Emotional Style Transfer using CNN Network.
Wang et al. Uncouple generative adversarial networks for transferring stylized portraits to realistic faces
Chen et al. The Transfer of Film Style Based on Meta-Learning
Bucciero et al. Portrait2Bust: DualStyleGAN-based portrait image stylization based on bust sculpture images

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant