CN109033095B - Target transformation method based on attention mechanism - Google Patents
Target transformation method based on attention mechanism Download PDFInfo
- Publication number
- CN109033095B CN109033095B CN201810866277.0A CN201810866277A CN109033095B CN 109033095 B CN109033095 B CN 109033095B CN 201810866277 A CN201810866277 A CN 201810866277A CN 109033095 B CN109033095 B CN 109033095B
- Authority
- CN
- China
- Prior art keywords
- attention
- layer
- image
- model
- calculating
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/40—Processing or translation of natural language
- G06F40/58—Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/10—Image acquisition
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- General Engineering & Computer Science (AREA)
- Biomedical Technology (AREA)
- Evolutionary Computation (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Data Mining & Analysis (AREA)
- Biophysics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Multimedia (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
The invention relates to a target transformation method based on an attention mechanism, which comprises the following steps: training a neural network model: step 1, initializing parameters of a neural network model by using random numbers; step 2, inputting an image X belonging to the category X into a generator G of the model, entering an encoding stage, and calculating a first-layer characteristic diagram f by the X through a convolution layer 1 . And performing target transformation on the image by using the trained neural network model, and introducing an attention mechanism into the model to enable the model to identify a target object to be converted in a target change task so as to distinguish the target from the background. Meanwhile, the consistency of the background of the original image and the converted image is ensured by constructing an attention consistency loss function and a background consistency loss function.
Description
Technical Field
The invention relates to image translation, in particular to a target transformation method based on an attention mechanism.
Background
Object transformation (Object transformation) is a special task of image translation, whose purpose is to transform a specific type of Object in an image into another type of Object. Image translation (Image translation) aims at converting an original Image into an Image of a target style by learning a mapping relationship between two types of images, and has been applied to many aspects such as Image super-resolution reconstruction, artistic style migration, and the like in recent years. Researchers have proposed many efficient transformation methods under supervised conditions. However, the conversion method under the unsupervised condition becomes a research hotspot in image translation due to the large labor cost and time cost required for acquiring paired data. Visual Attribute Transfer (VAT) is a representation of the convolutional neural network CNN-based approach, which uses features at different levels in the model to match the most likely corresponding features in another graph. In addition, a method using a Generative Adaptive Network (GAN) achieves more significant effects than a method based on a convolutional neural network. Isola P et al explored the potential of GAN in image translation tasks. Subsequently, cycle-dependent Loss was proposed by Zhu j.y et al to solve the problem of unsupervised image translation, which assumed that the mapping relationship learned in the image translation task was a bi-directional mapping, and thus enhanced the effect of image translation of the model in an unsupervised environment.
The traditional technology has the following technical problems:
most of the current image translation methods do not take into account the difference between the conversion object and the background region. In a target change task, most models are difficult to effectively distinguish a conversion target from a background, and the consistency of an original image background and a conversion image background cannot be ensured. Therefore, the model generates the effects of blurring, discoloring and the like on the image background in the conversion process, and the quality of the converted image is reduced.
Disclosure of Invention
In view of the above, it is necessary to provide an object transformation method based on attention mechanism, which can distinguish the object from the background by introducing attention mechanism into the model to enable the model to identify the object to be transformed in the task of object transformation. Meanwhile, the consistency of the background of the original image and the converted image is ensured by constructing an attention consistency loss function and a background consistency loss function.
An attention-based target transformation method, comprising:
training a neural network model:
step 1, initializing parameters of a neural network model by using random numbers;
step 2, inputting an image X belonging to the category X into a generator G of the model, entering an encoding stage, and calculating a first-layer characteristic diagram f by the X through a convolution layer 1 ;
Step 3, then f 1 Two branch networks will be traversed: (a) One convolution layer obtains the characteristic diagram of the second layer without attention mask processing(b) First passes through two convolutional layersThen passing through a deconvolution layer to obtain anCorresponding attention mask M 2 (ii) a Will M 2 Andelement by element multiplication, the product being obtained andare added one by one to obtain a processed second layer characteristic diagram f 2 ;
Step 4, f 2 Obtaining the characteristic diagram f of the next layer according to the mode of the step 3 3 (ii) a Then, f 3 Further fine features are obtained by 6 layers of residual convolution layers with convolution kernel size of 3 x 3 and step size of 1;
step 5, entering a decoding stage, and taking the deconvolution layer as a decoder; f. of 3 Two branch networks will be traversed: (a) An deconvolution layer is subjected to a second layer profile without attention masking(b) First through two deconvolution layers and then through one convolution layer to obtain a sumCorresponding attention mask M 4 (ii) a Will M 4 Andelement by element multiplication, the product being obtained andare added one by one to obtain a processed second layer characteristic diagram f 5 ;
Step 7, y 'is input into another generator F, and the same operation as in step 2-6 is performed to obtain x' and the corresponding attention mask M F(G(x)) ;
Step 8, inputting x and x' into a discriminator D x Middle, discriminator D x The probability that the input image belongs to the category X is returned; likewise, y and y' are input to the discriminator D Y Obtaining the probability that Y and Y' belong to the category Y; the value of the opposition loss function is thus calculated:
step 9, calculating the value of the cycle consistent loss function according to x, x ', y, y':
L cyc (G,F)=||x′-x|| 1 +||y′-y|| 1 #(3)
step 10, use M G(x) Separating the background in x and y' from the conversion target, calculating the background change loss:
L bg (x,G)=γ*||B(x,M G(x) )-B(y′,M G(x) )|| 1 #(4)
B(x,M G(x) )=H(x,1-M G(x) )#(5)
γ is set to 0.000075 to 0.0075; the value of the H (K, L) function is that elements in K are multiplied by elements in L one by one; also, M may be used F(G(x)) Calculating the background change loss L by using y and x bg (y,F);
Step 11, with M G(x) And M F(G(x)) Calculating the attention change loss:
L att (x,G,F)=α*||M G(x) -M F(G(x)) || 1 +β*(M G(x) +M F(G(x)) )#(6)
α is set to 0.000003 to 0.00015, β is set to 0.0000005 to 0.00005;
step 12, adjusting model parameters according to the error obtained in the previous step 8-11 by a back propagation algorithm with the learning rate of 0.00002 to 0.002;
step 13, taking y as an input image, and calculating an error through the operations of the steps 2 to 11, except that the y passes through a generator F and then a generator G; adjusting the model parameters according to the method in the step 12;
step 14, continuously repeating the steps 2-13 until the model parameters are converged;
and carrying out target transformation on the image by using the neural network model obtained by training.
The above target transformation method based on the attention mechanism enables the model to identify the target object needing to be converted in the target change task by introducing the attention mechanism into the model, so as to distinguish the target from the background. Meanwhile, the consistency of the background of the original image and the converted image is ensured by constructing an attention consistency loss function and a background consistency loss function.
In another embodiment, α is set to 0.000015.
In another embodiment, β is set to 0.000005.
In another embodiment, γ is set to 0.00075.
In another embodiment, the back propagation algorithm is optimized by Adam.
In another embodiment, the learning rate of the back propagation algorithm is 0.0002.
A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of any of the methods when the program is executed.
A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of any of the methods.
A processor for running a program, wherein the program when running performs any of the methods.
Drawings
Fig. 1 is an overall schematic diagram of a model structure of an attention-based target transformation method according to an embodiment of the present application.
Fig. 2 shows three different DAU structures in an attention-based target transformation method according to an embodiment of the present application. (DAU) decode And DAU final Structurally, the Attention Mask depth is different only for output. )
FIG. 3 is a comparison of experimental results of an attention-based objective transformation method with the CycleGAN and VAT methods on ImageNet datasets, provided by an embodiment of the present application.
FIG. 4 is a comparison of the results of experiments on CelebA data sets with the cycleGAN and VAT methods using an attention-based target transformation method provided in the examples of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
An attention-based target transformation method, comprising:
training a neural network model:
step 1, initializing parameters of a neural network model by using random numbers;
step 2, inputting an image X belonging to the category X into a generator G of the model, entering a coding stage, and calculating a first-layer characteristic diagram f by the X through a convolution layer 1 ;
Step 3, then f 1 Two branch networks will be traversed: (a) One convolution layer obtains the characteristic diagram of the second layer without attention mask processing(b) First through two convolutional layers and then through a deconvolution layer to obtain a sumCorresponding attention mask M 2 (ii) a Will M 2 And withElement by element multiplication, the product being further multiplied byAre added one by one to obtain a processed second layer characteristic diagram f 2 ;
Step 4, f 2 Obtaining the characteristic diagram f of the next layer according to the mode of the step 3 3 (ii) a Then, f 3 Further fine features are obtained by 6 layers of residual convolution layers with convolution kernel size of 3 x 3 and step size of 1;
step 5, entering a decoding stage, and taking the deconvolution layer as a decoder; f. of 3 Two branch networks will be traversed: (a) An deconvolution layer is subjected to a second layer profile without attention masking(b) First through two deconvolution layers and then through one convolution layer to obtain the sumAttention mask M 4 (ii) a Will M 4 Andelement by element multiplication, the product being further multiplied byAre added one by one to obtain a processed second layer characteristic diagram f 5 ;
Step 7, y' will beInputting into another generator F, and obtaining x' and corresponding attention mask M after the same operation as the step 2-6 F(G(x)) ;
Step 8, inputting x and x' into a discriminator D x Middle, discriminator D x The probability that the input image belongs to category X will be returned; likewise, y and y' are input to the discriminator D Y Obtaining the probability that Y and Y' belong to the category Y; the value of the opposition loss function is thus calculated:
step 9, calculating the value of the cycle consistent loss function according to x, x ', y, y':
L cyc (G,F)=||x′-x|| 1 +||y′-y|| 1 #(3)
step 10, use M G(x) Separating the background in x and y' from the conversion target, calculating the background change loss:
L bg (x,G)=γ*||B(x,M G(x) )-B(y′,M G(x) )|| 1 #(4)
B(x,M G(x) )=H(x,1-M G(x) )#(5)
γ is set to 0.000075 to 0.0075; the value of the H (K, L) function is that elements in K are multiplied by elements in L one by one; also, M may be used F(G(x)) Calculating the background change loss L from y and x bg (y,F);
Step 11, with M G(x) And M F(G(x)) Calculating attention change loss:
L att (x,G,F)=α*||M G(x) -M F(G(x)) || 1 +β*(M G(x) +M F(G(x)) )#(6)
α is set to 0.000003 to 0.00015, β is set to 0.0000005 to 0.00005;
step 12, adjusting model parameters according to the error obtained in the previous step 8-11 by a back propagation algorithm with the learning rate of 0.00002 to 0.002;
step 13, taking y as an input image, and calculating an error through the operations of the steps 2 to 11, except that the y passes through a generator F and then a generator G; adjusting the model parameters according to the method in the step 12;
step 14, continuously repeating the steps 2-13 until the model parameters are converged;
and carrying out target transformation on the image by using the neural network model obtained by training.
The above object transformation method based on the attention mechanism enables the model to identify the object needing to be transformed in the object change task by introducing the attention mechanism into the model, so as to distinguish the object from the background. Meanwhile, the consistency of the background of the original image and the converted image is ensured by constructing an attention consistency loss function and a background consistency loss function.
In another embodiment, α is set to 0.000015.
In another embodiment, β is set to 0.000005.
In another embodiment, γ is set to 0.00075.
In another embodiment, the back propagation algorithm is optimized by Adam.
In another embodiment, the learning rate of the back propagation algorithm is 0.0002.
A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of any of the methods when executing the program.
A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of any of the methods.
A processor for running a program, wherein the program when running performs any of the methods.
A specific application scenario of the present invention is described below:
the invention studies to enable a model to distinguish objects from the background while learning to map an image set X containing one type of object to an image set Y containing another type of object. The following figure shows the architecture of the model herein, our model comprising 4 modules: generator G, generator F, and discriminator D X And a sum discriminator D Y . G is used to learn the mapping function G: x → Y. The generator F learns another inverse mapping function F: y → X. D X For distinguishing between the original image x and the converted image F (y), and, correspondingly, D Y To distinguish between the original image y and the transformed image G (x). We build a Deep Attention Unit (DAU) in both generator G and generator F to extract the critical areas.
(1) Depth attention unit:
attention was calculated separately on each modality as follows: attention mask M ∈ R is extracted herein by constructing Deep Attention Unit (DAU) 3 The model has the capability of distinguishing the target from the background. The structure of the generator after the depth attention unit is added is shown in the lower part of fig. 1.
In the encoding Stage (Encode Stage), as shown in the lower half of FIG. 1, a feature map f of the n-1 st layer of an input image x is given n-1 (n is equal to {2,3 }), and a convolution layer is used as an encoder to obtain a characteristic diagram of the next layer of x
As shown in FIG. 2 (a), DAU will f n After being encoded by two convolutional layers, the coded signal is further encoded by a sigmoid function (y = 1/(1 + e) -x ) Performing one-time up-sampling on the deconvolution layer as an activation function to obtain a feature mapMask M with consistent foot size n :
In the decoding stage and the output stage, as shown in FIG. 3 (b), a deep attention unit, denoted as DAU, is used herein as well decode And DAU final . But its process and DAU encode In contrast:
the value range of the sigmoid function is [0,1 ]]In between, therefore attention is paid to the mask M n Can be seen as a pairThe weight distribution of (2) can enhance the expression of meaningful features and suppress meaningless information. We will M n And withAn element-wise product is made, denoted as H (#). Furthermore, referring to the residual network and the residual attention network, we add shortcut to suppress the gradient vanishing problem.
Finally obtaining the characteristic diagram f of the n-th layer through the operation n :
(2) Round consistent loss function:
CycleGAN uses a cyclic consistent loss function to improve the image translation effect, which is referred to as Dual learning (Dual learning) in the field of machine translation, and it is considered that for each image X in the data set X, this conversion cycle can map X back to the original image: x '= F (y') = F (G (x)) ≈ x. Accordingly: y '= F (x') = G (F (x)) ≈ y. Since the model herein is also a dual learning structure. We also use the round robin uniform loss functionImproving the effect of converting the model into the image:
L cvc (G,F)=||F(G(x))-x|| 1 +||G(F(y))-y|| 1 #(6)
(3) Attention consistent loss function:
considering that the spatial position of the target in the image should remain unchanged in the conversion process F (G (x)), an Attention Consistency Loss function (Attention Consistency Loss) is therefore constructed herein to constrain the model:
L att (x,G,F)=α*||M G(x) -M F(G(x)) || 1 +β*(M G(x) +M F(G(x)) )#(7)
M G(x) and M F(G(x)) Representing the masks output by the model in the last layer during the generation of G (x) and F (G (x)), respectively, where the values of the elements represent the probabilities that the corresponding elements belong to the conversion targets in the original image. The second term is a regularization term that prevents over-fitting of the model. α, β are the weights of both terms in the formula.
(4) Background consistent loss function:
when the DAU obtains the attention mask corresponding to the feature map, the model can distinguish the target from the background. A Background consistent Loss function (Background Consistency Loss) was constructed here:
L bg (x,G)=γ*||B(x,M G(x) )-B(G(x),M G(x) )|| 1 #(8)
B(x,M G(x) )=H(x,1-M G(x) )#(9)
γ is a hyperparameter. B (x, M) G(x) ) Is a background function, 1-M G(x) The value of the middle element represents the probability that the corresponding element belongs to the background in the original image. For x and 1-M G(x) And obtaining the background of x by calculating an element-wise product. B (G (x), M) G(x) ) The same is true.
(5) Background consistent loss function:
the effectiveness of the generated image may be enhanced by an adaptive Loss. For the mapping function G: x → Y and its discriminator D Y Expressed as:
g will attempt to make the generated image G (x) indistinguishable from the image of the data set Y, and D Y The aim is to distinguish G (x) from y as much as possible. The goal of G is to minimize this objective function, instead D will try to maximize it.
(6) The complete objective function:
this translates into a min-max optimization problem:
the invention has the advantages that the model can effectively identify the target object in the image, neglect irrelevant background and further improve the final visual nominal effect, and the model obtains the best effect on a plurality of comparison experiments with other current most methods.
The text firstly constructs a Deep Attention Unit (DAU) module based on an Attention accumulation mechanism, and the purpose of the module is to identify a target object in an image, so as to guide a model to eliminate background interference and further prompt a conversion effect.
The experiment was validated on both data sets, imageNet and CelebA. ImageNet is a large-scale image dataset specifically used for machine vision studies. We extracted 995 apple images, 1019 orange images, 1067 horse images and 1334 zebra images from ImageNet for training the model.
Fig. 3 shows the results of comparative experiments on the ImageNet dataset and fig. 4 shows the results of comparative experiments on the CelebA dataset. It is clear that CycleGAN and VAT have a great influence on the background of the original image. For example, in the second column of fig. 3 (a) (b), the leaves fade from green to gray. In fig. four, the conversion of VAT completely fails: the face of the transformed image has been completely deformed and the due transformation features have not appeared. For example, in fig. 4 (b), the conversion between the non-glasses image → glasses image is not converted into an image with glasses on one face by VAT. However, the DAU-GAN method not only successfully completes the conversion task, but also effectively retains the background of the original image. For example, in the figure 3 (c) conversion of horse image → zebra image, the zebra image generated by DAU-GAN not only preserves the background with more natural streaks.
Table 1 mean change value of background for each transformed image.
To more accurately demonstrate the effectiveness of our method, we quantitatively counted the mean change in the transformed image background over the test set. Table 1 shows the results of the experiment. For each conversion, the background variation value of the DAU-GAN converted image is minimal. It strongly demonstrates that our model can preserve the background in target changes.
The technical features of the embodiments described above may be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the embodiments described above are not described, but should be considered as being within the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present invention, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention. Therefore, the protection scope of the present patent should be subject to the appended claims.
Claims (9)
1. An attention-based target transformation method, comprising:
training a neural network model:
step 1, initializing parameters of a neural network model by using random numbers;
step 2, inputting an image X belonging to the category X into a generator G of the model, entering an encoding stage, and calculating a first-layer characteristic diagram f by the X through a convolution layer 1 ;
Step 3, then f 1 Two branch networks will be traversed: (a) One convolution layer obtains the characteristic diagram of the second layer without attention mask processing(b) First through two convolutional layers and then through a deconvolution layer to obtain a sumCorresponding attention mask M 2 (ii) a Will M 2 Andelement by element multiplication, the product being further multiplied byAre added one by one to obtain a processed second layer characteristic diagram f 2 ;
Step 4, f 2 Obtaining the characteristic diagram f of the next layer according to the mode of the step 3 3 (ii) a Then, f 3 Further fine features are obtained by 6 layers of residual convolution layers with convolution kernel size of 3 x 3 and step length of 1;
step 5, entering a decoding stage, and taking the deconvolution layer as a decoder; f. of 3 Two branch networks will be traversed: (a) An deconvolution layer is subjected to a second layer profile without attention masking(b) First through two deconvolution layers and then through one convolution layer to obtain the sumCorresponding attention mask M 4 (ii) a Will M 4 Andelement by element multiplication, the product being further multiplied byAre added one by one to obtain a processed second layer characteristic diagram f 5 ;
Step 6, enter the output stage, f 5 Two branch networks will be traversed: (a) obtaining a transformed image y' from a deconvolution layer; (b) Obtaining an attention mask M corresponding to y' through two deconvolution layers and a convolution layer G(x) ;
Step 7, y 'is input into another generator F, and the same operation as in step 2-6 is performed to obtain x' and the corresponding attention mask M F(G(x)) ;
Step 8, inputting x and x' into a discriminator D x Middle, discriminator D x The probability that the input image belongs to the category X is returned; similarly, y and y' are input to a discriminator D Y Obtaining the probability that Y and Y' belong to the category Y; the value of the opposition loss function is thus calculated:
step 9, calculating the value of the cycle consistent loss function according to x, x ', y, y':
L cyc (G,F)=||x′-x|| 1 +||y′-y|| 1 #(3)
step 10, use M G(x) Separating the background in x and y' from the conversion target, calculating the background change loss:
L bg (x,G)=γ*||B(x,M G(x) )-B(y′,M G(x) )|| 1 #(4)
B(x,M G(x) )=H(x,1-M G(x) )#(5)
γ is set to 0.000075 to 0.0075; the value of the H (K, L) function is that elements in K are multiplied by elements in L one by one; likewise, with M F(G(x)) Calculating the background change loss L from y and x bg (y,F);
Step 11, with M G(x) And M F(G(x)) Calculating the attention change loss:
L att (x,G,F)=α*||M G(x) -M F(G(x)) || 1 +β*(M G(x) +M F(G(x)) )#(6)
α is set to 0.000003 to 0.00015, β is set to 0.0000005 to 0.00005;
step 12, adjusting model parameters according to the error obtained in the previous step 8-11 by a back propagation algorithm with the learning rate of 0.00002 to 0.002;
step 13, taking y as an input image, and calculating an error through the operations of the steps 2 to 11, except that the y passes through a generator F and then a generator G; adjusting the model parameters according to the method in the step 12;
step 14, continuously repeating the steps 2-13 until the model parameters are converged;
and carrying out target transformation on the image by using the neural network model obtained by training.
2. The attention-based mechanism target transformation method of claim 1, wherein α is set to 0.000015.
3. The attention-based mechanism target translation method of claim 1, wherein β is set to 0.000005.
4. The attention-based mechanism target translation method of claim 1, wherein γ is set to 0.00075.
5. An attention-based mechanism target transformation method according to claim 1, characterized in that the back propagation algorithm is Adam optimized.
6. The attention-based mechanism target transformation method of claim 1, wherein the back-propagation algorithm has a learning rate of 0.0002.
7. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the steps of the method of any of claims 1 to 6 are implemented when the program is executed by the processor.
8. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 6.
9. A processor, characterized in that the processor is configured to run a program, wherein the program when running performs the method of any of claims 1 to 6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810866277.0A CN109033095B (en) | 2018-08-01 | 2018-08-01 | Target transformation method based on attention mechanism |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810866277.0A CN109033095B (en) | 2018-08-01 | 2018-08-01 | Target transformation method based on attention mechanism |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109033095A CN109033095A (en) | 2018-12-18 |
CN109033095B true CN109033095B (en) | 2022-10-18 |
Family
ID=64647612
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810866277.0A Active CN109033095B (en) | 2018-08-01 | 2018-08-01 | Target transformation method based on attention mechanism |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109033095B (en) |
Families Citing this family (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109712068A (en) * | 2018-12-21 | 2019-05-03 | 云南大学 | Image Style Transfer and analogy method for cucurbit pyrography |
CN109784197B (en) * | 2018-12-21 | 2022-06-07 | 西北工业大学 | Pedestrian re-identification method based on hole convolution and attention mechanics learning mechanism |
CN109829537B (en) * | 2019-01-30 | 2023-10-24 | 华侨大学 | Deep learning GAN network children's garment based style transfer method and equipment |
CN111325318B (en) * | 2019-02-01 | 2023-11-24 | 北京地平线机器人技术研发有限公司 | Neural network training method, neural network training device and electronic equipment |
CN109902602B (en) * | 2019-02-16 | 2021-04-30 | 北京工业大学 | Method for identifying foreign matter material of airport runway based on antagonistic neural network data enhancement |
CN110033410B (en) * | 2019-03-28 | 2020-08-04 | 华中科技大学 | Image reconstruction model training method, image super-resolution reconstruction method and device |
CN110084794B (en) * | 2019-04-22 | 2020-12-22 | 华南理工大学 | Skin cancer image identification method based on attention convolution neural network |
CN110634101B (en) * | 2019-09-06 | 2023-01-31 | 温州大学 | Unsupervised image-to-image conversion method based on random reconstruction |
CN110766638A (en) * | 2019-10-31 | 2020-02-07 | 北京影谱科技股份有限公司 | Method and device for converting object background style in image |
CN111489287B (en) * | 2020-04-10 | 2024-02-09 | 腾讯科技(深圳)有限公司 | Image conversion method, device, computer equipment and storage medium |
CN111815570B (en) * | 2020-06-16 | 2024-08-30 | 浙江大华技术股份有限公司 | Regional intrusion detection method and related device thereof |
CN112884773B (en) * | 2021-01-11 | 2022-03-04 | 天津大学 | Target segmentation model based on target attention consistency under background transformation |
CN113256592B (en) * | 2021-06-07 | 2021-10-08 | 中国人民解放军总医院 | Training method, system and device of image feature extraction model |
CN113538224B (en) * | 2021-09-14 | 2022-01-14 | 深圳市安软科技股份有限公司 | Image style migration method and device based on generation countermeasure network and related equipment |
CN113808011B (en) * | 2021-09-30 | 2023-08-11 | 深圳万兴软件有限公司 | Style migration method and device based on feature fusion and related components thereof |
CN113657560B (en) * | 2021-10-20 | 2022-04-15 | 南京理工大学 | Weak supervision image semantic segmentation method and system based on node classification |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108009525A (en) * | 2017-12-25 | 2018-05-08 | 北京航空航天大学 | A kind of specific objective recognition methods over the ground of the unmanned plane based on convolutional neural networks |
-
2018
- 2018-08-01 CN CN201810866277.0A patent/CN109033095B/en active Active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108009525A (en) * | 2017-12-25 | 2018-05-08 | 北京航空航天大学 | A kind of specific objective recognition methods over the ground of the unmanned plane based on convolutional neural networks |
Non-Patent Citations (2)
Title |
---|
DAU-GAN: Unsupervised Object Transfiguration via Deep Attention Unit;Zihan Ye et al.;《BICS 2018》;20180709;第120-129页 * |
Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks;Jun-Yan Zhu et al.;《arXiv:1703.10593v1》;20170330;第1-18页 * |
Also Published As
Publication number | Publication date |
---|---|
CN109033095A (en) | 2018-12-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109033095B (en) | Target transformation method based on attention mechanism | |
Denton et al. | Semi-supervised learning with context-conditional generative adversarial networks | |
CN111079532B (en) | Video content description method based on text self-encoder | |
CN104866900A (en) | Deconvolution neural network training method | |
Jiang et al. | When to learn what: Deep cognitive subspace clustering | |
CN106157254A (en) | Rarefaction representation remote sensing images denoising method based on non local self-similarity | |
Uddin et al. | A perceptually inspired new blind image denoising method using $ L_ {1} $ and perceptual loss | |
Pieters et al. | Comparing generative adversarial network techniques for image creation and modification | |
CN115984745A (en) | Moisture control method for black garlic fermentation | |
CN116342379A (en) | Flexible and various human face image aging generation system | |
CN111428181A (en) | Bank financing product recommendation method based on generalized additive model and matrix decomposition | |
CN115526223A (en) | Score-based generative modeling in a potential space | |
Wang et al. | Learning to hallucinate face in the dark | |
Zhou et al. | Personalized and occupational-aware age progression by generative adversarial networks | |
Zhu et al. | Multiview Deep Subspace Clustering Networks | |
CN105260736A (en) | Fast image feature representing method based on normalized nonnegative sparse encoder | |
Cong et al. | Gradient-semantic compensation for incremental semantic segmentation | |
Li et al. | Adaptive sparsity-regularized deep dictionary learning based on lifted proximal operator machine | |
Oza et al. | Semi-supervised image-to-image translation | |
CN116977343A (en) | Image processing method, apparatus, device, storage medium, and program product | |
CN115601257A (en) | Image deblurring method based on local features and non-local features | |
Hah et al. | Information‐Based Boundary Equilibrium Generative Adversarial Networks with Interpretable Representation Learning | |
Islam et al. | Class aware auto encoders for better feature extraction | |
CN113222100A (en) | Training method and device of neural network model | |
CN109840888A (en) | A kind of image super-resolution rebuilding method based on joint constraint |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |