CN109033095A - Object transformation method based on attention mechanism - Google Patents

Object transformation method based on attention mechanism Download PDF

Info

Publication number
CN109033095A
CN109033095A CN201810866277.0A CN201810866277A CN109033095A CN 109033095 A CN109033095 A CN 109033095A CN 201810866277 A CN201810866277 A CN 201810866277A CN 109033095 A CN109033095 A CN 109033095A
Authority
CN
China
Prior art keywords
attention
image
model
object transformation
background
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810866277.0A
Other languages
Chinese (zh)
Other versions
CN109033095B (en
Inventor
胡伏原
叶子寒
李林燕
孙钰
付保川
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou University of Science and Technology
Original Assignee
Suzhou University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou University of Science and Technology filed Critical Suzhou University of Science and Technology
Priority to CN201810866277.0A priority Critical patent/CN109033095B/en
Publication of CN109033095A publication Critical patent/CN109033095A/en
Application granted granted Critical
Publication of CN109033095B publication Critical patent/CN109033095B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/58Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/10Image acquisition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The present invention relates to a kind of object transformation methods based on attention mechanism, comprising: training neural network model: step 1, using the parameter of random number initialization neural network model;Step 2, it inputs one to belong in the image x to the generator G of model of classification X, into coding stage, x calculates first layer characteristic pattern f by a convolutional layer1.The object transformation that image is carried out using the neural network model that above-mentioned training obtains enables model to recognize the need for the target object of conversion in object variations task, so that target and background be distinguished by introducing attention mechanism in a model.Meanwhile the background uniformity of image to guarantee original image and is converted by the consistent loss function of building attention loss function consistent with background.

Description

Object transformation method based on attention mechanism
Technical field
The present invention relates to image interpretations, more particularly to the object transformation method based on attention mechanism.
Background technique
Object transformation (Object transfiguraion) is one special task of image interpretation, its purpose be Specific type target object in image is converted into another type of object.Image interpretation (Image translation) purport In the image that original image is converted into target style by the mapping relations between two class images of study, it is applied in recent years Many aspects, such as image super-resolution rebuilding, artistic style migration etc..Researcher proposes very in the case where there is surveillance requirements More effective conversion methods.But since the acquisition of paired data needs a large amount of human costs and time cost, unsupervised condition Under method for transformation become image interpretation in research hotspot.Visual Attribution Transfer (VAT) is wherein base In the representative of convolutional neural networks CNN method, it is corresponding to most probable in another width figure using the feature of different levels in model Feature matched.In addition, using production confrontation network (Generative adversarial network, GAN) Method obtains effect more significant than method based on convolutional neural networks.Isola P et al. has probed into GAN and has appointed in image interpretation Potentiality in business.Then, Cycle-Consistent Loss is used to solve unsupervised image interpretation by Zhu J.Y et al. proposition Problem, they assume that the mapping relations learnt in image interpretation task are a biaxial stress structures, and strengthen model in no prison with this Superintend and direct the effect of image interpretation under environment.
There are following technical problems for traditional technology:
Current overwhelming majority image interpretation method does not all account for the otherness by converting objects and background area.In target In variation task, most models are difficult to effectively distinguish switch target and background, not can guarantee original image background and transition diagram As the consistency of background.Therefore, the effects of model can generate image background in conversion process fuzzy, discoloration, reduces and turns Change the quality of image.
Summary of the invention
Based on this, it is necessary in view of the above technical problems, provide a kind of object transformation method based on attention mechanism, lead to Introducing attention mechanism in a model is crossed, model is enable to recognize the need for the target object of conversion in object variations task, To which target and background be distinguished.Meanwhile it being protected by the consistent loss function of building attention loss function consistent with background It demonstrate,proves original image and converts the background uniformity of image.
A kind of object transformation method based on attention mechanism, comprising:
Training neural network model:
Step 1, using the parameter of random number initialization neural network model;
Step 2, it inputs one to belong in the image x to the generator G of model of classification X, into coding stage, x passes through one A convolutional layer calculates first layer characteristic pattern f1
Step 3, subsequent f1Two branching networks can be passed through: (a) convolutional layer is obtained without attention mask process Second layer characteristic pattern(b) first pass through two convolutional layers using a warp lamination obtain withCorresponding attention exposure mask M2;By M2WithElement multiplication one by one, gained product again withElement be added one by one, the second layer characteristic pattern that obtains that treated f2
Step 4, f2Next layer of characteristic pattern f is obtained in the way of step 3 again3;Then, f36 layers of convolution kernel size can be passed through The further fine-feature of residual error convolutional layer for being 1 for 3*3, step-length;
Step 5, into decoding stage, warp lamination is as decoder;f3Two branching networks can be passed through: (a) one it is anti- Convolutional layer obtains the second layer characteristic pattern without attention mask process(b) two warp laminations are first passed through using one Convolutional layer obtain withCorresponding attention exposure mask M4;It will
M4WithElement multiplication one by one, gained product again withElement be added one by one, obtaining that treated, the second layer is special Sign figure f5
Step 6, into output stage, f5The image y ' converted by (a) warp lamination;(b) two are first passed through A warp lamination obtains attention mask M corresponding with y ' using a convolutional layerG(x)
Step 7, y ' can be entered in another generator F, and x ' and corresponding is obtained after operation identical with step 2-6 Attention mask MF(G(x))
Step 8, by x and x ' input arbiter DXIn, arbiter DXThe probability that input picture belongs to classification X can be returned;Equally Ground, y and y ' input arbiter DYIn, obtain the probability that y and y ' belongs to classification Y;Thus the value of confrontation loss function is calculated:
Step 9, the value for recycling consistent loss function is calculated according to x, x ', y, y ':
Lcyc(G, F)=| | x '-x | |1+||y′-y||1#(3)
Step 10, using MG(x)The middle background of x and y ' is separated with switch target, calculates background variation loss:
Lbg(x, G)=γ * | | B (x, MG(x))-B (y ', MG(x))||1#(4)
B (x, MG(x))=H (x, 1-MG(x))#(5)
It is set as 0.000075 to 0.0075;The value of H (a, b) function be a in element one by one be multiplied in b;It is also possible to Use MF(G(x))Y and x ' calculating background is changed into loss LBg (y, F)
Step 11, M is usedG(x)And MF(G(x))Calculate attention change loss:
Latt(x, G, F)=α * | | MG(x)-MF(G(x))||1+β*(MG(x)+MF(G(x)))#(6)
It is set as 0.000003 to 0.00015, β and is set as 0.0000005 to 0.00005;
Step 12, the back-propagation algorithm that learning rate is 0.00002 to 0.002, according to what is obtained in step 8-11 before Error adjusts model parameter;
Step 13, y is regarded into input picture, error is calculated using the operation of step 2-11, the difference is that being to first pass through Generator F is using generator G);Model parameter is adjusted by the method for step 12 again;
Step 14, step 2-13 is constantly repeated, until model parameter restrains;
The object transformation of image is carried out using the neural network model that above-mentioned training obtains.
The above-mentioned object transformation method based on attention mechanism enables model by introducing attention mechanism in a model The target object that conversion is recognized the need in object variations task, so that target and background be distinguished.Meanwhile passing through building The consistent loss function of attention loss function consistent with background come guarantee original image and convert image background uniformity.
In other one embodiment, α is set as 0.000015.
In other one embodiment, β is set as 0.000005.
In other one embodiment, γ is set as 0.00075.
In other one embodiment, the back-propagation algorithm optimizes by Adam.
In other one embodiment, the learning rate of the back-propagation algorithm is 0.0002.
A kind of computer equipment can be run on a memory and on a processor including memory, processor and storage The step of computer program, the processor realizes any one the method when executing described program.
A kind of computer readable storage medium, is stored thereon with computer program, realization when which is executed by processor The step of any one the method.
A kind of processor, the processor is for running program, wherein described program executes described in any item when running Method.
Detailed description of the invention
Fig. 1 is that a kind of model structure of object transformation method based on attention mechanism provided by the embodiments of the present application is whole Schematic diagram.
Fig. 2 is that three kinds in a kind of object transformation method based on attention mechanism provided by the embodiments of the present application are different DAU structure.(DAUdecodeAnd DAUfinalIdentical in structure, the Attention Mask depth only exported is different.)
Fig. 3 it is provided by the embodiments of the present application it is a kind of based on the object transformation method of attention mechanism in ImageNet data set Upper and CycleGAN and VAT method comparative experiments result.
Fig. 4 it is provided by the embodiments of the present application it is a kind of based on the object transformation method of attention mechanism on CelebA data set With the comparative experiments result of CycleGAN and VAT method.
Specific embodiment
In order to make the objectives, technical solutions, and advantages of the present invention clearer, with reference to the accompanying drawings and embodiments, right The present invention is further elaborated.It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, and It is not used in the restriction present invention.
A kind of object transformation method based on attention mechanism, comprising:
Training neural network model:
Step 1, using the parameter of random number initialization neural network model;
Step 2, it inputs one to belong in the image x to the generator G of model of classification X, into coding stage, x passes through one A convolutional layer calculates first layer characteristic pattern f1
Step 3, subsequent f1Two branching networks can be passed through: (a) convolutional layer is obtained without attention mask process Second layer characteristic pattern(b) first pass through two convolutional layers using a warp lamination obtain withCorresponding attention exposure mask M2;By M2WithElement multiplication one by one, gained product again withElement be added one by one, the second layer feature that obtains that treated Scheme f2
Step 4, f2Next layer of characteristic pattern f is obtained in the way of step 3 again3;Then, f36 layers of convolution kernel size can be passed through The further fine-feature of residual error convolutional layer for being 1 for 3*3, step-length;
Step 5, into decoding stage, warp lamination is as decoder;f3Two branching networks can be passed through: (a) one it is anti- Convolutional layer obtains the second layer characteristic pattern without attention mask process(b) two warp laminations are first passed through using one Convolutional layer obtain withCorresponding attention exposure mask M4;It will
M4WithElement multiplication one by one, gained product again withElement be added one by one, obtaining that treated, the second layer is special Sign figure f5
Step 6, into output stage, f5The image y ' converted by (a) warp lamination;(b) two are first passed through A warp lamination obtains attention mask M corresponding with y ' using a convolutional layerG(x)
Step 7, y ' can be entered in another generator F, and x ' and corresponding is obtained after operation identical with step 2-6 Attention mask MF(G(x))
Step 8, by x and x ' input arbiter DXIn, arbiter DXThe probability that input picture belongs to classification X can be returned;Equally Ground, y and y ' input arbiter DYIn, obtain the probability that y and y ' belongs to classification Y;Thus the value of confrontation loss function is calculated:
Step 9, the value for recycling consistent loss function is calculated according to x, x ', y, y ':
Lcyc(G, F)=| | x '-x | |1+||y′-y||1#(3)
Step 10, using MG(x)The middle background of x and y ' is separated with switch target, calculates background variation loss:
Lbg(x, G)=γ * | | B (x, MG(x))-B (y ', MG(x))||1#(4)
B (x, MG(x))=H (x, 1-MG(x)#(5)
It is set as 0.000075 to 0.0075;The value of H (a, b) function be a in element one by one be multiplied in b;It is also possible to Use MF(G(x))Y and x ' calculating background is changed into loss Lbg(y, F);
Step 11, M is usedG(x)And MF(G(x))Calculate attention change loss:
Latt(x, G, F)=α * | | MG(x)-MF(G(x))||1+β*(MG(x)+MF(G(x)))#(6)
It is set as 0.000003 to 0.00015, β and is set as 0.0000005 to 0.00005;
Step 12, the back-propagation algorithm that learning rate is 0.00002 to 0.002, according to what is obtained in step 8-11 before Error adjusts model parameter;
Step 13, y is regarded into input picture, error is calculated using the operation of step 2-11, the difference is that being to first pass through Generator F is using generator G);Model parameter is adjusted by the method for step 12 again;
Step 14, step 2-13 is constantly repeated, until model parameter restrains;
The object transformation of image is carried out using the neural network model that above-mentioned training obtains.
The above-mentioned object transformation method based on attention mechanism enables model by introducing attention mechanism in a model The target object that conversion is recognized the need in object variations task, so that target and background be distinguished.Meanwhile passing through building The consistent loss function of attention loss function consistent with background come guarantee original image and convert image background uniformity.
In other one embodiment, α is set as 0.000015.
In other one embodiment, β is set as 0.000005.
In other one embodiment, γ is set as 0.00075.
In other one embodiment, the back-propagation algorithm optimizes by Adam.
In other one embodiment, the learning rate of the back-propagation algorithm is 0.0002.
A kind of computer equipment can be run on a memory and on a processor including memory, processor and storage The step of computer program, the processor realizes any one the method when executing described program.
A kind of computer readable storage medium, is stored thereon with computer program, realization when which is executed by processor The step of any one the method.
A kind of processor, the processor is for running program, wherein described program executes described in any item when running Method.
A concrete application scene of the invention is described below:
What the present invention studied is to allow model that the image set X comprising a kind of target to be mapped to comprising another kind of target in study Image set Y while, target and background can be distinguished.The following figure illustrates the framework of this paper model, our model includes 4 A module: generator G, generator F, arbiter DXAnd arbiter
DY.G is used to learn mapping function G:X → Y.Generator F learns another opposite mapping function F:Y → X.DXWith To distinguish original image x and translated image F (y), correspondingly, DYFor distinguishing original image { y } and translated image G (x).We are in life It grows up to be a useful person in G and generator F, constructs depth attention unit (Deep Attention Unit, DAU) all to extract key area Domain.
(1) depth attention unit:
It is as follows that attention is calculated separately in each mode: herein by building depth attention unit (Deep Attetion Unit, DAU) extract attention exposure mask M ∈ R3, model is made to have the ability for distinguishing target and background.The lower part Fig. 1 Divide and illustrates the structure of the generator after depth attention unit is added.
At coding stage (Encode Stage), as shown in the lower half portion Fig. 1, (n-1)th layer of an input picture x is given Characteristic pattern fn-1(n ∈ { 2,3 }) use a convolutional layer to obtain the next layer of characteristic pattern of x as encoder
As shown in Fig. 2 (a), DAU is by fnAfter being encoded with two convolutional layers, then to sigmoid function (y=1/ (1+e-x) Warp lamination as activation primitive is once up-sampled, and is obtained and characteristic patternThe consistent exposure mask M of sizen:
In decoding stage and output stage, as shown in Fig. 3 (b), depth attention unit has similarly been used herein, has been denoted as DAUdecodeAnd DAUfinal.But its process and DAUencodeIt is opposite:
The codomain of sigmoid function is between [0,1], therefore noted that power exposure mask MnIt can regard as pairWeight distribution, The expression that significant feature can be enhanced inhibits meaningless information.We will
MnWithAn element-wise product is done, H (*) is denoted as.In addition, reference residual network and residual error attention net Network, we add shortcut to inhibit gradient disappearance problem.
N-th layer characteristic pattern f is finally obtained by aforesaid operationsn:
(2) consistent loss function is recycled:
CycleGAN promotes the effect of image interpretation using consistent loss function is recycled, it is with reference in machine translation field Paired-associate learning method (Dual learning), it is believed that for every image x in data set X, this conversion cycle can be by x Map back the image of script: x '=F (y ')=F (G (x)) ≈ x.Correspondingly: y '=F (x ')=G (F (x)) ≈ y.Due to herein Model is also paired-associate learning structure.We are also using the consistent loss function of circulationThe effect of lift scheme conversion image:
Lcyc(G, F)=| | F (G (x))-x | |1+||G(F(y))-y||1#(6)
(3) the consistent loss function of attention:
In view of the spatial position of target in the picture should remain unchanged in conversion process F (G (x)), herein therefore The consistent loss function (Attention Consistency Loss) of attention is constructed to constrain model:
Latt(x, G, F)=α * | | MG(x)-MF(G(x))||1+β*(MG(x)+MF(G(x)))#(7)
MG(x)And MF(G(x))The exposure mask of model the last layer output in the generating process of G (x) and F (G (x)) is respectively indicated, Wherein value expression corresponding element in original image of element belongs to the probability of switch target.Section 2 is a regularization term, can To prevent model over-fitting.α, β are two in formula weights.
(4) the consistent loss function of background:
After DAU obtains characteristic pattern corresponding attention exposure mask, model can be made to distinguish target and background.Building back herein The consistent loss function of scape (Background Consistency Loss):
Lbg(x, G)=γ * | | B (x, MG(x))-B (G (x), MG(x))||1#(8)
B (x, MG(x))=H (x, 1-MG(x))#(9)
γ is a hyper parameter.B (x, MG(x)) it is background function, 1-MG(x)The value of middle element indicates the corresponding element in original image Element belongs to the probability of background.To x and 1-MG(x)Seek the background of element-wise product you can get it x.B (G (x), MG(x)) similarly.
(5) the consistent loss function of background:
The effect of the image of generation can be enhanced in confrontation loss (Adversarial Loss).To mapping function
The arbiter D of G:X → Y and itY, indicate are as follows:
G can attempt to distinguish the image G (x) generated can not with the image of data set Y, and DYPurpose be as regional as possible Divide G (x) and y.The purpose of G is to minimize this objective function, on the contrary, D can attempt to maximize it.
(6) complete objective function:
Thus it is converted to a min-max optimization problem:
The advantage of the invention is that model can effectively identify the target object in image, ignores extraneous background and then mention Rise final vision and censure effect, it is multiple with all achieve best effect on other currently most methodical comparative experiments.
Construct first herein based on attention Accumulation Mechanism depth attention unit (Deep Attention Unit, DAU) module, the purpose of the module are to identify the target object in image, so that pilot model excludes background interference, in turn Prompt conversion effect.
Experiment is verified in ImageNet and CelebA two datasets.ImageNet is one dedicated for machine The large scale image data set of device vision research.We have extracted 995 Apple images, 1019 orange figures from ImageNet Picture, 1067 horse images and 1334 zebra images are used for training pattern.
Fig. 3 is illustrated in the comparative experiments on ImageNet data set as a result, Fig. 4 is illustrated on CelebA data set Comparative experiments result.Therefrom this it appears that CycleGAN and VAT produces strong influence to the background of original image.Example Such as, in the secondary series of Fig. 3 (a) (b), blade takes off from green to grey.In figure four, the conversion of VAT has failed completely: transition diagram The face of picture has been badly deformed, and due converting characteristic does not also occur.For example, Fig. 4 (b) glasses-free image → there is glasses picture In the conversion of picture, VAT does not convert out the image that a face has glasses.And our method DAU-GAN has not only succeeded At convert task, and effectively remain the background of original image.For example, turning in Fig. 3 (c) horse image → zebra image In changing, not only remaining background by the DAU-GAN zebra image generated has more natural striped.
The background average change value of table .1 every conversion image.
In order to more accurately confirm the effect of our methods, we quantify geo-statistic and convert image background on test set Average change value.Table 1 illustrates experimental result.Every kind is converted, by the background changing value of the DAU-GAN image converted It is all the smallest.It has convincingly demonstrated our models can retain background in object variations.
Each technical characteristic of embodiment described above can be combined arbitrarily, for simplicity of description, not to above-mentioned reality It applies all possible combination of each technical characteristic in example to be all described, as long as however, the combination of these technical characteristics is not deposited In contradiction, all should be considered as described in this specification.
The embodiments described above only express several embodiments of the present invention, and the description thereof is more specific and detailed, but simultaneously It cannot therefore be construed as limiting the scope of the patent.It should be pointed out that coming for those of ordinary skill in the art It says, without departing from the inventive concept of the premise, various modifications and improvements can be made, these belong to protection of the invention Range.Therefore, the scope of protection of the patent of the invention shall be subject to the appended claims.

Claims (9)

1. a kind of object transformation method based on attention mechanism characterized by comprising
The training neural network model:
Step 1, using the parameter of random number initialization neural network model;
Step 2, it inputs one to belong in the image x to the generator G of model of classification X, into coding stage, x is by a volume Lamination calculates first layer characteristic pattern f1
Step 3, subsequent f1Two branching networks can be passed through: (a) convolutional layer obtains the second layer without attention mask process Characteristic pattern(b) first pass through two convolutional layers using a warp lamination obtain withCorresponding attention exposure mask M2;By M2 WithElement multiplication one by one, gained product again withElement be added one by one, the second layer characteristic pattern f that obtains that treated2
Step 4, f2Next layer of characteristic pattern f is obtained in the way of step 3 again3;Then, f3It can be by 6 layers of convolution kernel having a size of 3* 3, the further fine-feature of residual error convolutional layer that step-length is 1;
Step 5, into decoding stage, warp lamination is as decoder;f3Two branching networks: (a) warp lamination can be passed through Obtain the second layer characteristic pattern without attention mask process(b) two warp laminations are first passed through using a convolutional layer Obtain withCorresponding attention exposure mask M4;By M4WithElement multiplication one by one, gained product again withElement phase one by one Add, the second layer characteristic pattern f that obtains that treated5
Step 6, into output stage, f5The image y ' converted by (a) warp lamination;(b) two warps are first passed through Lamination obtains attention mask M corresponding with y ' using a convolutional layerG(x)
Step 7, y ' can be entered in another generator F, obtain x ' and corresponding note after operation identical with step 2-6 Anticipate power mask MF(G(x))
Step 8, by x and x ' input arbiter DXIn, arbiter DXThe probability that input picture belongs to classification X can be returned;Similarly, y With y ' input arbiter DYIn, obtain the probability that y and y ' belongs to classification Y;Thus the value of confrontation loss function is calculated:
Step 9, the value for recycling consistent loss function is calculated according to x, x ', y, y ':
Lcyc(G, F)=| | x '-x | |1+||y′-y||1#(3)
Step 10, using MG(x)The middle background of x and y ' is separated with switch target, calculates background variation loss:
Lbg(x, G)=γ * | | B (x, MG(x))-B (y ', MG(x))||1#(4)
B (x, MG(x))=H (x, 1-MG(x)) # (5)
γ is set as 0.000075 to 0.0075;The value of H (a, b) function be a in element one by one be multiplied in b;It is also possible to MF(G(x))Y and x ' calculating background is changed into loss Lbg(y, F);
Step 11, M is usedG(x)And MF(G(x))Calculate attention change loss:
Latt(x, G, F)=α * | | MG(x)-MF(G(x))||1+β*(MG(x)+MF(G(x)))#(6)
α is set as 0.000003 to 0.00015, β and is set as 0.0000005 to 0.00005;
Step 12, the back-propagation algorithm that learning rate is 0.00002 to 0.002, according to the error obtained in step 8-11 before, Adjust model parameter;
Step 13, y is regarded into input picture, error is calculated using the operation of step 2-11, the difference is that being to first pass through generation Device F is using generator G);Model parameter is adjusted by the method for step 12 again;
Step 14, step 2-13 is constantly repeated, until model parameter restrains;
The object transformation of image is carried out using the neural network model that above-mentioned training obtains.
2. the object transformation method according to claim 1 based on attention mechanism, which is characterized in that α is set as 0.000015。
3. the object transformation method according to claim 1 based on attention mechanism, which is characterized in that β is set as 0.000005。
4. the object transformation method according to claim 1 based on attention mechanism, which is characterized in that γ is set as 0.00075。
5. the object transformation method according to claim 1 based on attention mechanism, which is characterized in that the backpropagation Algorithm optimizes by Adam.
6. the object transformation method according to claim 1 based on attention mechanism, which is characterized in that the backpropagation The learning rate of algorithm is 0.0002.
7. a kind of computer equipment including memory, processor and stores the meter that can be run on a memory and on a processor Calculation machine program, which is characterized in that the processor realizes any one of claims 1 to 6 the method when executing described program Step.
8. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the program is held by processor The step of any one of claims 1 to 6 the method is realized when row.
9. a kind of processor, which is characterized in that the processor is for running program, wherein right of execution when described program is run Benefit requires 1 to 6 described in any item methods.
CN201810866277.0A 2018-08-01 2018-08-01 Target transformation method based on attention mechanism Active CN109033095B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810866277.0A CN109033095B (en) 2018-08-01 2018-08-01 Target transformation method based on attention mechanism

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810866277.0A CN109033095B (en) 2018-08-01 2018-08-01 Target transformation method based on attention mechanism

Publications (2)

Publication Number Publication Date
CN109033095A true CN109033095A (en) 2018-12-18
CN109033095B CN109033095B (en) 2022-10-18

Family

ID=64647612

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810866277.0A Active CN109033095B (en) 2018-08-01 2018-08-01 Target transformation method based on attention mechanism

Country Status (1)

Country Link
CN (1) CN109033095B (en)

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109712068A (en) * 2018-12-21 2019-05-03 云南大学 Image Style Transfer and analogy method for cucurbit pyrography
CN109784197A (en) * 2018-12-21 2019-05-21 西北工业大学 Pedestrian's recognition methods again based on hole convolution Yu attention study mechanism
CN109829537A (en) * 2019-01-30 2019-05-31 华侨大学 Style transfer method and equipment based on deep learning GAN network children's garment clothes
CN109902602A (en) * 2019-02-16 2019-06-18 北京工业大学 A kind of airfield runway foreign materials recognition methods based on confrontation Neural Network Data enhancing
CN110033410A (en) * 2019-03-28 2019-07-19 华中科技大学 Image reconstruction model training method, image super-resolution rebuilding method and device
CN110084794A (en) * 2019-04-22 2019-08-02 华南理工大学 A kind of cutaneum carcinoma image identification method based on attention convolutional neural networks
CN110634101A (en) * 2019-09-06 2019-12-31 温州大学 Unsupervised image-to-image conversion method based on random reconstruction
CN110766638A (en) * 2019-10-31 2020-02-07 北京影谱科技股份有限公司 Method and device for converting object background style in image
CN111325318A (en) * 2019-02-01 2020-06-23 北京地平线机器人技术研发有限公司 Neural network training method, neural network training device and electronic equipment
CN111489287A (en) * 2020-04-10 2020-08-04 腾讯科技(深圳)有限公司 Image conversion method, image conversion device, computer equipment and storage medium
CN111815570A (en) * 2020-06-16 2020-10-23 浙江大华技术股份有限公司 Regional intrusion detection method and related device thereof
CN112884773A (en) * 2021-01-11 2021-06-01 天津大学 Target segmentation model based on target attention consistency under background transformation
CN113256592A (en) * 2021-06-07 2021-08-13 中国人民解放军总医院 Training method, system and device of image feature extraction model
CN113538224A (en) * 2021-09-14 2021-10-22 深圳市安软科技股份有限公司 Image style migration method and device based on generation countermeasure network and related equipment
CN113657560A (en) * 2021-10-20 2021-11-16 南京理工大学 Weak supervision image semantic segmentation method and system based on node classification
CN113808011A (en) * 2021-09-30 2021-12-17 深圳万兴软件有限公司 Feature fusion based style migration method and device and related components thereof

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108009525A (en) * 2017-12-25 2018-05-08 北京航空航天大学 A kind of specific objective recognition methods over the ground of the unmanned plane based on convolutional neural networks

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108009525A (en) * 2017-12-25 2018-05-08 北京航空航天大学 A kind of specific objective recognition methods over the ground of the unmanned plane based on convolutional neural networks

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
JUN-YAN ZHU ET AL.: "Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks", 《ARXIV:1703.10593V1》 *
ZIHAN YE ET AL.: "DAU-GAN: Unsupervised Object Transfiguration via Deep Attention Unit", 《BICS 2018》 *
胡光伟: "BP神经网络的训练算法", 《洞庭湖水沙时空演变及其对水资源安全的影响研究》 *

Cited By (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109784197A (en) * 2018-12-21 2019-05-21 西北工业大学 Pedestrian's recognition methods again based on hole convolution Yu attention study mechanism
CN109712068A (en) * 2018-12-21 2019-05-03 云南大学 Image Style Transfer and analogy method for cucurbit pyrography
CN109784197B (en) * 2018-12-21 2022-06-07 西北工业大学 Pedestrian re-identification method based on hole convolution and attention mechanics learning mechanism
CN109829537A (en) * 2019-01-30 2019-05-31 华侨大学 Style transfer method and equipment based on deep learning GAN network children's garment clothes
CN109829537B (en) * 2019-01-30 2023-10-24 华侨大学 Deep learning GAN network children's garment based style transfer method and equipment
CN111325318A (en) * 2019-02-01 2020-06-23 北京地平线机器人技术研发有限公司 Neural network training method, neural network training device and electronic equipment
CN111325318B (en) * 2019-02-01 2023-11-24 北京地平线机器人技术研发有限公司 Neural network training method, neural network training device and electronic equipment
CN109902602A (en) * 2019-02-16 2019-06-18 北京工业大学 A kind of airfield runway foreign materials recognition methods based on confrontation Neural Network Data enhancing
CN109902602B (en) * 2019-02-16 2021-04-30 北京工业大学 Method for identifying foreign matter material of airport runway based on antagonistic neural network data enhancement
CN110033410A (en) * 2019-03-28 2019-07-19 华中科技大学 Image reconstruction model training method, image super-resolution rebuilding method and device
CN110084794B (en) * 2019-04-22 2020-12-22 华南理工大学 Skin cancer image identification method based on attention convolution neural network
CN110084794A (en) * 2019-04-22 2019-08-02 华南理工大学 A kind of cutaneum carcinoma image identification method based on attention convolutional neural networks
CN110634101A (en) * 2019-09-06 2019-12-31 温州大学 Unsupervised image-to-image conversion method based on random reconstruction
CN110634101B (en) * 2019-09-06 2023-01-31 温州大学 Unsupervised image-to-image conversion method based on random reconstruction
CN110766638A (en) * 2019-10-31 2020-02-07 北京影谱科技股份有限公司 Method and device for converting object background style in image
CN111489287B (en) * 2020-04-10 2024-02-09 腾讯科技(深圳)有限公司 Image conversion method, device, computer equipment and storage medium
CN111489287A (en) * 2020-04-10 2020-08-04 腾讯科技(深圳)有限公司 Image conversion method, image conversion device, computer equipment and storage medium
CN111815570A (en) * 2020-06-16 2020-10-23 浙江大华技术股份有限公司 Regional intrusion detection method and related device thereof
CN112884773B (en) * 2021-01-11 2022-03-04 天津大学 Target segmentation model based on target attention consistency under background transformation
CN112884773A (en) * 2021-01-11 2021-06-01 天津大学 Target segmentation model based on target attention consistency under background transformation
CN113256592A (en) * 2021-06-07 2021-08-13 中国人民解放军总医院 Training method, system and device of image feature extraction model
CN113256592B (en) * 2021-06-07 2021-10-08 中国人民解放军总医院 Training method, system and device of image feature extraction model
CN113538224B (en) * 2021-09-14 2022-01-14 深圳市安软科技股份有限公司 Image style migration method and device based on generation countermeasure network and related equipment
CN113538224A (en) * 2021-09-14 2021-10-22 深圳市安软科技股份有限公司 Image style migration method and device based on generation countermeasure network and related equipment
CN113808011A (en) * 2021-09-30 2021-12-17 深圳万兴软件有限公司 Feature fusion based style migration method and device and related components thereof
CN113808011B (en) * 2021-09-30 2023-08-11 深圳万兴软件有限公司 Style migration method and device based on feature fusion and related components thereof
CN113657560B (en) * 2021-10-20 2022-04-15 南京理工大学 Weak supervision image semantic segmentation method and system based on node classification
CN113657560A (en) * 2021-10-20 2021-11-16 南京理工大学 Weak supervision image semantic segmentation method and system based on node classification

Also Published As

Publication number Publication date
CN109033095B (en) 2022-10-18

Similar Documents

Publication Publication Date Title
CN109033095A (en) Object transformation method based on attention mechanism
Bar et al. Visual prompting via image inpainting
Lu et al. Evolving block-based convolutional neural network for hyperspectral image classification
Reed et al. Few-shot autoregressive density estimation: Towards learning to learn distributions
CN111340122B (en) Multi-modal feature fusion text-guided image restoration method
Liao et al. Learning deep parsimonious representations
Vemulapalli et al. Gaussian conditional random field network for semantic segmentation
Canchumuni et al. Recent developments combining ensemble smoother and deep generative networks for facies history matching
Lee et al. Understanding pure clip guidance for voxel grid nerf models
CN108764281A (en) A kind of image classification method learning across task depth network based on semi-supervised step certainly
Ma et al. Multi-feature fusion deep networks
Li Active learning for hyperspectral image classification with a stacked autoencoders based neural network
CN106934458A (en) Multilayer automatic coding and system based on deep learning
CN115578680A (en) Video understanding method
CN115984485A (en) High-fidelity three-dimensional face model generation method based on natural text description
Li et al. A deep neural network based quasi-linear kernel for support vector machines
Li et al. Diversified text-to-image generation via deep mutual information estimation
Wu et al. Transformer Autoencoder for K-means Efficient clustering
Oza et al. Semi-supervised image-to-image translation
CN112380843A (en) Random disturbance network-based open answer generation method
Han et al. A Large-Scale Network Construction and Lightweighting Method for Point Cloud Semantic Segmentation
Jiang et al. Multi-feature deep learning for face gender recognition
CN116258504A (en) Bank customer relationship management system and method thereof
Radwan et al. Distilling Part-whole Hierarchical Knowledge from a Huge Pretrained Class Agnostic Segmentation Framework
Ahn et al. Multi-branch neural architecture search for lightweight image super-resolution

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant