CN107330954A - A kind of method based on attenuation network by sliding attribute manipulation image - Google Patents
A kind of method based on attenuation network by sliding attribute manipulation image Download PDFInfo
- Publication number
- CN107330954A CN107330954A CN201710576667.XA CN201710576667A CN107330954A CN 107330954 A CN107330954 A CN 107330954A CN 201710576667 A CN201710576667 A CN 201710576667A CN 107330954 A CN107330954 A CN 107330954A
- Authority
- CN
- China
- Prior art keywords
- image
- attribute
- discriminator
- here
- encoder
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T11/00—2D [Two Dimensional] image generation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T17/00—Three dimensional [3D] modelling, e.g. data description of 3D objects
Abstract
The present invention proposes a kind of method based on attenuation network by sliding attribute manipulation image, and its main contents includes:Implementation of the study, presumption model learning algorithm, coding and decoding framework that attenuation network, coding and decoding framework, different attribute stealth are represented in neutral net, its process is to decomposite the notable information of image and attribute in recessive space to carry out image reconstruction with a kind of newly encoded decoding framework.Trained presumption model produces the input picture of different editions by changing image property value, and using continuous property can select generation image in can perceive particular community number, this characteristic, which allows for user, changes human face expression or upgating object color with sliding knob.Compared to existing for the training stage is by replacing attribute come the method for training confrontation network in pixel space, the present invention is simpler and can expand to many attribute, and presumption model can keep image naturalness while the perceived value of image attributes is lifted.
Description
Technical field
The present invention relates to image reconstruction field, more particularly, to a kind of attenuation network that is based on by sliding attribute manipulation figure
The method of picture.
Background technology
Image reconstruction refers to the data obtained according to the detection to object to re-establish image, and is used for reconstruction image
Data are usually timesharing, step acquisitions, then create two dimension or 3-D view from scattered or incomplete data, and for
Applied mathematics formula is then needed for some imaging techniques to regenerate the image become apparent from so that it becomes more readable
With it is useful.Image reconstruction is an important research branch in image procossing, and its significance is to obtain inside object to be detected
The image of structure is without causing any damage physically to object.It shows uniqueness in each different application field
Importance, such as medical radiation, nuclear medicine, electron microscopic, radio radar astronomy, micro- light and holographic imaging and reason
There is application by the fields such as vision all more.
The present invention propose it is a kind of based on attenuation network by sliding the method that attribute manipulates image, trained a kind of new
Coding-decoding framework is used for carrying out image by directly decompositing the notable information and property value of image in recessive space
Rebuild.After training, thus it is speculated that model can produce the different editions of an input picture by changing image property value.
By using continuous property, can select that the number of particular community can be perceived in a generated image, and this characteristic just permits
The facial expression of portrait is changed using sliding knob or the color of some objects is updated in family allowable.State-of-the-art method is at present
Confrontation network is trained in pixel space by replacing property value dependent in the training stage, compared to this method present invention's
Training program is simpler and can be very good to expand to many attribute.In addition model of the invention can largely change
Become the perceived value of attribute in image, and the naturalness of image can be kept simultaneously.
The content of the invention
A kind of new coding-decoding framework is trained for the present invention to be used for by directly decompositing figure in recessive space
The notable information and property value of picture is rebuild to image.After training, thus it is speculated that model can be belonged to by changing image
Property value produces the different editions of an input picture.By using continuous property, it can select in a generated image may be used
Perceive particular community number, and this characteristic allows for user and changed using sliding knob facial expression or the renewal of portrait
The color of some objects.Current state-of-the-art method is to rely in the training stage by replacing property value come in pixel space
Training confrontation network, the training program of the invention compared to this method is simpler and can be very good to expand to many attribute.
In addition model of the invention can largely change the perceived value of attribute in image, and can keep image simultaneously
Naturalness.
To solve the above problems, the present invention provides a kind of method based on attenuation network by sliding attribute manipulation image,
Its main contents includes:
(1) attenuation network;
(2) coding-decoding framework;
(3) study that different attribute stealth is represented;
(4) presumption model learning algorithm;
(5) implementation of the coding-decoding framework in neutral net.
Wherein, described attenuation network, allowsAs image area, andFor withThe associated possibility property set of middle image
Close, take the Representative properties of face for wear a pair of spectacles or not wear a pair of spectacles, sex, youth or age, operated here for simplification
Step, it is considered to which attribute can also expand to categorical attribute as binary situation, under this configurationWherein
N is the quantity of attribute, and comprising m to image and the training set of attributeWhereinHere final goal be fromLearn a model, for any one attribute vector y ', one defeated
Enter image x version its property value corresponding with y '.
Wherein, described coding-decoding framework, region confrontation instruction is spatially carried out based on coding-decoding architecture stealthy
Practice, wherein encoderIt is to include parameter θencConvolutional neural networks, this parameter reflects an input picture
The N-dimensional stealth for being mapped to it is representedOn, wherein decoderInclude parameter θdecDeconvolution god
Through network, this parameter produces a redaction of input picture and provides its stealthy expressionAnd arbitrary attribute
Vectorial y ', and only need simply to record using D and E when research range understandsWithAs shown in equation (1), with
It is a classical mean square error (MSE) that the associated autocoding of the exact architecture of neutral net, which is lost, and it can measure institute
Training input picture x reconstruction quality simultaneously provides its real property vector y:
Here and without carrying out cutting true selection to image reconstruction loss, and it is more clear in order to obtain texture in this stage
Clear image, resists loss, but use except mean square error (MSE) can also be used as confrontation generation network (GAN)
Average absolute value error or mean square error are still necessary, so may insure that the image rebuild matches with original image,
And in the ideal case, modification D (E (x), y) in y will produce with different perception properties image, but with other aspect with
Input picture x is similar.
Further, the study that the stealth of described different attribute is represented, in no additional restrictions, decoder can be neglected
The attribute of sketch map picture, does not at this moment have any effect in testing time modification y, and in order to avoid this behavior, its solution is
Go to learn the constant stealthy expression of association attributes, can accomplish to allow the mesh of two given different editions by consistency here
It is identical to mark x and x ' property values corresponding with them, the two width figures and two that such as same person is worn glasses during with not wear a pair of spectacles
Individual corresponding stealthy statement E (x) and E (x ') should also be identical;When meeting this consistency, decoder must use this
Attribute rebuilds original image, and training set do not include the different editions of identical image, therefore can not be simple in counting loss
Ground adds the restrictive condition, so suggestion adds this restrictive condition by carrying out dual training in stealthy space, it is this
Need training to be referred to as the additional neural network of discriminator to recognize real property y of the training for providing E (x) to (x, y), lead to here
Crossing study encoder E can allow discriminator to recognize correct attribute to obtain consistency, this like the same in GAN, it
One two-player game of correspondence, wherein discriminator is intended to allow the ability of its recognition property to maximize, and E is intended to prevent it from turning into one
Good discriminator.
Further, described discriminator target, discriminator can export an attribute vectorProbability,
Wherein θdisIt is the parameter of discriminator, k-th of attribute is represented used here as subscript k, and have Here the loss of discriminator depends on the current state of encoder, shown in such as equation (2):
The purpose of wherein discriminator be in order to predict it is given its stealth represent input picture attribute.
Further, described confrontation target, it is characterised in that the target of encoder is to calculate a stealthy table herein
Show to optimize two objects, first decoder should be able to when providing E (x) and y reconstruction image x, and discriminator at the same time
Y can not be predicted in the case where providing E (x), here it is considered that when discriminator predicts 1-y for attribute kkIt can make a mistake, institute
With the absolute penalty values such as equation (3) of coding-decoding architecture when providing discriminator parameter Suo Shi:
Wherein λE> 0 controls the balance situation between image reconstruction quality and the stealthy consistency represented, here λEIt is larger
Value will limit the information content of the x included in E (x), and can cause to produce blurred picture, and λESmaller value will limit
Decoder is to hidden code y dependence so as to cause the ill effect produced when converting attribute.
Further, described presumption model learning algorithm, the optimal discriminator when providing the current state of encoder
Parameter is metIf ignoring related to multiple minimum or local minimum
Problem, then overall goals function beAnd actually in θencValue it is every
All go to solve during secondary renewalIt is irrational, and after dual training is carried out to deep neural network, it is contemplated that
θdisCurrency conductApproximation, it is necessary to updated to all parameters using stochastic gradient, providing instruction here
When practicing sample (x, y), the autocoder loss that will be limited to (x, y) is designated asAnd corresponding discriminating
Device losesAnd training example (x(t), y(t)) be represented by such as equation (4) and equation (5):
The parameter current wherein provided according to time t renewal isWith
Further, implementation of the described coding-decoding framework in neutral net, allows coding-decoding architecture to fit
C should be made herein to neutral netkTo include convolution amendment linear unit (ReLU) layer of k wave filter, convolution makes here
With the kernel that size is 4 × 4, its span value is 2, and Filling power is 1, so that each layer of input value size of encoder is
2, wherein using the leaky ReLU that slope value is 2 in the encoder, and simple ReLU is used in a decoder, here encoder
Constituted as shown in formula (6) by following 7 layers:
C16-C32-C64-C128-C256-C512-C512 (6)
And because the size of input picture is 256 × 256, therefore the stealthy of piece image represents that by size be 2 × 2
512 width characteristic patterns are constituted, and are given to here in order to provide image attributes to decoder, it is necessary to which hidden code is attached to as input
On each layer of decoder, the hidden code of wherein image be the cascade of hot coding vector to represent the property value of image, here
Binary attribute uses [1,0] and [0,1] to represent, therefore is attached to decoding using hidden code as extra constant inflow passage
In the convolution of device, the quantity of attribute is represented with n here, and encoder is symmetrical with decoder, therefore make as shown in formula (7)
Lifting sampling is carried out with transposition convolution:
C512+2n-C512t2n-C256+2n-C128+2n-C64+2n-C32+2n-C16+2n (7)
Discriminator is a C herein512Layer, its followed by one be respectively by size 512 and n the god connected entirely for two layers
Through network.
Further, described discriminator cost planning, to the loss coefficient λ of discriminatorEIt is first here using Changeable weight
By λE0 and the training pattern as normal autocoder are set to, then the λ in preceding 500,000 iterationEValue it is linear
0.0001 is incremented to so as to slowly encourage model to produce constant expression, even if will be observed that without this planning
It is in λEValue it is low-down in the case of the penalty values from discriminator very big influence can be also caused to encoder.
Further, described model selection, carrys out the automatic model that performs using two standards and selects, here first by
The image reconstruction errors on original image measured by MSE, then in predicted for second standard by training grader
Image attributes, and the ending in each execution cycle concentrates the attribute of each image of exchange in checking and goes measurement for decoding
The performance situation of Image Classifier, two indices here are used to filtering out potential good model, and the selection of final mask
The people of assessment will be carried out based on to(for) the image that is obtained in the training set rebuild from attribute has been exchanged is carried out.
Brief description of the drawings
Fig. 1 be the present invention it is a kind of based on attenuation network by slide attribute manipulate image method coding-decoding architecture
Figure.
Fig. 2 be the present invention it is a kind of based on attenuation network by slide attribute manipulate image method flower in different pink colours
The image reconstruction exemplary plot of property value.
Embodiment
It should be noted that in the case where not conflicting, the feature in embodiment and embodiment in the application can phase
Mutually combine, the present invention is described in further detail with specific embodiment below in conjunction with the accompanying drawings.
Fig. 1 be the present invention it is a kind of based on attenuation network by slide attribute manipulate image method coding-decoding architecture
Figure.Input in figure is an image with attribute to (x, y), and x is mapped to its stealth and represented on z by encoder, and discriminator is then
Be trained to predict y when z is provided and can not when now encoder is then trained to allow discriminator only providing z
Y is predicted, therefore decoder needs provide (z, y) and can just reconstruct image x.Generally mainly include attenuation network, coding-
Decoding framework, the study of different attribute stealth expression, presumption model learning algorithm, coding-decoding framework are in neutral net
Implementation.
Wherein, described attenuation network, allowsAs image area, andFor withThe associated possibility property set of middle image
Close, take the Representative properties of face for wear a pair of spectacles or not wear a pair of spectacles, sex, youth or age, operated here for simplification
Step, it is considered to which attribute can also expand to categorical attribute as binary situation, under this configurationWherein
N is the quantity of attribute, and comprising m to image and the training set of attributeWhereinHere final goal be fromLearn a model, for any one attribute vector y ', one defeated
Enter image x version its property value corresponding with y '.
Wherein, described coding-decoding framework, region confrontation instruction is spatially carried out based on coding-decoding architecture stealthy
Practice, wherein encoderIt is to include parameter θencConvolutional neural networks, this parameter reflects an input picture
The N-dimensional stealth for being mapped to it is representedOn, wherein decoderInclude parameter θdecDeconvolution god
Through network, this parameter produces a redaction of input picture and provides its stealthy expressionAnd arbitrary attribute
Vectorial y ', and only need simply to record using D and E when research range understandsWithAs shown in equation (1), with
It is a classical mean square error (MSE) that the associated autocoding of the exact architecture of neutral net, which is lost, and it can measure institute
Training input picture x reconstruction quality simultaneously provides its real property vector y:
Here and without carrying out cutting true selection to image reconstruction loss, and it is more clear in order to obtain texture in this stage
Clear image, resists loss, but use except mean square error (MSE) can also be used as confrontation generation network (GAN)
Average absolute value error or mean square error are still necessary, so may insure that the image rebuild matches with original image,
And in the ideal case, modification D (E (x), y) in y will produce with different perception properties image, but with other aspect with
Input picture x is similar.
Further, the study that the stealth of described different attribute is represented, in no additional restrictions, decoder can be neglected
The attribute of sketch map picture, does not at this moment have any effect in testing time modification y, and in order to avoid this behavior, its solution is
Go to learn the constant stealthy expression of association attributes, can accomplish to allow the mesh of two given different editions by consistency here
It is identical to mark x and x ' property values corresponding with them, the two width figures and two that such as same person is worn glasses during with not wear a pair of spectacles
Individual corresponding stealthy statement E (x) and E (x ') should also be identical;When meeting this consistency, decoder must use this
Attribute rebuilds original image, and training set do not include the different editions of identical image, therefore can not be simple in counting loss
Ground adds the restrictive condition, so suggestion adds this restrictive condition by carrying out dual training in stealthy space, it is this
Need training to be referred to as the additional neural network of discriminator to recognize real property y of the training for providing E (x) to (x, y), lead to here
Crossing study encoder E can allow discriminator to recognize correct attribute to obtain consistency, this like the same in GAN, it
One two-player game of correspondence, wherein discriminator is intended to allow the ability of its recognition property to maximize, and E is intended to prevent it from turning into one
Good discriminator.
Further, described discriminator target, discriminator can export an attribute vectorProbability,
Wherein θdisIt is the parameter of discriminator, k-th of attribute is represented used here as subscript k, and have Here the loss of discriminator depends on the current state of encoder, shown in such as equation (2):
The purpose of wherein discriminator be in order to predict it is given its stealth represent input picture attribute.
Further, described confrontation target, it is characterised in that the target of encoder is to calculate a stealthy table herein
Show to optimize two objects, first decoder should be able to when providing E (x) and y reconstruction image x, and discriminator at the same time
Y can not be predicted in the case where providing E (x), here it is considered that when discriminator predicts 1-y for attribute kkIt can make a mistake, institute
With the absolute penalty values such as equation (3) of coding-decoding architecture when providing discriminator parameter Suo Shi:
Wherein λE> 0 controls the balance situation between image reconstruction quality and the stealthy consistency represented, here λEIt is larger
Value will limit the information content of the x included in E (x), and can cause to produce blurred picture, and λESmaller value will limit
Decoder is to hidden code y dependence so as to cause the ill effect produced when converting attribute.
Further, described presumption model learning algorithm, the optimal discriminator when providing the current state of encoder
Parameter is metIf ignoring related to multiple minimum or local minimum
Problem, then overall goals function beAnd actually in θencValue it is every
All go to solve during secondary renewalIt is irrational, and after dual training is carried out to deep neural network, it is contemplated that
θdisCurrency conductApproximation, it is necessary to updated to all parameters using stochastic gradient, providing instruction here
When practicing sample (x, y), the autocoder loss that will be limited to (x, y) is designated asAnd corresponding mirror
Other device losesAnd training example(x(t), y(t)) be represented by such as equation (4) and equation (5):
The parameter current wherein provided according to time t renewal isWith
Further, implementation of the described coding-decoding framework in neutral net, allows coding-decoding architecture to fit
C should be made herein to neutral netkTo include convolution amendment linear unit (ReLU) layer of k wave filter, convolution makes here
With the kernel that size is 4 × 4, its span value is 2, and Filling power is 1, so that each layer of input value size of encoder is
2, wherein using the leaky ReLU that slope value is 2 in the encoder, and simple ReLU is used in a decoder, here encoder
Constituted as shown in formula (6) by following 7 layers:
C16-C32-C64-C128-C256-C512-C512 (6)
And because the size of input picture is 256 × 256, therefore the stealthy of piece image represents that by size be 2 × 2
512 width characteristic patterns are constituted, and are given to here in order to provide image attributes to decoder, it is necessary to which hidden code is attached to as input
On each layer of decoder, the hidden code of wherein image be the cascade of hot coding vector to represent the property value of image, here
Binary attribute uses [1,0] and [0,1] to represent, therefore is attached to decoding using hidden code as extra constant inflow passage
In the convolution of device, the quantity of attribute is represented with n here, and encoder is symmetrical with decoder, therefore make as shown in formula (7)
Lifting sampling is carried out with transposition convolution:
C512+2n-C512+2n-C256+2n-C128+2n-C64+2n-C32+2n-C16+2n (7)
Discriminator is a C herein512Layer, its followed by one be respectively by size 512 and n the god connected entirely for two layers
Through network.
Further, described discriminator cost planning, to the loss coefficient λ of discriminatorEIt is first here using Changeable weight
By λE0 and the training pattern as normal autocoder are set to, then the λ in preceding 500,000 iterationEValue it is linear
0.0001 is incremented to so as to slowly encourage model to produce constant expression, even if will be observed that without this planning
It is in λEValue it is low-down in the case of the penalty values from discriminator very big influence can be also caused to encoder.
Further, described model selection, carrys out the automatic model that performs using two standards and selects, here first by
The image reconstruction errors on original image measured by MSE, then in predicted for second standard by training grader
Image attributes, and the ending in each execution cycle concentrates the attribute of each image of exchange in checking and goes measurement for decoding
The performance situation of Image Classifier, two indices here are used to filtering out potential good model, and the selection of final mask
The people of assessment will be carried out based on to(for) the image that is obtained in the training set rebuild from attribute has been exchanged is carried out.
Fig. 2 be the present invention it is a kind of based on attenuation network by slide attribute manipulate image method flower in different pink colours
The image reconstruction exemplary plot of property value.The closer pink colour that flower fair becomes when increasing pink colour property value in figure, and in reduction pink colour
It can allow and be spent closer to yellow or orange during property value.
For those skilled in the art, the present invention is not restricted to the details of above-described embodiment, in the essence without departing substantially from the present invention
In the case of refreshing and scope, the present invention can be realized with other concrete forms.In addition, those skilled in the art can be to this hair
Bright to carry out various changes and modification without departing from the spirit and scope of the present invention, these improvement and modification also should be regarded as the present invention's
Protection domain.Therefore, appended claims are intended to be construed to include preferred embodiment and fall into all changes of the scope of the invention
More and modification.
Claims (10)
1. a kind of method based on attenuation network by sliding attribute manipulation image, it is characterised in that mainly including attenuation network
(1);Coding-decoding framework (two);The study (three) that different attribute stealth is represented;Presumption model learning algorithm (four);Coding-
Implementation (five) of the decoding framework in neutral net.
2. based on the attenuation network (one) described in claims 1, it is characterised in that allowAs image area, andFor withIn
The associated possibility attribute set of image, take the Representative properties of face for wear a pair of spectacles or not wear a pair of spectacles, sex, youth or
Age, here for simplified operating procedure, it is considered to which attribute can also expand to categorical attribute as binary situation, at this
Plant under settingWherein n is the quantity of attribute, and comprising m to image and the training set of attribute WhereinHere final goal be fromLearn a model, for any one
For individual attribute vector y ', its property value of input picture x version is corresponding with y '.
3. based on coding-decoding framework (two) described in claims 1, it is characterised in that based on coding-decoding architecture hidden
Shape spatially carries out region dual training, wherein encoderIt is to include parameter θencConvolutional neural networks,
This parameter represents the N-dimensional stealth that an input picture is mapped to itOn, wherein decoder
Include parameter θdecDeconvolution neutral net, this parameter produces a redaction of input picture and provides its stealthy expressionAnd arbitrary attribute vector y ', and only need simply to record using D and E when research range understandsWithAs shown in equation (1), it is a classical mean square error that the autocoding associated with the exact architecture of neutral net, which is lost,
Poor (MSE), it can measure trained input picture x reconstruction quality and provide its real property vector y:
Here and without carrying out cutting true selection to image reconstruction loss, and become apparent from this stage in order to obtain texture
Image, loss is resisted except mean square error (MSE) can also be used as confrontation generation network (GAN), but using average
Absolute value error or mean square error are still necessary, so may insure that the image rebuild matches with original image, and
Ideally, modification D (E (x), y) in y will produce with different perception properties image, but with other aspect with input
Image x is similar.
4. the study (three) represented based on the different attribute stealth described in claims 1, it is characterised in that in not additional limit
Decoder can ignore the attribute of image during condition processed, at this moment not have any effect in testing time modification y, and in order to avoid this
Behavior, its solution is to learn the constant stealthy expression of association attributes, can accomplish to allow given here by consistency
The target x and x ' property values corresponding with them of two different editions be identical, such as same person wear glasses with without
Two width figures stealthy statement E (x) corresponding with two and E (x ') during glasses should also be identical;When meeting this consistency
When, decoder must use the attribute to rebuild original image, and training set does not include the different editions of identical image, therefore
The restrictive condition can not be simply added into during counting loss, so suggestion is added by carrying out dual training in stealthy space
This restrictive condition, needs to train to be referred to as the additional neural network of discriminator to recognize the training for providing E (x) to (x, y) for this
Real property y, by learning encoder E to obtain consistency discriminator can be allowed to recognize correct attribute here, this is just
Like the same in GAN, its one two-player game of correspondence, wherein discriminator is intended to allow the ability of its recognition property to maximize, and E
It is intended to prevent it from turning into a good discriminator.
5. based on the discriminator target described in claims 4, it is characterised in that discriminator can export an attribute vectorProbability, wherein θdisIt is the parameter of discriminator, k-th of attribute is represented used here as subscript k, and haveHere the loss of discriminator depends on the current state of encoder, such as
Shown in equation (2):
The purpose of wherein discriminator be in order to predict it is given its stealth represent input picture attribute.
6. based on the confrontation target described in claims 4, it is characterised in that the target of encoder is that calculating one is hidden herein
Shape represents to optimize two objects, first decoder should be able to when providing E (x) and y reconstruction image x, and reflect at the same time
Other device can not predict y in the case where providing E (x), here it is considered that when discriminator predicts 1-y for attribute kkCan occur mistake
By mistake, so when providing discriminator parameter shown in the absolute penalty values such as equation (3) of coding-decoding architecture:
Wherein λE> 0 controls the balance situation between image reconstruction quality and the stealthy consistency represented, here λEHigher value will
The information content of the x included in E (x) can be limited, and can cause to produce blurred picture, and λESmaller value will limit decoding
Device is to hidden code y dependence so as to cause the ill effect produced when converting attribute.
7. based on the presumption model learning algorithm (four) described in claims 1, it is characterised in that providing the current of encoder
Optimal discriminator parameter is met during stateIf ignored and multiple minimum
Or the problem of local minimum correlation, then overall goals function isAnd
Actually in θencValue all go when updating every time to solveIt is irrational, and to deep neural network progress pair
After anti-training, it is contemplated that θdisCurrency conductApproximation, it is necessary to use stochastic gradient to all parameters
Update, here when providing training sample (x, y), the autocoder loss that will be limited to (x, y) is designated asAnd corresponding discriminator loss isAnd training example (x(t), y(t)) such as equation
(4) it is represented by with equation (5):
The parameter current wherein provided according to time t renewal isWith
8. the implementation (five) based on coding-decoding framework described in claims 1 in neutral net, its feature exists
In allowing coding-decoding architecture to adapt to neutral net, C made hereinkIt is linearly single for the convolution amendment comprising k wave filter
First (ReLU) layer, convolution is using the kernel that size is 4 × 4 here, and its span value is 2, and Filling power is 1, so that encoder
Each layer of input value size be 2, wherein and being made in a decoder using the leaky ReLU that slope value is 2 in the encoder
Simple ReLU is used, is constituted here shown in encoder such as formula (6) by following 7 layers:
C16-C32-C64-C128-C256-C512-C512 (6)
And because the size of input picture is 256 × 256, therefore the stealthy of piece image is represented by 512 width that size is 2 × 2
Characteristic pattern is constituted, and is given to decoding, it is necessary to hidden code is attached to as input here in order to provide image attributes to decoder
On each layer of device, the hidden code of wherein image be the cascade of hot coding vector to represent the property value of image, two enter here
Attribute processed uses [1,0] and [0,1] to represent, therefore is attached to decoder using hidden code as extra constant inflow passage
In convolution, the quantity of attribute is represented with n here, and encoder is symmetrical with decoder, therefore use turns as shown in formula (7)
Put convolution and carry out lifting sampling:
C512+2n-C512+2n-C256+2n-C128+2n-C64+2n-C32+2n-C16+2n (7)
Discriminator is a C herein512Layer, its followed by one be respectively by size 512 and n the nerve net connected entirely for two layers
Network.
9. based on the discriminator cost planning described in claims 8, it is characterised in that to the loss coefficient λ of discriminatorEUse
Changeable weight, here first by λE0 and the training pattern as normal autocoder are set to, then at preceding 500,000 times
λ in iterationEValue linear increment to 0.0001 so as to which slow excitation model produces constant expression, without this planning
Even words are it will be observed that in λEValue it is low-down in the case of the penalty values from discriminator encoder can also be caused very
Big influence.
10. based on the model selection described in claims 8, it is characterised in that carry out the automatic model that performs using two standards and select
Select, here first by the image reconstruction errors on the original image measured by MSE, then in for second standard pass through instruction
Practice grader and carry out prognostic chart picture attribute, and the ending in each execution cycle is concentrated the attribute of each image of exchange in checking and gone
The performance situation of the Image Classifier for decoding is measured, two indices here are used to filter out potential good model, and
The selection of final mask will be based on people for commenting that the image that is obtained in the training set rebuild from attribute has been exchanged is carried out
Estimate to carry out.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710576667.XA CN107330954A (en) | 2017-07-14 | 2017-07-14 | A kind of method based on attenuation network by sliding attribute manipulation image |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710576667.XA CN107330954A (en) | 2017-07-14 | 2017-07-14 | A kind of method based on attenuation network by sliding attribute manipulation image |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107330954A true CN107330954A (en) | 2017-11-07 |
Family
ID=60227199
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710576667.XA Withdrawn CN107330954A (en) | 2017-07-14 | 2017-07-14 | A kind of method based on attenuation network by sliding attribute manipulation image |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107330954A (en) |
Cited By (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107665339A (en) * | 2017-09-22 | 2018-02-06 | 中山大学 | A kind of method changed by neural fusion face character |
CN108171173A (en) * | 2017-12-29 | 2018-06-15 | 北京中科虹霸科技有限公司 | A kind of pupil generation of iris image U.S. and minimizing technology |
CN108280885A (en) * | 2018-01-09 | 2018-07-13 | 上海大学 | The holographic idol method of structure |
CN108765261A (en) * | 2018-04-13 | 2018-11-06 | 北京市商汤科技开发有限公司 | Image conversion method and device, electronic equipment, computer storage media, program |
CN109064430A (en) * | 2018-08-07 | 2018-12-21 | 中国人民解放军陆军炮兵防空兵学院 | It is a kind of to remove cloud method and system containing cloud atlas for region is taken photo by plane |
CN110046693A (en) * | 2018-01-13 | 2019-07-23 | Arm有限公司 | Select the encoding option |
WO2019178893A1 (en) * | 2018-03-22 | 2019-09-26 | 深圳大学 | Motion blur image sharpening method and device, apparatus, and storage medium |
CN110322416A (en) * | 2019-07-09 | 2019-10-11 | 腾讯科技(深圳)有限公司 | Image processing method, device and computer readable storage medium |
CN110363830A (en) * | 2018-04-10 | 2019-10-22 | 阿里巴巴集团控股有限公司 | Element image generation method, apparatus and system |
CN110880203A (en) * | 2018-09-04 | 2020-03-13 | 辉达公司 | Joint composition and placement of objects in a scene |
CN111400754A (en) * | 2020-03-11 | 2020-07-10 | 支付宝(杭州)信息技术有限公司 | Construction method and device of user classification system for protecting user privacy |
US10783394B2 (en) | 2017-06-20 | 2020-09-22 | Nvidia Corporation | Equivariant landmark transformation for landmark localization |
CN113205140A (en) * | 2021-05-06 | 2021-08-03 | 中国人民解放军海军航空大学航空基础学院 | Semi-supervised specific radiation source individual identification method based on generative countermeasure network |
CN113902921A (en) * | 2018-11-30 | 2022-01-07 | 腾讯科技(深圳)有限公司 | Image processing method, device, equipment and storage medium |
CN115210772A (en) * | 2020-01-03 | 2022-10-18 | 佩治人工智能公司 | System and method for processing electronic images for universal disease detection |
-
2017
- 2017-07-14 CN CN201710576667.XA patent/CN107330954A/en not_active Withdrawn
Non-Patent Citations (1)
Title |
---|
GUILLAUME LAMPLE等: "Fader Networks:Manipulating Images by Sliding Attributes", 《HTTPS://ARXIV.ORG/ABS/1706.00409V1》 * |
Cited By (25)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10783394B2 (en) | 2017-06-20 | 2020-09-22 | Nvidia Corporation | Equivariant landmark transformation for landmark localization |
US10783393B2 (en) | 2017-06-20 | 2020-09-22 | Nvidia Corporation | Semi-supervised learning for landmark localization |
CN107665339B (en) * | 2017-09-22 | 2021-04-13 | 中山大学 | Method for realizing face attribute conversion through neural network |
CN107665339A (en) * | 2017-09-22 | 2018-02-06 | 中山大学 | A kind of method changed by neural fusion face character |
CN108171173A (en) * | 2017-12-29 | 2018-06-15 | 北京中科虹霸科技有限公司 | A kind of pupil generation of iris image U.S. and minimizing technology |
CN108280885A (en) * | 2018-01-09 | 2018-07-13 | 上海大学 | The holographic idol method of structure |
CN108280885B (en) * | 2018-01-09 | 2021-12-03 | 上海大学 | Method for constructing holographic even image |
CN110046693A (en) * | 2018-01-13 | 2019-07-23 | Arm有限公司 | Select the encoding option |
WO2019178893A1 (en) * | 2018-03-22 | 2019-09-26 | 深圳大学 | Motion blur image sharpening method and device, apparatus, and storage medium |
CN110363830B (en) * | 2018-04-10 | 2023-05-02 | 阿里巴巴集团控股有限公司 | Element image generation method, device and system |
CN110363830A (en) * | 2018-04-10 | 2019-10-22 | 阿里巴巴集团控股有限公司 | Element image generation method, apparatus and system |
CN108765261A (en) * | 2018-04-13 | 2018-11-06 | 北京市商汤科技开发有限公司 | Image conversion method and device, electronic equipment, computer storage media, program |
CN108765261B (en) * | 2018-04-13 | 2022-07-05 | 北京市商汤科技开发有限公司 | Image transformation method and device, electronic equipment and computer storage medium |
CN109064430B (en) * | 2018-08-07 | 2020-10-09 | 中国人民解放军陆军炮兵防空兵学院 | Cloud removing method and system for aerial region cloud-containing image |
CN109064430A (en) * | 2018-08-07 | 2018-12-21 | 中国人民解放军陆军炮兵防空兵学院 | It is a kind of to remove cloud method and system containing cloud atlas for region is taken photo by plane |
CN110880203A (en) * | 2018-09-04 | 2020-03-13 | 辉达公司 | Joint composition and placement of objects in a scene |
CN113902921A (en) * | 2018-11-30 | 2022-01-07 | 腾讯科技(深圳)有限公司 | Image processing method, device, equipment and storage medium |
CN113902921B (en) * | 2018-11-30 | 2022-11-25 | 腾讯科技(深圳)有限公司 | Image processing method, device, equipment and storage medium |
CN110322416A (en) * | 2019-07-09 | 2019-10-11 | 腾讯科技(深圳)有限公司 | Image processing method, device and computer readable storage medium |
CN110322416B (en) * | 2019-07-09 | 2022-11-18 | 腾讯科技(深圳)有限公司 | Image data processing method, apparatus and computer readable storage medium |
CN115210772A (en) * | 2020-01-03 | 2022-10-18 | 佩治人工智能公司 | System and method for processing electronic images for universal disease detection |
CN115210772B (en) * | 2020-01-03 | 2023-07-18 | 佩治人工智能公司 | System and method for processing electronic images for universal disease detection |
CN111400754A (en) * | 2020-03-11 | 2020-07-10 | 支付宝(杭州)信息技术有限公司 | Construction method and device of user classification system for protecting user privacy |
CN113205140A (en) * | 2021-05-06 | 2021-08-03 | 中国人民解放军海军航空大学航空基础学院 | Semi-supervised specific radiation source individual identification method based on generative countermeasure network |
CN113205140B (en) * | 2021-05-06 | 2022-11-15 | 中国人民解放军海军航空大学 | Semi-supervised specific radiation source individual identification method based on generative countermeasure network |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107330954A (en) | A kind of method based on attenuation network by sliding attribute manipulation image | |
Bolin et al. | A perceptually based adaptive sampling algorithm | |
CN105719263B (en) | Visible ray and infrared image fusion method based on NSCT domains bottom visual signature | |
CN110110617A (en) | Medical image dividing method, device, electronic equipment and storage medium | |
Liu et al. | Criteria to evaluate the fidelity of image enhancement by MSRCR | |
CN110322416A (en) | Image processing method, device and computer readable storage medium | |
Bradley et al. | Retina-V1 model of detectability across the visual field | |
CN109361934A (en) | Image processing method, device, equipment and storage medium | |
CN105787867B (en) | The method and apparatus of processing video image based on neural network algorithm | |
CN110223234A (en) | Depth residual error network image super resolution ratio reconstruction method based on cascade shrinkage expansion | |
CN107590515A (en) | The hyperspectral image classification method of self-encoding encoder based on entropy rate super-pixel segmentation | |
CN105096286A (en) | Method and device for fusing remote sensing image | |
CN107895038A (en) | A kind of link prediction relation recommends method and device | |
CN108876754A (en) | A kind of remote sensing images missing data method for reconstructing based on depth convolutional neural networks | |
CN103065282A (en) | Image fusion method based on sparse linear system | |
CN110348352A (en) | Training method, terminal and storage medium for human face image age migration network | |
CN110415201A (en) | Single exposure super-resolution imaging method and device based on structure light and deep learning | |
CN110866364A (en) | Ground surface temperature downscaling method based on machine learning | |
CN111179196A (en) | Multi-resolution depth network image highlight removing method based on divide-and-conquer | |
Kaur et al. | Skin lesion segmentation using an improved framework of encoder‐decoder based convolutional neural network | |
Wang et al. | An image fusion algorithm based on lifting wavelet transform | |
Tufail et al. | Optimisation of transmission map for improved image defogging | |
CN117197627B (en) | Multi-mode image fusion method based on high-order degradation model | |
CN103198456A (en) | Remote sensing image fusion method based on directionlet domain hidden Markov tree (HMT) model | |
CN111768326B (en) | High-capacity data protection method based on GAN (gas-insulated gate bipolar transistor) amplified image foreground object |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WW01 | Invention patent application withdrawn after publication |
Application publication date: 20171107 |
|
WW01 | Invention patent application withdrawn after publication |