CN108171324A - A kind of variation own coding mixed model - Google Patents
A kind of variation own coding mixed model Download PDFInfo
- Publication number
- CN108171324A CN108171324A CN201711433048.1A CN201711433048A CN108171324A CN 108171324 A CN108171324 A CN 108171324A CN 201711433048 A CN201711433048 A CN 201711433048A CN 108171324 A CN108171324 A CN 108171324A
- Authority
- CN
- China
- Prior art keywords
- model
- variation
- coding
- hidden variable
- weight
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/047—Probabilistic or stochastic networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Molecular Biology (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Probability & Statistics with Applications (AREA)
- Machine Translation (AREA)
Abstract
The present invention relates to a kind of variation own coding mixed model, technical characteristics are:It is made of K variation own coding model, wherein each variation own coding model is indicated by the random hidden variable of two-value that K is tieed up, the probabilistic decoding model and probability encoding model of each variation own coding model are made of neural networks with single hidden layer, indicate the Posterior probability distribution of hidden variable by being formed based on the neural network for cutting rod.Reasonable design of the present invention, it uses the relationship between own coding mixed model estimation hidden variable and sample, the ability of model generation sample can be improved, learn the weight of mixing generation model based on hidden variable, it can ensure the simplicity sampled to hidden variable, and best generation model can be independently determined by hidden variable when sampling generates sample, simultaneously, the present invention effectively extends latent variables space i.e. probability encoding space, improves the precision of model expression.
Description
Technical field
The invention belongs to machine learning techniques field, especially a kind of variation own coding mixed model.
Background technology
Variation own coding (Variational Autoencoders, VAEs) is a kind of important expression model, passes through change
Point method comes approximate solution generation model (probabilistic decoding) and identification model (probability encoding).Enable X={ x1,x2,…,xNRepresent N
The set of a independent same distribution sample composition.Variable x=[x1,x2,...,xd]TIt is a d dimensional vector, can is discrete variable
Or continuous variable.VAEs model hypothesis data x is by condition distribution pθ(x | z) generation, wherein z are continuous hidden variables, priori
It is distributed as pθ(z), θ represents model parameter.Learning tasks at this time are by calculating edge likelihood pθ(x) and the posteriority of hidden variable z
Distribution pθ(z | x) solving model parameter, i.e.,:
pθ(x)=∫zpθ(x|z)pθ(z)dz (1)
It calculates edge likelihood and Posterior distrbutionp is difficult to resolve, variation own coding is freely distributed q by introducingφ(z | x) it is used for
Approximate Posterior probability distribution pθThe variable problem of quadraturing is converted into about being freely distributed q by (z | x)φThe optimization problem of (z | x)
LVAEs(x, θ, φ), by the approximate solution optimization problem calculating target function, i.e.,:
In variation own coding model, condition distribution pθ(x | z) it is known as generation model or probabilistic decoding, freely it is distributed qφ(z
| x) it is known as identification model or probability encoding.Specifically,qφ(z | x)=N
(z;μφ(x),Σφ(x)), wherein fθ(z), μφ(x) and Σφ(x) it is made of neural networks with single hidden layer.Using stochastic gradient descent
Method solving-optimizing problem (3) study variation own coding model parameter { θ, φ }.
Karol Gregor in 2014 et al. are in " Deep AutoRegressive Networks " autoregression network
Concept is used in self-encoding encoder, with a kind of increasingly complex autoencoder network of autoregression network struction, can accurately be fitted sample
This potential regularity of distribution.2016, Danilo Jimenez Rezende were in " Variational Inference with
The concept of the middle normal streams of Normalizing Flows " is complicated the Posterior probability distribution of hidden variable in variation own coding model
Change, obtain a kind of higher distribution of scalability.Although autoregression self-encoding encoder and normal stream self-encoding encoder both models are all
So that the edge likelihood of variation reasoning improves, but both modes have upset the hidden variable of different classes of sample in feature space
In distribution, to the sample that is generated after hidden variable random sampling also distribution-free rule.2017, Serena Yeung were proposed
" Epitomic Variational Autoencoder, eVAE ", Serena analyze the intermediate hidden layer of variation own coding model,
Some intermediate hidden nodes are all sluggish to big multisample in concurrent own coding model now, while the value pair of the hidden node
Different sample changed unobvious, variance are smaller.Based on this point, the node of intermediate hidden layer is divided into multigroup, each sample by eVAE
Hidden node among one group of this correspondence, other group nodes will be hidden, and add in a hidden variable in a model, be used to specify
The group of the corresponding intermediate hidden node of sample.How variation own coding model to be effectively bonded together with hidden variable, because
This, the precision of the model, space still have some problems.
Invention content
It is an object of the invention to overcome the deficiencies in the prior art, propose that a kind of reasonable design, precision are high and effectively expand
The variation own coding mixed model of latent variables space.
The present invention solves its technical problem and following technical scheme is taken to realize:
A kind of variation own coding mixed model is made of K variation own coding model, and each variation own coding model is tieed up by K
The instruction of two-value random hidden variable, the probabilistic decoding model and probability encoding model of each variation own coding model are by single hidden layer
Neural network forms, and indicates the Posterior probability distribution of hidden variable by being formed based on the neural network for cutting rod.
Each variation own coding model represents as follows:
Enable { θ1,θ2,...,θKRepresent the parameter of each distributed component, π=[π1,π2,...,πK] represent each distributed component
Weight, andThe two-value instruction hidden variable c=[c of K dimensions1,c2,…,cK]T, meet ck∈ { 0,1 } andThen πk=p (ck=1) be k-th of model weight, hidden variable probability is indicated in variation own coding mixed model
The conditional probability distribution p of distribution p (c | π) and generation data (x | z, c) be respectively:
The joint probability distribution form of variation own coding mixed model is:
P (x, z, c)=p (x | z, c) p (z) p (c | π).
The condition distribution p of the neural networks with single hidden layerθ(x | z) be:
Y=fσ(W2tanh(W1z+b1)+b2)
Wherein, W3,b3Represent weight and the biasing of neural networks with single hidden layer input layer, W4,b4,W5,b5Represent single hidden layer god
Weight and biasing through network output layer, therefore parameter phi={ W3,W4,W5,b3,b4,b5}。
For continuous hidden variable z, the condition distribution q based on neural networks with single hidden layerφ(z | x) be:
logqφ(z | x)=logN (z;μ,δ2I)
μ=W4h+b4
logδ2=W5h+b5
H=tanh (W3z+b3)
Wherein, W3,b3Represent the weight of neural networks with single hidden layer input layer and biasing, W in probability encoding4,b4,W5,b5Table
Show weight and the biasing of neural networks with single hidden layer output layer, therefore parameter phi={ W3,W4,W5,b3,b4,b5}。
The Posterior probability distribution of the instruction hidden variable represents as follows:
For hidden variable π, using monolayer neural networks study posteriority qη(π|z):
α=tanh (W7(W6z+b6)+b7)
Wherein, W6,b6Represent weight and the biasing of neural networks with single hidden layer input layer, W7,b7Represent neural networks with single hidden layer
The weight of output layer and biasing, therefore parameter η={ W7,W8,b7,b8}。
The variation own coding mixed model includes the model parameter of probabilistic decoding model, the model ginseng of probability encoding model
The model parameter of number and instruction hidden variable Posterior probability distribution, above-mentioned parameter are calculated using gradient descent method optimization object function
It obtains.
The advantages and positive effects of the present invention are:
Reasonable design of the present invention uses the relationship between own coding mixed model estimation hidden variable and sample, Neng Gouti
The ability of high model generation sample, while based on the weight of hidden variable study mixing generation model, can ensure to take out hidden variable
The simplicity of sample, and best generation model can be independently determined by hidden variable when sampling generates sample, the present invention is effectively
Latent variables space i.e. probability encoding space is extended, improves the precision of model expression, while sample can be efficiently generated.
Description of the drawings
Fig. 1 is variation own coding mixed model graph model structure chart;
Fig. 2 is MNIST hand-written script data set figures;
During Fig. 3 is using MNIST data sets training variation own coding mixed model, the convergence process figure of variation lower bound;
Fig. 4 is to generate new hand-written script sample after training variation own coding mixed model using MNIST data sets;
After Fig. 5 is trains variation own coding mixed model using MNIST data sets, the generation of latent variables space uniform sampling
Hand-written script.
Specific embodiment
The embodiment of the present invention is further described below in conjunction with attached drawing.
A kind of variation own coding mixed model as shown in Figure 1, forming mixed model by K variation own coding model, enables
{θ1,θ2,...,θKRepresent the parameter of each distributed component, π=[π1,π2,...,πK] represent the weight of each distributed component, andIntroduce the random hidden variable c=[c of two-value of K dimensions1,c2,…,cK]T, meet ck∈ { 0,1 } andThen
πk=p (ck=1) be k-th of model weight.The form of probability of variation own coding mixed model is:
Combining form of probability under variation own coding mixed model is:
P (x, z, c)=p (x | z, c) p (z) p (c | π) (6)
The Posterior probability distribution p (z | x) and p (c | x) for calculating hidden variable c and z are difficult to resolve, according to variation approximate resoning side
Method, introducing are freely distributed qφ(z | x) and qη(c | x), variable Integral Problem is converted into optimization problem, specific derivation process is such as
Under:
Therefore, the variation optimization problem in variation own coding mixed model is:
Learning tasks on variation own coding mixed model are by solving variation optimization problem (8), and study variation is self-editing
Code model parameter { θ, φ, η }.The graph model representation of variation own coding mixed model is such as.
For variation own coding mixed model, probabilistic decoding model (or generation model) pθ(x | z) and probability encoding model
(or identification model) qφ(z | x) it is made of neural networks with single hidden layer.Specifically, when vector x is discrete vector, based on single hidden layer
The condition distribution p of neural networkθ(x | z) be:
Y=fσ(W2tanh(W1z+b1)+b2)
Wherein, W1,b1Represent the weight of neural networks with single hidden layer input layer and biasing, W in probabilistic decoding2,b2Represent single hidden
The weight of layer neural network output layer and biasing, therefore parameter θ={ W1,W2,b1,b2}。
For continuous hidden variable z, the condition distribution q based on neural networks with single hidden layerφ(z | x) be:
Wherein, W3,b3Represent the weight of neural networks with single hidden layer input layer and biasing, W in probability encoding4,b4,W5,b5Table
Show weight and the biasing of neural networks with single hidden layer output layer, therefore parameter phi={ W3,W4,W5,b3,b4,b5}。
Hidden variable π selects its conjugate gradient descent method --- and Di Li Crays distribution Dir (α) is distributed with classical Di Li Crays
Building method --- rod method approximation p (π) is cut, i.e.,
(π1,π2,...,πK)~Dir (α1,α2,...,αK) (11)
πk=σk (13)
Realize that the Beta in above-mentioned construction process is distributed by the way of mapping is differentiated in variation own coding mixed model
Cut rod process:
π1=sigmoid (f1(z,η)) (14)
π2=(1-sigmoid (f1(z,η)))sigmoid(f2(z,η)) (15)
…
πK=(1-sigmoid (fK-1(z,η)))...(1-sigmoid(f1(z,η))) (16)
Above-mentioned construction process can be further simplified as:
For hidden variable π, using monolayer neural networks study posteriority qη(π|z):
α=tanh (W7(W6z+b6)+b7) (19)
Wherein, W6,b6Represent weight and the biasing of neural networks with single hidden layer input layer, W7,b7Represent neural networks with single hidden layer
The weight of output layer and biasing, therefore parameter η={ W7,W8,b7,b8}。
L (x, θ, φ, η) is the variation lower bound of the edge likelihood of variation own coding mixed model, and the target of model is exactly most
Change the value greatly.To obtain the estimation of deviation minimum, it should using batch sampleIt is handled.Using adopting again
Sample is sampled hidden variable z, then optimizes object function L (x, θ, φ, η) with stochastic gradient descent method.
The MNIST handwritten numerals data set provided below with Fig. 2 illustrates this variation own coding mixed model.
MNIST handwritten numerals data set comes from American National Standard and technical research institute, including 0-9 totally ten digital hand-written scripts,
Training set includes 60000 fonts altogether, is specifically formed from the hand-written number of 250 different peoples, and 50% is high school student,
50% staff from the Census Bureau.The few examples of MNIST handwritten numerals are as shown in Figure 2.Using the hand-written numbers of MNIST
Digital data set pair this variation own coding mixed model is trained, especially by solving-optimizing formula (8) computation model parameter θ,
φ, η } and variation lower bound L (x, θ, φ, η).The convergence process of optimization problem variation lower bound is as shown in figure 3, wherein abscissa represents
Iterations, ordinate represent variation lower bound.Generation model p can be obtained after the completion of variation own coding model trainingθ(x | z),
New handwritten numeral sample can be generated using the generation model, the newly-generated handwritten numeral sample in part is as shown in Figure 4.Fig. 5
After training variation own coding mixed model using MNIST data sets, the hand-written script of hidden variable z space uniforms sampling generation.
It is emphasized that embodiment of the present invention is illustrative rather than limited, therefore present invention packet
Include the embodiment being not limited to described in specific embodiment, it is every by those skilled in the art according to the technique and scheme of the present invention
The other embodiment obtained, also belongs to the scope of protection of the invention.
Claims (5)
1. a kind of variation own coding mixed model, it is characterised in that:It is made of K variation own coding model, each variation is self-editing
Code model is by the random hidden variable instruction of two-value that K is tieed up, the probabilistic decoding model of each variation own coding model and probability encoding mould
Type is made of neural networks with single hidden layer, indicates the Posterior probability distribution of hidden variable by being formed based on the neural network for cutting rod.
2. a kind of variation own coding mixed model according to claim 1, it is characterised in that:Each variation own coding
Model represents as follows:
Enable { θ1,θ2,...,θKRepresent the parameter of each distributed component, π=[π1,π2,...,πK] represent the weight of each distributed component,
AndThe two-value instruction hidden variable c=[c of K dimensions1,c2,…,cK]T, meet ck∈ { 0,1 } andThen πk
=p (ck=1) be k-th of model weight, instruction hidden variable probability distribution p in variation own coding mixed model (c | π) and raw
Conditional probability distribution p (x | z, c) into data is respectively:
The joint probability distribution form of variation own coding mixed model is:
P (x, z, c)=p (x | z, c) p (z) p (c | π).
3. a kind of variation own coding mixed model according to claim 1, it is characterised in that:The neural networks with single hidden layer
Condition distribution pθ(x | z) be:
Y=fσ(W2tanh(W1z+b1)+b2)
Wherein, W1,b1Represent the weight of neural networks with single hidden layer input layer and biasing, W in probabilistic decoding2,b2Represent single hidden layer god
Weight and biasing through network output layer, therefore parameter θ={ W1,W2,b1,b2}。
For continuous hidden variable z, the condition distribution q based on neural networks with single hidden layerφ(z | x) be:
log qφ(z | x)=log N (z;μ,δ2I)
μ=W4h+b4
logδ2=W5h+b5
H=tanh (W3z+b3)
Wherein, W3,b3Represent the weight of neural networks with single hidden layer input layer and biasing, W in probability encoding4,b4,W5,b5Represent single
The weight of hidden layer neural network output layer and biasing, therefore parameter phi={ W3,W4,W5,b3,b4,b5}。
4. a kind of variation own coding mixed model according to claim 1, it is characterised in that:After the instruction hidden variable
It tests probability distribution and represents as follows:
For hidden variable π, using monolayer neural networks study posteriority qη(π|z):
α=tanh (W7(W6z+b6)+b7)
Wherein, W6,b6Represent weight and the biasing of neural networks with single hidden layer input layer, W7,b7Represent neural networks with single hidden layer output
The weight of layer and biasing, therefore parameter η={ W7,W8,b7,b8}。
5. a kind of variation own coding mixed model according to claim 1, it is characterised in that:The variation own coding mixing
Model includes the model parameter, the model parameter of probability encoding model and instruction hidden variable Posterior probability distribution of probabilistic decoding model
Model parameter, above-mentioned parameter is calculated using gradient descent method optimization object function.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711433048.1A CN108171324A (en) | 2017-12-26 | 2017-12-26 | A kind of variation own coding mixed model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711433048.1A CN108171324A (en) | 2017-12-26 | 2017-12-26 | A kind of variation own coding mixed model |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108171324A true CN108171324A (en) | 2018-06-15 |
Family
ID=62521049
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711433048.1A Pending CN108171324A (en) | 2017-12-26 | 2017-12-26 | A kind of variation own coding mixed model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108171324A (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108959551A (en) * | 2018-06-29 | 2018-12-07 | 北京百度网讯科技有限公司 | Method for digging, device, storage medium and the terminal device of neighbour's semanteme |
CN109543100A (en) * | 2018-10-31 | 2019-03-29 | 上海交通大学 | User interest modeling method and system based on Cooperative Study |
CN110161480A (en) * | 2019-06-18 | 2019-08-23 | 西安电子科技大学 | Radar target identification method based on semi-supervised depth probabilistic model |
CN110753239A (en) * | 2018-07-23 | 2020-02-04 | 深圳地平线机器人科技有限公司 | Video prediction method, video prediction device, electronic equipment and vehicle |
CN111224670A (en) * | 2018-11-27 | 2020-06-02 | 富士通株式会社 | Auto encoder, and method and medium for training the same |
CN111243045A (en) * | 2020-01-10 | 2020-06-05 | 杭州电子科技大学 | Image generation method based on Gaussian mixture model prior variation self-encoder |
EP3767542A1 (en) * | 2019-07-17 | 2021-01-20 | Robert Bosch GmbH | Training and data synthesis and probability inference using nonlinear conditional normalizing flow model |
-
2017
- 2017-12-26 CN CN201711433048.1A patent/CN108171324A/en active Pending
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108959551A (en) * | 2018-06-29 | 2018-12-07 | 北京百度网讯科技有限公司 | Method for digging, device, storage medium and the terminal device of neighbour's semanteme |
CN110753239A (en) * | 2018-07-23 | 2020-02-04 | 深圳地平线机器人科技有限公司 | Video prediction method, video prediction device, electronic equipment and vehicle |
CN110753239B (en) * | 2018-07-23 | 2022-03-08 | 深圳地平线机器人科技有限公司 | Video prediction method, video prediction device, electronic equipment and vehicle |
CN109543100A (en) * | 2018-10-31 | 2019-03-29 | 上海交通大学 | User interest modeling method and system based on Cooperative Study |
CN111224670A (en) * | 2018-11-27 | 2020-06-02 | 富士通株式会社 | Auto encoder, and method and medium for training the same |
CN111224670B (en) * | 2018-11-27 | 2023-09-15 | 富士通株式会社 | Automatic encoder, and method and medium for training the same |
CN110161480A (en) * | 2019-06-18 | 2019-08-23 | 西安电子科技大学 | Radar target identification method based on semi-supervised depth probabilistic model |
EP3767542A1 (en) * | 2019-07-17 | 2021-01-20 | Robert Bosch GmbH | Training and data synthesis and probability inference using nonlinear conditional normalizing flow model |
CN111243045A (en) * | 2020-01-10 | 2020-06-05 | 杭州电子科技大学 | Image generation method based on Gaussian mixture model prior variation self-encoder |
CN111243045B (en) * | 2020-01-10 | 2023-04-07 | 杭州电子科技大学 | Image generation method based on Gaussian mixture model prior variation self-encoder |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108171324A (en) | A kind of variation own coding mixed model | |
Xu et al. | Ternary compression for communication-efficient federated learning | |
CN107463953B (en) | Image classification method and system based on quality insertion in the noisy situation of label | |
CN109558945A (en) | The method and device that artificial neural network and floating-point neural network are quantified | |
CN107808278A (en) | A kind of Github open source projects based on sparse self-encoding encoder recommend method | |
CN108776820A (en) | It is a kind of to utilize the improved random forest integrated approach of width neural network | |
CN104468413B (en) | A kind of network service method and system | |
WO2018133596A1 (en) | Continuous feature construction method based on nominal attribute | |
CN109446414A (en) | A kind of software information website fast tag recommended method based on neural network classification | |
CN110334105A (en) | A kind of flow data Outlier Detection Algorithm based on Storm | |
CN103793438B (en) | A kind of parallel clustering method based on MapReduce | |
CN108804577A (en) | A kind of predictor method of information label interest-degree | |
CN112085086A (en) | Multi-source transfer learning method based on graph convolution neural network | |
Liu et al. | Efficient federated learning for AIoT applications using knowledge distillation | |
CN111950611A (en) | Big data two-classification distributed optimization method based on random gradient tracking technology | |
CN115659807A (en) | Method for predicting talent performance based on Bayesian optimization model fusion algorithm | |
CN110830291A (en) | Node classification method of heterogeneous information network based on meta-path | |
CN115098672A (en) | User demand discovery method and system based on multi-view deep clustering | |
CN108363685A (en) | Based on recurrence variation own coding model from media data document representation method | |
CN108710944A (en) | One kind can train piece-wise linear activation primitive generation method | |
Sandhu et al. | Software effort estimation using soft computing techniques | |
Pradier et al. | Projected BNNs: Avoiding weight-space pathologies by learning latent representations of neural network weights | |
Huo et al. | Statistical characteristics of dynamics for population migration driven by the economic interests | |
CN112286996A (en) | Node embedding method based on network link and node attribute information | |
CN110264311A (en) | A kind of business promotion accurate information recommended method and system based on deep learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180615 |