CN108171324A - A kind of variation own coding mixed model - Google Patents

A kind of variation own coding mixed model Download PDF

Info

Publication number
CN108171324A
CN108171324A CN201711433048.1A CN201711433048A CN108171324A CN 108171324 A CN108171324 A CN 108171324A CN 201711433048 A CN201711433048 A CN 201711433048A CN 108171324 A CN108171324 A CN 108171324A
Authority
CN
China
Prior art keywords
model
variation
coding
hidden variable
weight
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201711433048.1A
Other languages
Chinese (zh)
Inventor
陈亚瑞
蒋硕然
赵青
杨巨成
张传雷
赵希
刘建征
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin University of Science and Technology
Original Assignee
Tianjin University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin University of Science and Technology filed Critical Tianjin University of Science and Technology
Priority to CN201711433048.1A priority Critical patent/CN108171324A/en
Publication of CN108171324A publication Critical patent/CN108171324A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Machine Translation (AREA)

Abstract

The present invention relates to a kind of variation own coding mixed model, technical characteristics are:It is made of K variation own coding model, wherein each variation own coding model is indicated by the random hidden variable of two-value that K is tieed up, the probabilistic decoding model and probability encoding model of each variation own coding model are made of neural networks with single hidden layer, indicate the Posterior probability distribution of hidden variable by being formed based on the neural network for cutting rod.Reasonable design of the present invention, it uses the relationship between own coding mixed model estimation hidden variable and sample, the ability of model generation sample can be improved, learn the weight of mixing generation model based on hidden variable, it can ensure the simplicity sampled to hidden variable, and best generation model can be independently determined by hidden variable when sampling generates sample, simultaneously, the present invention effectively extends latent variables space i.e. probability encoding space, improves the precision of model expression.

Description

A kind of variation own coding mixed model
Technical field
The invention belongs to machine learning techniques field, especially a kind of variation own coding mixed model.
Background technology
Variation own coding (Variational Autoencoders, VAEs) is a kind of important expression model, passes through change Point method comes approximate solution generation model (probabilistic decoding) and identification model (probability encoding).Enable X={ x1,x2,…,xNRepresent N The set of a independent same distribution sample composition.Variable x=[x1,x2,...,xd]TIt is a d dimensional vector, can is discrete variable Or continuous variable.VAEs model hypothesis data x is by condition distribution pθ(x | z) generation, wherein z are continuous hidden variables, priori It is distributed as pθ(z), θ represents model parameter.Learning tasks at this time are by calculating edge likelihood pθ(x) and the posteriority of hidden variable z Distribution pθ(z | x) solving model parameter, i.e.,:
pθ(x)=∫zpθ(x|z)pθ(z)dz (1)
It calculates edge likelihood and Posterior distrbutionp is difficult to resolve, variation own coding is freely distributed q by introducingφ(z | x) it is used for Approximate Posterior probability distribution pθThe variable problem of quadraturing is converted into about being freely distributed q by (z | x)φThe optimization problem of (z | x) LVAEs(x, θ, φ), by the approximate solution optimization problem calculating target function, i.e.,:
In variation own coding model, condition distribution pθ(x | z) it is known as generation model or probabilistic decoding, freely it is distributed qφ(z | x) it is known as identification model or probability encoding.Specifically,qφ(z | x)=N (z;μφ(x),Σφ(x)), wherein fθ(z), μφ(x) and Σφ(x) it is made of neural networks with single hidden layer.Using stochastic gradient descent Method solving-optimizing problem (3) study variation own coding model parameter { θ, φ }.
Karol Gregor in 2014 et al. are in " Deep AutoRegressive Networks " autoregression network Concept is used in self-encoding encoder, with a kind of increasingly complex autoencoder network of autoregression network struction, can accurately be fitted sample This potential regularity of distribution.2016, Danilo Jimenez Rezende were in " Variational Inference with The concept of the middle normal streams of Normalizing Flows " is complicated the Posterior probability distribution of hidden variable in variation own coding model Change, obtain a kind of higher distribution of scalability.Although autoregression self-encoding encoder and normal stream self-encoding encoder both models are all So that the edge likelihood of variation reasoning improves, but both modes have upset the hidden variable of different classes of sample in feature space In distribution, to the sample that is generated after hidden variable random sampling also distribution-free rule.2017, Serena Yeung were proposed " Epitomic Variational Autoencoder, eVAE ", Serena analyze the intermediate hidden layer of variation own coding model, Some intermediate hidden nodes are all sluggish to big multisample in concurrent own coding model now, while the value pair of the hidden node Different sample changed unobvious, variance are smaller.Based on this point, the node of intermediate hidden layer is divided into multigroup, each sample by eVAE Hidden node among one group of this correspondence, other group nodes will be hidden, and add in a hidden variable in a model, be used to specify The group of the corresponding intermediate hidden node of sample.How variation own coding model to be effectively bonded together with hidden variable, because This, the precision of the model, space still have some problems.
Invention content
It is an object of the invention to overcome the deficiencies in the prior art, propose that a kind of reasonable design, precision are high and effectively expand The variation own coding mixed model of latent variables space.
The present invention solves its technical problem and following technical scheme is taken to realize:
A kind of variation own coding mixed model is made of K variation own coding model, and each variation own coding model is tieed up by K The instruction of two-value random hidden variable, the probabilistic decoding model and probability encoding model of each variation own coding model are by single hidden layer Neural network forms, and indicates the Posterior probability distribution of hidden variable by being formed based on the neural network for cutting rod.
Each variation own coding model represents as follows:
Enable { θ12,...,θKRepresent the parameter of each distributed component, π=[π12,...,πK] represent each distributed component Weight, andThe two-value instruction hidden variable c=[c of K dimensions1,c2,…,cK]T, meet ck∈ { 0,1 } andThen πk=p (ck=1) be k-th of model weight, hidden variable probability is indicated in variation own coding mixed model The conditional probability distribution p of distribution p (c | π) and generation data (x | z, c) be respectively:
The joint probability distribution form of variation own coding mixed model is:
P (x, z, c)=p (x | z, c) p (z) p (c | π).
The condition distribution p of the neural networks with single hidden layerθ(x | z) be:
Y=fσ(W2tanh(W1z+b1)+b2)
Wherein, W3,b3Represent weight and the biasing of neural networks with single hidden layer input layer, W4,b4,W5,b5Represent single hidden layer god Weight and biasing through network output layer, therefore parameter phi={ W3,W4,W5,b3,b4,b5}。
For continuous hidden variable z, the condition distribution q based on neural networks with single hidden layerφ(z | x) be:
logqφ(z | x)=logN (z;μ,δ2I)
μ=W4h+b4
logδ2=W5h+b5
H=tanh (W3z+b3)
Wherein, W3,b3Represent the weight of neural networks with single hidden layer input layer and biasing, W in probability encoding4,b4,W5,b5Table Show weight and the biasing of neural networks with single hidden layer output layer, therefore parameter phi={ W3,W4,W5,b3,b4,b5}。
The Posterior probability distribution of the instruction hidden variable represents as follows:
For hidden variable π, using monolayer neural networks study posteriority qη(π|z):
α=tanh (W7(W6z+b6)+b7)
Wherein, W6,b6Represent weight and the biasing of neural networks with single hidden layer input layer, W7,b7Represent neural networks with single hidden layer The weight of output layer and biasing, therefore parameter η={ W7,W8,b7,b8}。
The variation own coding mixed model includes the model parameter of probabilistic decoding model, the model ginseng of probability encoding model The model parameter of number and instruction hidden variable Posterior probability distribution, above-mentioned parameter are calculated using gradient descent method optimization object function It obtains.
The advantages and positive effects of the present invention are:
Reasonable design of the present invention uses the relationship between own coding mixed model estimation hidden variable and sample, Neng Gouti The ability of high model generation sample, while based on the weight of hidden variable study mixing generation model, can ensure to take out hidden variable The simplicity of sample, and best generation model can be independently determined by hidden variable when sampling generates sample, the present invention is effectively Latent variables space i.e. probability encoding space is extended, improves the precision of model expression, while sample can be efficiently generated.
Description of the drawings
Fig. 1 is variation own coding mixed model graph model structure chart;
Fig. 2 is MNIST hand-written script data set figures;
During Fig. 3 is using MNIST data sets training variation own coding mixed model, the convergence process figure of variation lower bound;
Fig. 4 is to generate new hand-written script sample after training variation own coding mixed model using MNIST data sets;
After Fig. 5 is trains variation own coding mixed model using MNIST data sets, the generation of latent variables space uniform sampling Hand-written script.
Specific embodiment
The embodiment of the present invention is further described below in conjunction with attached drawing.
A kind of variation own coding mixed model as shown in Figure 1, forming mixed model by K variation own coding model, enables {θ12,...,θKRepresent the parameter of each distributed component, π=[π12,...,πK] represent the weight of each distributed component, andIntroduce the random hidden variable c=[c of two-value of K dimensions1,c2,…,cK]T, meet ck∈ { 0,1 } andThen πk=p (ck=1) be k-th of model weight.The form of probability of variation own coding mixed model is:
Combining form of probability under variation own coding mixed model is:
P (x, z, c)=p (x | z, c) p (z) p (c | π) (6)
The Posterior probability distribution p (z | x) and p (c | x) for calculating hidden variable c and z are difficult to resolve, according to variation approximate resoning side Method, introducing are freely distributed qφ(z | x) and qη(c | x), variable Integral Problem is converted into optimization problem, specific derivation process is such as Under:
Therefore, the variation optimization problem in variation own coding mixed model is:
Learning tasks on variation own coding mixed model are by solving variation optimization problem (8), and study variation is self-editing Code model parameter { θ, φ, η }.The graph model representation of variation own coding mixed model is such as.
For variation own coding mixed model, probabilistic decoding model (or generation model) pθ(x | z) and probability encoding model (or identification model) qφ(z | x) it is made of neural networks with single hidden layer.Specifically, when vector x is discrete vector, based on single hidden layer The condition distribution p of neural networkθ(x | z) be:
Y=fσ(W2tanh(W1z+b1)+b2)
Wherein, W1,b1Represent the weight of neural networks with single hidden layer input layer and biasing, W in probabilistic decoding2,b2Represent single hidden The weight of layer neural network output layer and biasing, therefore parameter θ={ W1,W2,b1,b2}。
For continuous hidden variable z, the condition distribution q based on neural networks with single hidden layerφ(z | x) be:
Wherein, W3,b3Represent the weight of neural networks with single hidden layer input layer and biasing, W in probability encoding4,b4,W5,b5Table Show weight and the biasing of neural networks with single hidden layer output layer, therefore parameter phi={ W3,W4,W5,b3,b4,b5}。
Hidden variable π selects its conjugate gradient descent method --- and Di Li Crays distribution Dir (α) is distributed with classical Di Li Crays Building method --- rod method approximation p (π) is cut, i.e.,
12,...,πK)~Dir (α12,...,αK) (11)
πkk (13)
Realize that the Beta in above-mentioned construction process is distributed by the way of mapping is differentiated in variation own coding mixed model Cut rod process:
π1=sigmoid (f1(z,η)) (14)
π2=(1-sigmoid (f1(z,η)))sigmoid(f2(z,η)) (15)
πK=(1-sigmoid (fK-1(z,η)))...(1-sigmoid(f1(z,η))) (16)
Above-mentioned construction process can be further simplified as:
For hidden variable π, using monolayer neural networks study posteriority qη(π|z):
α=tanh (W7(W6z+b6)+b7) (19)
Wherein, W6,b6Represent weight and the biasing of neural networks with single hidden layer input layer, W7,b7Represent neural networks with single hidden layer The weight of output layer and biasing, therefore parameter η={ W7,W8,b7,b8}。
L (x, θ, φ, η) is the variation lower bound of the edge likelihood of variation own coding mixed model, and the target of model is exactly most Change the value greatly.To obtain the estimation of deviation minimum, it should using batch sampleIt is handled.Using adopting again Sample is sampled hidden variable z, then optimizes object function L (x, θ, φ, η) with stochastic gradient descent method.
The MNIST handwritten numerals data set provided below with Fig. 2 illustrates this variation own coding mixed model. MNIST handwritten numerals data set comes from American National Standard and technical research institute, including 0-9 totally ten digital hand-written scripts, Training set includes 60000 fonts altogether, is specifically formed from the hand-written number of 250 different peoples, and 50% is high school student, 50% staff from the Census Bureau.The few examples of MNIST handwritten numerals are as shown in Figure 2.Using the hand-written numbers of MNIST Digital data set pair this variation own coding mixed model is trained, especially by solving-optimizing formula (8) computation model parameter θ, φ, η } and variation lower bound L (x, θ, φ, η).The convergence process of optimization problem variation lower bound is as shown in figure 3, wherein abscissa represents Iterations, ordinate represent variation lower bound.Generation model p can be obtained after the completion of variation own coding model trainingθ(x | z), New handwritten numeral sample can be generated using the generation model, the newly-generated handwritten numeral sample in part is as shown in Figure 4.Fig. 5 After training variation own coding mixed model using MNIST data sets, the hand-written script of hidden variable z space uniforms sampling generation.
It is emphasized that embodiment of the present invention is illustrative rather than limited, therefore present invention packet Include the embodiment being not limited to described in specific embodiment, it is every by those skilled in the art according to the technique and scheme of the present invention The other embodiment obtained, also belongs to the scope of protection of the invention.

Claims (5)

1. a kind of variation own coding mixed model, it is characterised in that:It is made of K variation own coding model, each variation is self-editing Code model is by the random hidden variable instruction of two-value that K is tieed up, the probabilistic decoding model of each variation own coding model and probability encoding mould Type is made of neural networks with single hidden layer, indicates the Posterior probability distribution of hidden variable by being formed based on the neural network for cutting rod.
2. a kind of variation own coding mixed model according to claim 1, it is characterised in that:Each variation own coding Model represents as follows:
Enable { θ12,...,θKRepresent the parameter of each distributed component, π=[π12,...,πK] represent the weight of each distributed component, AndThe two-value instruction hidden variable c=[c of K dimensions1,c2,…,cK]T, meet ck∈ { 0,1 } andThen πk =p (ck=1) be k-th of model weight, instruction hidden variable probability distribution p in variation own coding mixed model (c | π) and raw Conditional probability distribution p (x | z, c) into data is respectively:
The joint probability distribution form of variation own coding mixed model is:
P (x, z, c)=p (x | z, c) p (z) p (c | π).
3. a kind of variation own coding mixed model according to claim 1, it is characterised in that:The neural networks with single hidden layer Condition distribution pθ(x | z) be:
Y=fσ(W2tanh(W1z+b1)+b2)
Wherein, W1,b1Represent the weight of neural networks with single hidden layer input layer and biasing, W in probabilistic decoding2,b2Represent single hidden layer god Weight and biasing through network output layer, therefore parameter θ={ W1,W2,b1,b2}。
For continuous hidden variable z, the condition distribution q based on neural networks with single hidden layerφ(z | x) be:
log qφ(z | x)=log N (z;μ,δ2I)
μ=W4h+b4
logδ2=W5h+b5
H=tanh (W3z+b3)
Wherein, W3,b3Represent the weight of neural networks with single hidden layer input layer and biasing, W in probability encoding4,b4,W5,b5Represent single The weight of hidden layer neural network output layer and biasing, therefore parameter phi={ W3,W4,W5,b3,b4,b5}。
4. a kind of variation own coding mixed model according to claim 1, it is characterised in that:After the instruction hidden variable It tests probability distribution and represents as follows:
For hidden variable π, using monolayer neural networks study posteriority qη(π|z):
α=tanh (W7(W6z+b6)+b7)
Wherein, W6,b6Represent weight and the biasing of neural networks with single hidden layer input layer, W7,b7Represent neural networks with single hidden layer output The weight of layer and biasing, therefore parameter η={ W7,W8,b7,b8}。
5. a kind of variation own coding mixed model according to claim 1, it is characterised in that:The variation own coding mixing Model includes the model parameter, the model parameter of probability encoding model and instruction hidden variable Posterior probability distribution of probabilistic decoding model Model parameter, above-mentioned parameter is calculated using gradient descent method optimization object function.
CN201711433048.1A 2017-12-26 2017-12-26 A kind of variation own coding mixed model Pending CN108171324A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711433048.1A CN108171324A (en) 2017-12-26 2017-12-26 A kind of variation own coding mixed model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711433048.1A CN108171324A (en) 2017-12-26 2017-12-26 A kind of variation own coding mixed model

Publications (1)

Publication Number Publication Date
CN108171324A true CN108171324A (en) 2018-06-15

Family

ID=62521049

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711433048.1A Pending CN108171324A (en) 2017-12-26 2017-12-26 A kind of variation own coding mixed model

Country Status (1)

Country Link
CN (1) CN108171324A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108959551A (en) * 2018-06-29 2018-12-07 北京百度网讯科技有限公司 Method for digging, device, storage medium and the terminal device of neighbour's semanteme
CN109543100A (en) * 2018-10-31 2019-03-29 上海交通大学 User interest modeling method and system based on Cooperative Study
CN110161480A (en) * 2019-06-18 2019-08-23 西安电子科技大学 Radar target identification method based on semi-supervised depth probabilistic model
CN110753239A (en) * 2018-07-23 2020-02-04 深圳地平线机器人科技有限公司 Video prediction method, video prediction device, electronic equipment and vehicle
CN111224670A (en) * 2018-11-27 2020-06-02 富士通株式会社 Auto encoder, and method and medium for training the same
CN111243045A (en) * 2020-01-10 2020-06-05 杭州电子科技大学 Image generation method based on Gaussian mixture model prior variation self-encoder
EP3767542A1 (en) * 2019-07-17 2021-01-20 Robert Bosch GmbH Training and data synthesis and probability inference using nonlinear conditional normalizing flow model

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108959551A (en) * 2018-06-29 2018-12-07 北京百度网讯科技有限公司 Method for digging, device, storage medium and the terminal device of neighbour's semanteme
CN110753239A (en) * 2018-07-23 2020-02-04 深圳地平线机器人科技有限公司 Video prediction method, video prediction device, electronic equipment and vehicle
CN110753239B (en) * 2018-07-23 2022-03-08 深圳地平线机器人科技有限公司 Video prediction method, video prediction device, electronic equipment and vehicle
CN109543100A (en) * 2018-10-31 2019-03-29 上海交通大学 User interest modeling method and system based on Cooperative Study
CN111224670A (en) * 2018-11-27 2020-06-02 富士通株式会社 Auto encoder, and method and medium for training the same
CN111224670B (en) * 2018-11-27 2023-09-15 富士通株式会社 Automatic encoder, and method and medium for training the same
CN110161480A (en) * 2019-06-18 2019-08-23 西安电子科技大学 Radar target identification method based on semi-supervised depth probabilistic model
EP3767542A1 (en) * 2019-07-17 2021-01-20 Robert Bosch GmbH Training and data synthesis and probability inference using nonlinear conditional normalizing flow model
CN111243045A (en) * 2020-01-10 2020-06-05 杭州电子科技大学 Image generation method based on Gaussian mixture model prior variation self-encoder
CN111243045B (en) * 2020-01-10 2023-04-07 杭州电子科技大学 Image generation method based on Gaussian mixture model prior variation self-encoder

Similar Documents

Publication Publication Date Title
CN108171324A (en) A kind of variation own coding mixed model
Xu et al. Ternary compression for communication-efficient federated learning
CN107463953B (en) Image classification method and system based on quality insertion in the noisy situation of label
CN109558945A (en) The method and device that artificial neural network and floating-point neural network are quantified
CN107808278A (en) A kind of Github open source projects based on sparse self-encoding encoder recommend method
CN108776820A (en) It is a kind of to utilize the improved random forest integrated approach of width neural network
CN104468413B (en) A kind of network service method and system
WO2018133596A1 (en) Continuous feature construction method based on nominal attribute
CN109446414A (en) A kind of software information website fast tag recommended method based on neural network classification
CN110334105A (en) A kind of flow data Outlier Detection Algorithm based on Storm
CN103793438B (en) A kind of parallel clustering method based on MapReduce
CN108804577A (en) A kind of predictor method of information label interest-degree
CN112085086A (en) Multi-source transfer learning method based on graph convolution neural network
Liu et al. Efficient federated learning for AIoT applications using knowledge distillation
CN111950611A (en) Big data two-classification distributed optimization method based on random gradient tracking technology
CN115659807A (en) Method for predicting talent performance based on Bayesian optimization model fusion algorithm
CN110830291A (en) Node classification method of heterogeneous information network based on meta-path
CN115098672A (en) User demand discovery method and system based on multi-view deep clustering
CN108363685A (en) Based on recurrence variation own coding model from media data document representation method
CN108710944A (en) One kind can train piece-wise linear activation primitive generation method
Sandhu et al. Software effort estimation using soft computing techniques
Pradier et al. Projected BNNs: Avoiding weight-space pathologies by learning latent representations of neural network weights
Huo et al. Statistical characteristics of dynamics for population migration driven by the economic interests
CN112286996A (en) Node embedding method based on network link and node attribute information
CN110264311A (en) A kind of business promotion accurate information recommended method and system based on deep learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20180615