CN108805167B - Sparse depth confidence network image classification method based on Laplace function constraint - Google Patents

Sparse depth confidence network image classification method based on Laplace function constraint Download PDF

Info

Publication number
CN108805167B
CN108805167B CN201810417793.5A CN201810417793A CN108805167B CN 108805167 B CN108805167 B CN 108805167B CN 201810417793 A CN201810417793 A CN 201810417793A CN 108805167 B CN108805167 B CN 108805167B
Authority
CN
China
Prior art keywords
image
training
rbm
sparse
hidden layer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810417793.5A
Other languages
Chinese (zh)
Other versions
CN108805167A (en
Inventor
宋威
李蓓蓓
王晨妮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangnan University
Original Assignee
Jiangnan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangnan University filed Critical Jiangnan University
Priority to CN201810417793.5A priority Critical patent/CN108805167B/en
Publication of CN108805167A publication Critical patent/CN108805167A/en
Application granted granted Critical
Publication of CN108805167B publication Critical patent/CN108805167B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • G06F18/24155Bayesian classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a sparse depth confidence network image classification method based on Laplace function constraint, and belongs to the field of image processing and depth learning. The method is based on the inspiration of visual cortex analysis of primates, a penalty regular term is introduced into a likelihood function in an unsupervised stage, a CD algorithm is used for maximizing a target function, meanwhile, sparse distribution of a training set is obtained through Lapalce sparse constraint, and unlabeled data can be learned to be visually represented by characteristics. Secondly, an improved sparse deep belief network is provided, the Laplace distribution is used for inducing the sparse state of the hidden layer nodes, and meanwhile, the scale parameters in the distribution are used for controlling the sparse strength. And finally, training and learning the parameters of the LSDBN network by using a random gradient descent method. The method provided by the invention achieves the best recognition accuracy and has good sparsity performance under the condition of few samples of each type.

Description

Sparse depth confidence network image classification method based on Laplace function constraint
Technical Field
The invention relates to the field of image processing and Deep learning, in particular to a Laplace Sparse Deep Belief Network (LSDBN) image classification method based on Laplace function constraint.
Background
The existing image classification mainly adopts a method based on a generation model or a discrimination model, the shallow structure models have certain limitations, the expression capability of a complex function is limited under the condition of limited samples, and the generalization capability is limited to a certain extent, so that the classification effect of the models is reduced; image data features have a lot of noise and redundant information and require preprocessing, thereby consuming a lot of time and resources. Therefore, excellent feature extraction algorithms and classification models are an important research direction for image processing.
In recent years, Deep learning is rapidly developed, and Hinton et al propose a Deep Belief Network (DBN) and an unsupervised greedy layer-by-layer training algorithm in 2006, so that the problem that a Deep neural network is easy to fall into local optimization is solved, and a new wave of Deep learning in academic circles is caused. The DBN obtains abstract representation of original data through multi-level feature transformation, so that accuracy of tasks such as classification and prediction is improved, the DBN has the advantages of automatic feature learning and data dimension reduction, the DBN becomes a network structure which is most widely applied to deep learning, and at present, the DBN makes breakthrough progress in relevant fields such as voice recognition, image classification and face recognition.
The image classification algorithm constructed by the DBN can be used for integrally learning the feature representation of each layer, the spatial information of the image features is reserved, the advantage of the DBN for automatically learning the classification features is utilized, and the poor universality of the traditional feature extraction algorithm is avoided. Although the DBN model has achieved encouraging achievements, there is a feature homogenization phenomenon in the training process, that is, there are a large number of common features, resulting in a high posterior probability of hidden layer units, and a useful feature representation of data cannot be well learned, which is especially prominent when the number of hidden layer units is too small. At present, the method for solving the characteristic homogenization phenomenon is to adjust the sparsity of nodes of a hidden layer and reduce the similarity between connection weight value arrays, namely, to perform sparsity by adding a sparse penalty factor in a network. According to the study of the human visual system for targeted things, only a few neurons are activated. Inspired by this study, researchers have proposed Sparse coding theory for the purpose of simulating Sparse Representation of the visual system (Sparse Representation).
The sparse representation is not considered to be affected by local deformation in the computer vision direction, and in the process of learning the sparse representation, the most important characteristics of the object are always concerned, so that redundant characteristics can be discarded, and the influence of overfitting and noise pollution is reduced. Therefore, it is a meaningful idea to introduce sparsity into the training process of a Restricted Boltzmann Machine (RBM) to avoid the phenomenon of feature homogeneity. Currently, researchers have proposed a variety of sparse RBM models to solve this problem, and some have tried to introduce L0 regularization of hidden layer cell activation probability into the likelihood function of RBM, but solving L0 regularization is an NP-hard problem; considering that L1 regularization is a convex quadratic optimization problem, a learner introduces L1 regularization of hidden unit activation probability into a likelihood function of an RBM (radial basis function), and provides a novel sparse deep belief network, and Hinton provides a cross entropy sparse penalty factor by utilizing a cross entropy concept, so that hidden units have overall sparsity; lee et al propose sparse RBM based on sum of squared errors (SP-RBM); researchers have proposed sparse RBMs (SR-RBMs) based on rate-distortion theory, but there is no correct way to obtain SR-RBM distortion metrics. In summary, for a DBN, the sparse behavior of binary hidden cells can be achieved by specifying "sparse targets" with a variant of RBM. However, this method needs to set a "sparse target" in advance, and the hidden layer nodes all have the same sparsity under a certain state.
Disclosure of Invention
In order to solve the problems in the prior art, the invention provides a sparse depth confidence network image classification method based on Laplace function constraint.
The technical scheme of the invention is as follows:
a sparse depth confidence network image classification method based on Laplace function constraint comprises the following steps:
step 1, selecting a training image data set, and carrying out image preprocessing to obtain a training data set;
step 2, inputting the training data set preprocessed in the step 1 into an LSDBN network model, unsupervised and separately training each layer of a Laplace function constraint-based Sparse Restricted Boltzmann Machine (LS-RBM) network from bottom to top by utilizing a contrast Divergence algorithm (CD-k), using the output of a lower layer LS-RBM network as the input of an adjacent upper layer LS-RBM network, obtaining the parameter value of each LS-RBM network through iterative training, and finally obtaining the high-level characteristics of the input image data; the parameter values are weight and bias;
step 3, taking the parameter values obtained in the step 2 as initial values of a fine tuning stage, and fine tuning the whole LSDBN network by adopting a top-down back propagation algorithm to obtain an LSDBN network model;
and 4, inputting the test image data set into the LSDBN network model obtained in the step 3, performing identification test by adopting a Softmax classifier, and finally outputting an image classification result.
The step 1 specifically comprises the following steps: converting the color image into a gray level image by a binarization method, and normalizing the gray value of the gray level image to be between [0,1] to obtain a training data set; wherein the normalization formula is:
Figure BDA0001649842910000021
wherein the content of the first and second substances,
Figure BDA0001649842910000022
is a characteristic value, x, of the image data setmaxAnd xminRespectively the maximum and minimum of all features of the image dataset, and x is the normalized image dataset.
The step 2 specifically comprises the following steps:
step 2.1, constructing an LSDBN network model, and setting parameter values of the LSDBN network model structure: visible layer nodes, hidden layer numbers, iteration times and fine adjustment times; the hidden layer nodes are determined according to the size of the characteristic dimension of the input image set;
step 2.2, taking the preprocessed training data set x as the input of a first LS-RBM, and training the LS-RBM by adopting a CD algorithm;
(1) the relationship between the visible layer and the hidden layer is expressed as an energy function:
Figure BDA0001649842910000031
where θ represents a parameter in the model, i.e., { W ═ Wij,ai,bj};WijIs a weight matrix between the visible layer and the hidden layer, aiBias of nodes of visible layer, bjFor the bias of hidden layer nodes, i is the characteristic quantity of the input image, namely visible layer nodes, and n in total; j is hidden layer nodes, and the number of j is m; v. ofiRepresents the ith visible level node, hjRepresents the jth hidden layer node;
(2) based on the energy function formula (2), the joint probability distribution of v and h in the RBM is obtained as follows:
Figure BDA0001649842910000032
Figure BDA0001649842910000033
where Z (θ) is the sum of all possible visible and hidden layer node pairs, aiBias of nodes of visible layer, bjA bias for a hidden layer node;
respectively solving the edge probability distribution of the visual layer unit v and the hidden layer h according to the joint probability distribution of the formula (3) by utilizing the principle of a Bayes formula:
Figure BDA0001649842910000034
Figure BDA0001649842910000035
and deducing a conditional probability distribution formula of the visual layer unit v and the hidden layer h by using a Bayesian formula principle and a sigmoid activation function definition formula:
Figure BDA0001649842910000036
Figure BDA0001649842910000037
wherein σ (·) is a sigmoid activation function, namely a nonlinear mapping function of the neuron;
obtaining approximate reconstruction P (v; theta) of the training image by using a contrast divergence algorithm through one-step Gibbs sampling by using a formula (7) and a formula (8);
(3) solving P (v; theta) by using a maximum likelihood method to obtain an optimal value of theta; the likelihood function for LS-RBM is:
Figure BDA0001649842910000041
the optimal values of the parameters are:
Figure BDA0001649842910000042
after adding the sparse penalty term, the objective function of LS-RBM pre-training optimization is as follows:
F=Funsup+λFsparse (11)
wherein λ is a sparsity parameter for adjusting FsparseOf relative importance of, FsparseExpressing a sparse regularization function, the formula is:
Figure BDA0001649842910000043
Figure BDA0001649842910000044
wherein, L (q)jμ, b) is the Laplace probability density function, qjRepresenting the average of the conditional expectation of the jth hidden layer unit of a given data, p being a constant, controlling n hidden units hjThe sparsity of (a); u represents a scale parameter; q. q ofjThe expression is as follows:
Figure BDA0001649842910000045
where E (-) is the conditional expectation for the jth hidden layer element given the data, l represents the number of sheets of training images, m is the number of training image datasets,
Figure BDA0001649842910000046
j is the j unit of the hidden layer corresponding to the l picture(l)The visible layer unit corresponding to the first picture,
Figure BDA0001649842910000047
given a visible layer v, the hidden layer element hjG is a sigmoid function;
after adding the sparse penalty term, training LS-RBM aims at solving the optimal value of the objective function of the formula (10):
Figure BDA0001649842910000051
wherein, P (v)(l)) The likelihood function to be optimized for the LS-RBM, i.e. the distribution P (v; θ);
(4) and (3) carrying out derivation on the objective function of the LS-RBM by using a gradient descent method to update the weight matrix and the bias of the hidden layer, wherein the derivation formula is as follows:
Figure BDA0001649842910000052
Figure BDA0001649842910000053
and substituting the derived parameter value into an updating formula of the parameter theta to obtain a new parameter value:
Figure BDA0001649842910000054
a(1):=ma+α(v1-v2) (26)
Figure BDA0001649842910000055
(5) continuously training the network by using new parameter values, enabling the activation probability of the hidden layer unit to be gradually close to a given fixed value p by continuously optimizing an objective function, learning a group of weight parameters and corresponding bias, finding out sparse feature vectors through the appropriate parameters, controlling redundant features existing in the image, combining the learning weights by using main features in the image to reconstruct input data, and finishing the training of the first LS-RBM and the updating of the corresponding parameter value theta.
Step 2.3W trained by the first LS-RBM(1)And b(1)And use of P (h)(1)|v(1),W(1),b(1)) Obtaining the input characteristics of a second LS-RBM, and continuously training the second LS-RBM by using the algorithm in the step 2.2;
step 2.4, training recursively according to the steps until the training reaches a layer l-1, and obtaining a deep sparse DBN model, namely an LSDBN network model after multiple loop iterations;
step 2.5W for L layers(L)And b(L)Initialize and use { W(1),W(2),…,W(L)And { b }and(1),b(2),…,b(L)Forming a deep neural network with an L layer, wherein an output layer is label data of a training image set, and a Softmax classifier is used as the output layer;
the step 3 specifically comprises the following steps:
step 3.1, inputting the parameter value theta trained in the step 2 as an initial value of the parameter of the fine tuning stage into an LSDBN network model of the fine tuning stage;
step 3.2, calculating the activation value of each hidden layer unit by using a forward propagation algorithm;
step 3.3, calculating the output result of the forward propagation of the training image set and the error of the label corresponding to the image set, reversely propagating the error, and calculating the residual error of each hidden layer unit to represent the influence of the unit on the residual error; calculating partial derivatives by using the residual error of each unit, and updating the weight matrix and the offset by adopting a gradient descent method according to formulas (28) and (29) in each iteration until the maximum iteration number is reached to obtain a fine-tuned LSDBN network model;
Figure BDA0001649842910000061
Figure BDA0001649842910000062
where α is the learning rate, and J (W, b) is a cost function obtained by calculating the error of the actual model output value and the corresponding tag value.
The step 4 specifically comprises the following steps:
step 4.1, inputting the preprocessed test image set into the LSDBN network model finely adjusted in the step 3, and extracting main characteristics of the test image;
step 4.2, inputting the main characteristics of the test image into a Softmax classifier, and outputting the probability of belonging to a certain category, wherein the output of the Softmax classifier of one k category is as follows:
Figure BDA0001649842910000063
wherein x is(i)For the ith test image, y(i)Label, p (y), representing the ith test image(i)=k|x(i)(ii) a Theta) represents the probability that the ith test image belongs to the kth class,
Figure BDA0001649842910000064
normalizing the probability distribution to make the sum of all probabilities be 1;
and 4.3, calculating the classification accuracy according to the probability that the test image belongs to a certain class and the class label corresponding to the image, and finally outputting the classification result of the image.
The invention has the beneficial effects that: in order to enable the model to have interpretation and discrimination capability, firstly, a punishment regular term is introduced into a likelihood function in an unsupervised stage, a CD training is used for maximizing a target function, meanwhile, sparse distribution of a training set is obtained through sparse constraint, so that the unlabeled data can be learned to be useful low-level feature representation, and the energy minimum economic strategy of biological evolution is met. Secondly, a sparse deep belief network based on a Laplace function is provided, the sparse state of the hidden layer is induced by using Laplace distribution, and meanwhile, scale parameters in the distribution can be used for controlling the sparse strength, so that sparse representation can be learned, and the sparse deep belief network has stronger feature extraction capability.
Drawings
Fig. 1 is a flowchart of an image classification method of an LSDBN model in the present invention.
FIG. 2 shows the result of the classification accuracy of LS-RBM on the pendings data set in the present invention.
FIG. 3 shows the result of the classification accuracy of LSDBN on the pendings data set in the present invention.
Detailed Description
The following further describes a specific embodiment of the present invention with reference to the drawings and technical solutions.
Example 1:
as shown in fig. 1, a sparse depth confidence network image classification method based on Laplace function constraint includes the following specific steps:
step 1, selecting a proper training image data set, and carrying out image preprocessing on the proper training image data set to obtain a training data set.
Because the image classification is in the feature extraction process, the color image is converted into a gray level image through binarization, and the gray value is normalized to be between [0 and 1], so that only one two-dimensional gray level matrix needs to be subjected to feature extraction. The specific normalization formula is as follows:
Figure BDA0001649842910000071
wherein the content of the first and second substances,
Figure BDA0001649842910000072
is a characteristic value, x, of the image data setmaxAnd xminWhich are the maximum and minimum values of all features of the image dataset, respectively, and x is the normalized image dataset.
And 2, using the preprocessed training data set for the pre-training of the LSDBN network model. The method comprises the steps of pre-training a network according to an input training data set, selecting a contrast Divergence (CD-k) algorithm to perform unsupervised bottom-up independent training on each LS-RBM, using the output of the LS-RBMs on the bottom layer as the input of the LS-RBMs on the top layer, performing iterative training to obtain corresponding weight and bias, and finally obtaining the high-level characteristics of data. The method comprises the following steps:
step 2.1, constructing an LSDBN network model, and setting parameter values of the LSDBN network model structure: the visible layer nodes are the characteristic dimension of the input data set, the hidden layer nodes are set according to the characteristic dimension of different data sets, the hidden layer number is set to be 2, the iteration number is 100, and the fine adjustment number is 100, so that a better convergence result can be obtained.
And 2.2, taking the preprocessed training data set as the input of the first LS-RBM, and training the LS-RBM by adopting a CD algorithm.
(1) Since RBM is an energy-based model, the relationship between the visible layer and the hidden layer can be expressed using an energy function as:
Figure BDA0001649842910000081
wherein: θ represents a parameter in the model, i.e., { W ═ Wij,ai,bj};WijIs a weight matrix between the visible layer and the hidden layer, aiBias of nodes of visible layer, bjFor the bias of hidden layer nodes, i is the characteristic quantity of the input image, namely visible layer nodes, and n in total; j is hidden layer nodes, and the number of j is m; v. ofiRepresents the ith visible level node, hjRepresenting the jth hidden layer node.
(2) According to the energy function formula (2), the values of each group of visible layer nodes and hidden layer nodes have a corresponding energy value. When all parameters are determined, according to the definition of an energy function and the principle of thermodynamic statistics, the joint probability distribution of v and h in the RBM is defined as follows:
Figure BDA0001649842910000082
Figure BDA0001649842910000083
in the above equation, Z (θ) is the sum of all possible visible layer node and hidden layer node pairs. What the RBM model learns is the joint probability distribution, i.e., the generation of objects.
Further, for the specific problem, the most interesting is the probability distribution defined by the RBM on the observed data v, i.e. the marginal probability distribution of P (v, h; θ), the probability of the distribution network assigned to the visual elements (training data) is given by summing all possible hidden elements:
Figure BDA0001649842910000084
accordingly, the edge probability distribution of the hidden layer h is:
Figure BDA0001649842910000085
the conditional probability distribution formulas of the visible layer and the hidden layer can be deduced through a Bayesian formula and a sigmoid function definition formula as follows:
Figure BDA0001649842910000091
Figure BDA0001649842910000092
(3) for a given training sample, training the RBM model means finding the value of the parameter θ at which the RBM can fit the training sample very much. Therefore, the log-likelihood function P (v | θ) of the RBM is maximized by the maximum likelihood method to obtain an optimal value of θ, i.e., an optimal value of θ
Figure BDA0001649842910000093
Suppose a training set v is given(1),...,v(m)An unsupervised pre-trained optimization model using sparse penalty terms is defined as follows:
F=Funsup+λFsparse (10)
wherein, FunsupThe likelihood function representing the RBM, i.e. as shown in equation (4-5)
Figure BDA0001649842910000094
λ is a sparse penalty parameter, FsparseRepresenting an arbitrary sparse regularization function.
The objective of sparse RBM is mainly to make the hidden layer nodes mostly zero, in other words, the activation probability of the hidden layer unit is close to zero, based on the theory of statistics. If the hidden layer element is sparse (the hidden layer element is inactive in most cases), then the features of this hidden layer element are used to represent only a small portion of the training data.
By defining a sparse regularization term, the average activation probability of the training data can be reduced, thereby ensuring model neurons (corresponding to the random variable h)j) The "activation rate" of (a) is kept at a rather low level so that the activation of neurons is sparse. This requires that the activation probability image of the hidden layer unit has the characteristic of peak heavy tail.
Based on the above analysis, inspired by the compressive sensing theory, the present invention induces the sparse state of the hidden layer unit by penalizing with laplacian function, which has a heavy tail characteristic, with different behaviors according to the deviation of the hidden layer unit activation probability from the fixed value parameter p. In addition, the method also has a scale parameter which can control the sparsity degree.
In probabilistic and mathematical statistics, the laplace distribution is a continuous distribution with flatter tails compared to a normal distribution. In the process of solving the sparse solution, most features tend to be distributed on both sides of the function gradually, namely closer to zero. And a few hidden units have higher activation probability, some hidden units are activated, and the activated hidden units can represent main features of data through training, so that sparse feature representation of the model is easier to realize.
The objective function optimization formula after adding the sparse regularization term is as follows:
Figure BDA0001649842910000101
the laplacian sparse penalty term is defined as follows:
Figure BDA0001649842910000102
Figure BDA0001649842910000103
wherein L (q)j,μ, b) is the Laplace probability density function, qjRepresenting the average of the conditional expectation of the jth hidden layer unit of a given data, p being a constant, controlling n hidden units hjBy defining the value of the parameter, the optimization objective function is to make the average activation probability of the hidden layer node as close to p as possible; u represents a scale parameter that can be used to control the degree of sparsity by varying its value. Wherein q isjIs represented as follows:
Figure BDA0001649842910000104
in the formula, E (-) is the jth hidden layer node given the dataM is the number of training data,
Figure BDA0001649842910000105
given a visible layer v, the hidden layer element hjG is the sigmoid function.
Further, the objective function for training LS-RBM is as follows: :
Figure BDA0001649842910000106
in the above equation, the first term is a log-likelihood function term, and the second term is a sparse penalty term, where λ is a parameter of the term, and is used to represent the relative importance of the term between the objective function and the data distribution. Therefore, the sparse regularization term solution is maximized while the log-likelihood function is maximized.
(4) The bias of the hidden layer directly influences the sparsity of the hidden units, so that only the weight matrix and the bias of the hidden layer are updated. Wherein the gradient of the sparse regularization term is calculated as follows:
Figure BDA0001649842910000111
Figure BDA0001649842910000112
wherein the derivation result of the first term in the above formula is shown as follows:
Figure BDA0001649842910000113
Figure BDA0001649842910000114
the second term expands as follows:
Figure BDA0001649842910000115
Figure BDA0001649842910000116
wherein σj=∑iWijvi+bjRepresenting the input of the hidden unit j. Each term after the formula expansion is derived as follows:
Figure BDA0001649842910000117
Figure BDA0001649842910000118
derivation of the hidden layer bias can result:
Figure BDA0001649842910000121
and bringing the derived parameter value into a parameter updating model to obtain a new parameter value:
Figure BDA0001649842910000122
a:=ma+α(v1-v2) (26)
Figure BDA0001649842910000123
(5) after the parameter values are updated, the new parameter values continue to train the network, the activation probability of the hidden layer node is enabled to be gradually close to a given fixed value p by continuously optimizing the objective function, a group of weight parameters and corresponding bias are tried to be learned, sparse feature vectors can be found through the appropriate parameters, redundant features are controlled, the learning weights are combined by using the main features to reconstruct input data, and the robustness of the algorithm to noise is improved. Thus, the first LS-RBM and the corresponding parameter values are trained.
Step 2.3 training W by first LS-RBM(1)And b(1)And using P (h)(1)|v(1),W(1),b(1)) Obtaining the input characteristics of a second LS-RBM, and continuously training the second LS-RBM by using the algorithm in the step 2.2;
step 2.4, recursively training to an L-1 layer according to the steps, wherein parameters and output of the LS-RBM trained on the bottommost layer are used as next higher layer data in the training model, namely the next LS-RBM, and after multiple loop iterations, a deep sparse DBN model, namely an LSDBN model, can be learned;
step 2.5W for L layers(L)And b(L)Initialize and use { W(1),W(2),…,W(L)And { b }and(1),b(2),…,b(L)Forming a deep neural network with an L layer, wherein the output layer is labeled data, and a Softmax classifier is used as the output layer;
and 3, further optimizing the LSDBN. And taking the parameter values obtained in the pre-training stage as initial values of the fine-tuning stage, and fine-tuning the whole LSDBN network. The invention adopts a top-down supervised learning algorithm-back propagation algorithm to finely adjust the whole network, namely, training samples and test data are input for training, and errors are reversely propagated from top to bottom to optimize the network. The method mainly comprises the following steps:
and 3.1, inputting the training sample into the network in the fine tuning stage for optimization by taking the network parameters trained in the pre-training stage as initial values in the fine tuning stage.
And 3.2, calculating the activation value of each layer of neurons by using a forward propagation algorithm.
Step 3.3, calculating the output result of the forward propagation of the training image set and the error of the label corresponding to the image set, reversely propagating the error, and calculating the residual error of each hidden layer unit to represent the influence of the unit on the residual error; calculating partial derivatives by using the residual error of each unit, and updating the weight matrix and the offset by adopting a gradient descent method according to formulas (28) and (29) in each iteration until the maximum iteration number is reached to obtain a fine-tuned LSDBN network model;
Figure BDA0001649842910000131
Figure BDA0001649842910000132
where α is the learning rate, and J (W, b) is the cost function of the network, obtained by calculating the actual output value and the target value. The iterative steps of the gradient descent method are repeatedly used to reduce the value of J (W, b).
And 4, inputting the test image data set into the LSDBN network model trained in the step 3, and performing identification test by adopting a Softmax classifier to realize the output of the image classification result. For image classification, a classifier is used at the top level of the network, and test data is input for test classification. In order to make the network widely used, the invention uses the Softmax classifier to classify, and for a k-class classifier, the output is:
Figure BDA0001649842910000133
wherein, θ is a parameter matrix including weight and bias, and each row is regarded as the parameter of the classifier corresponding to a category, and has k rows.
Figure BDA0001649842910000134
The probability distribution is normalized so that the sum of all probabilities is 1. Thus, the cost function J (θ) of the fine tuning stage is:
Figure BDA0001649842910000135
where 1{ · } is an indicative function, that is, 1{ expression whose value is true } ═ 1, and 1{ expression whose value is false } ═ 0.
The MNIST handwriting database and the Pendigits handwriting recognition data set are detected by the method provided by the invention.
Example 2: experiments on MNIST handwriting database
The MNIST handwriting data set includes 60000 training samples and 10000 test samples, each with a picture size of 28 x 28 pixels. In order to facilitate the extraction of image features, the invention extracts different numbers of images of each category from 60000 training data for experimental analysis. The model comprises 784 visible layer nodes and 500 hidden layer nodes, the learning rate is set to be 1, the batch block number is 100, the maximum iteration number is 100, and the CD algorithm with the step length of 1 is used for training the model.
Table 1 shows the results of the sparsity measurements on the MNIST dataset according to the present invention, and the comparative analysis is performed with the remaining two sparse models. The sparsity measurement method is shown as the following formula:
Figure BDA0001649842910000141
for the sparse model, the higher the sparsity is, the higher the algorithm stability is, and the stronger the robustness is. As can be seen from Table 1, LS-RBM has a higher sparseness value and a sparser representation can be learned compared to SP-RBM and SR-RBM.
TABLE 1 sparsity measurement results on MNIST dataset
Figure BDA0001649842910000142
Table 2 shows the results of the classification accuracy of the LS-RBM in the MNIST data set based on different sample numbers of each type, and the results are compared and analyzed with Artificial Neural Network (ANN), Automatic Encoder (AE), Restricted Boltzmann Machine (RBM), SR-RBM and SP-RBM methods. It can be seen from table 2 that the LS-RBM method of the present invention is always at the best recognition accuracy for different numbers of samples per class of MNIST dataset. Particularly, under the condition of 1000 samples in each class, the recognition rate of 96.8% is achieved, which is 3% higher than that of the second-highest SR-RBM algorithm, and the method provided by the invention has better feature extraction capability.
Table 2 LS-RBM sort accuracy results on MNIST dataset;
Figure BDA0001649842910000143
in order to learn deep features, table 3 shows the classification accuracy results of the LSDBN in the present invention based on different sample numbers of each class on the MNIST dataset, and the results are compared and analyzed with the DBN, SP-DBN, SR-DBN composed of the DBN, SP-RBM, SR-RBM. For a deeper sparse model, more abstract features are extracted every time a layer is added, and the final classification effect is influenced by the existence of some redundant features, as is apparent from table 3, the LSDBN can learn the features with easier discrimination, and can improve by 3 percentage points at most compared with the next highest SP-DBN, which indicates that the LSDBN can have better robustness for the interference of the redundant features.
TABLE 3 LSDBN Classification accuracy results on MNIST dataset
Figure BDA0001649842910000151
Example 3: experiments on the Pendigits handwriting recognition dataset
The Pen-Based registration of hand writing Digits (PenDigits) data set comprises 10992 data samples which are divided into 10 classes, wherein 7494 training data, 3298 testing data and 16 characteristic vectors are respectively used for analyzing images of each class in different numbers. 16 visible layer nodes and 10 hidden layer nodes are arranged, the learning rate is set to be 1, the batch block number is 100, and the maximum iteration number is 1000.
Fig. 2 shows the classification accuracy results of the LS-RBM on the basis of different sample numbers of each class on the Pendigits handwriting recognition data set, and it can be seen that the classification accuracy of most algorithms is higher and higher when the number of samples of each class is larger. The LS-RBM algorithm still achieves the best classification accuracy on the PenDigits data set even if each class has only dozens of data samples, and the learned feature representation of the invention has better discrimination than SP-RBM and SR-RBM.
In order to learn deep features, a second RBM (namely DBN, SP-DBN, SR-DBN and LS-DBN) is trained by using the activation probability of an RBM hidden layer unit on the basis of RBM, SP-RBM, SR-RBM and LS-RBM experiments, the hidden unit is still set to be 10, and the iteration times are 1000 respectively. The classification accuracy of each model was tested using the PenDigits dataset test set, with the results shown in fig. 3. The observation shows that the main characteristics of the method can be still obtained through sparsity constraint when the number of samples is small, the classification precision is improved by 2.7% -6% compared with that of a DBN model, and the applicability of the algorithm in a low-dimensional data set is proved.

Claims (5)

1. A sparse depth confidence network image classification method based on Laplace function constraint is characterized by comprising the following steps:
step 1, selecting a training image data set, and carrying out image preprocessing to obtain a training data set;
step 2, inputting the training data set preprocessed in the step 1 into an LSDBN network model, unsupervised and separately training an LS-RBM network of each layer from bottom to top by using a contrast divergence algorithm, using the output of an LS-RBM network of a lower layer as the input of an LS-RBM network adjacent to an upper layer, obtaining the parameter value of each LS-RBM network through iterative training, and finally obtaining the high-level characteristics of the input image data; the parameter values are weight and bias;
step 3, taking the parameter values obtained in the step 2 as initial values of a fine tuning stage, and fine tuning the whole LSDBN network by adopting a top-down back propagation algorithm to obtain an LSDBN network model;
step 4, inputting the test image data set into the LSDBN network model obtained in the step 3, performing identification test by adopting a Softmax classifier, and finally outputting an image classification result;
the step 2 specifically comprises the following steps:
step 2.1, constructing an LSDBN network model, and setting parameter values of the LSDBN network model structure: visible layer nodes, hidden layer numbers, iteration times and fine adjustment times; the hidden layer nodes are determined according to the size of the characteristic dimension of the input image set;
step 2.2, taking the preprocessed training data set x as the input of a first LS-RBM, and training the LS-RBM by adopting a CD algorithm;
(1) the relationship between the visible layer and the hidden layer is expressed as an energy function:
Figure FDA0003558330060000011
where θ represents a parameter in the model, i.e., { W ═ Wij,ai,bj};WijIs a weight matrix between the visible layer and the hidden layer, aiBias of nodes of visible layer, bjFor the bias of hidden layer nodes, i is the characteristic quantity of the input image, namely visible layer nodes, and n in total; j is hidden layer nodes, and the number of j is m; v. ofiRepresents the ith visible level node, hjRepresents the jth hidden layer node;
(2) based on the energy function formula (2), the joint probability distribution of v and h in the RBM is obtained as follows:
Figure FDA0003558330060000012
Figure FDA0003558330060000013
where Z (θ) is the sum of all possible visible and hidden layer node pairs, aiBias of nodes of visible layer, bjA bias for a hidden layer node;
respectively solving the edge probability distribution of the visual layer unit v and the hidden layer h according to the joint probability distribution of the formula (3) by utilizing the principle of a Bayes formula:
Figure FDA0003558330060000021
Figure FDA0003558330060000022
and deducing a conditional probability distribution formula of the visual layer unit v and the hidden layer h by using a Bayesian formula principle and a sigmoid activation function definition formula:
Figure FDA0003558330060000023
Figure FDA0003558330060000024
wherein σ (·) is a sigmoid activation function, namely a nonlinear mapping function of the neuron;
obtaining approximate reconstruction P (v; theta) of the training image by using a contrast divergence algorithm through one-step Gibbs sampling by using a formula (7) and a formula (8);
(3) solving P (v; theta) by using a maximum likelihood method to obtain an optimal value of theta; the likelihood function for LS-RBM is:
Figure FDA0003558330060000025
the optimal values of the parameters are:
Figure FDA0003558330060000026
after adding the sparse penalty term, the objective function of LS-RBM pre-training optimization is as follows:
F=Funsup+λFsparse (11)
wherein λ is sparsity parameter for adjusting FsparseOf relative importance of, FsparseExpressing a sparse regularization function, the formula is:
Figure FDA0003558330060000027
Figure FDA0003558330060000028
wherein, L (q)jP, u) is the Laplace probability density function, qjRepresenting the average of the conditional expectation of the jth hidden layer unit of a given data, p being a constant, controlling n hidden units hjSparsity of (d); u represents a scale parameter; q. q.sjThe expression is as follows:
Figure FDA0003558330060000031
where E (-) is the conditional expectation for the jth hidden layer element given the data, l represents the number of sheets of training images, m is the number of training image datasets,
Figure FDA0003558330060000032
j is the j unit of the hidden layer corresponding to the l picture(l)The visible layer unit corresponding to the first picture,
Figure FDA0003558330060000033
given a visible layer v, the hidden layer element hjG is a sigmoid function, σjRepresents the input of the hidden layer element j;
after the sparse regularization term is added, the objective of training the LS-RBM is to solve the optimal value of the objective function of formula (10):
Figure FDA0003558330060000034
wherein, P (v)(l)) The likelihood function to be optimized for the LS-RBM, i.e. the distribution P (v; θ);
(4) and (3) carrying out derivation on the objective function of the LS-RBM by using a gradient descent method to update the weight matrix and the bias of the hidden layer, wherein the derivation formula is as follows:
Figure FDA0003558330060000035
Figure FDA0003558330060000036
and substituting the derived parameter value into an updating formula of the parameter theta to obtain a new parameter value:
Figure FDA0003558330060000037
a(1):=ma+α(v1-v2) (26)
Figure FDA0003558330060000038
wherein α is a learning rate;
(5) continuously training the network by using new parameter values, enabling the activation probability of the hidden layer unit to be gradually close to a given fixed value p by continuously optimizing a target function, learning a group of weight parameters and corresponding bias, finding out sparse feature vectors through the appropriate parameters, controlling redundant features existing in the image, combining the learning weights by using main features in the image to reconstruct input data, and finishing the training of a first LS-RBM and the updating of a corresponding parameter value theta;
step 2.3W trained by the first LS-RBM(1)And b(1)And use of P (h)(1)|v(1),W(1),b(1)) Obtaining the input characteristics of a second LS-RBM, and continuously training the second LS-RBM by using the algorithm in the step 2.2;
step 2.4, training recursively according to the steps until the training reaches a layer l-1, and obtaining a deep sparse DBN model, namely an LSDBN network model after multiple loop iterations;
step 2.5 for W of l layers(L)And b(L)Initialize and use { W(1),W(2),…,W(L)And { b }and(1),b(2),…,b(L)And forming a deep neural network with an L layer, wherein an output layer is label data of the training image set, and a Softmax classifier is used as the output layer.
2. The method according to claim 1, wherein step 1 is specifically: converting the color image into a gray level image by a binarization method, and normalizing the gray value of the gray level image to be between [0,1] to obtain a training data set; wherein
The normalized formula is:
Figure FDA0003558330060000041
wherein the content of the first and second substances,
Figure FDA0003558330060000042
is a characteristic value, x, of the image data setmaxAnd xminRespectively the maximum and minimum of all features of the image dataset, and x is the normalized image dataset.
3. The method according to claim 1 or 2, characterized in that said step 3 is in particular:
step 3.1, inputting the parameter value theta trained in the step 2 as an initial value of the parameter of the fine tuning stage into an LSDBN network model of the fine tuning stage;
step 3.2, calculating the activation value of each hidden layer unit by using a forward propagation algorithm;
step 3.3, calculating the output result of the forward propagation of the training image set and the error of the label corresponding to the image set, reversely propagating the error, and calculating the residual error of each hidden layer unit to represent the influence of the unit on the residual error; calculating partial derivatives by using the residual error of each unit, and updating the weight matrix and the offset by adopting a gradient descent method according to formulas (28) and (29) in each iteration until the maximum iteration number is reached to obtain a fine-tuned LSDBN network model;
Figure FDA0003558330060000043
Figure FDA0003558330060000044
where J (W, b) is a cost function obtained by calculating the error of the actual model output value and the corresponding label value.
4. The method according to claim 1 or 2, characterized in that said step 4 is in particular:
step 4.1, inputting the preprocessed test image set into the LSDBN network model finely adjusted in the step 3, and extracting main characteristics of the test image;
step 4.2, inputting the main features of the test image into a Softmax classifier, outputting the probability of belonging to a certain class, and outputting the probability of the Softmax classifier of a k class as follows:
Figure FDA0003558330060000051
wherein x is(i)For the ith test image, y(i)Label, p (y), representing the ith test image(i)=k|x(i)(ii) a Theta) represents the probability that the ith test image belongs to the kth class,
Figure FDA0003558330060000052
normalizing the probability distribution to make the sum of all probabilities be 1;
and 4.3, calculating the classification accuracy according to the probability that the test image belongs to a certain class and the class label corresponding to the image, and finally outputting the classification result of the image.
5. The method according to claim 3, wherein step 4 is specifically:
step 4.1, inputting the preprocessed test image set into the LSDBN network model finely adjusted in the step 3, and extracting main characteristics of the test image;
step 4.2, inputting the main features of the test image into a Softmax classifier, outputting the probability of belonging to a certain class, and outputting the probability of the Softmax classifier of a k class as follows:
Figure FDA0003558330060000053
wherein x is(i)For the ith test image, y(i)Label, p (y), representing the ith test image(i)=k|x(i)(ii) a Theta) represents the probability that the ith test image belongs to the kth class,
Figure FDA0003558330060000054
normalizing the probability distribution to make the sum of all probabilities be 1;
and 4.3, calculating the classification accuracy according to the probability that the test image belongs to a certain class and the class label corresponding to the image, and finally outputting the classification result of the image.
CN201810417793.5A 2018-05-04 2018-05-04 Sparse depth confidence network image classification method based on Laplace function constraint Active CN108805167B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810417793.5A CN108805167B (en) 2018-05-04 2018-05-04 Sparse depth confidence network image classification method based on Laplace function constraint

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810417793.5A CN108805167B (en) 2018-05-04 2018-05-04 Sparse depth confidence network image classification method based on Laplace function constraint

Publications (2)

Publication Number Publication Date
CN108805167A CN108805167A (en) 2018-11-13
CN108805167B true CN108805167B (en) 2022-05-13

Family

ID=64093241

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810417793.5A Active CN108805167B (en) 2018-05-04 2018-05-04 Sparse depth confidence network image classification method based on Laplace function constraint

Country Status (1)

Country Link
CN (1) CN108805167B (en)

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109246608A (en) * 2018-11-16 2019-01-18 重庆小富农康农业科技服务有限公司 A kind of point-to-point localization method in interior based on WIFI location fingerprint big data analysis
CN109635931A (en) * 2018-12-14 2019-04-16 吉林大学 A kind of equipment running status evaluation method based on depth conviction net
CN110147834A (en) * 2019-05-10 2019-08-20 上海理工大学 Fine granularity image classification method based on rarefaction bilinearity convolutional neural networks
CN110209813A (en) * 2019-05-14 2019-09-06 天津大学 A kind of incident detection and prediction technique based on autocoder
CN110188692B (en) * 2019-05-30 2023-06-06 南通大学 Enhanced cyclic cascading method for effective target rapid identification
CN110543918B (en) * 2019-09-09 2023-03-24 西北大学 Sparse data processing method based on regularization and data augmentation
CN111368686B (en) * 2020-02-27 2022-10-25 西安交通大学 Electroencephalogram emotion classification method based on deep learning
CN112188210A (en) * 2020-09-27 2021-01-05 铜仁学院 DVC side information solving method adopting deep belief network
CN112286996A (en) * 2020-11-23 2021-01-29 天津大学 Node embedding method based on network link and node attribute information
CN113095381B (en) * 2021-03-29 2024-04-05 西安交通大学 Underwater sound target identification method and system based on improved DBN
CN113313175B (en) * 2021-05-28 2024-02-27 北京大学 Image classification method of sparse regularized neural network based on multi-element activation function
CN115049814B (en) * 2022-08-15 2022-11-08 聊城市飓风工业设计有限公司 Intelligent eye protection lamp adjusting method adopting neural network model

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104077595A (en) * 2014-06-15 2014-10-01 北京工业大学 Deep belief network image recognition method based on Bayesian regularization
CN104732249A (en) * 2015-03-25 2015-06-24 武汉大学 Deep learning image classification method based on popular learning and chaotic particle swarms
CN106067042A (en) * 2016-06-13 2016-11-02 西安电子科技大学 Polarization SAR sorting technique based on semi-supervised degree of depth sparseness filtering network
CN107528824A (en) * 2017-07-03 2017-12-29 中山大学 A kind of depth belief network intrusion detection method based on two-dimensionses rarefaction

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104077595A (en) * 2014-06-15 2014-10-01 北京工业大学 Deep belief network image recognition method based on Bayesian regularization
CN104732249A (en) * 2015-03-25 2015-06-24 武汉大学 Deep learning image classification method based on popular learning and chaotic particle swarms
CN106067042A (en) * 2016-06-13 2016-11-02 西安电子科技大学 Polarization SAR sorting technique based on semi-supervised degree of depth sparseness filtering network
CN107528824A (en) * 2017-07-03 2017-12-29 中山大学 A kind of depth belief network intrusion detection method based on two-dimensionses rarefaction

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
基于受限玻尔兹曼机的深度学习模型及其应用;张艳霞;《中国优秀硕士学位论文全文数据库 信息科技》;20170215;15-50 *
基于深度学习的图像分类算法研究;房雪键;《中国优秀硕士学位论文全文数据库 信息科技》;20170215;13-21 *
张艳霞.基于受限玻尔兹曼机的深度学习模型及其应用.《中国优秀硕士学位论文全文数据库 信息科技》.2017, *

Also Published As

Publication number Publication date
CN108805167A (en) 2018-11-13

Similar Documents

Publication Publication Date Title
CN108805167B (en) Sparse depth confidence network image classification method based on Laplace function constraint
US11620487B2 (en) Neural architecture search based on synaptic connectivity graphs
CN107229914B (en) Handwritten digit recognition method based on deep Q learning strategy
US20230229891A1 (en) Reservoir computing neural networks based on synaptic connectivity graphs
US11568201B2 (en) Predicting neuron types based on synaptic connectivity graphs
US11625611B2 (en) Training artificial neural networks based on synaptic connectivity graphs
US20230229901A1 (en) Artificial neural network architectures based on synaptic connectivity graphs
CN109034186B (en) Handwriting data identification method based on DA-RBM classifier model
CN112464004A (en) Multi-view depth generation image clustering method
US11631000B2 (en) Training artificial neural networks based on synaptic connectivity graphs
CN109948589B (en) Facial expression recognition method based on quantum depth belief network
CN113344069B (en) Image classification method for unsupervised visual representation learning based on multi-dimensional relation alignment
Karthikeyan et al. Self-adaptive hybridized lion optimization algorithm with transfer learning for ancient Tamil character recognition in stone inscriptions
CN112861626A (en) Fine-grained expression classification method based on small sample learning
CN115310491A (en) Class-imbalance magnetic resonance whole brain data classification method based on deep learning
CN115661498A (en) Self-optimization single cell clustering method
Yang et al. iCausalOSR: invertible Causal Disentanglement for Open-set Recognition
CN108304546B (en) Medical image retrieval method based on content similarity and Softmax classifier
Atukorale et al. Hierarchical overlapped neural gas network with application to pattern classification
Xiong et al. Denoising auto-encoders toward robust unsupervised feature representation
Shu et al. Image Classification Algorithm Named OCFC Based on Self-supervised Learning
CN114037931B (en) Multi-view discriminating method of self-adaptive weight
US20230342589A1 (en) Ensemble machine learning with reservoir neural networks
Jiang From Neuronal to Artificial Neural Network: Discovering Non-linearity in the Inference Problems
Rath et al. Development and Performance Assessment of Bio-inspired based ANN Model for Handwritten English Numeral Recognition

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant