CN108805167B - Sparse depth confidence network image classification method based on Laplace function constraint - Google Patents
Sparse depth confidence network image classification method based on Laplace function constraint Download PDFInfo
- Publication number
- CN108805167B CN108805167B CN201810417793.5A CN201810417793A CN108805167B CN 108805167 B CN108805167 B CN 108805167B CN 201810417793 A CN201810417793 A CN 201810417793A CN 108805167 B CN108805167 B CN 108805167B
- Authority
- CN
- China
- Prior art keywords
- image
- training
- rbm
- sparse
- hidden layer
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2415—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
- G06F18/24155—Bayesian classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Computational Linguistics (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Probability & Statistics with Applications (AREA)
- Image Analysis (AREA)
Abstract
The invention provides a sparse depth confidence network image classification method based on Laplace function constraint, and belongs to the field of image processing and depth learning. The method is based on the inspiration of visual cortex analysis of primates, a penalty regular term is introduced into a likelihood function in an unsupervised stage, a CD algorithm is used for maximizing a target function, meanwhile, sparse distribution of a training set is obtained through Lapalce sparse constraint, and unlabeled data can be learned to be visually represented by characteristics. Secondly, an improved sparse deep belief network is provided, the Laplace distribution is used for inducing the sparse state of the hidden layer nodes, and meanwhile, the scale parameters in the distribution are used for controlling the sparse strength. And finally, training and learning the parameters of the LSDBN network by using a random gradient descent method. The method provided by the invention achieves the best recognition accuracy and has good sparsity performance under the condition of few samples of each type.
Description
Technical Field
The invention relates to the field of image processing and Deep learning, in particular to a Laplace Sparse Deep Belief Network (LSDBN) image classification method based on Laplace function constraint.
Background
The existing image classification mainly adopts a method based on a generation model or a discrimination model, the shallow structure models have certain limitations, the expression capability of a complex function is limited under the condition of limited samples, and the generalization capability is limited to a certain extent, so that the classification effect of the models is reduced; image data features have a lot of noise and redundant information and require preprocessing, thereby consuming a lot of time and resources. Therefore, excellent feature extraction algorithms and classification models are an important research direction for image processing.
In recent years, Deep learning is rapidly developed, and Hinton et al propose a Deep Belief Network (DBN) and an unsupervised greedy layer-by-layer training algorithm in 2006, so that the problem that a Deep neural network is easy to fall into local optimization is solved, and a new wave of Deep learning in academic circles is caused. The DBN obtains abstract representation of original data through multi-level feature transformation, so that accuracy of tasks such as classification and prediction is improved, the DBN has the advantages of automatic feature learning and data dimension reduction, the DBN becomes a network structure which is most widely applied to deep learning, and at present, the DBN makes breakthrough progress in relevant fields such as voice recognition, image classification and face recognition.
The image classification algorithm constructed by the DBN can be used for integrally learning the feature representation of each layer, the spatial information of the image features is reserved, the advantage of the DBN for automatically learning the classification features is utilized, and the poor universality of the traditional feature extraction algorithm is avoided. Although the DBN model has achieved encouraging achievements, there is a feature homogenization phenomenon in the training process, that is, there are a large number of common features, resulting in a high posterior probability of hidden layer units, and a useful feature representation of data cannot be well learned, which is especially prominent when the number of hidden layer units is too small. At present, the method for solving the characteristic homogenization phenomenon is to adjust the sparsity of nodes of a hidden layer and reduce the similarity between connection weight value arrays, namely, to perform sparsity by adding a sparse penalty factor in a network. According to the study of the human visual system for targeted things, only a few neurons are activated. Inspired by this study, researchers have proposed Sparse coding theory for the purpose of simulating Sparse Representation of the visual system (Sparse Representation).
The sparse representation is not considered to be affected by local deformation in the computer vision direction, and in the process of learning the sparse representation, the most important characteristics of the object are always concerned, so that redundant characteristics can be discarded, and the influence of overfitting and noise pollution is reduced. Therefore, it is a meaningful idea to introduce sparsity into the training process of a Restricted Boltzmann Machine (RBM) to avoid the phenomenon of feature homogeneity. Currently, researchers have proposed a variety of sparse RBM models to solve this problem, and some have tried to introduce L0 regularization of hidden layer cell activation probability into the likelihood function of RBM, but solving L0 regularization is an NP-hard problem; considering that L1 regularization is a convex quadratic optimization problem, a learner introduces L1 regularization of hidden unit activation probability into a likelihood function of an RBM (radial basis function), and provides a novel sparse deep belief network, and Hinton provides a cross entropy sparse penalty factor by utilizing a cross entropy concept, so that hidden units have overall sparsity; lee et al propose sparse RBM based on sum of squared errors (SP-RBM); researchers have proposed sparse RBMs (SR-RBMs) based on rate-distortion theory, but there is no correct way to obtain SR-RBM distortion metrics. In summary, for a DBN, the sparse behavior of binary hidden cells can be achieved by specifying "sparse targets" with a variant of RBM. However, this method needs to set a "sparse target" in advance, and the hidden layer nodes all have the same sparsity under a certain state.
Disclosure of Invention
In order to solve the problems in the prior art, the invention provides a sparse depth confidence network image classification method based on Laplace function constraint.
The technical scheme of the invention is as follows:
a sparse depth confidence network image classification method based on Laplace function constraint comprises the following steps:
step 1, selecting a training image data set, and carrying out image preprocessing to obtain a training data set;
step 3, taking the parameter values obtained in the step 2 as initial values of a fine tuning stage, and fine tuning the whole LSDBN network by adopting a top-down back propagation algorithm to obtain an LSDBN network model;
and 4, inputting the test image data set into the LSDBN network model obtained in the step 3, performing identification test by adopting a Softmax classifier, and finally outputting an image classification result.
The step 1 specifically comprises the following steps: converting the color image into a gray level image by a binarization method, and normalizing the gray value of the gray level image to be between [0,1] to obtain a training data set; wherein the normalization formula is:
wherein the content of the first and second substances,is a characteristic value, x, of the image data setmaxAnd xminRespectively the maximum and minimum of all features of the image dataset, and x is the normalized image dataset.
The step 2 specifically comprises the following steps:
step 2.1, constructing an LSDBN network model, and setting parameter values of the LSDBN network model structure: visible layer nodes, hidden layer numbers, iteration times and fine adjustment times; the hidden layer nodes are determined according to the size of the characteristic dimension of the input image set;
step 2.2, taking the preprocessed training data set x as the input of a first LS-RBM, and training the LS-RBM by adopting a CD algorithm;
(1) the relationship between the visible layer and the hidden layer is expressed as an energy function:
where θ represents a parameter in the model, i.e., { W ═ Wij,ai,bj};WijIs a weight matrix between the visible layer and the hidden layer, aiBias of nodes of visible layer, bjFor the bias of hidden layer nodes, i is the characteristic quantity of the input image, namely visible layer nodes, and n in total; j is hidden layer nodes, and the number of j is m; v. ofiRepresents the ith visible level node, hjRepresents the jth hidden layer node;
(2) based on the energy function formula (2), the joint probability distribution of v and h in the RBM is obtained as follows:
where Z (θ) is the sum of all possible visible and hidden layer node pairs, aiBias of nodes of visible layer, bjA bias for a hidden layer node;
respectively solving the edge probability distribution of the visual layer unit v and the hidden layer h according to the joint probability distribution of the formula (3) by utilizing the principle of a Bayes formula:
and deducing a conditional probability distribution formula of the visual layer unit v and the hidden layer h by using a Bayesian formula principle and a sigmoid activation function definition formula:
wherein σ (·) is a sigmoid activation function, namely a nonlinear mapping function of the neuron;
obtaining approximate reconstruction P (v; theta) of the training image by using a contrast divergence algorithm through one-step Gibbs sampling by using a formula (7) and a formula (8);
(3) solving P (v; theta) by using a maximum likelihood method to obtain an optimal value of theta; the likelihood function for LS-RBM is:
the optimal values of the parameters are:
after adding the sparse penalty term, the objective function of LS-RBM pre-training optimization is as follows:
F=Funsup+λFsparse (11)
wherein λ is a sparsity parameter for adjusting FsparseOf relative importance of, FsparseExpressing a sparse regularization function, the formula is:
wherein, L (q)jμ, b) is the Laplace probability density function, qjRepresenting the average of the conditional expectation of the jth hidden layer unit of a given data, p being a constant, controlling n hidden units hjThe sparsity of (a); u represents a scale parameter; q. q ofjThe expression is as follows:
where E (-) is the conditional expectation for the jth hidden layer element given the data, l represents the number of sheets of training images, m is the number of training image datasets,j is the j unit of the hidden layer corresponding to the l picture(l)The visible layer unit corresponding to the first picture,given a visible layer v, the hidden layer element hjG is a sigmoid function;
after adding the sparse penalty term, training LS-RBM aims at solving the optimal value of the objective function of the formula (10):
wherein, P (v)(l)) The likelihood function to be optimized for the LS-RBM, i.e. the distribution P (v; θ);
(4) and (3) carrying out derivation on the objective function of the LS-RBM by using a gradient descent method to update the weight matrix and the bias of the hidden layer, wherein the derivation formula is as follows:
and substituting the derived parameter value into an updating formula of the parameter theta to obtain a new parameter value:
a(1):=ma+α(v1-v2) (26)
(5) continuously training the network by using new parameter values, enabling the activation probability of the hidden layer unit to be gradually close to a given fixed value p by continuously optimizing an objective function, learning a group of weight parameters and corresponding bias, finding out sparse feature vectors through the appropriate parameters, controlling redundant features existing in the image, combining the learning weights by using main features in the image to reconstruct input data, and finishing the training of the first LS-RBM and the updating of the corresponding parameter value theta.
Step 2.3W trained by the first LS-RBM(1)And b(1)And use of P (h)(1)|v(1),W(1),b(1)) Obtaining the input characteristics of a second LS-RBM, and continuously training the second LS-RBM by using the algorithm in the step 2.2;
step 2.4, training recursively according to the steps until the training reaches a layer l-1, and obtaining a deep sparse DBN model, namely an LSDBN network model after multiple loop iterations;
step 2.5W for L layers(L)And b(L)Initialize and use { W(1),W(2),…,W(L)And { b }and(1),b(2),…,b(L)Forming a deep neural network with an L layer, wherein an output layer is label data of a training image set, and a Softmax classifier is used as the output layer;
the step 3 specifically comprises the following steps:
step 3.1, inputting the parameter value theta trained in the step 2 as an initial value of the parameter of the fine tuning stage into an LSDBN network model of the fine tuning stage;
step 3.2, calculating the activation value of each hidden layer unit by using a forward propagation algorithm;
step 3.3, calculating the output result of the forward propagation of the training image set and the error of the label corresponding to the image set, reversely propagating the error, and calculating the residual error of each hidden layer unit to represent the influence of the unit on the residual error; calculating partial derivatives by using the residual error of each unit, and updating the weight matrix and the offset by adopting a gradient descent method according to formulas (28) and (29) in each iteration until the maximum iteration number is reached to obtain a fine-tuned LSDBN network model;
where α is the learning rate, and J (W, b) is a cost function obtained by calculating the error of the actual model output value and the corresponding tag value.
The step 4 specifically comprises the following steps:
step 4.1, inputting the preprocessed test image set into the LSDBN network model finely adjusted in the step 3, and extracting main characteristics of the test image;
step 4.2, inputting the main characteristics of the test image into a Softmax classifier, and outputting the probability of belonging to a certain category, wherein the output of the Softmax classifier of one k category is as follows:
wherein x is(i)For the ith test image, y(i)Label, p (y), representing the ith test image(i)=k|x(i)(ii) a Theta) represents the probability that the ith test image belongs to the kth class,normalizing the probability distribution to make the sum of all probabilities be 1;
and 4.3, calculating the classification accuracy according to the probability that the test image belongs to a certain class and the class label corresponding to the image, and finally outputting the classification result of the image.
The invention has the beneficial effects that: in order to enable the model to have interpretation and discrimination capability, firstly, a punishment regular term is introduced into a likelihood function in an unsupervised stage, a CD training is used for maximizing a target function, meanwhile, sparse distribution of a training set is obtained through sparse constraint, so that the unlabeled data can be learned to be useful low-level feature representation, and the energy minimum economic strategy of biological evolution is met. Secondly, a sparse deep belief network based on a Laplace function is provided, the sparse state of the hidden layer is induced by using Laplace distribution, and meanwhile, scale parameters in the distribution can be used for controlling the sparse strength, so that sparse representation can be learned, and the sparse deep belief network has stronger feature extraction capability.
Drawings
Fig. 1 is a flowchart of an image classification method of an LSDBN model in the present invention.
FIG. 2 shows the result of the classification accuracy of LS-RBM on the pendings data set in the present invention.
FIG. 3 shows the result of the classification accuracy of LSDBN on the pendings data set in the present invention.
Detailed Description
The following further describes a specific embodiment of the present invention with reference to the drawings and technical solutions.
Example 1:
as shown in fig. 1, a sparse depth confidence network image classification method based on Laplace function constraint includes the following specific steps:
step 1, selecting a proper training image data set, and carrying out image preprocessing on the proper training image data set to obtain a training data set.
Because the image classification is in the feature extraction process, the color image is converted into a gray level image through binarization, and the gray value is normalized to be between [0 and 1], so that only one two-dimensional gray level matrix needs to be subjected to feature extraction. The specific normalization formula is as follows:
wherein the content of the first and second substances,is a characteristic value, x, of the image data setmaxAnd xminWhich are the maximum and minimum values of all features of the image dataset, respectively, and x is the normalized image dataset.
And 2, using the preprocessed training data set for the pre-training of the LSDBN network model. The method comprises the steps of pre-training a network according to an input training data set, selecting a contrast Divergence (CD-k) algorithm to perform unsupervised bottom-up independent training on each LS-RBM, using the output of the LS-RBMs on the bottom layer as the input of the LS-RBMs on the top layer, performing iterative training to obtain corresponding weight and bias, and finally obtaining the high-level characteristics of data. The method comprises the following steps:
step 2.1, constructing an LSDBN network model, and setting parameter values of the LSDBN network model structure: the visible layer nodes are the characteristic dimension of the input data set, the hidden layer nodes are set according to the characteristic dimension of different data sets, the hidden layer number is set to be 2, the iteration number is 100, and the fine adjustment number is 100, so that a better convergence result can be obtained.
And 2.2, taking the preprocessed training data set as the input of the first LS-RBM, and training the LS-RBM by adopting a CD algorithm.
(1) Since RBM is an energy-based model, the relationship between the visible layer and the hidden layer can be expressed using an energy function as:
wherein: θ represents a parameter in the model, i.e., { W ═ Wij,ai,bj};WijIs a weight matrix between the visible layer and the hidden layer, aiBias of nodes of visible layer, bjFor the bias of hidden layer nodes, i is the characteristic quantity of the input image, namely visible layer nodes, and n in total; j is hidden layer nodes, and the number of j is m; v. ofiRepresents the ith visible level node, hjRepresenting the jth hidden layer node.
(2) According to the energy function formula (2), the values of each group of visible layer nodes and hidden layer nodes have a corresponding energy value. When all parameters are determined, according to the definition of an energy function and the principle of thermodynamic statistics, the joint probability distribution of v and h in the RBM is defined as follows:
in the above equation, Z (θ) is the sum of all possible visible layer node and hidden layer node pairs. What the RBM model learns is the joint probability distribution, i.e., the generation of objects.
Further, for the specific problem, the most interesting is the probability distribution defined by the RBM on the observed data v, i.e. the marginal probability distribution of P (v, h; θ), the probability of the distribution network assigned to the visual elements (training data) is given by summing all possible hidden elements:
accordingly, the edge probability distribution of the hidden layer h is:
the conditional probability distribution formulas of the visible layer and the hidden layer can be deduced through a Bayesian formula and a sigmoid function definition formula as follows:
(3) for a given training sample, training the RBM model means finding the value of the parameter θ at which the RBM can fit the training sample very much. Therefore, the log-likelihood function P (v | θ) of the RBM is maximized by the maximum likelihood method to obtain an optimal value of θ, i.e., an optimal value of θ
Suppose a training set v is given(1),...,v(m)An unsupervised pre-trained optimization model using sparse penalty terms is defined as follows:
F=Funsup+λFsparse (10)
wherein, FunsupThe likelihood function representing the RBM, i.e. as shown in equation (4-5)λ is a sparse penalty parameter, FsparseRepresenting an arbitrary sparse regularization function.
The objective of sparse RBM is mainly to make the hidden layer nodes mostly zero, in other words, the activation probability of the hidden layer unit is close to zero, based on the theory of statistics. If the hidden layer element is sparse (the hidden layer element is inactive in most cases), then the features of this hidden layer element are used to represent only a small portion of the training data.
By defining a sparse regularization term, the average activation probability of the training data can be reduced, thereby ensuring model neurons (corresponding to the random variable h)j) The "activation rate" of (a) is kept at a rather low level so that the activation of neurons is sparse. This requires that the activation probability image of the hidden layer unit has the characteristic of peak heavy tail.
Based on the above analysis, inspired by the compressive sensing theory, the present invention induces the sparse state of the hidden layer unit by penalizing with laplacian function, which has a heavy tail characteristic, with different behaviors according to the deviation of the hidden layer unit activation probability from the fixed value parameter p. In addition, the method also has a scale parameter which can control the sparsity degree.
In probabilistic and mathematical statistics, the laplace distribution is a continuous distribution with flatter tails compared to a normal distribution. In the process of solving the sparse solution, most features tend to be distributed on both sides of the function gradually, namely closer to zero. And a few hidden units have higher activation probability, some hidden units are activated, and the activated hidden units can represent main features of data through training, so that sparse feature representation of the model is easier to realize.
The objective function optimization formula after adding the sparse regularization term is as follows:
the laplacian sparse penalty term is defined as follows:
wherein L (q)j,μ, b) is the Laplace probability density function, qjRepresenting the average of the conditional expectation of the jth hidden layer unit of a given data, p being a constant, controlling n hidden units hjBy defining the value of the parameter, the optimization objective function is to make the average activation probability of the hidden layer node as close to p as possible; u represents a scale parameter that can be used to control the degree of sparsity by varying its value. Wherein q isjIs represented as follows:
in the formula, E (-) is the jth hidden layer node given the dataM is the number of training data,given a visible layer v, the hidden layer element hjG is the sigmoid function.
Further, the objective function for training LS-RBM is as follows: :
in the above equation, the first term is a log-likelihood function term, and the second term is a sparse penalty term, where λ is a parameter of the term, and is used to represent the relative importance of the term between the objective function and the data distribution. Therefore, the sparse regularization term solution is maximized while the log-likelihood function is maximized.
(4) The bias of the hidden layer directly influences the sparsity of the hidden units, so that only the weight matrix and the bias of the hidden layer are updated. Wherein the gradient of the sparse regularization term is calculated as follows:
wherein the derivation result of the first term in the above formula is shown as follows:
the second term expands as follows:
wherein σj=∑iWijvi+bjRepresenting the input of the hidden unit j. Each term after the formula expansion is derived as follows:
derivation of the hidden layer bias can result:
and bringing the derived parameter value into a parameter updating model to obtain a new parameter value:
a:=ma+α(v1-v2) (26)
(5) after the parameter values are updated, the new parameter values continue to train the network, the activation probability of the hidden layer node is enabled to be gradually close to a given fixed value p by continuously optimizing the objective function, a group of weight parameters and corresponding bias are tried to be learned, sparse feature vectors can be found through the appropriate parameters, redundant features are controlled, the learning weights are combined by using the main features to reconstruct input data, and the robustness of the algorithm to noise is improved. Thus, the first LS-RBM and the corresponding parameter values are trained.
Step 2.3 training W by first LS-RBM(1)And b(1)And using P (h)(1)|v(1),W(1),b(1)) Obtaining the input characteristics of a second LS-RBM, and continuously training the second LS-RBM by using the algorithm in the step 2.2;
step 2.4, recursively training to an L-1 layer according to the steps, wherein parameters and output of the LS-RBM trained on the bottommost layer are used as next higher layer data in the training model, namely the next LS-RBM, and after multiple loop iterations, a deep sparse DBN model, namely an LSDBN model, can be learned;
step 2.5W for L layers(L)And b(L)Initialize and use { W(1),W(2),…,W(L)And { b }and(1),b(2),…,b(L)Forming a deep neural network with an L layer, wherein the output layer is labeled data, and a Softmax classifier is used as the output layer;
and 3, further optimizing the LSDBN. And taking the parameter values obtained in the pre-training stage as initial values of the fine-tuning stage, and fine-tuning the whole LSDBN network. The invention adopts a top-down supervised learning algorithm-back propagation algorithm to finely adjust the whole network, namely, training samples and test data are input for training, and errors are reversely propagated from top to bottom to optimize the network. The method mainly comprises the following steps:
and 3.1, inputting the training sample into the network in the fine tuning stage for optimization by taking the network parameters trained in the pre-training stage as initial values in the fine tuning stage.
And 3.2, calculating the activation value of each layer of neurons by using a forward propagation algorithm.
Step 3.3, calculating the output result of the forward propagation of the training image set and the error of the label corresponding to the image set, reversely propagating the error, and calculating the residual error of each hidden layer unit to represent the influence of the unit on the residual error; calculating partial derivatives by using the residual error of each unit, and updating the weight matrix and the offset by adopting a gradient descent method according to formulas (28) and (29) in each iteration until the maximum iteration number is reached to obtain a fine-tuned LSDBN network model;
where α is the learning rate, and J (W, b) is the cost function of the network, obtained by calculating the actual output value and the target value. The iterative steps of the gradient descent method are repeatedly used to reduce the value of J (W, b).
And 4, inputting the test image data set into the LSDBN network model trained in the step 3, and performing identification test by adopting a Softmax classifier to realize the output of the image classification result. For image classification, a classifier is used at the top level of the network, and test data is input for test classification. In order to make the network widely used, the invention uses the Softmax classifier to classify, and for a k-class classifier, the output is:
wherein, θ is a parameter matrix including weight and bias, and each row is regarded as the parameter of the classifier corresponding to a category, and has k rows.The probability distribution is normalized so that the sum of all probabilities is 1. Thus, the cost function J (θ) of the fine tuning stage is:
where 1{ · } is an indicative function, that is, 1{ expression whose value is true } ═ 1, and 1{ expression whose value is false } ═ 0.
The MNIST handwriting database and the Pendigits handwriting recognition data set are detected by the method provided by the invention.
Example 2: experiments on MNIST handwriting database
The MNIST handwriting data set includes 60000 training samples and 10000 test samples, each with a picture size of 28 x 28 pixels. In order to facilitate the extraction of image features, the invention extracts different numbers of images of each category from 60000 training data for experimental analysis. The model comprises 784 visible layer nodes and 500 hidden layer nodes, the learning rate is set to be 1, the batch block number is 100, the maximum iteration number is 100, and the CD algorithm with the step length of 1 is used for training the model.
Table 1 shows the results of the sparsity measurements on the MNIST dataset according to the present invention, and the comparative analysis is performed with the remaining two sparse models. The sparsity measurement method is shown as the following formula:
for the sparse model, the higher the sparsity is, the higher the algorithm stability is, and the stronger the robustness is. As can be seen from Table 1, LS-RBM has a higher sparseness value and a sparser representation can be learned compared to SP-RBM and SR-RBM.
TABLE 1 sparsity measurement results on MNIST dataset
Table 2 shows the results of the classification accuracy of the LS-RBM in the MNIST data set based on different sample numbers of each type, and the results are compared and analyzed with Artificial Neural Network (ANN), Automatic Encoder (AE), Restricted Boltzmann Machine (RBM), SR-RBM and SP-RBM methods. It can be seen from table 2 that the LS-RBM method of the present invention is always at the best recognition accuracy for different numbers of samples per class of MNIST dataset. Particularly, under the condition of 1000 samples in each class, the recognition rate of 96.8% is achieved, which is 3% higher than that of the second-highest SR-RBM algorithm, and the method provided by the invention has better feature extraction capability.
Table 2 LS-RBM sort accuracy results on MNIST dataset;
in order to learn deep features, table 3 shows the classification accuracy results of the LSDBN in the present invention based on different sample numbers of each class on the MNIST dataset, and the results are compared and analyzed with the DBN, SP-DBN, SR-DBN composed of the DBN, SP-RBM, SR-RBM. For a deeper sparse model, more abstract features are extracted every time a layer is added, and the final classification effect is influenced by the existence of some redundant features, as is apparent from table 3, the LSDBN can learn the features with easier discrimination, and can improve by 3 percentage points at most compared with the next highest SP-DBN, which indicates that the LSDBN can have better robustness for the interference of the redundant features.
TABLE 3 LSDBN Classification accuracy results on MNIST dataset
Example 3: experiments on the Pendigits handwriting recognition dataset
The Pen-Based registration of hand writing Digits (PenDigits) data set comprises 10992 data samples which are divided into 10 classes, wherein 7494 training data, 3298 testing data and 16 characteristic vectors are respectively used for analyzing images of each class in different numbers. 16 visible layer nodes and 10 hidden layer nodes are arranged, the learning rate is set to be 1, the batch block number is 100, and the maximum iteration number is 1000.
Fig. 2 shows the classification accuracy results of the LS-RBM on the basis of different sample numbers of each class on the Pendigits handwriting recognition data set, and it can be seen that the classification accuracy of most algorithms is higher and higher when the number of samples of each class is larger. The LS-RBM algorithm still achieves the best classification accuracy on the PenDigits data set even if each class has only dozens of data samples, and the learned feature representation of the invention has better discrimination than SP-RBM and SR-RBM.
In order to learn deep features, a second RBM (namely DBN, SP-DBN, SR-DBN and LS-DBN) is trained by using the activation probability of an RBM hidden layer unit on the basis of RBM, SP-RBM, SR-RBM and LS-RBM experiments, the hidden unit is still set to be 10, and the iteration times are 1000 respectively. The classification accuracy of each model was tested using the PenDigits dataset test set, with the results shown in fig. 3. The observation shows that the main characteristics of the method can be still obtained through sparsity constraint when the number of samples is small, the classification precision is improved by 2.7% -6% compared with that of a DBN model, and the applicability of the algorithm in a low-dimensional data set is proved.
Claims (5)
1. A sparse depth confidence network image classification method based on Laplace function constraint is characterized by comprising the following steps:
step 1, selecting a training image data set, and carrying out image preprocessing to obtain a training data set;
step 2, inputting the training data set preprocessed in the step 1 into an LSDBN network model, unsupervised and separately training an LS-RBM network of each layer from bottom to top by using a contrast divergence algorithm, using the output of an LS-RBM network of a lower layer as the input of an LS-RBM network adjacent to an upper layer, obtaining the parameter value of each LS-RBM network through iterative training, and finally obtaining the high-level characteristics of the input image data; the parameter values are weight and bias;
step 3, taking the parameter values obtained in the step 2 as initial values of a fine tuning stage, and fine tuning the whole LSDBN network by adopting a top-down back propagation algorithm to obtain an LSDBN network model;
step 4, inputting the test image data set into the LSDBN network model obtained in the step 3, performing identification test by adopting a Softmax classifier, and finally outputting an image classification result;
the step 2 specifically comprises the following steps:
step 2.1, constructing an LSDBN network model, and setting parameter values of the LSDBN network model structure: visible layer nodes, hidden layer numbers, iteration times and fine adjustment times; the hidden layer nodes are determined according to the size of the characteristic dimension of the input image set;
step 2.2, taking the preprocessed training data set x as the input of a first LS-RBM, and training the LS-RBM by adopting a CD algorithm;
(1) the relationship between the visible layer and the hidden layer is expressed as an energy function:
where θ represents a parameter in the model, i.e., { W ═ Wij,ai,bj};WijIs a weight matrix between the visible layer and the hidden layer, aiBias of nodes of visible layer, bjFor the bias of hidden layer nodes, i is the characteristic quantity of the input image, namely visible layer nodes, and n in total; j is hidden layer nodes, and the number of j is m; v. ofiRepresents the ith visible level node, hjRepresents the jth hidden layer node;
(2) based on the energy function formula (2), the joint probability distribution of v and h in the RBM is obtained as follows:
where Z (θ) is the sum of all possible visible and hidden layer node pairs, aiBias of nodes of visible layer, bjA bias for a hidden layer node;
respectively solving the edge probability distribution of the visual layer unit v and the hidden layer h according to the joint probability distribution of the formula (3) by utilizing the principle of a Bayes formula:
and deducing a conditional probability distribution formula of the visual layer unit v and the hidden layer h by using a Bayesian formula principle and a sigmoid activation function definition formula:
wherein σ (·) is a sigmoid activation function, namely a nonlinear mapping function of the neuron;
obtaining approximate reconstruction P (v; theta) of the training image by using a contrast divergence algorithm through one-step Gibbs sampling by using a formula (7) and a formula (8);
(3) solving P (v; theta) by using a maximum likelihood method to obtain an optimal value of theta; the likelihood function for LS-RBM is:
the optimal values of the parameters are:
after adding the sparse penalty term, the objective function of LS-RBM pre-training optimization is as follows:
F=Funsup+λFsparse (11)
wherein λ is sparsity parameter for adjusting FsparseOf relative importance of, FsparseExpressing a sparse regularization function, the formula is:
wherein, L (q)jP, u) is the Laplace probability density function, qjRepresenting the average of the conditional expectation of the jth hidden layer unit of a given data, p being a constant, controlling n hidden units hjSparsity of (d); u represents a scale parameter; q. q.sjThe expression is as follows:
where E (-) is the conditional expectation for the jth hidden layer element given the data, l represents the number of sheets of training images, m is the number of training image datasets,j is the j unit of the hidden layer corresponding to the l picture(l)The visible layer unit corresponding to the first picture,given a visible layer v, the hidden layer element hjG is a sigmoid function, σjRepresents the input of the hidden layer element j;
after the sparse regularization term is added, the objective of training the LS-RBM is to solve the optimal value of the objective function of formula (10):
wherein, P (v)(l)) The likelihood function to be optimized for the LS-RBM, i.e. the distribution P (v; θ);
(4) and (3) carrying out derivation on the objective function of the LS-RBM by using a gradient descent method to update the weight matrix and the bias of the hidden layer, wherein the derivation formula is as follows:
and substituting the derived parameter value into an updating formula of the parameter theta to obtain a new parameter value:
a(1):=ma+α(v1-v2) (26)
wherein α is a learning rate;
(5) continuously training the network by using new parameter values, enabling the activation probability of the hidden layer unit to be gradually close to a given fixed value p by continuously optimizing a target function, learning a group of weight parameters and corresponding bias, finding out sparse feature vectors through the appropriate parameters, controlling redundant features existing in the image, combining the learning weights by using main features in the image to reconstruct input data, and finishing the training of a first LS-RBM and the updating of a corresponding parameter value theta;
step 2.3W trained by the first LS-RBM(1)And b(1)And use of P (h)(1)|v(1),W(1),b(1)) Obtaining the input characteristics of a second LS-RBM, and continuously training the second LS-RBM by using the algorithm in the step 2.2;
step 2.4, training recursively according to the steps until the training reaches a layer l-1, and obtaining a deep sparse DBN model, namely an LSDBN network model after multiple loop iterations;
step 2.5 for W of l layers(L)And b(L)Initialize and use { W(1),W(2),…,W(L)And { b }and(1),b(2),…,b(L)And forming a deep neural network with an L layer, wherein an output layer is label data of the training image set, and a Softmax classifier is used as the output layer.
2. The method according to claim 1, wherein step 1 is specifically: converting the color image into a gray level image by a binarization method, and normalizing the gray value of the gray level image to be between [0,1] to obtain a training data set; wherein
The normalized formula is:
3. The method according to claim 1 or 2, characterized in that said step 3 is in particular:
step 3.1, inputting the parameter value theta trained in the step 2 as an initial value of the parameter of the fine tuning stage into an LSDBN network model of the fine tuning stage;
step 3.2, calculating the activation value of each hidden layer unit by using a forward propagation algorithm;
step 3.3, calculating the output result of the forward propagation of the training image set and the error of the label corresponding to the image set, reversely propagating the error, and calculating the residual error of each hidden layer unit to represent the influence of the unit on the residual error; calculating partial derivatives by using the residual error of each unit, and updating the weight matrix and the offset by adopting a gradient descent method according to formulas (28) and (29) in each iteration until the maximum iteration number is reached to obtain a fine-tuned LSDBN network model;
where J (W, b) is a cost function obtained by calculating the error of the actual model output value and the corresponding label value.
4. The method according to claim 1 or 2, characterized in that said step 4 is in particular:
step 4.1, inputting the preprocessed test image set into the LSDBN network model finely adjusted in the step 3, and extracting main characteristics of the test image;
step 4.2, inputting the main features of the test image into a Softmax classifier, outputting the probability of belonging to a certain class, and outputting the probability of the Softmax classifier of a k class as follows:
wherein x is(i)For the ith test image, y(i)Label, p (y), representing the ith test image(i)=k|x(i)(ii) a Theta) represents the probability that the ith test image belongs to the kth class,normalizing the probability distribution to make the sum of all probabilities be 1;
and 4.3, calculating the classification accuracy according to the probability that the test image belongs to a certain class and the class label corresponding to the image, and finally outputting the classification result of the image.
5. The method according to claim 3, wherein step 4 is specifically:
step 4.1, inputting the preprocessed test image set into the LSDBN network model finely adjusted in the step 3, and extracting main characteristics of the test image;
step 4.2, inputting the main features of the test image into a Softmax classifier, outputting the probability of belonging to a certain class, and outputting the probability of the Softmax classifier of a k class as follows:
wherein x is(i)For the ith test image, y(i)Label, p (y), representing the ith test image(i)=k|x(i)(ii) a Theta) represents the probability that the ith test image belongs to the kth class,normalizing the probability distribution to make the sum of all probabilities be 1;
and 4.3, calculating the classification accuracy according to the probability that the test image belongs to a certain class and the class label corresponding to the image, and finally outputting the classification result of the image.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810417793.5A CN108805167B (en) | 2018-05-04 | 2018-05-04 | Sparse depth confidence network image classification method based on Laplace function constraint |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810417793.5A CN108805167B (en) | 2018-05-04 | 2018-05-04 | Sparse depth confidence network image classification method based on Laplace function constraint |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108805167A CN108805167A (en) | 2018-11-13 |
CN108805167B true CN108805167B (en) | 2022-05-13 |
Family
ID=64093241
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810417793.5A Active CN108805167B (en) | 2018-05-04 | 2018-05-04 | Sparse depth confidence network image classification method based on Laplace function constraint |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108805167B (en) |
Families Citing this family (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109246608A (en) * | 2018-11-16 | 2019-01-18 | 重庆小富农康农业科技服务有限公司 | A kind of point-to-point localization method in interior based on WIFI location fingerprint big data analysis |
CN109635931A (en) * | 2018-12-14 | 2019-04-16 | 吉林大学 | A kind of equipment running status evaluation method based on depth conviction net |
CN110147834A (en) * | 2019-05-10 | 2019-08-20 | 上海理工大学 | Fine granularity image classification method based on rarefaction bilinearity convolutional neural networks |
CN110209813A (en) * | 2019-05-14 | 2019-09-06 | 天津大学 | A kind of incident detection and prediction technique based on autocoder |
CN110188692B (en) * | 2019-05-30 | 2023-06-06 | 南通大学 | Enhanced cyclic cascading method for effective target rapid identification |
CN110543918B (en) * | 2019-09-09 | 2023-03-24 | 西北大学 | Sparse data processing method based on regularization and data augmentation |
CN111368686B (en) * | 2020-02-27 | 2022-10-25 | 西安交通大学 | Electroencephalogram emotion classification method based on deep learning |
CN112188210A (en) * | 2020-09-27 | 2021-01-05 | 铜仁学院 | DVC side information solving method adopting deep belief network |
CN112286996A (en) * | 2020-11-23 | 2021-01-29 | 天津大学 | Node embedding method based on network link and node attribute information |
CN113095381B (en) * | 2021-03-29 | 2024-04-05 | 西安交通大学 | Underwater sound target identification method and system based on improved DBN |
CN113313175B (en) * | 2021-05-28 | 2024-02-27 | 北京大学 | Image classification method of sparse regularized neural network based on multi-element activation function |
CN115049814B (en) * | 2022-08-15 | 2022-11-08 | 聊城市飓风工业设计有限公司 | Intelligent eye protection lamp adjusting method adopting neural network model |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104077595A (en) * | 2014-06-15 | 2014-10-01 | 北京工业大学 | Deep belief network image recognition method based on Bayesian regularization |
CN104732249A (en) * | 2015-03-25 | 2015-06-24 | 武汉大学 | Deep learning image classification method based on popular learning and chaotic particle swarms |
CN106067042A (en) * | 2016-06-13 | 2016-11-02 | 西安电子科技大学 | Polarization SAR sorting technique based on semi-supervised degree of depth sparseness filtering network |
CN107528824A (en) * | 2017-07-03 | 2017-12-29 | 中山大学 | A kind of depth belief network intrusion detection method based on two-dimensionses rarefaction |
-
2018
- 2018-05-04 CN CN201810417793.5A patent/CN108805167B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104077595A (en) * | 2014-06-15 | 2014-10-01 | 北京工业大学 | Deep belief network image recognition method based on Bayesian regularization |
CN104732249A (en) * | 2015-03-25 | 2015-06-24 | 武汉大学 | Deep learning image classification method based on popular learning and chaotic particle swarms |
CN106067042A (en) * | 2016-06-13 | 2016-11-02 | 西安电子科技大学 | Polarization SAR sorting technique based on semi-supervised degree of depth sparseness filtering network |
CN107528824A (en) * | 2017-07-03 | 2017-12-29 | 中山大学 | A kind of depth belief network intrusion detection method based on two-dimensionses rarefaction |
Non-Patent Citations (3)
Title |
---|
基于受限玻尔兹曼机的深度学习模型及其应用;张艳霞;《中国优秀硕士学位论文全文数据库 信息科技》;20170215;15-50 * |
基于深度学习的图像分类算法研究;房雪键;《中国优秀硕士学位论文全文数据库 信息科技》;20170215;13-21 * |
张艳霞.基于受限玻尔兹曼机的深度学习模型及其应用.《中国优秀硕士学位论文全文数据库 信息科技》.2017, * |
Also Published As
Publication number | Publication date |
---|---|
CN108805167A (en) | 2018-11-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108805167B (en) | Sparse depth confidence network image classification method based on Laplace function constraint | |
US11620487B2 (en) | Neural architecture search based on synaptic connectivity graphs | |
CN107229914B (en) | Handwritten digit recognition method based on deep Q learning strategy | |
US20230229891A1 (en) | Reservoir computing neural networks based on synaptic connectivity graphs | |
US11568201B2 (en) | Predicting neuron types based on synaptic connectivity graphs | |
US11625611B2 (en) | Training artificial neural networks based on synaptic connectivity graphs | |
US20230229901A1 (en) | Artificial neural network architectures based on synaptic connectivity graphs | |
CN109034186B (en) | Handwriting data identification method based on DA-RBM classifier model | |
CN112464004A (en) | Multi-view depth generation image clustering method | |
US11631000B2 (en) | Training artificial neural networks based on synaptic connectivity graphs | |
CN109948589B (en) | Facial expression recognition method based on quantum depth belief network | |
CN113344069B (en) | Image classification method for unsupervised visual representation learning based on multi-dimensional relation alignment | |
Karthikeyan et al. | Self-adaptive hybridized lion optimization algorithm with transfer learning for ancient Tamil character recognition in stone inscriptions | |
CN112861626A (en) | Fine-grained expression classification method based on small sample learning | |
CN115310491A (en) | Class-imbalance magnetic resonance whole brain data classification method based on deep learning | |
CN115661498A (en) | Self-optimization single cell clustering method | |
Yang et al. | iCausalOSR: invertible Causal Disentanglement for Open-set Recognition | |
CN108304546B (en) | Medical image retrieval method based on content similarity and Softmax classifier | |
Atukorale et al. | Hierarchical overlapped neural gas network with application to pattern classification | |
Xiong et al. | Denoising auto-encoders toward robust unsupervised feature representation | |
Shu et al. | Image Classification Algorithm Named OCFC Based on Self-supervised Learning | |
CN114037931B (en) | Multi-view discriminating method of self-adaptive weight | |
US20230342589A1 (en) | Ensemble machine learning with reservoir neural networks | |
Jiang | From Neuronal to Artificial Neural Network: Discovering Non-linearity in the Inference Problems | |
Rath et al. | Development and Performance Assessment of Bio-inspired based ANN Model for Handwritten English Numeral Recognition |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |