CN109146007B - Solid waste intelligent treatment method based on dynamic deep belief network - Google Patents
Solid waste intelligent treatment method based on dynamic deep belief network Download PDFInfo
- Publication number
- CN109146007B CN109146007B CN201810768405.8A CN201810768405A CN109146007B CN 109146007 B CN109146007 B CN 109146007B CN 201810768405 A CN201810768405 A CN 201810768405A CN 109146007 B CN109146007 B CN 109146007B
- Authority
- CN
- China
- Prior art keywords
- ddbn
- rbm
- solid waste
- training
- neuron
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 239000002910 solid waste Substances 0.000 title claims abstract description 53
- 238000000034 method Methods 0.000 title claims abstract description 39
- 210000002569 neuron Anatomy 0.000 claims abstract description 127
- 238000012549 training Methods 0.000 claims abstract description 86
- 230000008569 process Effects 0.000 claims abstract description 20
- 230000006399 behavior Effects 0.000 claims abstract description 14
- 238000002485 combustion reaction Methods 0.000 claims abstract description 14
- 239000013598 vector Substances 0.000 claims abstract description 11
- 230000000007 visual effect Effects 0.000 claims description 23
- 230000006870 function Effects 0.000 claims description 21
- 230000009467 reduction Effects 0.000 claims description 19
- 230000004913 activation Effects 0.000 claims description 14
- 238000012360 testing method Methods 0.000 claims description 14
- 238000009826 distribution Methods 0.000 claims description 10
- 238000007781 pre-processing Methods 0.000 claims description 10
- 230000008859 change Effects 0.000 claims description 6
- 238000011478 gradient descent method Methods 0.000 claims description 6
- 238000013138 pruning Methods 0.000 claims description 6
- 238000010606 normalization Methods 0.000 claims description 5
- 239000011159 matrix material Substances 0.000 claims description 3
- 238000005070 sampling Methods 0.000 claims description 3
- 230000003247 decreasing effect Effects 0.000 claims description 2
- 238000005259 measurement Methods 0.000 claims description 2
- 238000012545 processing Methods 0.000 claims 4
- NAWXUBYGYWOOIX-SFHVURJKSA-N (2s)-2-[[4-[2-(2,4-diaminoquinazolin-6-yl)ethyl]benzoyl]amino]-4-methylidenepentanedioic acid Chemical compound C1=CC2=NC(N)=NC(N)=C2C=C1CCC1=CC=C(C(=O)N[C@@H](CC(=C)C(O)=O)C(O)=O)C=C1 NAWXUBYGYWOOIX-SFHVURJKSA-N 0.000 claims 1
- 230000001902 propagating effect Effects 0.000 claims 1
- 238000013135 deep learning Methods 0.000 abstract description 5
- 239000002699 waste material Substances 0.000 abstract description 5
- 230000008901 benefit Effects 0.000 abstract description 4
- 238000009270 solid waste treatment Methods 0.000 description 5
- 238000010586 diagram Methods 0.000 description 3
- 230000000644 propagated effect Effects 0.000 description 3
- 238000007476 Maximum Likelihood Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000002474 experimental method Methods 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 238000003723 Smelting Methods 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 230000003321 amplification Effects 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000004134 energy conservation Methods 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000000855 fermentation Methods 0.000 description 1
- 230000004151 fermentation Effects 0.000 description 1
- 238000003199 nucleic acid amplification method Methods 0.000 description 1
- 239000011541 reaction mixture Substances 0.000 description 1
- 230000001105 regulatory effect Effects 0.000 description 1
- 239000002893 slag Substances 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 230000001960 triggered effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/19—Recognition using electronic means
- G06V30/192—Recognition using electronic means using simultaneous comparisons or correlations of the image signals with a plurality of references
- G06V30/194—References adjustable by an adaptive method, e.g. learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
- G06N3/061—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using biological neurons, e.g. biological neurons connected to an integrated circuit
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Molecular Biology (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Neurology (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- Data Mining & Analysis (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Multimedia (AREA)
- Databases & Information Systems (AREA)
- Microelectronics & Electronic Packaging (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention provides a solid waste intelligent treatment method based on a dynamic deep belief network, and belongs to the fields of deep learning and solid waste intelligent treatment. The method firstly provides the DDBN using the dynamic branch increase and decrease algorithm, so that the DDBN can increase hidden layer neurons and hidden layers according to the current training condition in the training process, remove redundant neurons and effectively optimize the network structure of the DDBN. Then, by utilizing the advantage that the DDBN can effectively extract the main characteristics of the original data, the DDBN is used for effectively describing the random, discrete and nonlinear characteristic vectors of the solid waste, so that the state characteristics of the time sequence are easier to identify, and the main information of the original data is ensured not to be lost. Meanwhile, according to the extracted state description of the solid waste, the optimized combustion behavior suitable for the state of the solid waste is predicted by using DDBN, the waste of resources caused by blind combustion behavior is reduced, and the intelligent treatment of the solid waste is realized.
Description
Technical Field
The invention belongs to the field of Deep learning and intelligent solid waste treatment, and provides a Dynamic Deep Belief Network (DDBN) model using a Dynamic branching and increasing algorithm, which can effectively optimize the Network structure of a Deep Belief Network so as to solve the problem of intelligent treatment of a large amount of solid waste in the light industry.
Background
The development of the light industry currently faces great environmental protection pressure and great requirements on pollution emission reduction treatment tasks. With the development of national economy, the market demand of fermentation and paper-making products is greatly increased, and although the pollution emission intensity of unit products is obviously reduced in recent years, the total emission amount of industrial solid wastes is still increased due to the rapid amplification of production capacity. In order to realize the related targets of energy conservation and emission reduction in the industry, a new method for treating wastes needs to be researched, the pollution control and treatment level of production enterprises is improved through the application of the new method, and the pollution emission reduction action of the industry is supported.
In recent years, Deep learning is rapidly developed, and Hinton et al propose a Deep Belief Network (DBN) and an unsupervised greedy layer-by-layer training algorithm in 2006, so that the problem that a Deep neural Network is prone to fall into local optimization is solved, and a new wave of Deep learning in academic circles is triggered. The DBN obtains abstract representation of original data through multi-level feature transformation, so that accuracy of tasks such as classification and prediction is improved, and the DBN has the advantages of automatic learning features and data dimension reduction and becomes a most extensive network structure for deep learning application.
The advantage that the DBN can effectively extract the main characteristics of the original data is utilized, the DBN can be used for effectively describing the state of random, discrete and nonlinear characteristic vectors of the solid waste, so that the state characteristics of the time sequence are easier to identify, and the main information of the original data is not lost. Meanwhile, according to the extracted state description of the solid waste, the optimized combustion behavior suitable for the state of the solid waste is predicted by using the DBN, the waste of resources caused by blind combustion behaviors is greatly reduced, and the intelligent treatment of the solid waste is realized.
However, if the DBN is to solve a high complexity problem, such as the above problem, the DBN needs to increase hidden layer neurons and hidden layers appropriately. However, the number of hidden layer neurons and hidden layers still needs to be selected by manual experiments at present, and the network structure is fixed and unchanged in the training process. Therefore, the error is large, the calculation cost is high, and the efficiency is low. Therefore, a new design method of the DBN structure is needed to be provided, so that the DBN can dynamically increase and decrease branches according to the current training situation in the training process, and the network structure is optimized, so that the problem of intelligent treatment of solid wastes is better solved.
Disclosure of Invention
In order to solve the problems in the prior art, the invention provides an intelligent solid waste treatment method based on a Dynamic Deep Belief Network (DDBN).
The technical scheme of the invention is as follows:
a solid waste intelligent treatment method based on a dynamic deep belief network comprises the following steps:
step 1, measuring solid waste to obtain a solid waste data set, preprocessing the solid waste data set, and dividing to obtain a training data set and a testing data set.
The pretreatment comprises the following steps: normalizing the solid waste data set to be between [0 and 1], wherein the normalization formula is as follows:
wherein,characteristic value, x, of solid waste data setmaxAnd xminThe maximum value and the minimum value of all characteristics of the solid waste data set are respectively, and x is the solid waste data set after normalization.
Step 2, inputting the training data set obtained after the preprocessing in the step 1 into a DDBN (digital data base) model, using a contrast Divergence algorithm (CD) to train each layer of Restricted Boltzmann Machine (RBM) from bottom to top without supervision, optimizing the network structure of the current RBM by a dynamic branch-increasing and-decreasing algorithm in the training process, obtaining the network structure and parameter values of each RBM by iterative training, and finally obtaining the high-level characteristics of the training data set; the parameter values are weight and bias. The specific operation is as follows:
step 2.1, constructing a DDBN network model, and setting the parameter values of DDBN: visual layer neurons, initial hidden layer neurons, and hidden layer numbers, learning rate, iteration times, and fine tuning times. Wherein, the number of visual layer neurons is the feature dimension of the training data set.
And 2.2, inputting the training data set obtained after preprocessing into the first layer RBM, pre-training the RBM by using a CD algorithm, and optimizing the network structure of the current RBM by using a dynamic branch-increasing and-decreasing algorithm in the training process.
(1) The energy function E (v, h; θ) of the RBM and the joint probability distribution P (v, h; θ) of the visible and hidden layer neurons are:
wherein v isi(1. ltoreq. I. ltoreq.I) and hj(J is more than or equal to 1 and less than or equal to J) respectively represents a visual layer neuron and a hidden layer neuron, w is a weight matrix between the visual layer and the hidden layer, b and c are the bias of the visual layer neuron and the hidden layer neuron respectively, and theta represents a parameter in the model, namely theta is { w, b, c }; z is the sum of all possible pairs of visible and hidden layer neurons.
And (3) solving the edge probability distribution of the visual layer neuron v and the hidden layer neuron h according to a formula (3) by utilizing the principle of a Bayes formula:
and deducing the conditional probability distribution of the visual layer neuron v and the hidden layer neuron h by using a Bayesian formula:
and (3) obtaining an approximate reconstruction P (v; theta) of the training sample through one-step Gibbs sampling by using a contrast divergence algorithm according to the formula (6) and the formula (7), and then updating a network parameter theta to be { w, b, c } according to a reconstruction error.
(2) In the training process, the network structure of the RBM is optimized through a dynamic branch increase and decrease algorithm according to the current training situation.
The change in Weight w is monitored using the WD (Weight Distance) method:
WDj[m]=Met(wj[m],wj[m-1]) (8)
wherein, wj[m]Is the weight vector of hidden layer neuron j after m iterations, and Met represents the metric function, such as Euclidean distance. The value of WD reflects the change in the weight vector for hidden layer neuron j in two iterations.
The propagation conditions are considered from both local and global aspects.
The local conditions are defined as:
whereinIn the mth iteration, the WD value of the jth hidden neuron to the nth input sample, J is 1,2, 3.
The global condition is defined as:
where N is the number of samples in the training data set and N' is the number of samples that increased the WD value of the jth neuron compared to the last iteration, i.e., the number of samples
The local and global conditions are considered for a single input sample and all input samples by the hidden layer neuron j, respectively. Multiplying the two conditions to obtain the propagation condition:
max_WDj[m]*iratioj[m]>y(m) (11)
where y (m) is a curve, which is used as a variable threshold, defined as:
where m is the current iteration number, numepochs is the maximum iteration number, u represents the curvature of the curve, ymaxAnd yminRespectively the maximum and minimum of the curve. When the jth neuron satisfies equation (11), then the neuron will be divided into two neurons, and each parameter of the new neuron is 0.
When the RBM training is completed, branch reduction is started: and (3) using the standard deviation of the hidden layer neurons on the activation probability of all samples as a branch-reducing condition, wherein the standard deviation formula is as follows:
where N is 1,2,3, …, N is the number of samples in the input training dataset, j represents the jth hidden layer neuron, P (N, j) represents the activation probability of the jth neuron on the nth input sample, μjRepresenting the average activation probability of the jth neuron on all input samples.
The branch reducing conditions are as follows:
σ(j)<θA (14)
wherein theta isAIs a threshold value. When the jth neuron satisfies equation (14), then the neuron and all its parameters are removed. At the same time, a trade-off curve relating to the ratio of branch reduction and the prediction accuracy is made, and theta is selected according to the curveAThe value of (c) such that more redundant neurons are removed while preserving the original accuracy.
After the branch is subtracted, the RBM is retrained, so that the rest neurons can compensate the removed neurons, and the retraining after the branch is subtracted is an iteration. At each iteration we update the threshold θA:
θA←θA+δ[iter] (15)
By delta [ iter ]]The threshold in each iteration of the pruning is updated to remove more neurons. Each iteration is a greedy search, and according to a balance curve in each branch reduction, the optimal branch reduction rate can be found without losing the accuracy rate, so that the delta [ iter ]]Is set so thatAThe branch reducing rate required by the iterative branch reducing is met.
Step 2.3, after the current RBM determines the network structure, using an energy function as a condition for adding a new RBM:
wherein ElIs the total energy of the L-th layer RBM, which is obtained from equation (2), L is 1,2 … L, L is the current layer number of DDBN, mean () is the average function, θLIs a threshold value. When the energy function satisfies the formula (16), a new layer of RBM is added, and the initialization of the parameters of the new RBM is the same as that of the first layer. The output of the current RBM is then taken as the input of the newly added RBM.
And 2.4, training the network circularly according to the steps 2.2 and 2.3 to obtain the network structure of the DDBN.
And 3, taking the DDBN network structure and the parameter values obtained in the step 2 as initial values of a fine tuning stage, and fine tuning the whole DDBN network by using a top-down back propagation algorithm to obtain a final DDBN network model. The specific operation is as follows:
and 3.1, taking the DDBN network structure and the parameter value theta trained in the step 2 as initial values of a fine tuning stage, adding an output layer after the last layer of RBM for predicting combustion behaviors including temperature, pressure and gas flow suitable for the training data set sample, and inputting the training data set to start fine tuning the whole DDBN network.
And 3.2, calculating the activation probability of each hidden layer neuron by utilizing a forward propagation algorithm.
Step 3.3, calculating a prediction result obtained by forward propagation of the training sample, and comparing the prediction result with an actual result to obtain a loss function:
where t is the current number of fine-tuning, N is the number of samples in the training dataset, ynAnd y'nThe actual result and the predicted result of the nth training sample are respectively. The error between the actual result and the predicted result is propagated in the reverse direction, and the weight w and the offset c are updated according to the equations (18) and (19) by using the gradient descent method:
where α is the learning rate.
And (3) iteratively using a gradient descent method to finely adjust the whole DDBN network from top to bottom to reduce the value of J (t) until the maximum fine adjustment times are reached to obtain a final DDBN network model.
And 4, inputting the test data set into the final DDBN network model obtained in the step 3, and finally outputting a prediction result. The specific operation is as follows:
and 4.1, inputting the preprocessed test data set into the DDBN network model finely adjusted in the step 3, and extracting the main characteristics of the solid waste through RBM.
And 4.2, inputting the main characteristics of the test sample into the last output layer, and predicting the combustion behavior suitable for the training sample, including temperature, pressure and gas flow.
The invention has the beneficial effects that: in order to enable the model to have extraction characteristics and prediction capability, the DDBN model using the dynamic branch increase and decrease algorithm is provided, so that the network structure of the DDBN can be changed according to the current training condition in the training process, including adding hidden layer neurons and hidden layers and removing redundant neurons, the network structure of the DDBN can be effectively optimized, manual experiments are replaced, and the difficulty of network structure design is overcome. Then, by utilizing the advantage that the DDBN can effectively extract the main characteristics of the original data, the DDBN is used for effectively describing the random, discrete and nonlinear characteristic vectors of the solid waste, so that the state characteristics of the time sequence are easier to identify, and the main information of the original data is ensured not to be lost. Meanwhile, according to the extracted state description of the solid waste, the optimized combustion behavior suitable for the state of the solid waste is predicted by using the DDBN, including temperature, pressure and gas flow, so that the waste of resources caused by blind combustion behavior is greatly reduced, and the intelligent treatment of the solid waste is realized.
Drawings
FIG. 1 is a schematic representation of the operation of adding hidden layer neurons in accordance with the present invention;
FIG. 2 is a schematic diagram of the operation of removing redundant neurons in the present invention;
FIG. 3 is a schematic diagram illustrating the operation of adding a hidden layer according to the present invention;
FIG. 4 is a flow chart of a training process of a DDBN model in the invention;
FIG. 5 is a diagram showing the effect of the intelligent treatment of the solid waste according to the present invention;
Detailed Description
The following further describes a specific embodiment of the present invention with reference to the drawings and technical solutions.
As shown in fig. 4, an intelligent solid waste treatment method based on a dynamic deep belief network includes the following specific steps:
step 1, measuring data such as GDP, dangerous objects, solid waste amount, smelting waste residue, furnace ash, furnace slag, tailings and the like of solid waste to obtain a solid waste data set, and preprocessing the data to obtain a training and testing data set.
Since the data of the solid waste are often not in the same order of magnitude, the solid waste data set needs to be normalized to be between 0 and 1, which is beneficial to improving the training speed of the network. The normalized formula is:
wherein,characteristic value, x, of solid waste data setmaxAnd xminThe maximum value and the minimum value of all characteristics of the solid waste data set are respectively, and x is the solid waste data set after normalization.
And 2, inputting the training data set obtained after preprocessing into a DDBN model for pre-training, and individually training each layer of Restricted Boltzmann Machine (RBM) from bottom to top unsupervised by using a contrast Divergence algorithm (CD). And in the training process, the network structure of the current RBM is optimized through a dynamic branch increase and decrease algorithm, wherein the network structure comprises the steps of adding new hidden layer neurons and removing redundant neurons. After the current RBM training is finished, if the DDBN meets the layer number generation condition, a new layer of RBM is added, the output of the current RBM is used as the input of the newly added RBM, and the initialization of each parameter of the new RBM is the same as that of the first layer. And obtaining the network structure of each RBM and corresponding weight and bias through multiple iterations, and finally obtaining the high-level characteristics of the data. The specific operation is as follows:
step 2.1, constructing a DDBN network model, and setting the parameter values of DDBN: the number of visual layer neurons is the feature dimension of a training data set, the initial hidden layer neurons and the hidden layer neurons are respectively set to be 10 and 1, the learning rate is set to be 0.1, the number of pre-training iterations is 100, and the number of fine-tuning iterations is 100.
And 2.2, taking the training data set obtained after preprocessing as the input of the first layer RBM, pre-training the RBM by using a CD algorithm, and optimizing the network structure of the current RBM by using a dynamic branch addition and subtraction algorithm in the training process, wherein the network structure comprises adding a new hidden layer neuron and removing a redundant neuron.
(1) The RBM is a random network based on an energy model, and the values of each group of visual layer neurons v and hidden layer neurons h have corresponding energy values. According to the definition of the energy function and the principle of thermodynamic statistics, the joint probability distribution of v and h can be obtained. The energy function E (v, h; theta) of the RBM and the joint probability distribution P (v, h; theta) of v and h are:
wherein v isi(1. ltoreq. I. ltoreq.I) and hj(1 ≦ J) for the visual layer neurons and hidden layer neurons, respectively, w is the weight matrix between the visual layer and hidden layer, b and c are the biases for the visual layer neurons and hidden layer neurons, respectively, θ represents a parameter in the model, i.e., θ ═ w, b, c, and Z is the sum over all possible pairs of visual layer and hidden layer neurons.
By using the principle of Bayes formula, the edge probability distribution of the visual layer neuron v and the hidden layer neuron h can be obtained according to the formula (3):
the goal of the RBM network training is to solve for θ ═ { w, b, c }, so that under this parameter the RBM can fit the input samples very large, making P (v; θ) maximum, i.e., solve for the maximum likelihood estimates of the input samples. However, to obtain the maximum likelihood estimate, all possible cases need to be computed, the computation is exponential growing, so the RBM is estimated using the contrast divergence algorithm.
And deducing the conditional probability distribution of the visual layer neuron v and the hidden layer neuron h by using a Bayesian formula:
according to the formula (6) and the formula (7), approximate reconstruction P (v; theta) of the training sample is obtained through one-step Gibbs sampling by using a contrast divergence algorithm, and then the network parameter theta is updated according to the reconstruction error.
(2) In the training process, the network structure of the RBM is optimized through a dynamic branch increase and decrease algorithm according to the current training situation.
In DBNs, the weight w plays a decisive role in network training. Therefore, the present invention proposes a method called WD (Weight Distance) to monitor the change of the Weight w:
WDj[m]=Met(wj[m],wj[m-1]) (8)
wherein, wj[m]Is the weight vector of hidden layer neuron j after m iterations, and Met is a measurement function, such as Euclidean distance. The value of WD reflects the change in the weight vector for hidden layer neuron j in two iterations. In general, the weight vector of neuron j converges after a period of training, i.e., WD becomes smaller and smaller. If some weight vectors fluctuate with large amplitude after a long time of iteration, it should be considered that this is caused by the lack of hidden layer neurons to map the input samples. In this case, the number of neurons needs to be increased to improve the performance of the network. The invention considers the propagation condition from the local aspect and the global aspect. The local conditions are defined as:
whereinIs in the mth iteration, the jth hidden godThe WD value of an n-th input sample of a neuron pair, J1, 2, 3. The global condition is defined as:
where N is the number of training samples and N' is the number of samples that increased the WD value of the jth neuron compared to the last iteration, i.e.
The local and global conditions are considered for a single input sample and all input samples by the hidden layer neuron j, respectively. Find the maximum WD value max _ WD of neuron j for a certain samplej[m]And a sample ratio iratio that increases the WD valuej(m) of the reaction mixture. Then multiplying the two conditions to obtain the propagation condition:
max_WDj[m]*iratioj[m]>y(m) (11)
where y (m) is a curve, which is used as a variable threshold, defined as:
where m is the current iteration number, numepochs is the maximum iteration number, u represents the curvature of the curve, ymaxAnd yminRespectively the maximum and minimum of the curve. In the training process, if the network develops to a good direction, the values of max _ WD and iratio become smaller and smaller, so a curve y (m) is used as the variable threshold of the branching-up condition, and when h is in the process>At 0, y (m) is a monotonically decreasing curve. If the jth neuron satisfies equation (11), the neuron will be divided into two neurons, and each parameter of the new neuron is 0.
And when the RBM training is completed, branch reduction is started. The purpose of RBM is to extract the main feature of the input sample, i.e. the activation probability of hidden layer neurons. These features are all discriminative and facilitate further applied study of the data. If the activation probability of a certain neuron is close to the average value for all samples, the characteristic extracted by the neuron is not distinctive, namely redundant neurons. In order to obtain a more compact network structure, it is necessary to remove these redundant neurons. The invention uses standard deviation to measure the discrete degree of the neuron of the same hidden layer to the activation probability of all samples, and the standard deviation formula is as follows:
where N is 1,2,3, …, N is the number of input samples, j represents the jth hidden layer neuron, P (N, j) represents the activation probability of the jth neuron on the nth input sample, μjRepresenting the average activation probability of the jth neuron on all input samples. A smaller standard deviation means that the values are close to the mean, i.e. the extracted feature of this neuron is not discriminative, and therefore the redundant neuron needs to be removed. The branch reducing conditions are as follows:
σ(j)<θA (14)
wherein theta isAIs a threshold value. If the jth neuron satisfies equation (14), then that neuron and all its parameters are removed. Meanwhile, the invention makes a trade-off curve between the branch reduction rate and the prediction accuracy rate, and selects theta according to the curveAThe value of (c) such that more redundant neurons are removed while preserving the original accuracy. Furthermore, after pruning, retraining the current RBM enables the remaining neurons to compensate for the removed neurons. This step is crucial, and retraining after pruning is an iteration. Iterative pruning removes fewer neurons at a time and retrains multiple times to compensate. Through a plurality of iterations, a higher branch reduction rate can be found without losing the accuracy. Updating the threshold θ for each iterationA:
θA←θA+δ[iter] (15)
By delta [ iter ]]To update the threshold in each iteration of the pruning toMore neurons were removed. Each iteration is a greedy search, and according to a balance curve in each branch reduction, the optimal branch reduction rate can be found without losing the accuracy rate, so that the delta [ iter ]]Is set so thatAThe branch reducing rate required by the iterative branch reducing is met.
And 2.3, after the current RBM determines the network structure, considering the growth of the hidden layer. It can be found from equation (4) that P (v; θ) is inversely proportional to E (v, h; θ). So if one wants to maximize P (v; theta), the energy function E (v, h; theta) should be as small as possible. If the total energy of the DBN is greater than a threshold, indicating that the DBN lacks data representation capability, a new layer of RBM may be added. The invention therefore uses the energy function as a condition for adding a new RBM:
wherein ElIs the total energy of the L-th layer RBM, which is obtained from equation (2), L is 1,2 … L, L is the current layer number of DDBN, mean () is the average function, θLIs a threshold value. If the energy function satisfies equation (16), a new layer of RBM is added, and the parameters of the new RBM are initialized as same as the first layer, and then the output of the current RBM is used as the input of the newly added RBM.
And 2.4, training the network circularly according to the steps 2.2 and 2.3, and learning a deep DDBN network structure.
And 3, further optimizing the DDBN by using fine adjustment. And taking the network structure and the parameter values obtained in the pre-training stage as initial values of the fine-tuning stage, and fine-tuning the whole DDBN network. The invention uses a back propagation algorithm to finely adjust the whole network, namely, training errors are reversely propagated from top to bottom, and the network is optimized to obtain a final DDBN network model. The specific operation is as follows:
and 3.1, taking the network structure and the parameter value theta trained in the pre-training stage as initial values of the fine-tuning stage, and adding an output layer after the last layer of RBM. The output layer is provided with 3 neurons, and the outputs of the neurons represent the temperature, the pressure and the gas flow respectively and are used for predicting the combustion behavior suitable for the training sample. And inputting a training data set into the network in the fine tuning stage for optimization.
And 3.2, calculating the activation probability of each hidden layer neuron by utilizing a forward propagation algorithm.
Step 3.3, calculating a prediction result obtained by forward propagation of the training sample, and comparing the prediction result with an actual result to obtain a loss function:
where t is the current number of fine-tuning times, N is the number of training samples, ynAnd y'nThe actual result and the predicted result of the nth training sample are respectively. The error between the actual result and the predicted result is propagated in the reverse direction, and the weight w and the offset c are updated according to the equations (18) and (19) by using the gradient descent method:
where α is the learning rate. And (3) iteratively using a gradient descent method to finely adjust the whole DDBN network from top to bottom to reduce the value of J (t) until the maximum fine adjustment times are reached to obtain a final DDBN network model.
And 4, inputting the preprocessed test data set into a DDBN network model obtained in the fine adjustment stage, extracting main characteristics of the test sample through RBM, then inputting the main characteristics into the last layer of output layer, and respectively representing the values of temperature, pressure and gas flow, namely predicting the combustion behavior suitable for the training sample.
The collected solid waste data set is detected by the method provided by the invention. The data set includes 1000 samples, 800 training samples and 200 testing samples. Each sample has 7 features, so the number of visual layer neurons is set to 7; each sample has 3 outputs, namely temperature, pressure and gas flow, corresponding to its combustion behaviour.
The detection result shows that the intelligent solid waste treatment method based on the dynamic deep belief network saves 30% of treatment time compared with the traditional manual control method, and the treatment effect also reaches the national regulated solid waste treatment index. Therefore, the method provided by the invention can effectively treat the solid waste, save time and cost and realize efficient intelligent treatment.
Claims (5)
1. A solid waste intelligent treatment method based on a dynamic deep belief network is characterized by comprising the following steps:
step 1, measuring solid waste to obtain a solid waste data set, preprocessing the solid waste data set, and dividing to obtain a training data set and a testing data set;
step 2, inputting the training data set obtained after the preprocessing in the step 1 into a DDBN model, using a contrast divergence algorithm to train each layer of restricted Boltzmann machine RBM from bottom to top independently without supervision, optimizing the network structure of the current RBM through a dynamic increase and decrease branch algorithm in the training process, obtaining the network structure and parameter values of each RBM through iterative training, and finally obtaining the high-level characteristics of the training data set; the parameter values are weight and bias;
step 3, taking the DDBN network structure and the parameter values obtained in the step 2 as initial values of a fine tuning stage, and fine tuning the whole DDBN network by using a top-down back propagation algorithm to obtain a final DDBN network model; the specific process is as follows:
step 3.1, taking the DDBN network structure and the parameter value theta trained in the step 2 as initial values of a fine tuning stage, adding an output layer after the last layer of RBM for predicting combustion behaviors including temperature, pressure and gas flow suitable for the training data set sample, and inputting the training data set to start fine tuning the whole DDBN network;
step 3.2, calculating the activation probability of each hidden layer neuron by using a forward propagation algorithm;
step 3.3, calculating a prediction result obtained by forward propagation of the training sample, and comparing the prediction result with an actual result to obtain a loss function:
where t is the current number of fine-tuning, N is the number of samples in the training dataset, ynAnd y'nRespectively obtaining an actual result and a predicted result of the nth training sample; and (3) reversely propagating errors of the actual result and the predicted result, and updating the weight w and the offset c according to the formulas (2) and (3) by using a gradient descent method:
wherein α is the learning rate;
iteratively using a gradient descent method to finely adjust the whole DDBN network from top to bottom to reduce the value of J (t) until the maximum fine adjustment times are reached to obtain a final DDBN network model;
and 4, inputting the test data set into the final DDBN network model obtained in the step 3, and finally outputting a prediction result.
2. The method for intelligently processing the solid waste based on the dynamic deep belief network as claimed in claim 1, wherein the preprocessing in the step 1 is: normalizing the solid waste data set to be between [0 and 1], wherein the normalization formula is as follows:
3. The method for intelligently processing the solid waste based on the dynamic deep belief network as claimed in claim 1 or 2, wherein the specific process of the step 2 is as follows:
step 2.1, constructing a DDBN network model, and setting the parameter values of DDBN: visual layer neurons, initial hidden layer neurons, hidden layer numbers, learning rate, iteration times and fine tuning times; wherein, the number of neurons in the visual layer is the feature dimension of the training data set;
step 2.2, inputting the training data set obtained after preprocessing into a first layer RBM, pre-training the RBM by using a CD algorithm, and optimizing the network structure of the current RBM by using a dynamic branch-increasing and-decreasing algorithm in the training process;
(1) the energy function E (v, h; θ) of the RBM and the joint probability distribution P (v, h; θ) of the visible and hidden layer neurons are:
wherein v isi(1. ltoreq. I. ltoreq.I) and hj(J is more than or equal to 1 and less than or equal to J) respectively represents a visual layer neuron and a hidden layer neuron, w is a weight matrix between the visual layer and the hidden layer, b and c are the bias of the visual layer neuron and the hidden layer neuron respectively, and theta represents a parameter in the model, namely theta is { w, b, c }; z is for all possible visible and hidden layersSumming the channel pairs;
and (3) solving the edge probability distribution of the visual layer neuron v and the hidden layer neuron h according to a formula (6) by utilizing the principle of a Bayes formula:
and deducing the conditional probability distribution of the visual layer neuron v and the hidden layer neuron h by using a Bayesian formula:
obtaining approximate reconstruction P (v; theta) of a training sample by using a contrast divergence algorithm through one-step Gibbs sampling by using a formula (9) and a formula (10), and then updating a network parameter theta to be { w, b, c } according to a reconstruction error;
(2) in the training process, according to the current training condition, optimizing the network structure of the RBM through a dynamic branch increasing and decreasing algorithm;
the change in weight w is monitored using the weight distance WD method:
WDj[m]=Met(wj[m],wj[m-1]) (11)
wherein, wj[m]The weight vector of the hidden layer neuron j after m iterations, wherein Met represents a measurement function, such as Euclidean distance; the value of WD reflects the change in the weight vector of hidden layer neuron j in two iterations;
the branch increasing condition is considered from the aspects of local and global;
the local conditions are defined as:
whereinIn the mth iteration, the WD value of the jth hidden neuron to the nth input sample, J is 1,2,3, J is the number of hidden neurons, and max (·) is a maximum function;
the global condition is defined as:
where N is the number of samples in the training data set and N' is the number of samples that increased the WD value of the jth neuron compared to the last iteration, i.e., the number of samples
The local and global conditions are considered for a single input sample and all input samples by the hidden layer neuron j, respectively; multiplying the two conditions to obtain the propagation condition:
max_WDj[m]*iratioj(m)>y(m) (14)
where y (m) is a curve, which is used as a variable threshold, defined as:
where m is the current iteration number, numepochs is the maximum iteration number, u represents the curvature of the curve, ymaxAnd yminMaximum and minimum values of the curve, respectively; when the jth neuron satisfies equation (14), then the neuron will be divided into two neurons, and each parameter of the new neuron will beThe numbers are all 0;
when the RBM training is completed, branch reduction is started: and (3) using the standard deviation of the hidden layer neurons on the activation probability of all samples as a branch-reducing condition, wherein the standard deviation formula is as follows:
where N is 1,2,3, …, N is the number of samples in the input training dataset, j represents the jth hidden layer neuron, P (N, j) represents the activation probability of the jth neuron on the nth input sample, μjRepresenting the average activation probability of the jth neuron on all input samples;
the branch reducing conditions are as follows:
σ(j)<θA (17)
wherein theta isAIs a threshold value; when the jth neuron satisfies formula (17), removing the neuron and all parameters thereof; at the same time, a trade-off curve relating to the ratio of branch reduction and the prediction accuracy is made, and theta is selected according to the curveAA value of (a) such that the original accuracy is preserved while removing more redundant neurons;
after branch reduction, retraining the RBM to enable the remaining neurons to compensate the removed neurons, and retraining after branch reduction into one iteration; at each iteration we update the threshold θA:
θA←θA+δ[iter] (18)
By delta [ iter ]]To update the threshold in each iteration of the pruning to remove more neurons; each iteration is a greedy search, and according to a balance curve in each branch reduction, the optimal branch reduction rate can be found without losing the accuracy rate, so that the delta [ iter ]]Is set so thatAThe branch reducing rate required by the iterative branch reducing is met;
step 2.3, after the current RBM determines the network structure, using an energy function as a condition for adding a new RBM:
wherein ElIs the total energy of the L-th layer RBM, which is found by equation (5), L is 1,2 … L, L is the current layer number of the DDBN, mean () is the average function, θLIs a threshold value; when the energy function meets the formula (19), a new layer of RBM is added, and the initialization of each parameter of the new RBM is the same as that of the first layer; then taking the output of the current RBM as the input of the newly added RBM;
and 2.4, training the network circularly according to the steps 2.2 and 2.3 to obtain the network structure of the DDBN.
4. The method for intelligently processing the solid waste based on the dynamic deep belief network as claimed in claim 1 or 2, wherein the specific process of the step 4 is as follows:
step 4.1, inputting the preprocessed test data set into the DDBN network model finely adjusted in the step 3, and extracting the main characteristics of the solid waste through RBM;
and 4.2, inputting the main characteristics of the test sample into the last output layer, and predicting the combustion behavior suitable for the training sample, including temperature, pressure and gas flow.
5. The method for intelligently processing the solid waste based on the dynamic deep belief network as claimed in claim 3, wherein the specific process of the step 4 is as follows:
step 4.1, inputting the preprocessed test data set into the DDBN network model finely adjusted in the step 3, and extracting the main characteristics of the solid waste through RBM;
and 4.2, inputting the main characteristics of the test sample into the last output layer, and predicting the combustion behavior suitable for the training sample, including temperature, pressure and gas flow.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810768405.8A CN109146007B (en) | 2018-07-13 | 2018-07-13 | Solid waste intelligent treatment method based on dynamic deep belief network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810768405.8A CN109146007B (en) | 2018-07-13 | 2018-07-13 | Solid waste intelligent treatment method based on dynamic deep belief network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109146007A CN109146007A (en) | 2019-01-04 |
CN109146007B true CN109146007B (en) | 2021-08-27 |
Family
ID=64800535
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810768405.8A Active CN109146007B (en) | 2018-07-13 | 2018-07-13 | Solid waste intelligent treatment method based on dynamic deep belief network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109146007B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110108672B (en) * | 2019-04-12 | 2021-07-06 | 南京信息工程大学 | Aerosol extinction coefficient inversion method based on deep belief network |
CN111366123B (en) * | 2020-03-06 | 2021-03-26 | 大连理工大学 | Part surface roughness and cutter wear prediction method based on multi-task learning |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107269335A (en) * | 2017-07-28 | 2017-10-20 | 浙江大学 | The rubbish and gas combustion-gas vapor combined cycle system of a kind of use combustion gas garbage drying |
CN107729988A (en) * | 2017-09-30 | 2018-02-23 | 北京工商大学 | Blue-green alga bloom Forecasting Methodology based on dynamic depth confidence network |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107247260B (en) * | 2017-07-06 | 2019-12-03 | 合肥工业大学 | A kind of RFID localization method based on adaptive depth confidence network |
CN108197427B (en) * | 2018-01-02 | 2020-09-04 | 山东师范大学 | Protein subcellular localization method and device based on deep convolutional neural network |
-
2018
- 2018-07-13 CN CN201810768405.8A patent/CN109146007B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107269335A (en) * | 2017-07-28 | 2017-10-20 | 浙江大学 | The rubbish and gas combustion-gas vapor combined cycle system of a kind of use combustion gas garbage drying |
CN107729988A (en) * | 2017-09-30 | 2018-02-23 | 北京工商大学 | Blue-green alga bloom Forecasting Methodology based on dynamic depth confidence network |
Non-Patent Citations (2)
Title |
---|
An adaptive learning method of Deep Belief Network by layer generation;Shin Kamada 等;《IEEE》;20171231;第2967-2970页 * |
一种动态构建深度信念网络模型方法;吴强 等;《中国计量大学学报》;20180315;第64-69页 * |
Also Published As
Publication number | Publication date |
---|---|
CN109146007A (en) | 2019-01-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108665058B (en) | Method for generating countermeasure network based on segment loss | |
CN112116162B (en) | Power transmission line icing thickness prediction method based on CEEMDAN-QFAOA-LSTM | |
CN110782658B (en) | Traffic prediction method based on LightGBM algorithm | |
CN109886464B (en) | Low-information-loss short-term wind speed prediction method based on optimized singular value decomposition generated feature set | |
CN114218872B (en) | DBN-LSTM semi-supervised joint model-based residual service life prediction method | |
CN104598972A (en) | Quick training method of large-scale data recurrent neutral network (RNN) | |
CN112069310A (en) | Text classification method and system based on active learning strategy | |
CN113255986B (en) | Multi-step daily runoff forecasting method based on meteorological information and deep learning algorithm | |
CN106056127A (en) | GPR (gaussian process regression) online soft measurement method with model updating | |
CN111144552B (en) | Multi-index grain quality prediction method and device | |
CN114757427B (en) | Autoregressive-corrected LSTM intelligent wind power plant ultra-short-term power prediction method | |
CN111931983B (en) | Precipitation prediction method and system | |
Suryo et al. | Improved time series prediction using LSTM neural network for smart agriculture application | |
CN109146007B (en) | Solid waste intelligent treatment method based on dynamic deep belief network | |
CN108062566A (en) | A kind of intelligent integrated flexible measurement method based on the potential feature extraction of multinuclear | |
CN115544890A (en) | Short-term power load prediction method and system | |
Hu et al. | A dynamic rectified linear activation units | |
CN118297106B (en) | Natural gas pipeline leakage risk prediction optimization method | |
CN115879369A (en) | Coal mill fault early warning method based on optimized LightGBM algorithm | |
CN106777466B (en) | Dynamic evolution modeling method of high-sulfur natural gas purification process based on ST-UPFNN algorithm | |
CN112001115A (en) | Soft measurement modeling method of semi-supervised dynamic soft measurement network | |
CN109214513B (en) | Solid-liquid waste intelligent coupling treatment method based on adaptive deep belief network | |
Jamaleddyn et al. | An improved approach to Arabic news classification based on hyperparameter tuning of machine learning algorithms | |
CN106777468A (en) | High sulfur content natural gas desulfurization process strong tracking evolutionary Modeling method | |
CN116865255A (en) | Short-term wind power prediction method based on improved entropy weight method and SECEEMD |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |