CN109146007B - Solid waste intelligent treatment method based on dynamic deep belief network - Google Patents

Solid waste intelligent treatment method based on dynamic deep belief network Download PDF

Info

Publication number
CN109146007B
CN109146007B CN201810768405.8A CN201810768405A CN109146007B CN 109146007 B CN109146007 B CN 109146007B CN 201810768405 A CN201810768405 A CN 201810768405A CN 109146007 B CN109146007 B CN 109146007B
Authority
CN
China
Prior art keywords
ddbn
rbm
solid waste
training
neuron
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810768405.8A
Other languages
Chinese (zh)
Other versions
CN109146007A (en
Inventor
宋威
张士昱
王晨妮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangnan University
Original Assignee
Jiangnan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangnan University filed Critical Jiangnan University
Priority to CN201810768405.8A priority Critical patent/CN109146007B/en
Publication of CN109146007A publication Critical patent/CN109146007A/en
Application granted granted Critical
Publication of CN109146007B publication Critical patent/CN109146007B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • G06V30/192Recognition using electronic means using simultaneous comparisons or correlations of the image signals with a plurality of references
    • G06V30/194References adjustable by an adaptive method, e.g. learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/061Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using biological neurons, e.g. biological neurons connected to an integrated circuit
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Neurology (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Databases & Information Systems (AREA)
  • Microelectronics & Electronic Packaging (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides a solid waste intelligent treatment method based on a dynamic deep belief network, and belongs to the fields of deep learning and solid waste intelligent treatment. The method firstly provides the DDBN using the dynamic branch increase and decrease algorithm, so that the DDBN can increase hidden layer neurons and hidden layers according to the current training condition in the training process, remove redundant neurons and effectively optimize the network structure of the DDBN. Then, by utilizing the advantage that the DDBN can effectively extract the main characteristics of the original data, the DDBN is used for effectively describing the random, discrete and nonlinear characteristic vectors of the solid waste, so that the state characteristics of the time sequence are easier to identify, and the main information of the original data is ensured not to be lost. Meanwhile, according to the extracted state description of the solid waste, the optimized combustion behavior suitable for the state of the solid waste is predicted by using DDBN, the waste of resources caused by blind combustion behavior is reduced, and the intelligent treatment of the solid waste is realized.

Description

Solid waste intelligent treatment method based on dynamic deep belief network
Technical Field
The invention belongs to the field of Deep learning and intelligent solid waste treatment, and provides a Dynamic Deep Belief Network (DDBN) model using a Dynamic branching and increasing algorithm, which can effectively optimize the Network structure of a Deep Belief Network so as to solve the problem of intelligent treatment of a large amount of solid waste in the light industry.
Background
The development of the light industry currently faces great environmental protection pressure and great requirements on pollution emission reduction treatment tasks. With the development of national economy, the market demand of fermentation and paper-making products is greatly increased, and although the pollution emission intensity of unit products is obviously reduced in recent years, the total emission amount of industrial solid wastes is still increased due to the rapid amplification of production capacity. In order to realize the related targets of energy conservation and emission reduction in the industry, a new method for treating wastes needs to be researched, the pollution control and treatment level of production enterprises is improved through the application of the new method, and the pollution emission reduction action of the industry is supported.
In recent years, Deep learning is rapidly developed, and Hinton et al propose a Deep Belief Network (DBN) and an unsupervised greedy layer-by-layer training algorithm in 2006, so that the problem that a Deep neural Network is prone to fall into local optimization is solved, and a new wave of Deep learning in academic circles is triggered. The DBN obtains abstract representation of original data through multi-level feature transformation, so that accuracy of tasks such as classification and prediction is improved, and the DBN has the advantages of automatic learning features and data dimension reduction and becomes a most extensive network structure for deep learning application.
The advantage that the DBN can effectively extract the main characteristics of the original data is utilized, the DBN can be used for effectively describing the state of random, discrete and nonlinear characteristic vectors of the solid waste, so that the state characteristics of the time sequence are easier to identify, and the main information of the original data is not lost. Meanwhile, according to the extracted state description of the solid waste, the optimized combustion behavior suitable for the state of the solid waste is predicted by using the DBN, the waste of resources caused by blind combustion behaviors is greatly reduced, and the intelligent treatment of the solid waste is realized.
However, if the DBN is to solve a high complexity problem, such as the above problem, the DBN needs to increase hidden layer neurons and hidden layers appropriately. However, the number of hidden layer neurons and hidden layers still needs to be selected by manual experiments at present, and the network structure is fixed and unchanged in the training process. Therefore, the error is large, the calculation cost is high, and the efficiency is low. Therefore, a new design method of the DBN structure is needed to be provided, so that the DBN can dynamically increase and decrease branches according to the current training situation in the training process, and the network structure is optimized, so that the problem of intelligent treatment of solid wastes is better solved.
Disclosure of Invention
In order to solve the problems in the prior art, the invention provides an intelligent solid waste treatment method based on a Dynamic Deep Belief Network (DDBN).
The technical scheme of the invention is as follows:
a solid waste intelligent treatment method based on a dynamic deep belief network comprises the following steps:
step 1, measuring solid waste to obtain a solid waste data set, preprocessing the solid waste data set, and dividing to obtain a training data set and a testing data set.
The pretreatment comprises the following steps: normalizing the solid waste data set to be between [0 and 1], wherein the normalization formula is as follows:
Figure BDA0001729614950000021
wherein,
Figure BDA0001729614950000022
characteristic value, x, of solid waste data setmaxAnd xminThe maximum value and the minimum value of all characteristics of the solid waste data set are respectively, and x is the solid waste data set after normalization.
Step 2, inputting the training data set obtained after the preprocessing in the step 1 into a DDBN (digital data base) model, using a contrast Divergence algorithm (CD) to train each layer of Restricted Boltzmann Machine (RBM) from bottom to top without supervision, optimizing the network structure of the current RBM by a dynamic branch-increasing and-decreasing algorithm in the training process, obtaining the network structure and parameter values of each RBM by iterative training, and finally obtaining the high-level characteristics of the training data set; the parameter values are weight and bias. The specific operation is as follows:
step 2.1, constructing a DDBN network model, and setting the parameter values of DDBN: visual layer neurons, initial hidden layer neurons, and hidden layer numbers, learning rate, iteration times, and fine tuning times. Wherein, the number of visual layer neurons is the feature dimension of the training data set.
And 2.2, inputting the training data set obtained after preprocessing into the first layer RBM, pre-training the RBM by using a CD algorithm, and optimizing the network structure of the current RBM by using a dynamic branch-increasing and-decreasing algorithm in the training process.
(1) The energy function E (v, h; θ) of the RBM and the joint probability distribution P (v, h; θ) of the visible and hidden layer neurons are:
Figure BDA0001729614950000023
Figure BDA0001729614950000024
wherein v isi(1. ltoreq. I. ltoreq.I) and hj(J is more than or equal to 1 and less than or equal to J) respectively represents a visual layer neuron and a hidden layer neuron, w is a weight matrix between the visual layer and the hidden layer, b and c are the bias of the visual layer neuron and the hidden layer neuron respectively, and theta represents a parameter in the model, namely theta is { w, b, c }; z is the sum of all possible pairs of visible and hidden layer neurons.
And (3) solving the edge probability distribution of the visual layer neuron v and the hidden layer neuron h according to a formula (3) by utilizing the principle of a Bayes formula:
Figure BDA0001729614950000025
Figure BDA0001729614950000026
and deducing the conditional probability distribution of the visual layer neuron v and the hidden layer neuron h by using a Bayesian formula:
Figure BDA0001729614950000031
Figure BDA0001729614950000032
and (3) obtaining an approximate reconstruction P (v; theta) of the training sample through one-step Gibbs sampling by using a contrast divergence algorithm according to the formula (6) and the formula (7), and then updating a network parameter theta to be { w, b, c } according to a reconstruction error.
(2) In the training process, the network structure of the RBM is optimized through a dynamic branch increase and decrease algorithm according to the current training situation.
The change in Weight w is monitored using the WD (Weight Distance) method:
WDj[m]=Met(wj[m],wj[m-1]) (8)
wherein, wj[m]Is the weight vector of hidden layer neuron j after m iterations, and Met represents the metric function, such as Euclidean distance. The value of WD reflects the change in the weight vector for hidden layer neuron j in two iterations.
The propagation conditions are considered from both local and global aspects.
The local conditions are defined as:
Figure BDA0001729614950000033
wherein
Figure BDA0001729614950000034
In the mth iteration, the WD value of the jth hidden neuron to the nth input sample, J is 1,2, 3.
The global condition is defined as:
Figure BDA0001729614950000035
where N is the number of samples in the training data set and N' is the number of samples that increased the WD value of the jth neuron compared to the last iteration, i.e., the number of samples
Figure BDA0001729614950000036
The local and global conditions are considered for a single input sample and all input samples by the hidden layer neuron j, respectively. Multiplying the two conditions to obtain the propagation condition:
max_WDj[m]*iratioj[m]>y(m) (11)
where y (m) is a curve, which is used as a variable threshold, defined as:
Figure BDA0001729614950000037
where m is the current iteration number, numepochs is the maximum iteration number, u represents the curvature of the curve, ymaxAnd yminRespectively the maximum and minimum of the curve. When the jth neuron satisfies equation (11), then the neuron will be divided into two neurons, and each parameter of the new neuron is 0.
When the RBM training is completed, branch reduction is started: and (3) using the standard deviation of the hidden layer neurons on the activation probability of all samples as a branch-reducing condition, wherein the standard deviation formula is as follows:
Figure BDA0001729614950000041
where N is 1,2,3, …, N is the number of samples in the input training dataset, j represents the jth hidden layer neuron, P (N, j) represents the activation probability of the jth neuron on the nth input sample, μjRepresenting the average activation probability of the jth neuron on all input samples.
The branch reducing conditions are as follows:
σ(j)<θA (14)
wherein theta isAIs a threshold value. When the jth neuron satisfies equation (14), then the neuron and all its parameters are removed. At the same time, a trade-off curve relating to the ratio of branch reduction and the prediction accuracy is made, and theta is selected according to the curveAThe value of (c) such that more redundant neurons are removed while preserving the original accuracy.
After the branch is subtracted, the RBM is retrained, so that the rest neurons can compensate the removed neurons, and the retraining after the branch is subtracted is an iteration. At each iteration we update the threshold θA
θA←θA+δ[iter] (15)
By delta [ iter ]]The threshold in each iteration of the pruning is updated to remove more neurons. Each iteration is a greedy search, and according to a balance curve in each branch reduction, the optimal branch reduction rate can be found without losing the accuracy rate, so that the delta [ iter ]]Is set so thatAThe branch reducing rate required by the iterative branch reducing is met.
Step 2.3, after the current RBM determines the network structure, using an energy function as a condition for adding a new RBM:
Figure BDA0001729614950000042
wherein ElIs the total energy of the L-th layer RBM, which is obtained from equation (2), L is 1,2 … L, L is the current layer number of DDBN, mean () is the average function, θLIs a threshold value. When the energy function satisfies the formula (16), a new layer of RBM is added, and the initialization of the parameters of the new RBM is the same as that of the first layer. The output of the current RBM is then taken as the input of the newly added RBM.
And 2.4, training the network circularly according to the steps 2.2 and 2.3 to obtain the network structure of the DDBN.
And 3, taking the DDBN network structure and the parameter values obtained in the step 2 as initial values of a fine tuning stage, and fine tuning the whole DDBN network by using a top-down back propagation algorithm to obtain a final DDBN network model. The specific operation is as follows:
and 3.1, taking the DDBN network structure and the parameter value theta trained in the step 2 as initial values of a fine tuning stage, adding an output layer after the last layer of RBM for predicting combustion behaviors including temperature, pressure and gas flow suitable for the training data set sample, and inputting the training data set to start fine tuning the whole DDBN network.
And 3.2, calculating the activation probability of each hidden layer neuron by utilizing a forward propagation algorithm.
Step 3.3, calculating a prediction result obtained by forward propagation of the training sample, and comparing the prediction result with an actual result to obtain a loss function:
Figure BDA0001729614950000051
where t is the current number of fine-tuning, N is the number of samples in the training dataset, ynAnd y'nThe actual result and the predicted result of the nth training sample are respectively. The error between the actual result and the predicted result is propagated in the reverse direction, and the weight w and the offset c are updated according to the equations (18) and (19) by using the gradient descent method:
Figure BDA0001729614950000052
Figure BDA0001729614950000053
where α is the learning rate.
And (3) iteratively using a gradient descent method to finely adjust the whole DDBN network from top to bottom to reduce the value of J (t) until the maximum fine adjustment times are reached to obtain a final DDBN network model.
And 4, inputting the test data set into the final DDBN network model obtained in the step 3, and finally outputting a prediction result. The specific operation is as follows:
and 4.1, inputting the preprocessed test data set into the DDBN network model finely adjusted in the step 3, and extracting the main characteristics of the solid waste through RBM.
And 4.2, inputting the main characteristics of the test sample into the last output layer, and predicting the combustion behavior suitable for the training sample, including temperature, pressure and gas flow.
The invention has the beneficial effects that: in order to enable the model to have extraction characteristics and prediction capability, the DDBN model using the dynamic branch increase and decrease algorithm is provided, so that the network structure of the DDBN can be changed according to the current training condition in the training process, including adding hidden layer neurons and hidden layers and removing redundant neurons, the network structure of the DDBN can be effectively optimized, manual experiments are replaced, and the difficulty of network structure design is overcome. Then, by utilizing the advantage that the DDBN can effectively extract the main characteristics of the original data, the DDBN is used for effectively describing the random, discrete and nonlinear characteristic vectors of the solid waste, so that the state characteristics of the time sequence are easier to identify, and the main information of the original data is ensured not to be lost. Meanwhile, according to the extracted state description of the solid waste, the optimized combustion behavior suitable for the state of the solid waste is predicted by using the DDBN, including temperature, pressure and gas flow, so that the waste of resources caused by blind combustion behavior is greatly reduced, and the intelligent treatment of the solid waste is realized.
Drawings
FIG. 1 is a schematic representation of the operation of adding hidden layer neurons in accordance with the present invention;
FIG. 2 is a schematic diagram of the operation of removing redundant neurons in the present invention;
FIG. 3 is a schematic diagram illustrating the operation of adding a hidden layer according to the present invention;
FIG. 4 is a flow chart of a training process of a DDBN model in the invention;
FIG. 5 is a diagram showing the effect of the intelligent treatment of the solid waste according to the present invention;
Detailed Description
The following further describes a specific embodiment of the present invention with reference to the drawings and technical solutions.
As shown in fig. 4, an intelligent solid waste treatment method based on a dynamic deep belief network includes the following specific steps:
step 1, measuring data such as GDP, dangerous objects, solid waste amount, smelting waste residue, furnace ash, furnace slag, tailings and the like of solid waste to obtain a solid waste data set, and preprocessing the data to obtain a training and testing data set.
Since the data of the solid waste are often not in the same order of magnitude, the solid waste data set needs to be normalized to be between 0 and 1, which is beneficial to improving the training speed of the network. The normalized formula is:
Figure BDA0001729614950000061
wherein,
Figure BDA0001729614950000062
characteristic value, x, of solid waste data setmaxAnd xminThe maximum value and the minimum value of all characteristics of the solid waste data set are respectively, and x is the solid waste data set after normalization.
And 2, inputting the training data set obtained after preprocessing into a DDBN model for pre-training, and individually training each layer of Restricted Boltzmann Machine (RBM) from bottom to top unsupervised by using a contrast Divergence algorithm (CD). And in the training process, the network structure of the current RBM is optimized through a dynamic branch increase and decrease algorithm, wherein the network structure comprises the steps of adding new hidden layer neurons and removing redundant neurons. After the current RBM training is finished, if the DDBN meets the layer number generation condition, a new layer of RBM is added, the output of the current RBM is used as the input of the newly added RBM, and the initialization of each parameter of the new RBM is the same as that of the first layer. And obtaining the network structure of each RBM and corresponding weight and bias through multiple iterations, and finally obtaining the high-level characteristics of the data. The specific operation is as follows:
step 2.1, constructing a DDBN network model, and setting the parameter values of DDBN: the number of visual layer neurons is the feature dimension of a training data set, the initial hidden layer neurons and the hidden layer neurons are respectively set to be 10 and 1, the learning rate is set to be 0.1, the number of pre-training iterations is 100, and the number of fine-tuning iterations is 100.
And 2.2, taking the training data set obtained after preprocessing as the input of the first layer RBM, pre-training the RBM by using a CD algorithm, and optimizing the network structure of the current RBM by using a dynamic branch addition and subtraction algorithm in the training process, wherein the network structure comprises adding a new hidden layer neuron and removing a redundant neuron.
(1) The RBM is a random network based on an energy model, and the values of each group of visual layer neurons v and hidden layer neurons h have corresponding energy values. According to the definition of the energy function and the principle of thermodynamic statistics, the joint probability distribution of v and h can be obtained. The energy function E (v, h; theta) of the RBM and the joint probability distribution P (v, h; theta) of v and h are:
Figure BDA0001729614950000071
Figure BDA0001729614950000072
wherein v isi(1. ltoreq. I. ltoreq.I) and hj(1 ≦ J) for the visual layer neurons and hidden layer neurons, respectively, w is the weight matrix between the visual layer and hidden layer, b and c are the biases for the visual layer neurons and hidden layer neurons, respectively, θ represents a parameter in the model, i.e., θ ═ w, b, c, and Z is the sum over all possible pairs of visual layer and hidden layer neurons.
By using the principle of Bayes formula, the edge probability distribution of the visual layer neuron v and the hidden layer neuron h can be obtained according to the formula (3):
Figure BDA0001729614950000073
Figure BDA0001729614950000074
the goal of the RBM network training is to solve for θ ═ { w, b, c }, so that under this parameter the RBM can fit the input samples very large, making P (v; θ) maximum, i.e., solve for the maximum likelihood estimates of the input samples. However, to obtain the maximum likelihood estimate, all possible cases need to be computed, the computation is exponential growing, so the RBM is estimated using the contrast divergence algorithm.
And deducing the conditional probability distribution of the visual layer neuron v and the hidden layer neuron h by using a Bayesian formula:
Figure BDA0001729614950000075
Figure BDA0001729614950000076
according to the formula (6) and the formula (7), approximate reconstruction P (v; theta) of the training sample is obtained through one-step Gibbs sampling by using a contrast divergence algorithm, and then the network parameter theta is updated according to the reconstruction error.
(2) In the training process, the network structure of the RBM is optimized through a dynamic branch increase and decrease algorithm according to the current training situation.
In DBNs, the weight w plays a decisive role in network training. Therefore, the present invention proposes a method called WD (Weight Distance) to monitor the change of the Weight w:
WDj[m]=Met(wj[m],wj[m-1]) (8)
wherein, wj[m]Is the weight vector of hidden layer neuron j after m iterations, and Met is a measurement function, such as Euclidean distance. The value of WD reflects the change in the weight vector for hidden layer neuron j in two iterations. In general, the weight vector of neuron j converges after a period of training, i.e., WD becomes smaller and smaller. If some weight vectors fluctuate with large amplitude after a long time of iteration, it should be considered that this is caused by the lack of hidden layer neurons to map the input samples. In this case, the number of neurons needs to be increased to improve the performance of the network. The invention considers the propagation condition from the local aspect and the global aspect. The local conditions are defined as:
Figure BDA0001729614950000081
wherein
Figure BDA0001729614950000082
Is in the mth iteration, the jth hidden godThe WD value of an n-th input sample of a neuron pair, J1, 2, 3. The global condition is defined as:
Figure BDA0001729614950000083
where N is the number of training samples and N' is the number of samples that increased the WD value of the jth neuron compared to the last iteration, i.e.
Figure BDA0001729614950000084
The local and global conditions are considered for a single input sample and all input samples by the hidden layer neuron j, respectively. Find the maximum WD value max _ WD of neuron j for a certain samplej[m]And a sample ratio iratio that increases the WD valuej(m) of the reaction mixture. Then multiplying the two conditions to obtain the propagation condition:
max_WDj[m]*iratioj[m]>y(m) (11)
where y (m) is a curve, which is used as a variable threshold, defined as:
Figure BDA0001729614950000085
where m is the current iteration number, numepochs is the maximum iteration number, u represents the curvature of the curve, ymaxAnd yminRespectively the maximum and minimum of the curve. In the training process, if the network develops to a good direction, the values of max _ WD and iratio become smaller and smaller, so a curve y (m) is used as the variable threshold of the branching-up condition, and when h is in the process>At 0, y (m) is a monotonically decreasing curve. If the jth neuron satisfies equation (11), the neuron will be divided into two neurons, and each parameter of the new neuron is 0.
And when the RBM training is completed, branch reduction is started. The purpose of RBM is to extract the main feature of the input sample, i.e. the activation probability of hidden layer neurons. These features are all discriminative and facilitate further applied study of the data. If the activation probability of a certain neuron is close to the average value for all samples, the characteristic extracted by the neuron is not distinctive, namely redundant neurons. In order to obtain a more compact network structure, it is necessary to remove these redundant neurons. The invention uses standard deviation to measure the discrete degree of the neuron of the same hidden layer to the activation probability of all samples, and the standard deviation formula is as follows:
Figure BDA0001729614950000091
where N is 1,2,3, …, N is the number of input samples, j represents the jth hidden layer neuron, P (N, j) represents the activation probability of the jth neuron on the nth input sample, μjRepresenting the average activation probability of the jth neuron on all input samples. A smaller standard deviation means that the values are close to the mean, i.e. the extracted feature of this neuron is not discriminative, and therefore the redundant neuron needs to be removed. The branch reducing conditions are as follows:
σ(j)<θA (14)
wherein theta isAIs a threshold value. If the jth neuron satisfies equation (14), then that neuron and all its parameters are removed. Meanwhile, the invention makes a trade-off curve between the branch reduction rate and the prediction accuracy rate, and selects theta according to the curveAThe value of (c) such that more redundant neurons are removed while preserving the original accuracy. Furthermore, after pruning, retraining the current RBM enables the remaining neurons to compensate for the removed neurons. This step is crucial, and retraining after pruning is an iteration. Iterative pruning removes fewer neurons at a time and retrains multiple times to compensate. Through a plurality of iterations, a higher branch reduction rate can be found without losing the accuracy. Updating the threshold θ for each iterationA
θA←θA+δ[iter] (15)
By delta [ iter ]]To update the threshold in each iteration of the pruning toMore neurons were removed. Each iteration is a greedy search, and according to a balance curve in each branch reduction, the optimal branch reduction rate can be found without losing the accuracy rate, so that the delta [ iter ]]Is set so thatAThe branch reducing rate required by the iterative branch reducing is met.
And 2.3, after the current RBM determines the network structure, considering the growth of the hidden layer. It can be found from equation (4) that P (v; θ) is inversely proportional to E (v, h; θ). So if one wants to maximize P (v; theta), the energy function E (v, h; theta) should be as small as possible. If the total energy of the DBN is greater than a threshold, indicating that the DBN lacks data representation capability, a new layer of RBM may be added. The invention therefore uses the energy function as a condition for adding a new RBM:
Figure BDA0001729614950000101
wherein ElIs the total energy of the L-th layer RBM, which is obtained from equation (2), L is 1,2 … L, L is the current layer number of DDBN, mean () is the average function, θLIs a threshold value. If the energy function satisfies equation (16), a new layer of RBM is added, and the parameters of the new RBM are initialized as same as the first layer, and then the output of the current RBM is used as the input of the newly added RBM.
And 2.4, training the network circularly according to the steps 2.2 and 2.3, and learning a deep DDBN network structure.
And 3, further optimizing the DDBN by using fine adjustment. And taking the network structure and the parameter values obtained in the pre-training stage as initial values of the fine-tuning stage, and fine-tuning the whole DDBN network. The invention uses a back propagation algorithm to finely adjust the whole network, namely, training errors are reversely propagated from top to bottom, and the network is optimized to obtain a final DDBN network model. The specific operation is as follows:
and 3.1, taking the network structure and the parameter value theta trained in the pre-training stage as initial values of the fine-tuning stage, and adding an output layer after the last layer of RBM. The output layer is provided with 3 neurons, and the outputs of the neurons represent the temperature, the pressure and the gas flow respectively and are used for predicting the combustion behavior suitable for the training sample. And inputting a training data set into the network in the fine tuning stage for optimization.
And 3.2, calculating the activation probability of each hidden layer neuron by utilizing a forward propagation algorithm.
Step 3.3, calculating a prediction result obtained by forward propagation of the training sample, and comparing the prediction result with an actual result to obtain a loss function:
Figure BDA0001729614950000102
where t is the current number of fine-tuning times, N is the number of training samples, ynAnd y'nThe actual result and the predicted result of the nth training sample are respectively. The error between the actual result and the predicted result is propagated in the reverse direction, and the weight w and the offset c are updated according to the equations (18) and (19) by using the gradient descent method:
Figure BDA0001729614950000103
Figure BDA0001729614950000104
where α is the learning rate. And (3) iteratively using a gradient descent method to finely adjust the whole DDBN network from top to bottom to reduce the value of J (t) until the maximum fine adjustment times are reached to obtain a final DDBN network model.
And 4, inputting the preprocessed test data set into a DDBN network model obtained in the fine adjustment stage, extracting main characteristics of the test sample through RBM, then inputting the main characteristics into the last layer of output layer, and respectively representing the values of temperature, pressure and gas flow, namely predicting the combustion behavior suitable for the training sample.
The collected solid waste data set is detected by the method provided by the invention. The data set includes 1000 samples, 800 training samples and 200 testing samples. Each sample has 7 features, so the number of visual layer neurons is set to 7; each sample has 3 outputs, namely temperature, pressure and gas flow, corresponding to its combustion behaviour.
The detection result shows that the intelligent solid waste treatment method based on the dynamic deep belief network saves 30% of treatment time compared with the traditional manual control method, and the treatment effect also reaches the national regulated solid waste treatment index. Therefore, the method provided by the invention can effectively treat the solid waste, save time and cost and realize efficient intelligent treatment.

Claims (5)

1. A solid waste intelligent treatment method based on a dynamic deep belief network is characterized by comprising the following steps:
step 1, measuring solid waste to obtain a solid waste data set, preprocessing the solid waste data set, and dividing to obtain a training data set and a testing data set;
step 2, inputting the training data set obtained after the preprocessing in the step 1 into a DDBN model, using a contrast divergence algorithm to train each layer of restricted Boltzmann machine RBM from bottom to top independently without supervision, optimizing the network structure of the current RBM through a dynamic increase and decrease branch algorithm in the training process, obtaining the network structure and parameter values of each RBM through iterative training, and finally obtaining the high-level characteristics of the training data set; the parameter values are weight and bias;
step 3, taking the DDBN network structure and the parameter values obtained in the step 2 as initial values of a fine tuning stage, and fine tuning the whole DDBN network by using a top-down back propagation algorithm to obtain a final DDBN network model; the specific process is as follows:
step 3.1, taking the DDBN network structure and the parameter value theta trained in the step 2 as initial values of a fine tuning stage, adding an output layer after the last layer of RBM for predicting combustion behaviors including temperature, pressure and gas flow suitable for the training data set sample, and inputting the training data set to start fine tuning the whole DDBN network;
step 3.2, calculating the activation probability of each hidden layer neuron by using a forward propagation algorithm;
step 3.3, calculating a prediction result obtained by forward propagation of the training sample, and comparing the prediction result with an actual result to obtain a loss function:
Figure FDA0003148918320000011
where t is the current number of fine-tuning, N is the number of samples in the training dataset, ynAnd y'nRespectively obtaining an actual result and a predicted result of the nth training sample; and (3) reversely propagating errors of the actual result and the predicted result, and updating the weight w and the offset c according to the formulas (2) and (3) by using a gradient descent method:
Figure FDA0003148918320000012
Figure FDA0003148918320000013
wherein α is the learning rate;
iteratively using a gradient descent method to finely adjust the whole DDBN network from top to bottom to reduce the value of J (t) until the maximum fine adjustment times are reached to obtain a final DDBN network model;
and 4, inputting the test data set into the final DDBN network model obtained in the step 3, and finally outputting a prediction result.
2. The method for intelligently processing the solid waste based on the dynamic deep belief network as claimed in claim 1, wherein the preprocessing in the step 1 is: normalizing the solid waste data set to be between [0 and 1], wherein the normalization formula is as follows:
Figure FDA0003148918320000014
wherein,
Figure FDA0003148918320000015
characteristic value, x, of solid waste data setmaxAnd xminThe maximum value and the minimum value of all characteristics of the solid waste data set are respectively, and x is the solid waste data set after normalization.
3. The method for intelligently processing the solid waste based on the dynamic deep belief network as claimed in claim 1 or 2, wherein the specific process of the step 2 is as follows:
step 2.1, constructing a DDBN network model, and setting the parameter values of DDBN: visual layer neurons, initial hidden layer neurons, hidden layer numbers, learning rate, iteration times and fine tuning times; wherein, the number of neurons in the visual layer is the feature dimension of the training data set;
step 2.2, inputting the training data set obtained after preprocessing into a first layer RBM, pre-training the RBM by using a CD algorithm, and optimizing the network structure of the current RBM by using a dynamic branch-increasing and-decreasing algorithm in the training process;
(1) the energy function E (v, h; θ) of the RBM and the joint probability distribution P (v, h; θ) of the visible and hidden layer neurons are:
Figure FDA0003148918320000021
Figure FDA0003148918320000022
wherein v isi(1. ltoreq. I. ltoreq.I) and hj(J is more than or equal to 1 and less than or equal to J) respectively represents a visual layer neuron and a hidden layer neuron, w is a weight matrix between the visual layer and the hidden layer, b and c are the bias of the visual layer neuron and the hidden layer neuron respectively, and theta represents a parameter in the model, namely theta is { w, b, c }; z is for all possible visible and hidden layersSumming the channel pairs;
and (3) solving the edge probability distribution of the visual layer neuron v and the hidden layer neuron h according to a formula (6) by utilizing the principle of a Bayes formula:
Figure FDA0003148918320000023
Figure FDA0003148918320000024
and deducing the conditional probability distribution of the visual layer neuron v and the hidden layer neuron h by using a Bayesian formula:
Figure FDA0003148918320000025
Figure FDA0003148918320000026
obtaining approximate reconstruction P (v; theta) of a training sample by using a contrast divergence algorithm through one-step Gibbs sampling by using a formula (9) and a formula (10), and then updating a network parameter theta to be { w, b, c } according to a reconstruction error;
(2) in the training process, according to the current training condition, optimizing the network structure of the RBM through a dynamic branch increasing and decreasing algorithm;
the change in weight w is monitored using the weight distance WD method:
WDj[m]=Met(wj[m],wj[m-1]) (11)
wherein, wj[m]The weight vector of the hidden layer neuron j after m iterations, wherein Met represents a measurement function, such as Euclidean distance; the value of WD reflects the change in the weight vector of hidden layer neuron j in two iterations;
the branch increasing condition is considered from the aspects of local and global;
the local conditions are defined as:
Figure FDA0003148918320000027
wherein
Figure FDA0003148918320000028
In the mth iteration, the WD value of the jth hidden neuron to the nth input sample, J is 1,2,3, J is the number of hidden neurons, and max (·) is a maximum function;
the global condition is defined as:
Figure FDA0003148918320000031
where N is the number of samples in the training data set and N' is the number of samples that increased the WD value of the jth neuron compared to the last iteration, i.e., the number of samples
Figure FDA0003148918320000032
The local and global conditions are considered for a single input sample and all input samples by the hidden layer neuron j, respectively; multiplying the two conditions to obtain the propagation condition:
max_WDj[m]*iratioj(m)>y(m) (14)
where y (m) is a curve, which is used as a variable threshold, defined as:
Figure FDA0003148918320000033
where m is the current iteration number, numepochs is the maximum iteration number, u represents the curvature of the curve, ymaxAnd yminMaximum and minimum values of the curve, respectively; when the jth neuron satisfies equation (14), then the neuron will be divided into two neurons, and each parameter of the new neuron will beThe numbers are all 0;
when the RBM training is completed, branch reduction is started: and (3) using the standard deviation of the hidden layer neurons on the activation probability of all samples as a branch-reducing condition, wherein the standard deviation formula is as follows:
Figure FDA0003148918320000034
where N is 1,2,3, …, N is the number of samples in the input training dataset, j represents the jth hidden layer neuron, P (N, j) represents the activation probability of the jth neuron on the nth input sample, μjRepresenting the average activation probability of the jth neuron on all input samples;
the branch reducing conditions are as follows:
σ(j)<θA (17)
wherein theta isAIs a threshold value; when the jth neuron satisfies formula (17), removing the neuron and all parameters thereof; at the same time, a trade-off curve relating to the ratio of branch reduction and the prediction accuracy is made, and theta is selected according to the curveAA value of (a) such that the original accuracy is preserved while removing more redundant neurons;
after branch reduction, retraining the RBM to enable the remaining neurons to compensate the removed neurons, and retraining after branch reduction into one iteration; at each iteration we update the threshold θA
θA←θA+δ[iter] (18)
By delta [ iter ]]To update the threshold in each iteration of the pruning to remove more neurons; each iteration is a greedy search, and according to a balance curve in each branch reduction, the optimal branch reduction rate can be found without losing the accuracy rate, so that the delta [ iter ]]Is set so thatAThe branch reducing rate required by the iterative branch reducing is met;
step 2.3, after the current RBM determines the network structure, using an energy function as a condition for adding a new RBM:
Figure FDA0003148918320000041
wherein ElIs the total energy of the L-th layer RBM, which is found by equation (5), L is 1,2 … L, L is the current layer number of the DDBN, mean () is the average function, θLIs a threshold value; when the energy function meets the formula (19), a new layer of RBM is added, and the initialization of each parameter of the new RBM is the same as that of the first layer; then taking the output of the current RBM as the input of the newly added RBM;
and 2.4, training the network circularly according to the steps 2.2 and 2.3 to obtain the network structure of the DDBN.
4. The method for intelligently processing the solid waste based on the dynamic deep belief network as claimed in claim 1 or 2, wherein the specific process of the step 4 is as follows:
step 4.1, inputting the preprocessed test data set into the DDBN network model finely adjusted in the step 3, and extracting the main characteristics of the solid waste through RBM;
and 4.2, inputting the main characteristics of the test sample into the last output layer, and predicting the combustion behavior suitable for the training sample, including temperature, pressure and gas flow.
5. The method for intelligently processing the solid waste based on the dynamic deep belief network as claimed in claim 3, wherein the specific process of the step 4 is as follows:
step 4.1, inputting the preprocessed test data set into the DDBN network model finely adjusted in the step 3, and extracting the main characteristics of the solid waste through RBM;
and 4.2, inputting the main characteristics of the test sample into the last output layer, and predicting the combustion behavior suitable for the training sample, including temperature, pressure and gas flow.
CN201810768405.8A 2018-07-13 2018-07-13 Solid waste intelligent treatment method based on dynamic deep belief network Active CN109146007B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810768405.8A CN109146007B (en) 2018-07-13 2018-07-13 Solid waste intelligent treatment method based on dynamic deep belief network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810768405.8A CN109146007B (en) 2018-07-13 2018-07-13 Solid waste intelligent treatment method based on dynamic deep belief network

Publications (2)

Publication Number Publication Date
CN109146007A CN109146007A (en) 2019-01-04
CN109146007B true CN109146007B (en) 2021-08-27

Family

ID=64800535

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810768405.8A Active CN109146007B (en) 2018-07-13 2018-07-13 Solid waste intelligent treatment method based on dynamic deep belief network

Country Status (1)

Country Link
CN (1) CN109146007B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110108672B (en) * 2019-04-12 2021-07-06 南京信息工程大学 Aerosol extinction coefficient inversion method based on deep belief network
CN111366123B (en) * 2020-03-06 2021-03-26 大连理工大学 Part surface roughness and cutter wear prediction method based on multi-task learning

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107269335A (en) * 2017-07-28 2017-10-20 浙江大学 The rubbish and gas combustion-gas vapor combined cycle system of a kind of use combustion gas garbage drying
CN107729988A (en) * 2017-09-30 2018-02-23 北京工商大学 Blue-green alga bloom Forecasting Methodology based on dynamic depth confidence network

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107247260B (en) * 2017-07-06 2019-12-03 合肥工业大学 A kind of RFID localization method based on adaptive depth confidence network
CN108197427B (en) * 2018-01-02 2020-09-04 山东师范大学 Protein subcellular localization method and device based on deep convolutional neural network

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107269335A (en) * 2017-07-28 2017-10-20 浙江大学 The rubbish and gas combustion-gas vapor combined cycle system of a kind of use combustion gas garbage drying
CN107729988A (en) * 2017-09-30 2018-02-23 北京工商大学 Blue-green alga bloom Forecasting Methodology based on dynamic depth confidence network

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
An adaptive learning method of Deep Belief Network by layer generation;Shin Kamada 等;《IEEE》;20171231;第2967-2970页 *
一种动态构建深度信念网络模型方法;吴强 等;《中国计量大学学报》;20180315;第64-69页 *

Also Published As

Publication number Publication date
CN109146007A (en) 2019-01-04

Similar Documents

Publication Publication Date Title
CN108665058B (en) Method for generating countermeasure network based on segment loss
CN112116162B (en) Power transmission line icing thickness prediction method based on CEEMDAN-QFAOA-LSTM
CN110782658B (en) Traffic prediction method based on LightGBM algorithm
CN109886464B (en) Low-information-loss short-term wind speed prediction method based on optimized singular value decomposition generated feature set
CN114218872B (en) DBN-LSTM semi-supervised joint model-based residual service life prediction method
CN104598972A (en) Quick training method of large-scale data recurrent neutral network (RNN)
CN112069310A (en) Text classification method and system based on active learning strategy
CN113255986B (en) Multi-step daily runoff forecasting method based on meteorological information and deep learning algorithm
CN106056127A (en) GPR (gaussian process regression) online soft measurement method with model updating
CN111144552B (en) Multi-index grain quality prediction method and device
CN114757427B (en) Autoregressive-corrected LSTM intelligent wind power plant ultra-short-term power prediction method
CN111931983B (en) Precipitation prediction method and system
Suryo et al. Improved time series prediction using LSTM neural network for smart agriculture application
CN109146007B (en) Solid waste intelligent treatment method based on dynamic deep belief network
CN108062566A (en) A kind of intelligent integrated flexible measurement method based on the potential feature extraction of multinuclear
CN115544890A (en) Short-term power load prediction method and system
Hu et al. A dynamic rectified linear activation units
CN118297106B (en) Natural gas pipeline leakage risk prediction optimization method
CN115879369A (en) Coal mill fault early warning method based on optimized LightGBM algorithm
CN106777466B (en) Dynamic evolution modeling method of high-sulfur natural gas purification process based on ST-UPFNN algorithm
CN112001115A (en) Soft measurement modeling method of semi-supervised dynamic soft measurement network
CN109214513B (en) Solid-liquid waste intelligent coupling treatment method based on adaptive deep belief network
Jamaleddyn et al. An improved approach to Arabic news classification based on hyperparameter tuning of machine learning algorithms
CN106777468A (en) High sulfur content natural gas desulfurization process strong tracking evolutionary Modeling method
CN116865255A (en) Short-term wind power prediction method based on improved entropy weight method and SECEEMD

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant