CN112862173B - Lake and reservoir cyanobacterial bloom prediction method based on self-organizing deep confidence echo state network - Google Patents
Lake and reservoir cyanobacterial bloom prediction method based on self-organizing deep confidence echo state network Download PDFInfo
- Publication number
- CN112862173B CN112862173B CN202110126626.7A CN202110126626A CN112862173B CN 112862173 B CN112862173 B CN 112862173B CN 202110126626 A CN202110126626 A CN 202110126626A CN 112862173 B CN112862173 B CN 112862173B
- Authority
- CN
- China
- Prior art keywords
- sub
- echo state
- state network
- self
- deep
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 52
- 210000002569 neuron Anatomy 0.000 claims abstract description 59
- 230000007246 mechanism Effects 0.000 claims abstract description 38
- 238000012549 training Methods 0.000 claims abstract description 35
- 239000011159 matrix material Substances 0.000 claims description 70
- 230000006870 function Effects 0.000 claims description 42
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 claims description 23
- 239000013598 vector Substances 0.000 claims description 20
- 229930002868 chlorophyll a Natural products 0.000 claims description 16
- ATNHDLDRLWWWCB-AENOIHSZSA-M chlorophyll a Chemical compound C1([C@@H](C(=O)OC)C(=O)C2=C3C)=C2N2C3=CC(C(CC)=C3C)=[N+]4C3=CC3=C(C=C)C(C)=C5N3[Mg-2]42[N+]2=C1[C@@H](CCC(=O)OC\C=C(/C)CCC[C@H](C)CCC[C@H](C)CCCC(C)C)[C@H](C)C2=C5 ATNHDLDRLWWWCB-AENOIHSZSA-M 0.000 claims description 16
- 230000008569 process Effects 0.000 claims description 16
- 238000012216 screening Methods 0.000 claims description 16
- 230000003044 adaptive effect Effects 0.000 claims description 4
- 230000000694 effects Effects 0.000 claims description 4
- 238000013138 pruning Methods 0.000 claims description 4
- 230000004913 activation Effects 0.000 claims description 2
- 238000003776 cleavage reaction Methods 0.000 claims description 2
- 238000000354 decomposition reaction Methods 0.000 claims description 2
- 230000007017 scission Effects 0.000 claims description 2
- 230000001052 transient effect Effects 0.000 claims description 2
- 230000007704 transition Effects 0.000 claims 1
- 241000192700 Cyanobacteria Species 0.000 abstract description 11
- 238000013461 design Methods 0.000 abstract description 8
- 230000002159 abnormal effect Effects 0.000 abstract description 6
- 238000001514 detection method Methods 0.000 abstract description 4
- 230000004927 fusion Effects 0.000 abstract description 2
- 238000013277 forecasting method Methods 0.000 abstract 1
- 239000010410 layer Substances 0.000 description 30
- 241000195493 Cryptophyta Species 0.000 description 8
- 238000002474 experimental method Methods 0.000 description 7
- 238000013528 artificial neural network Methods 0.000 description 6
- 241000192710 Microcystis aeruginosa Species 0.000 description 4
- 238000010586 diagram Methods 0.000 description 4
- 239000000126 substance Substances 0.000 description 4
- 230000007613 environmental effect Effects 0.000 description 3
- 235000015097 nutrients Nutrition 0.000 description 3
- 238000012545 processing Methods 0.000 description 3
- 238000012360 testing method Methods 0.000 description 3
- IJGRMHOSHXDMSA-UHFFFAOYSA-N Atomic nitrogen Chemical compound N#N IJGRMHOSHXDMSA-UHFFFAOYSA-N 0.000 description 2
- 230000007547 defect Effects 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- OAICVXFJPJFONN-UHFFFAOYSA-N Phosphorus Chemical compound [P] OAICVXFJPJFONN-UHFFFAOYSA-N 0.000 description 1
- 238000009825 accumulation Methods 0.000 description 1
- QVGXLLKOCUKJST-UHFFFAOYSA-N atomic oxygen Chemical compound [O] QVGXLLKOCUKJST-UHFFFAOYSA-N 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 230000000739 chaotic effect Effects 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 239000010842 industrial wastewater Substances 0.000 description 1
- 238000012804 iterative process Methods 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 230000001537 neural effect Effects 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 229910052757 nitrogen Inorganic materials 0.000 description 1
- 238000005312 nonlinear dynamic Methods 0.000 description 1
- 229910052760 oxygen Inorganic materials 0.000 description 1
- 239000001301 oxygen Substances 0.000 description 1
- 229910052698 phosphorus Inorganic materials 0.000 description 1
- 239000011574 phosphorus Substances 0.000 description 1
- 230000000644 propagated effect Effects 0.000 description 1
- 238000007670 refining Methods 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 239000002344 surface layer Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/10—Services
- G06Q50/26—Government or public services
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A20/00—Water conservation; Efficient water supply; Efficient water use
- Y02A20/152—Water filtration
Landscapes
- Business, Economics & Management (AREA)
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Tourism & Hospitality (AREA)
- Economics (AREA)
- Human Resources & Organizations (AREA)
- Strategic Management (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Marketing (AREA)
- Health & Medical Sciences (AREA)
- Development Economics (AREA)
- General Business, Economics & Management (AREA)
- Game Theory and Decision Science (AREA)
- Biophysics (AREA)
- Educational Administration (AREA)
- Entrepreneurship & Innovation (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Primary Health Care (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a lake and reservoir cyanobacteria bloom prediction method based on a self-organizing deep confidence echo state network, and belongs to the technical field of cyanobacteria bloom prediction and information science cross fusion. The lake and reservoir cyanobacterial bloom forecasting method adopts a mutual information method to screen input variables and output variables, then constructs a structure of a deep confidence echo state network, designs self-organization mechanisms aiming at the deep confidence network and the echo state network respectively, and obtains a self-organization deep confidence echo state network model after optimizing the structure self-organization mechanisms so as to effectively forecast the lake and reservoir cyanobacterial bloom and facilitate subsequent lake and reservoir cyanobacterial bloom treatment. The method fully learns deep characteristics of training data, designs a self-organizing mechanism for the deep confidence echo state network, realizes dynamic adjustment of the number of hidden layer neurons and sub-reservoirs, is suitable for lake and reservoir cyanobacterial bloom data containing abnormal values such as detection noise and the like, and can improve the precision and the robustness of a prediction result.
Description
Technical Field
The invention belongs to the technical field of blue algae water bloom prediction and information science cross fusion, and particularly relates to a lake and reservoir blue algae water bloom prediction method based on a self-organizing deep confidence echo state network.
Background
The blue algae bloom in lakes and reservoirs refers to the pollution phenomenon that algae and plankton in eutrophic lakes and reservoirs are abnormally and rapidly propagated, a large amount of blue-green algae layers visible to naked eyes are gathered on the surface layer of a water body, and the water surface is thickly covered. As urban and industrial wastewater is continuously discharged into lakes and reservoirs, the content of nutrient substances such as nitrogen, phosphorus and the like in the water body is higher and higher, which provides an environmental foundation for the outbreak of cyanobacterial bloom. Generally, factors such as water temperature, wind speed and nutrient substances influence the outbreak of cyanobacterial bloom in lakes and reservoirs. Therefore, the indexes can provide basis for targeted prediction, early warning and treatment of the cyanobacterial bloom. The lake and reservoir cyanobacteria bloom generation process has a chaotic attribute, and the time series prediction is carried out by taking the chlorophyll a concentration as a representation output variable and taking the water temperature, nutrient substances and the like as modeling input variables. The scholars in the environmental and biological fields carry out extensive research on the formation mechanism of the lake and reservoir cyanobacteria bloom, including modeling on environmental factors and plankton dynamics, and have better embodiment on the basic law of the lake and reservoir cyanobacteria bloom generation. Although the mechanism model has good interpretability, the evolution of the lake and reservoir cyanobacterial bloom is a complex nonlinear dynamic process and has certain sensitivity, and the establishment of the mechanism model with ideal quantitative prediction precision based on the existing research accumulation is difficult. With the development of the technology, the accessibility of data is continuously improved, and the application of a data driving method mainly based on a machine learning algorithm in the field of blue algae bloom prediction is more and more concerned. But the existing lake and reservoir cyanobacterial bloom prediction method has great defects in the aspects of prediction precision and robustness.
Disclosure of Invention
The invention provides a blue algae water bloom prediction method based on a self-organization deep confidence echo state network, which aims to effectively solve the problems of insufficient precision and poor robustness of the existing lake and reservoir blue algae water bloom prediction method. And after determining the input variable and the output variable, constructing a structure of a deep confidence echo state network, respectively designing a self-organization mechanism aiming at the deep confidence network and the echo state network, and obtaining a self-organization deep confidence echo state network model after optimizing the structure self-organization mechanism so as to effectively predict the lake and reservoir cyanobacterial bloom and facilitate the subsequent lake and reservoir cyanobacterial bloom treatment.
The invention provides a lake and reservoir cyanobacterial bloom prediction method based on a self-organizing deep confidence echo state network, which comprises the following four steps:
determining an input variable and an output variable of a deep confidence echo state network model;
and determining the characteristic variables of the lake-reservoir cyanobacteria bloom as output variables according to the domain knowledge, and screening out the influence variables of the lake-reservoir cyanobacteria bloom from the candidate water quality variables as input variables on the basis of a mutual information method.
Step two, establishing a structure of a deep confidence echo state network;
the method comprises the steps of constructing a structure of a deep confidence echo state network, wherein the structure comprises a deep confidence network and an echo state network, and particularly, the echo state network adopts a modularized sub-reserve pool structure and adopts a robust loss function to solve an output weight matrix of the echo state network.
Designing a self-organization mechanism of the deep confidence echo state network and optimizing the deep confidence echo state network;
after the structure of the deep confidence echo state network is constructed, firstly, the importance index of a neuron is defined, then, respective self-organization mechanisms of the deep confidence network and the echo state network are respectively designed, and the deep confidence echo state network is trained and optimized to obtain a self-organization deep confidence echo state network model.
Predicting based on the self-organizing deep confidence echo state network model;
and predicting the cyanobacterial bloom by using the self-organizing deep confidence echo state network model.
Compared with other methods in the prior art, the method provided by the invention has the advantages of feasibility and effectiveness.
The invention has the advantages that:
1. the invention constructs a self-organizing deep belief echo state network model for forecasting the lake and reservoir cyanobacterial bloom, and can fully learn the deep characteristics of training data, thereby realizing the effective forecasting of the lake and reservoir cyanobacterial bloom.
2. The invention provides a neuron importance index for measuring the importance degree of neurons, and the neuron importance index is used as the basis of self-organizing mechanism design and is beneficial to training and optimizing the deep confidence echo state network.
3. The invention designs a self-organization mechanism for the deep belief network and the echo state network respectively, so that the deep belief echo state network model can automatically determine the network structure in the training process, and the dynamic adjustment of the number of hidden layer neurons and sub reserve pools is realized.
4. In the invention, the echo state network part utilizes a robust loss function to solve the output weight matrix. Therefore, the proposed self-organization deep confidence echo state network model is suitable for lake and reservoir cyanobacterial bloom data containing abnormal values such as detection noise and the like, and can improve the accuracy and robustness of a prediction result.
Drawings
FIG. 1 is a flow chart of a lake and reservoir cyanobacterial bloom prediction method based on a self-organizing deep confidence echo state network provided by the invention;
FIG. 2 is a flow chart of establishing a deep confidence echo state network structure according to the present invention;
FIG. 3 is a flow chart of the establishment and training optimization of the self-organizing mechanism of the deep confidence echo state mesh structure in the present invention;
FIG. 4A is a schematic diagram illustrating the mutual information value between the chlorophyll-a concentration and the hysteresis input variable outputted in the embodiment of the present invention;
FIG. 4B is a schematic diagram of mutual information values between the input variables and the chlorophyll-a concentration according to the embodiment of the present invention;
fig. 5A, 5B, and 5C are a convergence graph of the number of neurons in the hidden layer of the deep belief network in the self-organizing deep belief echo state network structure, a convergence graph of the size of the reservoir of the echo state network, and a training error RMSE convergence graph of the deep belief echo state network model in the training process of the embodiment, respectively;
FIG. 6 is a schematic diagram showing the comparison between the predicted result of cyanobacterial bloom in lakes and reservoirs and the results of other conventional prediction methods in the embodiment of the present invention;
FIG. 7 is a comparison diagram of the predicted result of cyanobacterial bloom in lakes and reservoirs obtained by adding abnormal values with different proportions to the training data in the embodiment of the present invention.
Detailed Description
The present invention will be described in detail below with reference to the accompanying drawings and examples.
The invention provides a lake and reservoir cyanobacterial bloom prediction method based on a self-organizing deep confidence echo state network, wherein the self-organizing deep confidence echo state network comprises a deep confidence network and an echo state network, and in order to better predict lake and reservoir cyanobacterial bloom, the structure of the echo state network is required to be effectively optimized, and the characteristics of input variables are required to be subjected to targeted refining treatment. The deep belief network is a deep neural network model based on an energy function, can overcome the defect of local minimum and has good performance in the time series prediction problem. According to the method, the deep characteristics of the time sequence data in the input variables are extracted by using the unsupervised learning process of the deep confidence network, then the echo state network is adopted to model the deep characteristics and predict the chlorophyll a concentration at the next moment, so that the processing capacity of the self-organized deep confidence echo state network model on the time sequence information can be improved, and the blue-green algae bloom can be conveniently predicted.
In order to solve the optimization design problem of the neural network structure, the invention defines a neuron importance index by adopting a mutual information method, further defines the importance index of the hidden layer neuron and the importance index of the sub reserve pool respectively, and realizes the dynamic adjustment of the number of the hidden layer neuron and the sub reserve pool by designing a self-organization mechanism. In addition, the invention also adds a robust loss function for solving the output weight matrix of the echo state network so as to improve the robustness of the echo state network. Therefore, the prediction method provided by the invention has good prediction performance and good robustness on time sequence data containing abnormal values such as detection noise and the like, is suitable for modeling and prediction of practical lake and reservoir cyanobacterial bloom, and can provide prediction and early warning support for outbreak of the lake and reservoir cyanobacterial bloom.
The invention provides a lake and reservoir cyanobacteria bloom prediction method based on a self-organizing deep confidence echo state network, the flow of which is shown in figure 1, and the method mainly comprises the following four steps:
determining an input variable and an output variable of a deep confidence echo state network model;
the self-organizing deep confidence echo state network model is constructed by respectively determining an input variable and an output variable of the deep confidence echo state network model. In this embodiment, the output variable is determined as the concentration of chlorophyll a, and the input variable needs to be screened from a plurality of water quality variables affecting the generation of blue-green algae in lakes and reservoirs. The invention takes a mutual information method as a judgment criterion for screening input variables. Mutual information is used as a method for measuring the degree of interdependence between two variables, and can describe the nonlinear correlation of the two variables. When the mutual information value between the variables is larger, the correlation between the variables is higher. By respectively calculating mutual information values of the candidate water quality variables and the output variables, proper water quality variables can be screened as input variables according to the conditions of prediction precision requirements, speed and the like. Here, when the mutual information value of the candidate water quality variable and the output variable is greater than a set threshold (e.g., 0.2), the candidate water quality variable is selected as the input variable, otherwise, the candidate water quality variable is eliminated. And the screened input variables and the screened output variables participate in the training and prediction of the deep confidence echo state network model together.
Step two, establishing a structure of a deep confidence echo state network;
the self-organizing deep confidence echo state network model is composed of a deep confidence network based on limited Boltzmann machine stacking and a modular echo state network based on a sub-reserve pool. The deep confidence echo state network model firstly extracts deep features of input variables through a conventional deep confidence network. The limited Boltzmann machine is a basic unit forming a deep belief network, and comprises two layers of neurons, wherein one layer is a visible layer and is used for inputting variables; the other layer is a hidden layer for extracting deep features of the input variables. In particular, the deep confidence network part of the deep confidence echo state network is formed by stacking two limited Boltzmann machines. Specifically, as shown in fig. 2, the structure for establishing the deep confidence echo state includes the following steps:
inputting the input variable into a deep belief network, carrying out unsupervised learning through a contrast divergence method, and training the deep belief network to extract deep features of the input variable.
Inputting the deep features output by the hidden layer of the deep belief network into an echo state network, initializing the weight matrix of the deep features and the weight matrix of a sub-reserve pool by the echo state network, and collecting an internal state matrix.
The echo state network in the deep confidence echo state network is an echo state network based on a sub-reserve pool. The echo state network not only can meet the echo state characteristics, but also can reduce the complexity of parameter setting. The reserve pool in the echo state network without output feedback in the deep confidence echo state network comprises a plurality of sub reserve pools, and each sub reserve pool is mutually independent, so that the decoupling of partial neurons in the reserve pool is ensured.
Setting the number of the sub-reservoirs in the original reservoir as N total Each sub-reserve pool is provided with n sub Each neuron then consists of N total Weight matrix W of reserve pool formed by individual sub reserve pools * res In a block diagonal matrix, i.e.,
wherein, W i (1≤i≤N total ) And the weight matrix is corresponding to the ith sub-reserve pool. W i Generated by singular value decomposition, i.e. W i =U i S i V i . Wherein, the diagonal matrixRandomly generated by a given singular value distribution, and fully connected inside the sub-pool matrix. n is sub Is the size of the ith sub-reserve pool, i.e. all the sub-reserve pool weight matrixes in the invention are n sub ×n sub A dimension matrix.Is two random orthogonal matrices generated simultaneously, where u pk ,v pk ∈(-1,1),p=1,2,…,n sub ,k=1,2,…,n sub 。
The mathematical expression of the echo state network based on the sub-reserve pool is as follows:
wherein u (n) is input vector of K x 1 dimension at n time, i.e. deep feature extracted by deep belief network at n time, and K is depth positionThe number of the neurons of the last hidden layer of the network;x i (n) is 1 Xn sub State vector of ith sub reserve pool n moment of dimension; and y (n) is the output value of the echo state network at the moment n.In order to input the weight matrix, the weight matrix is input,is n sub The input weight matrix of the ith sub-pool in xK dimension,is 1 (N) total ×n sub ) The output weight matrix of the dimension. f. of res Is the activation function of the reserve pool neurons, and takes the sigmoid function.
Here, to overcome the effect of the initial transient, assume that n is the number n min From time +1 to time L train Time-of-day collection internal state matrix H = [ x (n) min +1),…,x(L train )] T Its corresponding desired output vector is T = [ T (n) min +1),…,t(L train )] T , t(n min + 1) is n min The desired output value at time + 1.
In addition, in order to overcome the ill-conditioned solution problem possibly caused by detection of abnormal values such as noise and the like and improve the robustness of prediction, the output weight matrix is iteratively solved by adopting a robust loss function containing L2 regularization
Initializing the solving iteration times k =1 of the output weight matrix, initializing a robust weight matrix as a unit matrix, and calculating a robust loss function and residual robust scale estimation; and updating the robust weight matrix according to the robust weight function.
Initializing the solving iteration times k =1 of the output weight matrix, and beginningAnd (3) taking the initialized robust weight matrix as a unit matrix, calculating a robust loss function and residual robust scale estimation in an iterative process, updating the robust weight matrix according to the robust weight function, and calculating an output weight matrix. Robust loss function E (k) and output weight matrix combining regularization term from iteration to kThe solving results of (1) are respectively:
where C is the regularization coefficient and I is (N) total ×n sub )×(N total ×n sub ) The identity matrix of the dimension(s),is 2-norm, ρ (-) is the robust objective function, ξ [k] (n)=T(n)-y [k] (n) is the training error at the nth time instant of iteration to step k,for the residual robust scale estimation from iteration to k steps, MAR is the median absolute deviation.
Is represented by (L) train -n min )×(L train -n min ) The robust weight matrix of dimension, w (-) is the robust weight function. In the invention, a Welsch function is taken as a robust weight function, and a robust target function rho (-) and a robust weight function w (-) thereof are respectively as follows:
wherein z is a function variable, k set =μk def Mu is a robust coefficient, the robust weight function is selected according to experience, the Welsch function is selected as the robust weight function, and then the coefficient k def =2.985。
Designing a self-organization mechanism of the deep confidence network and the echo state network and training the deep confidence network and the echo state network;
the invention respectively designs a self-organization mechanism and a corresponding training process aiming at a deep confidence network and an echo state network. Namely, in the step two, on the basis of the step two, the adjustment of the hidden layer neuron of the deep confidence network and the neutron reserve pool of the echo state network is respectively realized in each iteration of the respective training process.
As shown in FIG. 3, for each hidden layer neuron of the deep belief network, the iterative training times k are initialized first 1 =1, training weight matrix of deep belief network according to contrast divergence method, calculating importance index of neuron of each layer, wherein, iteration is carried out to any k-th 1 Importance index of neurons in stepIs defined as:
wherein the content of the first and second substances,andrespectively, the input and output of the jth neuron of the l layer.Is shown asAndthe value of the mutual information between them,is shown asAnd the desired output vector T. For the deep belief network part, the self-organization process of hidden layer neurons includes splitting and deleting, and a specific self-organization mechanism based on neuron importance is shown as follows.
(1) The mechanism of neuronal cleavage in the hidden layer: when iterating to the k 1 When the step is carried out,the higher the neuron, the more active it is processing information. The present invention therefore chooses to split the most active neurons in the hidden layer. That is, when the jth neuron of the l-th layer satisfies the following condition:
the jth neuron splits into two neurons,is iterated to the k < th > 1 Total number of layer I neurons in step (ii).
(2) Mechanism of pruning of hidden layer neurons: when in useLower, the neuron processes the information less strongly and should be considered to be deleted. Thus, the present invention defines iterating through the kth 1 Step-time adaptive pruning thresholdThe values are as follows:
wherein beta is (0, 1)]. Then, according to the above formula, when the jth neuron satisfies the conditionThen the jth neuron is deleted.
After the number of the neurons of the hidden layer and the weight matrix of the deep belief network are subjected to iterative training, the iterative training of the number of the sub-reservoirs of the echo state network and the output weight matrix can be carried out. Taking the output vector of the last hidden layer of the trained deep belief network as the input of the echo state network, and initializing the iteration number k of the echo state network 2 =1, self-defining control parameter vector, randomly generating temporary reserve pool weight and temporary input weight which are consistent with the original reserve pool size, and showing the specific sub-reserve pool screening and growing mechanism of the echo state network as follows:
(1) The screening mechanism of the child reserve pool: the invention defines the importance index of the ith sub-reserve pool in the reserve poolComprises the following steps:
whereinIs the input vector of the p-th neuron of the ith sub-reservoir,and the output vector of the p-th neuron of the ith sub-reservoir. Thus, training to k in iterations 2 Randomly generating temporary sub-reserves (1, 2, \8230; i, … ,i max (k 2 ) Sorting according to the size of the importance indexes:the invention defines the self-adaptive screening threshold as follows:
S th (k 2 )=NS′ sub (INT(αi max (k 2 ))) (12)
wherein INT (-) is an integer function.And the sorted sub reserve pool importance vectors are obtained. And the alpha epsilon (0, 1) is a self-defined control parameter, namely different screening degrees of the sub reserve pool are controlled in each circulation. The parameter may take a plurality of valuesTogether forming a control parameter vectorBut need to satisfy alpha 1 <α 2 <…<α Nα ,N α Is the dimension of the control parameter vector.
The training goal of the echo state network is to minimize the robust loss function of equation (4). In order to ensure that the performance of the reserve pool after screening can be kept or better than the sub-reserve pool set before screening, when the kth iteration 2 The ith sub-reserve pool meets the following conditions:
and the robust loss function E (k) of all sub-pools that satisfy the condition 2 ) And when the value is less than or equal to the minimum value of the historical robust loss function, the sub-reserve pools are reserved as new reserve pools, and the rest sub-reserve pools are deleted. And taking the screened sub reserve pools as temporary reserve pools and calculating training errors.
(2) Growth mechanism of child reserve pool: after screening, the increase of the sub reserve pools is realized, the temporary reserve pool is used as a new reserve pool, a new randomly generated sub reserve pool is merged, and then the output weight matrix of the echo state network after merging is as follows:
wherein H o For the state matrix corresponding to the reserve pool after the screening mechanism is completed, H g For the state matrix corresponding to the growing reserve pool,is the state matrix corresponding to the growing reserve pool.Is composed ofAn identity matrix of dimensions, wherein,the total number of the growing child pools. Further, a combined output weight matrix may be derived based on equation (14)The mathematical expression updated is:
wherein, I o Is (N) o ×n sub )×(N o ×n sub ) Identity matrix of dimension, N o The number of the child reserve pools after the screening mechanism is completed. I is g Is n sub ×n sub An identity matrix of dimensions. I is L Is (L) train -n min )×(L train -n min ) An identity matrix of dimensions.
And obtaining a self-organizing deep confidence echo state network model.
Predicting based on the self-organizing deep confidence echo state network model;
through the design of a self-organization mechanism, the self-organization deep confidence echo state network model can automatically learn and optimally design the proper number of neurons in the hidden layer of the deep confidence network and the number of the sub reserve pools of the echo state network in the training process, and meanwhile, the weight matrix solution corresponding to each neural network is realized. And inputting the input variable into the trained self-organizing deep confidence echo state network model, so that the characterization index of the lake and reservoir cyanobacterial bloom, namely the chlorophyll a concentration prediction can be realized.
The technical solution of the present invention is further illustrated by the following examples.
The first embodiment is as follows:
the embodiment provides a lake and reservoir cyanobacterial bloom prediction method based on a self-organizing deep confidence echo state network, which specifically comprises the following implementation steps:
step one, determining an input variable and an output variable of a prediction model;
the data in the examples were derived from the water quality data set of west fal-thao harbor, usa. The data set contains 6 water quality variables, and table 1 specifically shows the abbreviations, units and meanings of the individual variables in the data set.
TABLE 1 Water quality variables information
The sampling frequency of the data is 20 minutes, the acquisition time starts from 18 o 'clock 01 at 6 h in 2017 to 13 o' clock 21 at 31 h in 2017, 8 h in 2017, and 2491 groups of data are shared. In order to overcome the influence of redundant indexes on the modeling effect, the correlation between the water quality variable and the concentration of the chlorophyll a serving as an output variable is measured by a mutual information value in the experiment. In the experiment, not only the correlation of the water quality variables but also the autoregressive characteristic of the time series of the chlorophyll a concentration are considered. As can be seen from fig. 4A, the mutual information value of the lag variable of chlorophyll a gradually decreases as the lag time increases. Fig. 4B shows the mutual information values of 5 water quality variables for the chlorophyll a concentration at the next time. The experiment selects a water quality variable with a mutual information value greater than 0.2. Therefore, the input variables of the self-organizing depth confidence echo state network are the water temperature, the salinity, the oxygen saturation, the specific conductivity, the chlorophyll a concentration at the same moment and the chlorophyll a concentration at the lag three moments, and the output variable is the chlorophyll a concentration at the next moment. That is, the number of input variables is 8, and the number of output variables is 1.
Step two, establishing a structure of a deep confidence echo state network;
in the experiment for predicting the cyanobacterial bloom in the lake and reservoir, the self-organizing deep confidence echo state network starts to collect data after running 200 data to form a state matrix, the length of training data is 1600, and the length of testing data is 691. The neuron of the hidden layer of the deep belief network part is initialized to 3-3, the iterative training time is 50, the learning batch size is 50, the learning rate is 0.1, and beta is 0.98. The initialization range of the input weight matrix element of the echo state network is [ -1,1]The singular value of the diagonal matrix in SVD takes [0.1,0.99]The size of the sub-reserve pool is 5, the regularization coefficient C is 1e-7, the robust coefficient mu is 1, the iteration number of output weight matrix solving is 15, the iteration number of the reserve pool self-organizing process is 50, and the control parameter vector is uniformly distributedIs (0.5, 0.6,0.7,0.8, 0.9).
Designing and optimizing a self-organizing mechanism and a training process of the deep confidence echo state network;
the self-organizing process of hidden layers and reserve pools in the self-organizing deep confidence echo state network is shown in fig. 5A and 5B. In FIG. 5A, the neurons of the first hidden layer H1 and the second hidden layer H2 are finally stable at 7 and 6, respectively, so the final hidden layer structure is 7-6. During the process of reservoir size learning, the iterative training number is set to 100. As shown in fig. 5B, the pool size iteratively converges to 120, containing a total of 24 child pools, based on the self-organizing mechanism. Therefore, the structure of the self-organizing deep confidence echo state network in the experiment is 8-7-6-120-1. Fig. 5C shows the convergence curve of Root Mean Square Error (RMSE) during training. The training error of the self-organizing deep confidence echo state network finally converges to be near the minimum value of 0.383.
Predicting based on the self-organizing deep confidence echo state network model;
FIG. 6 shows the comparison of the predicted results of cyanobacterial bloom in lakes and reservoirs by using the self-organizing deep confidence echo state network and the predicted results by using other echo state network methods. It can be seen that the self-organizing deep confidence echo state network (SDBMESN) provided by the embodiment of the invention can effectively learn the evolution rule of the cyanobacterial bloom in the lake and reservoir relative to other echo state network models. Table 2 shows the comprehensive performance of the basic echo state network (OESN), the Regularized Echo State Network (RESN), the Growing Echo State Network (GESN), the adaptive regularized echo state network (DRESN), and the deep confidence echo state network (DBESN) in training and testing, including neural network structure and RMSE index. Therefore, the lake and reservoir cyanobacterial bloom prediction method based on the self-organizing deep confidence echo state network has high prediction precision and good generalization capability. Meanwhile, the size of a reserve pool of the self-organizing deep confidence echo state network is smaller than that of other echo state networks, and the self-organizing deep confidence echo state network has the simplest neural network structure. Here, the DBESN for each set of experiments employs the same neural network structure as the self-organizing deep confidence echo state network. However, under the condition of consistent neural network structure, the prediction performance of the DBESN is lower than that of the self-organizing deep confidence echo state network provided by the invention. The self-organizing mechanism of the self-organizing deep confidence echo state network not only realizes structural simplification, but also keeps the neurons and the sub reserve pools with better relative performance in the existing neurons in the self-organizing process, so that the neurons and the reserve pools in the self-organizing deep confidence echo state network have better prediction effect, and the capability of processing dynamic information by the neurons and the reserve pools is further improved. Therefore, the self-organizing deep confidence echo state network is suitable for the prediction application of the lake and reservoir cyanobacterial bloom.
TABLE 2 blue algae bloom prediction experiment results and different method comparison
In the self-organizing deep confidence echo state network in the embodiment, the robustness loss function is used as the target function, so that the robustness of time series data prediction of abnormal values such as monitoring noise can be improved. To verify this feature, a 10% to 40% proportion of the pulse function was added to each training sample of the example dataset. The test results are shown in FIG. 7. It can be seen that the robustness of the self-organizing deep confidence echo state network of the embodiment of the invention is obviously superior to that of other echo state networks, and the robustness is better.
Claims (2)
1. A lake and reservoir cyanobacterial bloom prediction method based on a self-organizing deep confidence echo state network comprises the following steps:
determining an input variable and an output variable of a deep confidence echo state network;
the input variable takes a mutual information method as a judgment criterion, when the mutual information value of the candidate water quality variable and the output variable is greater than a set threshold value of 0.2, the candidate water quality variable is selected as the input variable, otherwise, the candidate water quality variable is removed;
the output variable is the characteristic variable chlorophyll a concentration of the lake and reservoir cyanobacterial bloom;
the screened input variables and output variables participate in the training and prediction of the deep confidence echo state network;
step two, establishing a structure of a deep confidence echo state network;
the structure of the deep confidence echo state network comprises a deep confidence network and an echo state network, wherein the echo state network adopts a modular reserve pool structure; the method comprises the following specific steps:
2.1, adopting a restricted Boltzmann mechanism as a basic unit of a depth confidence network, and extracting deep features of an input variable;
2.2, learning the deep layer characteristics and predicting the chlorophyll a concentration at the next moment by an echo state network;
depth settingThe reserve pool in the echo-back state network structure comprises a plurality of sub reserve pools, and each sub reserve pool is independent; setting the number of the sub-reserve pools as N total Each is composed of N total Reserve pool weight matrix formed by individual sub reserve poolsIs a block diagonal matrix, namely:
wherein each weight matrix element W i For the weight matrix corresponding to the ith sub-reserve pool, i is more than or equal to 1 and less than or equal to N total ;
W i Generated by singular value decomposition, i.e. W i =U i S i V i ;
Diagonal matrixRandomly generated by a given singular value distribution, and the weight matrix inside the sub-reserve pool is fully connected,p=1,2,…,n sub ,n sub the size of the ith sub-reserve pool;
is two random orthogonal matrices generated simultaneously, where u pk ,v pk ∈(-1,1),p=1,2,…,n sub ,k=1,2,…,n sub ;
To overcome the effect of the initial transient, assume that n is the number of transitions from min From time +1 to L train Time-of-day collection internal state matrix H = [ x (n) min +1),…,x(L train )] T Its corresponding desired output vector is T = [ T (n) min +1),…,t(L train )] T ,t(n min + 1) is n min The desired output value at time + 1;
output weight matrixAdopting a robust loss function containing L2 regularization to iteratively solve output, combining the robust loss function E (k) of a regularization item from iteration to k step and outputting a weight matrixThe solution results of (a) are respectively:
c is a regularization coefficient, and C is a regularization coefficient,
i is (N) total ×n sub )×(N total ×n sub ) The identity matrix of the dimension(s),
ρ (-) is a robust objective function, ξ [k] (n)=T(n)-y [k] (n) is the training error at the nth time of iteration to step k,for residual robust scale estimation from iteration to k steps, MAR is the median absolute deviation,
is represented by (L) train -n min )×(L train -n min ) A dimensional robust weight matrix, w (-) is a robust weight function, and a Welsch function is taken;
the robust objective function ρ (-) and the robust weighting function w (-) are respectively:
wherein z is a variable, k set =μk def Mu is a robust coefficient, k def =2.985;
Designing a self-organization mechanism of the deep confidence echo state network and training the deep confidence echo state network;
respectively realizing the adjustment of hidden layer neurons of the deep belief network and a neutron reserve pool in the echo state network in each iteration of the respective training process;
for each hidden layer neuron of the deep belief network, iterate to kth 1 Importance index of neurons in stepIs defined as follows:
(3.1) mechanism of cleavage of hidden layer neurons: when the jth neuron of the l-th layer meets the following condition:
the jth neuron splits into two neurons,is iterated to the k < th > 1 The total number of layer I neurons;
(3.2) mechanism of pruning of hidden layer neurons: define iteration to kth 1 The adaptive pruning threshold at step time is as follows:
the self-organizing mechanism of the echo state network comprises a sub-reserve pool screening and growing mechanism, specifically,
(3.3) screening mechanism of the child pools: defining importance index of ith sub-reserve pool in reserve poolComprises the following steps:
whereinIs the input vector of the p-th neuron of the ith sub-reservoir,for the output vector of the p-th neuron of the ith sub-reservoir, iteratively training to k 2 Randomly generating i consistent with the structure of the original reserve pool in step max (k 2 ) A temporary child reserve pool {1,2, \8230;, i max (k 2 ) Sorting according to the size of the importance indexes:the adaptive screening threshold is defined as follows:
S th (k 2 )=NS s ′ ub (INT(αi max (k 2 ))) (12)
wherein INT (-) is an integer function,for the sorted sub-reserve pool importance vectors, alpha belongs to (0, 1) as a self-defined control parameter;
when iteratingKth 2 When the step (ii) is finished, the ith sub-reserve pool meets the following conditions:
and robust loss function E (k) for all sub-pools 2 ) When the current value is less than or equal to the minimum value of the historical robust loss function, the sub reserve pools are reserved, and the rest sub reserve pools are deleted;
predicting based on the self-organizing deep confidence echo state network model;
predicting the cyanobacterial bloom by using the self-organizing deep confidence echo state network model;
the method is characterized in that:
the mathematical expression of the echo state network is as follows:
u (n) is input vector of K multiplied by 1 dimension at n moment, K is the neuron number of the last hidden layer of the deep belief network, namely the deep feature of the deep belief network at n moment;
y (n) is the output value of the echo state network at the moment n;
in order to input the weight matrix, the weight matrix is input,is n sub The input weight matrix of the ith sub-pool in xK dimension,is 1 (N) total ×n sub ) A dimension output weight matrix;
f res taking a sigmoid function as an activation function of the reserve pool neurons;
(3.4) growth mechanism of child reserve pool: combining each screened sub reserve pool with a new randomly generated sub reserve pool, wherein the weight matrix of the output vector of the echo state network after combination is as follows:
H o a state matrix corresponding to the reserve pool after the screening mechanism is completed;
H g a state matrix corresponding to the growing reserve pool;
is composed ofAn identity matrix of dimensions, wherein,the total number of the merged and increased child reserve pools;
I o is (N) o ×n sub )×(N o ×n sub ) An identity matrix of dimensions;
N o the number of the child reserve pools after the screening mechanism is finished;
I g is n sub ×n sub An identity matrix of dimensions;
I L is (L) train -n min )×(L train -n min ) An identity matrix of dimensions.
2. The lake and reservoir cyanobacterial bloom prediction method based on the self-organizing deep confidence echo state network as claimed in claim 1, wherein: the input variables of the deep confidence echo state mesh structure also include the chlorophyll a concentration at three moments later.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110126626.7A CN112862173B (en) | 2021-01-29 | 2021-01-29 | Lake and reservoir cyanobacterial bloom prediction method based on self-organizing deep confidence echo state network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110126626.7A CN112862173B (en) | 2021-01-29 | 2021-01-29 | Lake and reservoir cyanobacterial bloom prediction method based on self-organizing deep confidence echo state network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112862173A CN112862173A (en) | 2021-05-28 |
CN112862173B true CN112862173B (en) | 2022-10-11 |
Family
ID=75986842
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110126626.7A Active CN112862173B (en) | 2021-01-29 | 2021-01-29 | Lake and reservoir cyanobacterial bloom prediction method based on self-organizing deep confidence echo state network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112862173B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114282639B (en) * | 2021-12-24 | 2024-02-02 | 上海应用技术大学 | Water bloom early warning method based on chaos theory and BP neural network |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105510546B (en) * | 2015-12-27 | 2017-06-16 | 北京工业大学 | A kind of biochemical oxygen demand (BOD) BOD intelligent detecting methods based on self-organizing Recurrent RBF Neural Networks |
CN107506857B (en) * | 2017-08-14 | 2020-05-08 | 北京工商大学 | Urban lake and reservoir cyanobacterial bloom multivariable prediction method based on fuzzy support vector machine |
CN108416460B (en) * | 2018-01-19 | 2022-01-28 | 北京工商大学 | Blue algae bloom prediction method based on multi-factor time sequence-random depth confidence network model |
CN109886454B (en) * | 2019-01-10 | 2021-03-02 | 北京工业大学 | Freshwater environment bloom prediction method based on self-organizing deep belief network and related vector machine |
CN111860306B (en) * | 2020-07-19 | 2024-06-14 | 陕西师范大学 | Electroencephalogram signal denoising method based on width depth echo state network |
-
2021
- 2021-01-29 CN CN202110126626.7A patent/CN112862173B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN112862173A (en) | 2021-05-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Karul et al. | Case studies on the use of neural networks in eutrophication modeling | |
CN108764540B (en) | Water supply network pressure prediction method based on parallel LSTM series DNN | |
CN106529818B (en) | Water quality assessment Forecasting Methodology based on Fuzzy Wavelet Network | |
CN108416755A (en) | A kind of image de-noising method and system based on deep learning | |
CN106022954B (en) | Multiple BP neural network load prediction method based on grey correlation degree | |
CN102622418B (en) | Prediction device and equipment based on BP (Back Propagation) nerve network | |
CN111324990A (en) | Porosity prediction method based on multilayer long-short term memory neural network model | |
CN108416460B (en) | Blue algae bloom prediction method based on multi-factor time sequence-random depth confidence network model | |
CN109948029A (en) | Based on the adaptive depth hashing image searching method of neural network | |
CN105488563A (en) | Deep learning oriented sparse self-adaptive neural network, algorithm and implementation device | |
CN113570039B (en) | Block chain system based on reinforcement learning optimization consensus | |
CN109214579B (en) | BP neural network-based saline-alkali soil stability prediction method and system | |
CN109308544B (en) | Blue algae bloom prediction method based on contrast divergence-long and short term memory network | |
CN111242380A (en) | Lake (reservoir) eutrophication prediction method based on artificial intelligence algorithm | |
CN108596078A (en) | A kind of seanoise signal recognition method based on deep neural network | |
Wang et al. | An approach of recursive timing deep belief network for algal bloom forecasting | |
CN112862173B (en) | Lake and reservoir cyanobacterial bloom prediction method based on self-organizing deep confidence echo state network | |
CN112884149A (en) | Deep neural network pruning method and system based on random sensitivity ST-SM | |
CN109408896B (en) | Multi-element intelligent real-time monitoring method for anaerobic sewage treatment gas production | |
CN115640901A (en) | Small sample load prediction method based on hybrid neural network and generation countermeasure | |
CN109978024B (en) | Effluent BOD prediction method based on interconnected modular neural network | |
Akinwale Adio et al. | Translated Nigeria stock market price using artificial neural network for effective prediction | |
CN116629332A (en) | Signal compensation method based on optical reserve tank calculation | |
CN113191689B (en) | Land suitability evaluation method for coupling principal component analysis and BP neural network | |
CN114357877A (en) | Fishpond water quality evaluation prediction system and method based on fuzzy evaluation and improved support vector machine |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |