CN112862173B - Lake and reservoir cyanobacterial bloom prediction method based on self-organizing deep confidence echo state network - Google Patents

Lake and reservoir cyanobacterial bloom prediction method based on self-organizing deep confidence echo state network Download PDF

Info

Publication number
CN112862173B
CN112862173B CN202110126626.7A CN202110126626A CN112862173B CN 112862173 B CN112862173 B CN 112862173B CN 202110126626 A CN202110126626 A CN 202110126626A CN 112862173 B CN112862173 B CN 112862173B
Authority
CN
China
Prior art keywords
sub
echo state
state network
self
deep
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110126626.7A
Other languages
Chinese (zh)
Other versions
CN112862173A (en
Inventor
张慧妍
胡博
王小艺
王立
孙茜
王昭洋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Technology and Business University
Original Assignee
Beijing Technology and Business University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Technology and Business University filed Critical Beijing Technology and Business University
Priority to CN202110126626.7A priority Critical patent/CN112862173B/en
Publication of CN112862173A publication Critical patent/CN112862173A/en
Application granted granted Critical
Publication of CN112862173B publication Critical patent/CN112862173B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A20/00Water conservation; Efficient water supply; Efficient water use
    • Y02A20/152Water filtration

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Tourism & Hospitality (AREA)
  • Economics (AREA)
  • Human Resources & Organizations (AREA)
  • Strategic Management (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Marketing (AREA)
  • Health & Medical Sciences (AREA)
  • Development Economics (AREA)
  • General Business, Economics & Management (AREA)
  • Game Theory and Decision Science (AREA)
  • Biophysics (AREA)
  • Educational Administration (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Primary Health Care (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a lake and reservoir cyanobacteria bloom prediction method based on a self-organizing deep confidence echo state network, and belongs to the technical field of cyanobacteria bloom prediction and information science cross fusion. The lake and reservoir cyanobacterial bloom forecasting method adopts a mutual information method to screen input variables and output variables, then constructs a structure of a deep confidence echo state network, designs self-organization mechanisms aiming at the deep confidence network and the echo state network respectively, and obtains a self-organization deep confidence echo state network model after optimizing the structure self-organization mechanisms so as to effectively forecast the lake and reservoir cyanobacterial bloom and facilitate subsequent lake and reservoir cyanobacterial bloom treatment. The method fully learns deep characteristics of training data, designs a self-organizing mechanism for the deep confidence echo state network, realizes dynamic adjustment of the number of hidden layer neurons and sub-reservoirs, is suitable for lake and reservoir cyanobacterial bloom data containing abnormal values such as detection noise and the like, and can improve the precision and the robustness of a prediction result.

Description

Lake and reservoir cyanobacterial bloom prediction method based on self-organizing deep confidence echo state network
Technical Field
The invention belongs to the technical field of blue algae water bloom prediction and information science cross fusion, and particularly relates to a lake and reservoir blue algae water bloom prediction method based on a self-organizing deep confidence echo state network.
Background
The blue algae bloom in lakes and reservoirs refers to the pollution phenomenon that algae and plankton in eutrophic lakes and reservoirs are abnormally and rapidly propagated, a large amount of blue-green algae layers visible to naked eyes are gathered on the surface layer of a water body, and the water surface is thickly covered. As urban and industrial wastewater is continuously discharged into lakes and reservoirs, the content of nutrient substances such as nitrogen, phosphorus and the like in the water body is higher and higher, which provides an environmental foundation for the outbreak of cyanobacterial bloom. Generally, factors such as water temperature, wind speed and nutrient substances influence the outbreak of cyanobacterial bloom in lakes and reservoirs. Therefore, the indexes can provide basis for targeted prediction, early warning and treatment of the cyanobacterial bloom. The lake and reservoir cyanobacteria bloom generation process has a chaotic attribute, and the time series prediction is carried out by taking the chlorophyll a concentration as a representation output variable and taking the water temperature, nutrient substances and the like as modeling input variables. The scholars in the environmental and biological fields carry out extensive research on the formation mechanism of the lake and reservoir cyanobacteria bloom, including modeling on environmental factors and plankton dynamics, and have better embodiment on the basic law of the lake and reservoir cyanobacteria bloom generation. Although the mechanism model has good interpretability, the evolution of the lake and reservoir cyanobacterial bloom is a complex nonlinear dynamic process and has certain sensitivity, and the establishment of the mechanism model with ideal quantitative prediction precision based on the existing research accumulation is difficult. With the development of the technology, the accessibility of data is continuously improved, and the application of a data driving method mainly based on a machine learning algorithm in the field of blue algae bloom prediction is more and more concerned. But the existing lake and reservoir cyanobacterial bloom prediction method has great defects in the aspects of prediction precision and robustness.
Disclosure of Invention
The invention provides a blue algae water bloom prediction method based on a self-organization deep confidence echo state network, which aims to effectively solve the problems of insufficient precision and poor robustness of the existing lake and reservoir blue algae water bloom prediction method. And after determining the input variable and the output variable, constructing a structure of a deep confidence echo state network, respectively designing a self-organization mechanism aiming at the deep confidence network and the echo state network, and obtaining a self-organization deep confidence echo state network model after optimizing the structure self-organization mechanism so as to effectively predict the lake and reservoir cyanobacterial bloom and facilitate the subsequent lake and reservoir cyanobacterial bloom treatment.
The invention provides a lake and reservoir cyanobacterial bloom prediction method based on a self-organizing deep confidence echo state network, which comprises the following four steps:
determining an input variable and an output variable of a deep confidence echo state network model;
and determining the characteristic variables of the lake-reservoir cyanobacteria bloom as output variables according to the domain knowledge, and screening out the influence variables of the lake-reservoir cyanobacteria bloom from the candidate water quality variables as input variables on the basis of a mutual information method.
Step two, establishing a structure of a deep confidence echo state network;
the method comprises the steps of constructing a structure of a deep confidence echo state network, wherein the structure comprises a deep confidence network and an echo state network, and particularly, the echo state network adopts a modularized sub-reserve pool structure and adopts a robust loss function to solve an output weight matrix of the echo state network.
Designing a self-organization mechanism of the deep confidence echo state network and optimizing the deep confidence echo state network;
after the structure of the deep confidence echo state network is constructed, firstly, the importance index of a neuron is defined, then, respective self-organization mechanisms of the deep confidence network and the echo state network are respectively designed, and the deep confidence echo state network is trained and optimized to obtain a self-organization deep confidence echo state network model.
Predicting based on the self-organizing deep confidence echo state network model;
and predicting the cyanobacterial bloom by using the self-organizing deep confidence echo state network model.
Compared with other methods in the prior art, the method provided by the invention has the advantages of feasibility and effectiveness.
The invention has the advantages that:
1. the invention constructs a self-organizing deep belief echo state network model for forecasting the lake and reservoir cyanobacterial bloom, and can fully learn the deep characteristics of training data, thereby realizing the effective forecasting of the lake and reservoir cyanobacterial bloom.
2. The invention provides a neuron importance index for measuring the importance degree of neurons, and the neuron importance index is used as the basis of self-organizing mechanism design and is beneficial to training and optimizing the deep confidence echo state network.
3. The invention designs a self-organization mechanism for the deep belief network and the echo state network respectively, so that the deep belief echo state network model can automatically determine the network structure in the training process, and the dynamic adjustment of the number of hidden layer neurons and sub reserve pools is realized.
4. In the invention, the echo state network part utilizes a robust loss function to solve the output weight matrix. Therefore, the proposed self-organization deep confidence echo state network model is suitable for lake and reservoir cyanobacterial bloom data containing abnormal values such as detection noise and the like, and can improve the accuracy and robustness of a prediction result.
Drawings
FIG. 1 is a flow chart of a lake and reservoir cyanobacterial bloom prediction method based on a self-organizing deep confidence echo state network provided by the invention;
FIG. 2 is a flow chart of establishing a deep confidence echo state network structure according to the present invention;
FIG. 3 is a flow chart of the establishment and training optimization of the self-organizing mechanism of the deep confidence echo state mesh structure in the present invention;
FIG. 4A is a schematic diagram illustrating the mutual information value between the chlorophyll-a concentration and the hysteresis input variable outputted in the embodiment of the present invention;
FIG. 4B is a schematic diagram of mutual information values between the input variables and the chlorophyll-a concentration according to the embodiment of the present invention;
fig. 5A, 5B, and 5C are a convergence graph of the number of neurons in the hidden layer of the deep belief network in the self-organizing deep belief echo state network structure, a convergence graph of the size of the reservoir of the echo state network, and a training error RMSE convergence graph of the deep belief echo state network model in the training process of the embodiment, respectively;
FIG. 6 is a schematic diagram showing the comparison between the predicted result of cyanobacterial bloom in lakes and reservoirs and the results of other conventional prediction methods in the embodiment of the present invention;
FIG. 7 is a comparison diagram of the predicted result of cyanobacterial bloom in lakes and reservoirs obtained by adding abnormal values with different proportions to the training data in the embodiment of the present invention.
Detailed Description
The present invention will be described in detail below with reference to the accompanying drawings and examples.
The invention provides a lake and reservoir cyanobacterial bloom prediction method based on a self-organizing deep confidence echo state network, wherein the self-organizing deep confidence echo state network comprises a deep confidence network and an echo state network, and in order to better predict lake and reservoir cyanobacterial bloom, the structure of the echo state network is required to be effectively optimized, and the characteristics of input variables are required to be subjected to targeted refining treatment. The deep belief network is a deep neural network model based on an energy function, can overcome the defect of local minimum and has good performance in the time series prediction problem. According to the method, the deep characteristics of the time sequence data in the input variables are extracted by using the unsupervised learning process of the deep confidence network, then the echo state network is adopted to model the deep characteristics and predict the chlorophyll a concentration at the next moment, so that the processing capacity of the self-organized deep confidence echo state network model on the time sequence information can be improved, and the blue-green algae bloom can be conveniently predicted.
In order to solve the optimization design problem of the neural network structure, the invention defines a neuron importance index by adopting a mutual information method, further defines the importance index of the hidden layer neuron and the importance index of the sub reserve pool respectively, and realizes the dynamic adjustment of the number of the hidden layer neuron and the sub reserve pool by designing a self-organization mechanism. In addition, the invention also adds a robust loss function for solving the output weight matrix of the echo state network so as to improve the robustness of the echo state network. Therefore, the prediction method provided by the invention has good prediction performance and good robustness on time sequence data containing abnormal values such as detection noise and the like, is suitable for modeling and prediction of practical lake and reservoir cyanobacterial bloom, and can provide prediction and early warning support for outbreak of the lake and reservoir cyanobacterial bloom.
The invention provides a lake and reservoir cyanobacteria bloom prediction method based on a self-organizing deep confidence echo state network, the flow of which is shown in figure 1, and the method mainly comprises the following four steps:
determining an input variable and an output variable of a deep confidence echo state network model;
the self-organizing deep confidence echo state network model is constructed by respectively determining an input variable and an output variable of the deep confidence echo state network model. In this embodiment, the output variable is determined as the concentration of chlorophyll a, and the input variable needs to be screened from a plurality of water quality variables affecting the generation of blue-green algae in lakes and reservoirs. The invention takes a mutual information method as a judgment criterion for screening input variables. Mutual information is used as a method for measuring the degree of interdependence between two variables, and can describe the nonlinear correlation of the two variables. When the mutual information value between the variables is larger, the correlation between the variables is higher. By respectively calculating mutual information values of the candidate water quality variables and the output variables, proper water quality variables can be screened as input variables according to the conditions of prediction precision requirements, speed and the like. Here, when the mutual information value of the candidate water quality variable and the output variable is greater than a set threshold (e.g., 0.2), the candidate water quality variable is selected as the input variable, otherwise, the candidate water quality variable is eliminated. And the screened input variables and the screened output variables participate in the training and prediction of the deep confidence echo state network model together.
Step two, establishing a structure of a deep confidence echo state network;
the self-organizing deep confidence echo state network model is composed of a deep confidence network based on limited Boltzmann machine stacking and a modular echo state network based on a sub-reserve pool. The deep confidence echo state network model firstly extracts deep features of input variables through a conventional deep confidence network. The limited Boltzmann machine is a basic unit forming a deep belief network, and comprises two layers of neurons, wherein one layer is a visible layer and is used for inputting variables; the other layer is a hidden layer for extracting deep features of the input variables. In particular, the deep confidence network part of the deep confidence echo state network is formed by stacking two limited Boltzmann machines. Specifically, as shown in fig. 2, the structure for establishing the deep confidence echo state includes the following steps:
inputting the input variable into a deep belief network, carrying out unsupervised learning through a contrast divergence method, and training the deep belief network to extract deep features of the input variable.
Inputting the deep features output by the hidden layer of the deep belief network into an echo state network, initializing the weight matrix of the deep features and the weight matrix of a sub-reserve pool by the echo state network, and collecting an internal state matrix.
The echo state network in the deep confidence echo state network is an echo state network based on a sub-reserve pool. The echo state network not only can meet the echo state characteristics, but also can reduce the complexity of parameter setting. The reserve pool in the echo state network without output feedback in the deep confidence echo state network comprises a plurality of sub reserve pools, and each sub reserve pool is mutually independent, so that the decoupling of partial neurons in the reserve pool is ensured.
Setting the number of the sub-reservoirs in the original reservoir as N total Each sub-reserve pool is provided with n sub Each neuron then consists of N total Weight matrix W of reserve pool formed by individual sub reserve pools * res In a block diagonal matrix, i.e.,
Figure BDA0002924297260000041
wherein, W i (1≤i≤N total ) And the weight matrix is corresponding to the ith sub-reserve pool. W i Generated by singular value decomposition, i.e. W i =U i S i V i . Wherein, the diagonal matrix
Figure BDA0002924297260000042
Randomly generated by a given singular value distribution, and fully connected inside the sub-pool matrix. n is sub Is the size of the ith sub-reserve pool, i.e. all the sub-reserve pool weight matrixes in the invention are n sub ×n sub A dimension matrix.
Figure BDA0002924297260000043
Is two random orthogonal matrices generated simultaneously, where u pk ,v pk ∈(-1,1),p=1,2,…,n sub ,k=1,2,…,n sub
The mathematical expression of the echo state network based on the sub-reserve pool is as follows:
Figure BDA0002924297260000044
Figure BDA0002924297260000045
wherein u (n) is input vector of K x 1 dimension at n time, i.e. deep feature extracted by deep belief network at n time, and K is depth positionThe number of the neurons of the last hidden layer of the network;
Figure BDA0002924297260000051
x i (n) is 1 Xn sub State vector of ith sub reserve pool n moment of dimension; and y (n) is the output value of the echo state network at the moment n.
Figure BDA0002924297260000052
In order to input the weight matrix, the weight matrix is input,
Figure BDA0002924297260000053
is n sub The input weight matrix of the ith sub-pool in xK dimension,
Figure BDA0002924297260000054
is 1 (N) total ×n sub ) The output weight matrix of the dimension. f. of res Is the activation function of the reserve pool neurons, and takes the sigmoid function.
Here, to overcome the effect of the initial transient, assume that n is the number n min From time +1 to time L train Time-of-day collection internal state matrix H = [ x (n) min +1),…,x(L train )] T Its corresponding desired output vector is T = [ T (n) min +1),…,t(L train )] T , t(n min + 1) is n min The desired output value at time + 1.
In addition, in order to overcome the ill-conditioned solution problem possibly caused by detection of abnormal values such as noise and the like and improve the robustness of prediction, the output weight matrix is iteratively solved by adopting a robust loss function containing L2 regularization
Figure BDA0002924297260000055
Initializing the solving iteration times k =1 of the output weight matrix, initializing a robust weight matrix as a unit matrix, and calculating a robust loss function and residual robust scale estimation; and updating the robust weight matrix according to the robust weight function.
Initializing the solving iteration times k =1 of the output weight matrix, and beginningAnd (3) taking the initialized robust weight matrix as a unit matrix, calculating a robust loss function and residual robust scale estimation in an iterative process, updating the robust weight matrix according to the robust weight function, and calculating an output weight matrix. Robust loss function E (k) and output weight matrix combining regularization term from iteration to k
Figure BDA0002924297260000056
The solving results of (1) are respectively:
Figure BDA0002924297260000057
Figure BDA0002924297260000058
where C is the regularization coefficient and I is (N) total ×n sub )×(N total ×n sub ) The identity matrix of the dimension(s),
Figure BDA0002924297260000059
is 2-norm, ρ (-) is the robust objective function, ξ [k] (n)=T(n)-y [k] (n) is the training error at the nth time instant of iteration to step k,
Figure BDA00029242972600000510
for the residual robust scale estimation from iteration to k steps, MAR is the median absolute deviation.
Figure BDA00029242972600000511
Is represented by (L) train -n min )×(L train -n min ) The robust weight matrix of dimension, w (-) is the robust weight function. In the invention, a Welsch function is taken as a robust weight function, and a robust target function rho (-) and a robust weight function w (-) thereof are respectively as follows:
Figure BDA00029242972600000512
Figure BDA00029242972600000513
wherein z is a function variable, k set =μk def Mu is a robust coefficient, the robust weight function is selected according to experience, the Welsch function is selected as the robust weight function, and then the coefficient k def =2.985。
Designing a self-organization mechanism of the deep confidence network and the echo state network and training the deep confidence network and the echo state network;
the invention respectively designs a self-organization mechanism and a corresponding training process aiming at a deep confidence network and an echo state network. Namely, in the step two, on the basis of the step two, the adjustment of the hidden layer neuron of the deep confidence network and the neutron reserve pool of the echo state network is respectively realized in each iteration of the respective training process.
As shown in FIG. 3, for each hidden layer neuron of the deep belief network, the iterative training times k are initialized first 1 =1, training weight matrix of deep belief network according to contrast divergence method, calculating importance index of neuron of each layer, wherein, iteration is carried out to any k-th 1 Importance index of neurons in step
Figure BDA00029242972600000616
Is defined as:
Figure BDA0002924297260000061
wherein the content of the first and second substances,
Figure BDA0002924297260000062
and
Figure BDA0002924297260000063
respectively, the input and output of the jth neuron of the l layer.
Figure BDA0002924297260000064
Is shown as
Figure BDA0002924297260000065
And
Figure BDA0002924297260000066
the value of the mutual information between them,
Figure BDA0002924297260000067
is shown as
Figure BDA0002924297260000068
And the desired output vector T. For the deep belief network part, the self-organization process of hidden layer neurons includes splitting and deleting, and a specific self-organization mechanism based on neuron importance is shown as follows.
(1) The mechanism of neuronal cleavage in the hidden layer: when iterating to the k 1 When the step is carried out,
Figure BDA00029242972600000615
the higher the neuron, the more active it is processing information. The present invention therefore chooses to split the most active neurons in the hidden layer. That is, when the jth neuron of the l-th layer satisfies the following condition:
Figure BDA0002924297260000069
the jth neuron splits into two neurons,
Figure BDA00029242972600000610
is iterated to the k < th > 1 Total number of layer I neurons in step (ii).
(2) Mechanism of pruning of hidden layer neurons: when in use
Figure BDA00029242972600000611
Lower, the neuron processes the information less strongly and should be considered to be deleted. Thus, the present invention defines iterating through the kth 1 Step-time adaptive pruning thresholdThe values are as follows:
Figure BDA00029242972600000612
wherein beta is (0, 1)]. Then, according to the above formula, when the jth neuron satisfies the condition
Figure BDA00029242972600000613
Then the jth neuron is deleted.
After the number of the neurons of the hidden layer and the weight matrix of the deep belief network are subjected to iterative training, the iterative training of the number of the sub-reservoirs of the echo state network and the output weight matrix can be carried out. Taking the output vector of the last hidden layer of the trained deep belief network as the input of the echo state network, and initializing the iteration number k of the echo state network 2 =1, self-defining control parameter vector, randomly generating temporary reserve pool weight and temporary input weight which are consistent with the original reserve pool size, and showing the specific sub-reserve pool screening and growing mechanism of the echo state network as follows:
(1) The screening mechanism of the child reserve pool: the invention defines the importance index of the ith sub-reserve pool in the reserve pool
Figure BDA00029242972600000614
Comprises the following steps:
Figure BDA0002924297260000071
wherein
Figure BDA0002924297260000072
Is the input vector of the p-th neuron of the ith sub-reservoir,
Figure BDA0002924297260000073
and the output vector of the p-th neuron of the ith sub-reservoir. Thus, training to k in iterations 2 Randomly generating temporary sub-reserves (1, 2, \8230; i, … ,i max (k 2 ) Sorting according to the size of the importance indexes:
Figure BDA0002924297260000074
the invention defines the self-adaptive screening threshold as follows:
S th (k 2 )=NS′ sub (INT(αi max (k 2 ))) (12)
wherein INT (-) is an integer function.
Figure BDA0002924297260000075
And the sorted sub reserve pool importance vectors are obtained. And the alpha epsilon (0, 1) is a self-defined control parameter, namely different screening degrees of the sub reserve pool are controlled in each circulation. The parameter may take a plurality of values
Figure BDA0002924297260000076
Together forming a control parameter vector
Figure BDA0002924297260000077
But need to satisfy alpha 1 <α 2 <…<α ,N α Is the dimension of the control parameter vector.
The training goal of the echo state network is to minimize the robust loss function of equation (4). In order to ensure that the performance of the reserve pool after screening can be kept or better than the sub-reserve pool set before screening, when the kth iteration 2 The ith sub-reserve pool meets the following conditions:
Figure BDA0002924297260000078
and the robust loss function E (k) of all sub-pools that satisfy the condition 2 ) And when the value is less than or equal to the minimum value of the historical robust loss function, the sub-reserve pools are reserved as new reserve pools, and the rest sub-reserve pools are deleted. And taking the screened sub reserve pools as temporary reserve pools and calculating training errors.
(2) Growth mechanism of child reserve pool: after screening, the increase of the sub reserve pools is realized, the temporary reserve pool is used as a new reserve pool, a new randomly generated sub reserve pool is merged, and then the output weight matrix of the echo state network after merging is as follows:
Figure BDA0002924297260000079
wherein H o For the state matrix corresponding to the reserve pool after the screening mechanism is completed, H g For the state matrix corresponding to the growing reserve pool,
Figure BDA00029242972600000710
is the state matrix corresponding to the growing reserve pool.
Figure BDA00029242972600000711
Is composed of
Figure BDA00029242972600000712
An identity matrix of dimensions, wherein,
Figure BDA00029242972600000713
the total number of the growing child pools. Further, a combined output weight matrix may be derived based on equation (14)
Figure BDA00029242972600000714
The mathematical expression updated is:
Figure BDA00029242972600000715
wherein, I o Is (N) o ×n sub )×(N o ×n sub ) Identity matrix of dimension, N o The number of the child reserve pools after the screening mechanism is completed. I is g Is n sub ×n sub An identity matrix of dimensions. I is L Is (L) train -n min )×(L train -n min ) An identity matrix of dimensions.
And obtaining a self-organizing deep confidence echo state network model.
Predicting based on the self-organizing deep confidence echo state network model;
through the design of a self-organization mechanism, the self-organization deep confidence echo state network model can automatically learn and optimally design the proper number of neurons in the hidden layer of the deep confidence network and the number of the sub reserve pools of the echo state network in the training process, and meanwhile, the weight matrix solution corresponding to each neural network is realized. And inputting the input variable into the trained self-organizing deep confidence echo state network model, so that the characterization index of the lake and reservoir cyanobacterial bloom, namely the chlorophyll a concentration prediction can be realized.
The technical solution of the present invention is further illustrated by the following examples.
The first embodiment is as follows:
the embodiment provides a lake and reservoir cyanobacterial bloom prediction method based on a self-organizing deep confidence echo state network, which specifically comprises the following implementation steps:
step one, determining an input variable and an output variable of a prediction model;
the data in the examples were derived from the water quality data set of west fal-thao harbor, usa. The data set contains 6 water quality variables, and table 1 specifically shows the abbreviations, units and meanings of the individual variables in the data set.
TABLE 1 Water quality variables information
Figure BDA0002924297260000081
The sampling frequency of the data is 20 minutes, the acquisition time starts from 18 o 'clock 01 at 6 h in 2017 to 13 o' clock 21 at 31 h in 2017, 8 h in 2017, and 2491 groups of data are shared. In order to overcome the influence of redundant indexes on the modeling effect, the correlation between the water quality variable and the concentration of the chlorophyll a serving as an output variable is measured by a mutual information value in the experiment. In the experiment, not only the correlation of the water quality variables but also the autoregressive characteristic of the time series of the chlorophyll a concentration are considered. As can be seen from fig. 4A, the mutual information value of the lag variable of chlorophyll a gradually decreases as the lag time increases. Fig. 4B shows the mutual information values of 5 water quality variables for the chlorophyll a concentration at the next time. The experiment selects a water quality variable with a mutual information value greater than 0.2. Therefore, the input variables of the self-organizing depth confidence echo state network are the water temperature, the salinity, the oxygen saturation, the specific conductivity, the chlorophyll a concentration at the same moment and the chlorophyll a concentration at the lag three moments, and the output variable is the chlorophyll a concentration at the next moment. That is, the number of input variables is 8, and the number of output variables is 1.
Step two, establishing a structure of a deep confidence echo state network;
in the experiment for predicting the cyanobacterial bloom in the lake and reservoir, the self-organizing deep confidence echo state network starts to collect data after running 200 data to form a state matrix, the length of training data is 1600, and the length of testing data is 691. The neuron of the hidden layer of the deep belief network part is initialized to 3-3, the iterative training time is 50, the learning batch size is 50, the learning rate is 0.1, and beta is 0.98. The initialization range of the input weight matrix element of the echo state network is [ -1,1]The singular value of the diagonal matrix in SVD takes [0.1,0.99]The size of the sub-reserve pool is 5, the regularization coefficient C is 1e-7, the robust coefficient mu is 1, the iteration number of output weight matrix solving is 15, the iteration number of the reserve pool self-organizing process is 50, and the control parameter vector is uniformly distributed
Figure BDA0002924297260000091
Is (0.5, 0.6,0.7,0.8, 0.9).
Designing and optimizing a self-organizing mechanism and a training process of the deep confidence echo state network;
the self-organizing process of hidden layers and reserve pools in the self-organizing deep confidence echo state network is shown in fig. 5A and 5B. In FIG. 5A, the neurons of the first hidden layer H1 and the second hidden layer H2 are finally stable at 7 and 6, respectively, so the final hidden layer structure is 7-6. During the process of reservoir size learning, the iterative training number is set to 100. As shown in fig. 5B, the pool size iteratively converges to 120, containing a total of 24 child pools, based on the self-organizing mechanism. Therefore, the structure of the self-organizing deep confidence echo state network in the experiment is 8-7-6-120-1. Fig. 5C shows the convergence curve of Root Mean Square Error (RMSE) during training. The training error of the self-organizing deep confidence echo state network finally converges to be near the minimum value of 0.383.
Predicting based on the self-organizing deep confidence echo state network model;
FIG. 6 shows the comparison of the predicted results of cyanobacterial bloom in lakes and reservoirs by using the self-organizing deep confidence echo state network and the predicted results by using other echo state network methods. It can be seen that the self-organizing deep confidence echo state network (SDBMESN) provided by the embodiment of the invention can effectively learn the evolution rule of the cyanobacterial bloom in the lake and reservoir relative to other echo state network models. Table 2 shows the comprehensive performance of the basic echo state network (OESN), the Regularized Echo State Network (RESN), the Growing Echo State Network (GESN), the adaptive regularized echo state network (DRESN), and the deep confidence echo state network (DBESN) in training and testing, including neural network structure and RMSE index. Therefore, the lake and reservoir cyanobacterial bloom prediction method based on the self-organizing deep confidence echo state network has high prediction precision and good generalization capability. Meanwhile, the size of a reserve pool of the self-organizing deep confidence echo state network is smaller than that of other echo state networks, and the self-organizing deep confidence echo state network has the simplest neural network structure. Here, the DBESN for each set of experiments employs the same neural network structure as the self-organizing deep confidence echo state network. However, under the condition of consistent neural network structure, the prediction performance of the DBESN is lower than that of the self-organizing deep confidence echo state network provided by the invention. The self-organizing mechanism of the self-organizing deep confidence echo state network not only realizes structural simplification, but also keeps the neurons and the sub reserve pools with better relative performance in the existing neurons in the self-organizing process, so that the neurons and the reserve pools in the self-organizing deep confidence echo state network have better prediction effect, and the capability of processing dynamic information by the neurons and the reserve pools is further improved. Therefore, the self-organizing deep confidence echo state network is suitable for the prediction application of the lake and reservoir cyanobacterial bloom.
TABLE 2 blue algae bloom prediction experiment results and different method comparison
Figure BDA0002924297260000101
In the self-organizing deep confidence echo state network in the embodiment, the robustness loss function is used as the target function, so that the robustness of time series data prediction of abnormal values such as monitoring noise can be improved. To verify this feature, a 10% to 40% proportion of the pulse function was added to each training sample of the example dataset. The test results are shown in FIG. 7. It can be seen that the robustness of the self-organizing deep confidence echo state network of the embodiment of the invention is obviously superior to that of other echo state networks, and the robustness is better.

Claims (2)

1. A lake and reservoir cyanobacterial bloom prediction method based on a self-organizing deep confidence echo state network comprises the following steps:
determining an input variable and an output variable of a deep confidence echo state network;
the input variable takes a mutual information method as a judgment criterion, when the mutual information value of the candidate water quality variable and the output variable is greater than a set threshold value of 0.2, the candidate water quality variable is selected as the input variable, otherwise, the candidate water quality variable is removed;
the output variable is the characteristic variable chlorophyll a concentration of the lake and reservoir cyanobacterial bloom;
the screened input variables and output variables participate in the training and prediction of the deep confidence echo state network;
step two, establishing a structure of a deep confidence echo state network;
the structure of the deep confidence echo state network comprises a deep confidence network and an echo state network, wherein the echo state network adopts a modular reserve pool structure; the method comprises the following specific steps:
2.1, adopting a restricted Boltzmann mechanism as a basic unit of a depth confidence network, and extracting deep features of an input variable;
2.2, learning the deep layer characteristics and predicting the chlorophyll a concentration at the next moment by an echo state network;
depth settingThe reserve pool in the echo-back state network structure comprises a plurality of sub reserve pools, and each sub reserve pool is independent; setting the number of the sub-reserve pools as N total Each is composed of N total Reserve pool weight matrix formed by individual sub reserve pools
Figure FDA0003753637890000011
Is a block diagonal matrix, namely:
Figure FDA0003753637890000012
wherein each weight matrix element W i For the weight matrix corresponding to the ith sub-reserve pool, i is more than or equal to 1 and less than or equal to N total
W i Generated by singular value decomposition, i.e. W i =U i S i V i
Diagonal matrix
Figure FDA0003753637890000013
Randomly generated by a given singular value distribution, and the weight matrix inside the sub-reserve pool is fully connected,
Figure FDA0003753637890000014
p=1,2,…,n sub ,n sub the size of the ith sub-reserve pool;
Figure FDA0003753637890000015
is two random orthogonal matrices generated simultaneously, where u pk ,v pk ∈(-1,1),p=1,2,…,n sub ,k=1,2,…,n sub
To overcome the effect of the initial transient, assume that n is the number of transitions from min From time +1 to L train Time-of-day collection internal state matrix H = [ x (n) min +1),…,x(L train )] T Its corresponding desired output vector is T = [ T (n) min +1),…,t(L train )] T ,t(n min + 1) is n min The desired output value at time + 1;
output weight matrix
Figure FDA0003753637890000016
Adopting a robust loss function containing L2 regularization to iteratively solve output, combining the robust loss function E (k) of a regularization item from iteration to k step and outputting a weight matrix
Figure FDA0003753637890000017
The solution results of (a) are respectively:
Figure FDA0003753637890000021
Figure FDA0003753637890000022
c is a regularization coefficient, and C is a regularization coefficient,
i is (N) total ×n sub )×(N total ×n sub ) The identity matrix of the dimension(s),
Figure FDA00037536378900000217
is a function of the 2-norm,
ρ (-) is a robust objective function, ξ [k] (n)=T(n)-y [k] (n) is the training error at the nth time of iteration to step k,
Figure FDA0003753637890000023
for residual robust scale estimation from iteration to k steps, MAR is the median absolute deviation,
Figure FDA0003753637890000024
is represented by (L) train -n min )×(L train -n min ) A dimensional robust weight matrix, w (-) is a robust weight function, and a Welsch function is taken;
the robust objective function ρ (-) and the robust weighting function w (-) are respectively:
Figure FDA0003753637890000025
Figure FDA0003753637890000026
wherein z is a variable, k set =μk def Mu is a robust coefficient, k def =2.985;
Designing a self-organization mechanism of the deep confidence echo state network and training the deep confidence echo state network;
respectively realizing the adjustment of hidden layer neurons of the deep belief network and a neutron reserve pool in the echo state network in each iteration of the respective training process;
for each hidden layer neuron of the deep belief network, iterate to kth 1 Importance index of neurons in step
Figure FDA00037536378900000218
Is defined as follows:
Figure FDA0003753637890000027
Figure FDA0003753637890000028
input for the jth neuron at level l;
Figure FDA0003753637890000029
is the output of the jth neuron of the l layer;
Figure FDA00037536378900000210
Is shown as
Figure FDA00037536378900000211
And
Figure FDA00037536378900000212
mutual information value between;
Figure FDA00037536378900000213
is shown as
Figure FDA00037536378900000214
And the desired output vector T;
(3.1) mechanism of cleavage of hidden layer neurons: when the jth neuron of the l-th layer meets the following condition:
Figure FDA00037536378900000215
the jth neuron splits into two neurons,
Figure FDA00037536378900000216
is iterated to the k < th > 1 The total number of layer I neurons;
(3.2) mechanism of pruning of hidden layer neurons: define iteration to kth 1 The adaptive pruning threshold at step time is as follows:
Figure FDA0003753637890000031
wherein beta is (0, 1)]When the jth neuron satisfies the condition
Figure FDA0003753637890000032
Then, the jth neuron is deleted;
the self-organizing mechanism of the echo state network comprises a sub-reserve pool screening and growing mechanism, specifically,
(3.3) screening mechanism of the child pools: defining importance index of ith sub-reserve pool in reserve pool
Figure FDA0003753637890000033
Comprises the following steps:
Figure FDA0003753637890000034
wherein
Figure FDA0003753637890000035
Is the input vector of the p-th neuron of the ith sub-reservoir,
Figure FDA0003753637890000036
for the output vector of the p-th neuron of the ith sub-reservoir, iteratively training to k 2 Randomly generating i consistent with the structure of the original reserve pool in step max (k 2 ) A temporary child reserve pool {1,2, \8230;, i max (k 2 ) Sorting according to the size of the importance indexes:
Figure FDA0003753637890000037
the adaptive screening threshold is defined as follows:
S th (k 2 )=NS sub (INT(αi max (k 2 ))) (12)
wherein INT (-) is an integer function,
Figure FDA0003753637890000038
for the sorted sub-reserve pool importance vectors, alpha belongs to (0, 1) as a self-defined control parameter;
when iteratingKth 2 When the step (ii) is finished, the ith sub-reserve pool meets the following conditions:
Figure FDA0003753637890000039
and robust loss function E (k) for all sub-pools 2 ) When the current value is less than or equal to the minimum value of the historical robust loss function, the sub reserve pools are reserved, and the rest sub reserve pools are deleted;
predicting based on the self-organizing deep confidence echo state network model;
predicting the cyanobacterial bloom by using the self-organizing deep confidence echo state network model;
the method is characterized in that:
the mathematical expression of the echo state network is as follows:
Figure FDA00037536378900000310
Figure FDA00037536378900000311
u (n) is input vector of K multiplied by 1 dimension at n moment, K is the neuron number of the last hidden layer of the deep belief network, namely the deep feature of the deep belief network at n moment;
Figure FDA00037536378900000312
x i (n) is 1 Xn sub The state vector of the ith sub-reserve pool of the dimension at the moment n;
y (n) is the output value of the echo state network at the moment n;
Figure FDA00037536378900000313
in order to input the weight matrix, the weight matrix is input,
Figure FDA00037536378900000315
is n sub The input weight matrix of the ith sub-pool in xK dimension,
Figure FDA00037536378900000314
is 1 (N) total ×n sub ) A dimension output weight matrix;
f res taking a sigmoid function as an activation function of the reserve pool neurons;
(3.4) growth mechanism of child reserve pool: combining each screened sub reserve pool with a new randomly generated sub reserve pool, wherein the weight matrix of the output vector of the echo state network after combination is as follows:
Figure FDA0003753637890000041
H o a state matrix corresponding to the reserve pool after the screening mechanism is completed;
H g a state matrix corresponding to the growing reserve pool;
Figure FDA0003753637890000042
merging the state matrixes corresponding to the increased reserve pools;
Figure FDA0003753637890000043
is composed of
Figure FDA0003753637890000044
An identity matrix of dimensions, wherein,
Figure FDA0003753637890000045
the total number of the merged and increased child reserve pools;
obtaining an output weight matrix based on equation (14)
Figure FDA0003753637890000046
The updated mathematical expression is:
Figure FDA0003753637890000047
I o is (N) o ×n sub )×(N o ×n sub ) An identity matrix of dimensions;
N o the number of the child reserve pools after the screening mechanism is finished;
I g is n sub ×n sub An identity matrix of dimensions;
I L is (L) train -n min )×(L train -n min ) An identity matrix of dimensions.
2. The lake and reservoir cyanobacterial bloom prediction method based on the self-organizing deep confidence echo state network as claimed in claim 1, wherein: the input variables of the deep confidence echo state mesh structure also include the chlorophyll a concentration at three moments later.
CN202110126626.7A 2021-01-29 2021-01-29 Lake and reservoir cyanobacterial bloom prediction method based on self-organizing deep confidence echo state network Active CN112862173B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110126626.7A CN112862173B (en) 2021-01-29 2021-01-29 Lake and reservoir cyanobacterial bloom prediction method based on self-organizing deep confidence echo state network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110126626.7A CN112862173B (en) 2021-01-29 2021-01-29 Lake and reservoir cyanobacterial bloom prediction method based on self-organizing deep confidence echo state network

Publications (2)

Publication Number Publication Date
CN112862173A CN112862173A (en) 2021-05-28
CN112862173B true CN112862173B (en) 2022-10-11

Family

ID=75986842

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110126626.7A Active CN112862173B (en) 2021-01-29 2021-01-29 Lake and reservoir cyanobacterial bloom prediction method based on self-organizing deep confidence echo state network

Country Status (1)

Country Link
CN (1) CN112862173B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114282639B (en) * 2021-12-24 2024-02-02 上海应用技术大学 Water bloom early warning method based on chaos theory and BP neural network

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105510546B (en) * 2015-12-27 2017-06-16 北京工业大学 A kind of biochemical oxygen demand (BOD) BOD intelligent detecting methods based on self-organizing Recurrent RBF Neural Networks
CN107506857B (en) * 2017-08-14 2020-05-08 北京工商大学 Urban lake and reservoir cyanobacterial bloom multivariable prediction method based on fuzzy support vector machine
CN108416460B (en) * 2018-01-19 2022-01-28 北京工商大学 Blue algae bloom prediction method based on multi-factor time sequence-random depth confidence network model
CN109886454B (en) * 2019-01-10 2021-03-02 北京工业大学 Freshwater environment bloom prediction method based on self-organizing deep belief network and related vector machine
CN111860306B (en) * 2020-07-19 2024-06-14 陕西师范大学 Electroencephalogram signal denoising method based on width depth echo state network

Also Published As

Publication number Publication date
CN112862173A (en) 2021-05-28

Similar Documents

Publication Publication Date Title
Karul et al. Case studies on the use of neural networks in eutrophication modeling
CN108764540B (en) Water supply network pressure prediction method based on parallel LSTM series DNN
CN106529818B (en) Water quality assessment Forecasting Methodology based on Fuzzy Wavelet Network
CN108416755A (en) A kind of image de-noising method and system based on deep learning
CN106022954B (en) Multiple BP neural network load prediction method based on grey correlation degree
CN102622418B (en) Prediction device and equipment based on BP (Back Propagation) nerve network
CN111324990A (en) Porosity prediction method based on multilayer long-short term memory neural network model
CN108416460B (en) Blue algae bloom prediction method based on multi-factor time sequence-random depth confidence network model
CN109948029A (en) Based on the adaptive depth hashing image searching method of neural network
CN105488563A (en) Deep learning oriented sparse self-adaptive neural network, algorithm and implementation device
CN113570039B (en) Block chain system based on reinforcement learning optimization consensus
CN109214579B (en) BP neural network-based saline-alkali soil stability prediction method and system
CN109308544B (en) Blue algae bloom prediction method based on contrast divergence-long and short term memory network
CN111242380A (en) Lake (reservoir) eutrophication prediction method based on artificial intelligence algorithm
CN108596078A (en) A kind of seanoise signal recognition method based on deep neural network
Wang et al. An approach of recursive timing deep belief network for algal bloom forecasting
CN112862173B (en) Lake and reservoir cyanobacterial bloom prediction method based on self-organizing deep confidence echo state network
CN112884149A (en) Deep neural network pruning method and system based on random sensitivity ST-SM
CN109408896B (en) Multi-element intelligent real-time monitoring method for anaerobic sewage treatment gas production
CN115640901A (en) Small sample load prediction method based on hybrid neural network and generation countermeasure
CN109978024B (en) Effluent BOD prediction method based on interconnected modular neural network
Akinwale Adio et al. Translated Nigeria stock market price using artificial neural network for effective prediction
CN116629332A (en) Signal compensation method based on optical reserve tank calculation
CN113191689B (en) Land suitability evaluation method for coupling principal component analysis and BP neural network
CN114357877A (en) Fishpond water quality evaluation prediction system and method based on fuzzy evaluation and improved support vector machine

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant