CN115587538A - Lake and reservoir cyanobacterial bloom prediction system based on SMRELM model - Google Patents

Lake and reservoir cyanobacterial bloom prediction system based on SMRELM model Download PDF

Info

Publication number
CN115587538A
CN115587538A CN202211248275.8A CN202211248275A CN115587538A CN 115587538 A CN115587538 A CN 115587538A CN 202211248275 A CN202211248275 A CN 202211248275A CN 115587538 A CN115587538 A CN 115587538A
Authority
CN
China
Prior art keywords
model
layer
information
ecm
concentration
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211248275.8A
Other languages
Chinese (zh)
Inventor
张慧妍
刘明伟
王小艺
王立
许继平
孙茜
王昭洋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Technology and Business University
Original Assignee
Beijing Technology and Business University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Technology and Business University filed Critical Beijing Technology and Business University
Publication of CN115587538A publication Critical patent/CN115587538A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • G06F30/27Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biophysics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computer Hardware Design (AREA)
  • Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Biomedical Technology (AREA)
  • Geometry (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a lake and reservoir cyanobacterial bloom prediction system based on an SMRELM model, which is characterized in that a cyanobacterial bloom time sequence characteristic information extraction module (100), an S-ELM model (200) and an ECM model (30) are added in the traditional cyanobacterial bloom prediction. The method comprises the following prediction steps: the chlorophyll a concentration is used as a characterization index for describing the formation of the cyanobacterial bloom and is used for constructing a time sequence characteristic set; reconstructing a chlorophyll a concentration time sequence through a sliding window, and training a reconstructed input sample by adopting a manifold regularization extreme learning machine; according to the similarity condition, the switching prediction of the cyanobacterial bloom is realized by combining a time sequence feature set and a manifold regularization extreme learning machine; and establishing an error compensation model by adopting improved fuzzy C-means clustering and a T-S fuzzy neural network, and correcting the result of the prediction model. The method solves the problem that the traditional batch type extreme learning machine has lower precision in the blue algae bloom prediction.

Description

Lake and reservoir cyanobacterial bloom prediction system based on SMRELM model
Technical Field
The invention relates to the technical field of blue algae water bloom prediction, in particular to a lake and reservoir blue algae water bloom prediction system based on an SMRELM model.
Background
The eutrophication of the water body refers to a pollution phenomenon that under the combined action of natural factors and human activities, the nutrient substances in the water body are increased, so that the excessive growth of plants and the change of the ecological balance of the water body are caused. The causes of eutrophication are complex and diverse, and relate to the influence of various ecological, social, economic and other factors. The cyanobacterial bloom in lakes and reservoirs is a common ecological disaster caused by eutrophic lakes, and the water body is anoxic by generating toxins and dying and decomposing, so that the normal dissolved oxygen balance of the water body is broken, the water quality is further deteriorated, the human health is threatened, and serious economic loss and social influence are caused.
The author Schroetian's early warning research on cyanobacterial bloom in Suzhou Taihu lake drinking water source' published in 12 months in 2019. Page 12 of the text introduces the process of establishing a cyanobacterial bloom prediction model, and the specific flow is shown in fig. 1.
In the 3 rd phase "ecology newspaper" of volume 25 of 3.2005, thought on the mechanism of formation of cyanobacterial blooms in large shallow water eutrophic lakes was disclosed, and the authors showed a strong and high light. The method indicates that the growth of the blue algae and the formation of the water bloom can be divided into 4 stages of dormancy, resuscitation, biomass increase (growth), floating, aggregation and the like in large shallow lakes at the middle and lower reaches of Yangtze river with clear four seasons and severe disturbance, and the physiological characteristics and the dominant environmental influence factors of the blue algae in each stage are different. In winter, the dormancy of the bloom-forming cyanobacteria is mainly influenced by low temperature and dark environment; the recovery process in spring is mainly controlled by the temperature and dissolved oxygen on the deposition surface of the lake, substances and energy required by photosynthesis and cell division determine the growth conditions of the water-blooming cyanobacteria in spring and summer, and once proper meteorological and hydrological conditions exist, a large amount of water-blooming cyanobacteria groups accumulated in the water body float to the surface of the water body to accumulate to form visible water bloom. The research on the forming mechanism of the cyanobacteria bloom requires to search the trigger factors or specificity factors of each main physiological stage leading to the formation of the bloom, and the intensive research is carried out aiming at the physiological characteristics of the cyanobacteria at different stages. Only then is it possible to gradually understand the mechanism of formation of cyanobacterial bloom and predict each process of its occurrence, seeking more targeted control measures. ".
The current blue algae bloom forecasting model in the lake and reservoir is mainly divided into a mechanism driving model and a data driving model. The mechanism driving model is mainly based on physical, biological and chemical factors of lake and reservoir cyanobacterial bloom, a differential equation set is established, a numerical method is used for solving, the cause of the lake and reservoir cyanobacterial bloom outbreak is analyzed, and the evolution rule and trend are predicted. Although the theoretical machine-driven model is more interpretable, the startup operation requires more parameters, such as: water depth, underwater shear force, etc., which are relatively complex in practical application. The data driving model can excavate the internal law of the lake-reservoir cyanobacteria bloom evolution from the accumulated time sequence data without prior parameters such as the specific mechanism of the lake-reservoir cyanobacteria bloom in the model establishing process, and is widely concerned in the forecast of the lake-reservoir cyanobacteria bloom.
Disclosure of Invention
The invention aims to solve the technical problem that the existing lake and reservoir cyanobacterial bloom prediction method is low in precision, and a lake and reservoir cyanobacterial bloom prediction model, namely an SMRELM model, is constructed by effectively excavating the time sequence characteristics of the lake and reservoir cyanobacterial bloom and combining an error compensation mode.
The invention relates to a lake and reservoir cyanobacterial bloom prediction system based on an SMRELM model, which is improved in that:
arranging a cyanobacterial bloom time sequence characteristic information extraction module (100) in a cyanobacterial bloom dynamic change trend module (10);
arranging an S-ELM model (200) at the output end of the blue algae water bloom dynamic change trend module (10);
an ECM model (30) is provided at the output of the S-ELM model (200).
The invention discloses a lake and reservoir cyanobacterial bloom prediction system based on an SMRELM model, which is characterized in that a cyanobacterial bloom time sequence characteristic information extraction module, an S-ELM model and an ECM model are added in the traditional cyanobacterial bloom prediction. The method comprises the following prediction steps: the chlorophyll a concentration is used as a characterization index for describing the formation of the cyanobacterial bloom and is used for constructing a time sequence characteristic set; reconstructing a chlorophyll a concentration time sequence through a sliding window, and training a reconstructed input sample by adopting a manifold regularization extreme learning machine; according to the similarity condition, the switching prediction of the cyanobacterial bloom is realized by combining a time sequence feature set and a manifold regularization extreme learning machine; and establishing an error compensation model by adopting improved fuzzy C-means clustering and a fuzzy neural network, and correcting the result of the prediction model. The invention solves the problem of low precision of the traditional extreme learning machine in the blue algae bloom prediction.
The SMRELM model has the advantages that:
(1) the invention provides a method for analyzing chlorophyll a concentration information CYB _ LD in historical cyanobacterial bloom, further extracting the CYB _ LD information according to a set floating recovery threshold and a set continuous growth threshold, and obtaining representative cyanobacterial bloom candidate time sequence characteristic information (namely CYB _ LD) HX Information); then according to the distance similarity, eliminating CYB _ LD HX The time sequence characteristics of the cyanobacterial bloom are repeated in the information, and the prior time sequence characteristic information (namely CYBTF information) of the cyanobacterial bloom is obtained, wherein the CYBTF information can provide the prior local shape characteristics for the invention.
(2) According to the invention, CYBTF information is divided into comparison characteristic information (DB _ CYB information) and prediction characteristic information (CS _ CYB information). And calculating the trend similarity and the distance similarity of the DB _ CYB information and the local prediction segments, and when a threshold condition is met, taking the prediction characteristics as a prediction value of the prediction model at the next moment, otherwise, directly adopting an extreme learning machine to predict. Therefore, the prediction accuracy of the prediction model of the extreme learning machine can be improved by combining the prior local shape characteristics. In addition, a manifold regular term is introduced to reduce the influence of the extreme learning machine caused by random initialization and improve the generalization capability of the extreme learning machine.
(3) The invention provides an improved fuzzy C-means clustering algorithm by combining the characteristics of subtractive clustering and simultaneously considering the intra-class compactness and the inter-class separation degree of the fuzzy C-means clustering. And combining the error compensation model with a fuzzy neural network prediction model to obtain an improved error compensation model. And training and optimizing the error compensation model so as to compensate the prediction result of the S-ELM prediction model.
(4) The invention constructs a comprehensive prediction model based on time sequence characteristics and error compensation, and realizes effective prediction of lake and reservoir cyanobacterial bloom by extracting the time sequence characteristics of the lake and reservoir cyanobacterial bloom and fully utilizing error data.
Drawings
FIG. 1 is a flow chart of the establishment of a traditional cyanobacterial bloom prediction model.
FIG. 1A is a display picture of different chlorophyll a concentrations in the blue algae bloom after inversion.
Fig. 2 is a topology structure diagram of a conventional T-S fuzzy neural network.
FIG. 2A is a flow chart of the invention for constructing an MRELM model.
FIG. 3 is a structural block diagram of the lake and reservoir cyanobacterial bloom prediction system based on the SMRELM model.
FIG. 4 is a flow chart of blue algae bloom prediction in lakes and reservoirs by applying the SMRELM model of the invention.
Fig. 5 is a prediction graph of different distance similarity thresholds in an embodiment.
FIG. 6 is a comparison of the predicted result of cyanobacterial bloom in lakes and reservoirs with that of other prediction methods in the prior art.
FIG. 7 is a comparison of the predicted results of cyanobacterial bloom in lakes and reservoirs in the examples with the stability of each model of other existing prediction methods.
10. Blue algae bloom dynamic change trend module 100. Blue algae bloom time sequence characteristic information extraction module
20. Prediction index module of chlorophyll a concentration 200.S-ELM model
ECM model
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and examples.
In the present invention, the Switching prediction model is abbreviated as S-ELM (Switching extreme learning machine) model. The improved T-S fuzzy neural network is called ECM (Error complement Model) Model for short, namely an Error compensation Model based on the fuzzy neural network. The SMRELM model is a short name of a comprehensive prediction model which is composed of an S-ELM model and an ECM model and is based on time sequence characteristics and error compensation.
The ECM model of the invention refers to improvement of an ambiguity layer under a T-S fuzzy neural network structure. T-S fuzzy neural network structure refers to PM based on T-S fuzzy neural network published by volume 25, no. 3 control engineering in month 3 of 2018 2.5 Prediction research, authors jogjunfei, caijie, hanhonggui; section 3.1. The structure of the T-S fuzzy neural network is shown in FIG. 2, and the ECM model of the present invention is shown in FIG. 2The fuzzy layer in the structure is improved by different generation modes of membership function, and the ECM model has the effects that: the number of Gaussian membership functions in the fuzzy layer can be prevented from being set artificially, and the influence of outliers on the prediction precision is reduced.
The S-ELM model judges whether the trained MRELM model or the cyanobacterial bloom prior time sequence characteristic information CYBTF is selected or not according to the similarity switching condition, so that the prediction precision of the chlorophyll a concentration in the cyanobacterial bloom is improved.
In the present invention, the Manifold regularization Extreme Learning Machine is abbreviated as MRELM (modified regulated empirical Learning Machine) model.
Referring to the display of different chlorophyll a concentrations in the blue algae bloom after inversion shown in fig. 1A, different chlorophyll a concentration values at the sampling time can be obtained from fig. 1A, and are represented by different shades of gray. Just for the different chlorophyll a concentration values, the invention needs to perform the feature extraction again on the cyanobacteria bloom features (namely, the added cyanobacteria bloom time sequence feature information extraction module 100), and the extracted local shape features participate in the construction of the S-ELM model and are also the basis for completing the switching prediction judgment.
In the invention, four stages of the cyanobacteria bloom are newly set up by combining four stages disclosed in thinking of a cyanobacteria bloom forming mechanism in a large shallow water eutrophic lake, a chlorophyll a concentration value when the cyanobacteria bloom is formed and a selection threshold designed in the invention, wherein the four stages are respectively a floating recovery stage, a continuous growth stage, an outbreak stage and an apoptosis stage.
Referring to fig. 1 and fig. 3, in one aspect of the present invention, a cyanobacteria bloom time series characteristic information extraction module 100 is added to the cyanobacteria bloom dynamic change trend module 10; on the other hand, an S-ELM model 200 is added behind the blue algae bloom dynamic change trend module 10; the third aspect requires processing by the ECM model 30 before outputting a predicted value of chlorophyll-a concentration.
In the invention, the analysis information of the historical cyanobacterial bloom in the cyanobacterial bloom dynamic change trend model 10 comprises two parts, one part isHistorical cyanobacterial bloom analysis information for constructing S-ELM model is recorded as CYB, and CYB = { data = { (data) 1 ,data 2 ,…,data δ }; the other part is historical cyanobacterial bloom analysis information used for constructing an ECM model and is recorded as CYB ECM And is and
Figure BDA0003887391270000031
and the lower corner mark delta is the total number of the historical cyanobacterial bloom analysis information used for constructing the S-ELM model.
And the lower corner mark f is the total number of the historical cyanobacterial bloom analysis information used for constructing the ECM model.
data 1 And the 1 st historical cyanobacterial bloom analysis information used for constructing the S-ELM model in the cyanobacterial bloom dynamic change trend model 10 is represented.
data 2 And (3) representing the 2 nd historical cyanobacterial bloom analysis information used for constructing the S-ELM model in the cyanobacterial bloom dynamic change trend model 10.
data δ And representing the last historical cyanobacterial bloom analysis information used for constructing the S-ELM model in the cyanobacterial bloom dynamic change trend model 10.
Figure BDA0003887391270000032
And the 1 st historical cyanobacterial bloom analysis information used for constructing the ECM model in the cyanobacterial bloom dynamic change trend model 10 is represented.
Figure BDA0003887391270000033
And (3) representing the 2 nd historical cyanobacterial bloom analysis information used for constructing the ECM model in the cyanobacterial bloom dynamic change trend model 10.
Figure BDA0003887391270000034
And the last historical cyanobacterial bloom analysis information used for constructing the ECM model in the cyanobacterial bloom dynamic change trend model 10 is represented.
In the invention, each historical cyanobacterial bloom analysis information consists of sampling time and chlorophyll a concentration value. Namely:
data 1 is recorded as time _ data 1 。data 1 The chlorophyll a concentration value is recorded as ld _ data 1
data 2 Is recorded as time _ data 2 。data 2 The chlorophyll a concentration value of (A) is recorded as ld _ data 2
data δ Is recorded as time _ data δ 。data δ The chlorophyll a concentration value is recorded as ld _ data δ
Figure BDA0003887391270000035
Is counted as a sampling time
Figure BDA0003887391270000036
The chlorophyll a concentration value of (A) is recorded
Figure BDA0003887391270000037
Figure BDA0003887391270000038
Is counted as a sampling time
Figure BDA0003887391270000039
The chlorophyll a concentration value of (A) is recorded
Figure BDA00038873912700000310
Figure BDA00038873912700000311
Is recorded as
Figure BDA00038873912700000312
The chlorophyll a concentration value is recorded as
Figure BDA00038873912700000313
In the present invention, CYB and CYB ECM The cyanobacterial bloom analysis information is sequenced according to the sampling time.
Then there are: historical cyanobacterial bloom analysis information for constructing S-ELM model
Figure BDA0003887391270000041
The chlorophyll a concentration value obtained from the CYB is marked as CYB _ LD (concentration for constructing S-ELM model for short), and CYB _ LD = { LD _ data = { (D _ data) } 1 ,ld_data 2 ,…,ld_dataδ}。
Then there are: historical cyanobacterial bloom analysis information for constructing ECM model
Figure BDA0003887391270000042
From the CYB ECM The chlorophyll a concentration value obtained in (1) is recorded as CYB ECM LD (concentration for short in constructing ECM model), and
Figure BDA0003887391270000043
(I) constructing a manifold regularization extreme learning machine model which is recorded as an MRELM model
In the invention, the manifold regularization extreme learning machine is a method of the manifold regularization extreme learning machine disclosed in section 2 of "speech recognition system based on manifold regularization extreme learning machine" with reference to the automatic journal of 09 months in 2015, author of which is xu jiaming.
In the present invention, the grid search method is the grid search method disclosed in section 2 of "determination of kernel function parameters of support vector machine based on grid search", which is described in the journal of national oceanic university in 09 months of 2005, with reference to the author waning.
The specific construction steps of the MRELM model are as follows:
a, constructing a network architecture of an MRELM model;
in the invention, the network architecture of the MRELM model comprises an input layer, a hidden layer and an output layer.
B, establishing an input layer information of the MRELM model;
in the invention, the input layer of the MRELM model is used for receiving the concentration CYB _ LD = { LD _ data for constructing the S-ELM model on the first aspect 1 ,ld_data 2 ,…,ld_dataδ};
The second aspect of the input layer of the MRELM model sets the sliding window width of the MRELM model, and is marked as H w (ii) a The value range of the width of the sliding window is H w =[3,4,…,τ](ii) a The maximum value of the sliding window width is denoted as τ.
The third aspect of the input layer of the MRELM model is according to the H w CYB _ LD = { LD _ data 1 ,ld_data 2 ,…,ld_data δ The chlorophyll a concentration in (f) was taken as concentration-training sample set, and is scored as D _ ALL, and
Figure BDA0003887391270000051
LD _ DATA represents the training-chlorophyll a concentration value, an
Figure BDA0003887391270000052
T _ DATA represents the sliding window-chlorophyll a concentration value, and T _ DATA = [ T _ DATA = [ T = 1 ;T 2 ;…;T h ]。
The lower subscript h is the total number of concentration-training sample input sequences in the MRELM model.
Figure BDA0003887391270000053
The 1 st concentration-training sample input sequence for the MRELM model is represented.
Figure BDA0003887391270000054
The 2 nd concentration of the MRELM model-the training sample input sequence is represented.
Figure BDA0003887391270000055
The h concentration of the MRELM model is represented-the training sample input sequence.
T 1 Denotes MRE1 st sliding window concentration of LM model.
T 2 The 2 nd sliding window concentration of the MRELM model is represented.
T h Represents the h-th sliding window concentration of the MRELM model.
E.g. sliding window width H w =3, then
Figure BDA0003887391270000056
T 1 =ld_data 4
ld_data 3 Data for representing analysis information of 3 rd historical cyanobacterial bloom 3 Chlorophyll a concentration value of.
ld_data 4 Data for representing analysis information of the 4 th historical cyanobacterial bloom 4 Chlorophyll a concentration value of.
E.g. sliding window width H w =3, then
Figure BDA0003887391270000057
T 2 =ld_data 5
ld_data 5 Data for representing analysis information of 5 th historical cyanobacterial bloom 5 Chlorophyll a concentration value of (a).
E.g. sliding window width H w =3, then
Figure BDA0003887391270000058
ld_data δ-3 Data for representing the analysis information of delta-3 historical cyanobacterial bloom δ-3 Chlorophyll a concentration value of.
ld_data δ-2 Data for representing delta-2 historical cyanobacterial bloom analysis information δ-2 Chlorophyll a concentration value of.
ld_data δ-1 Data for representing the analysis information of delta-1 historical cyanobacterial bloom δ-1 Chlorophyll a concentration value of.
ld_data δ Data for representing delta-th historical cyanobacterial bloom analysis information δ Chlorophyll a concentration value of.
Then, when the sliding window width H is w Concentration-training sample set characterization when =3Comprises the following steps:
Figure BDA0003887391270000061
c, establishing a step C, and setting the number of neurons of a hidden layer of the MRELM model;
in the invention, the number of neurons in the hidden layer of the MRELM model is recorded as L h (ii) a The value range of the number of the neurons in the hidden layer is 1 < L h Kappa is less than or equal to kappa. The maximum number of hidden layer neurons is denoted as κ. Each neuron of the hidden layer receives excitation connection from all neurons of the input layer, namely LD _ DATA is subjected to characteristic mapping of the neurons in the hidden layer to obtain output information of the hidden layer, and the output information is recorded as
Figure BDA0003887391270000062
Figure BDA0003887391270000063
Refers to the (h × 1) -dimensional output value obtained from LD _ DATA passing through the 1 st neuron in the hidden layer.
Figure BDA0003887391270000064
Refers to the (h × 1) -dimensional output value obtained from LD _ DATA passing through the 2 nd neuron in the hidden layer.
Figure BDA0003887391270000065
Means from LD _ DATA through Lth in the hidden layer h The (h × 1) -dimensional output value obtained by each neuron.
D, constructing the output of the MRELM model;
in the present invention, the manifold regularization extreme learning machine is an output space H with LD _ DATA in the hidden layer out Can maintain its local geometry in the input layer, i.e. if two training sample sequences
Figure BDA0003887391270000066
The similarity in the input layer is high, and the similarity of the input layer and the hidden layer in the output space is high, so that the influence of randomness is reduced. The generalization performance of the manifold regularization extreme learning machine is improved.
Figure BDA0003887391270000067
The h-1 concentration of the MRELM model-training sample input sequence is represented.
When the output layer of the MRELM model contains 1 neuron, the output information corresponding to the neuron is recorded as
Figure BDA0003887391270000068
And the LDD out =H out X β, β represents the weight between the hidden layer and the output layer.
Figure BDA0003887391270000069
And (3) representing the output value corresponding to the 1 st concentration-training sample input sequence of the MRELM model.
Figure BDA00038873912700000610
And (3) representing the output value corresponding to the input sequence of the 2 nd concentration-training sample of the MRELM model.
Figure BDA00038873912700000611
And representing the output value corresponding to the h concentration-training sample input sequence of the MRELM model.
In the invention, the MRELM model is obtained through the construction steps A to D. The optimization of the MRELM model is shown in FIG. 2A.
Constructing a step E, and optimizing an MRELM model;
and a step E101 of constructing, setting a training sample subset based on a grid search method, and recording the number of the divided training sample subsets as b.
Constructing a step E102 of dividing D _ ALL into b concentration-training sample subsets of the same size, denoted as SUB _ D, and
Figure BDA0003887391270000071
BLD denotes sample time ordering-chlorophyll a concentration value, and BLD = [ BLD = [ BLD = 1 ;bld 2 ;…;bld b ]。
T _ BLD denotes sample time ordering-sliding window-chlorophyll a concentration value, and T _ BLD = [ T _ BLD 1 ;t_bld 2 ;…;t_bld b ]。
bld 1 And (3) representing the 1 st cyanobacterial bloom time sequence-concentration-training sample input subset divided based on the grid search method.
bld 2 And (3) representing the input subset of the 2 nd cyanobacterial bloom time sequence-concentration-training sample divided based on the grid search method.
bld b And (4) representing the b-th cyanobacterial bloom time sequence-concentration-training sample input subset divided based on the grid search method.
t_bld 1 And (3) representing the 1 st cyanobacterial bloom time sequence-sliding window concentration divided based on a grid search method.
t_bld 2 And (3) representing the 2 nd cyanobacterial bloom time sequence-sliding window concentration divided based on a grid search method.
t_bld b And (4) representing the concentration of the b-th cyanobacterial bloom time sequence-sliding window after being divided based on a grid search method.
In the present invention, the terms δ, H w Constraint formed by b
Figure BDA0003887391270000072
To ensure that the partitioned b training sample subsets are the same size, an
Figure BDA0003887391270000073
Are all positive integers.
For example, when H w =3,b =4, the 1 st training subset bld divided based on the grid search method 1 The middle element is
Figure BDA0003887391270000074
And
Figure BDA0003887391270000075
Figure BDA0003887391270000076
denotes the first
Figure BDA0003887391270000077
Individual historical cyanobacterial bloom analysis information
Figure BDA0003887391270000078
Chlorophyll a concentration value of.
Figure BDA0003887391270000079
Denotes the first
Figure BDA00038873912700000710
Individual history cyanobacterial bloom analysis information
Figure BDA00038873912700000711
Chlorophyll a concentration value of (a).
Figure BDA00038873912700000712
Denotes the first
Figure BDA00038873912700000713
Individual historical cyanobacterial bloom analysis information
Figure BDA00038873912700000714
Chlorophyll a concentration value of.
Figure BDA00038873912700000715
Is shown as
Figure BDA00038873912700000716
Individual historical cyanobacterial bloom analysis information
Figure BDA00038873912700000717
Chlorophyll a concentration value of (a).
For example, when H =3,b =4, the divided 2 nd training subset bld is based on the grid search method 2 The middle element is
Figure BDA00038873912700000718
Figure BDA0003887391270000081
Is shown as
Figure BDA0003887391270000082
Individual historical cyanobacterial bloom analysis information
Figure BDA0003887391270000083
Chlorophyll a concentration value of.
Figure BDA0003887391270000084
Is shown as
Figure BDA0003887391270000085
Individual historical cyanobacterial bloom analysis information
Figure BDA0003887391270000086
Chlorophyll a concentration value of.
Figure BDA0003887391270000087
Denotes the first
Figure BDA0003887391270000088
Individual historical cyanobacterial bloom analysis information
Figure BDA0003887391270000089
Chlorophyll a concentration value of (a).
Figure BDA00038873912700000810
Is shown as
Figure BDA00038873912700000811
Individual historical cyanobacterial bloom analysis information
Figure BDA00038873912700000812
Chlorophyll a concentration value of.
For example, when H w =3,b =4, the 3 rd training subset bld is divided based on the grid search method 3 The middle element is
Figure BDA00038873912700000813
And
Figure BDA00038873912700000814
Figure BDA00038873912700000815
denotes the first
Figure BDA00038873912700000816
Individual historical cyanobacterial bloom analysis information
Figure BDA00038873912700000817
Chlorophyll a concentration value of.
Figure BDA00038873912700000818
Is shown as
Figure BDA00038873912700000819
Individual historical cyanobacterial bloom analysis information
Figure BDA00038873912700000820
Chlorophyll a concentration value of.
Figure BDA00038873912700000821
Is shown as
Figure BDA00038873912700000822
Individual historical cyanobacterial bloom analysis information
Figure BDA00038873912700000823
Chlorophyll a concentration value of.
Figure BDA00038873912700000824
Is shown as
Figure BDA00038873912700000825
Individual history cyanobacterial bloom analysis information
Figure BDA00038873912700000826
Chlorophyll a concentration value of.
For example, when H w If =3,b =4, the 4 th training subset bld is divided based on the grid search method 4 The middle element is
Figure BDA00038873912700000827
And
Figure BDA00038873912700000828
a construction step E103 of setting a subset of training samples SUB _ D Training And evaluating the subset of samples SUB _ D Evaluation of
In the present invention, SUB _ D Training The study was performed in MRELM.
In the present invention, SUB _ D Evaluation of The evaluation was performed in MRELM after training.
In the present invention, SUB _ D Evaluation of Selecting a b-th cyanobacterial bloom time sequence-concentration-training sample input subset from the SUB _ D; then SUB _ D Training Is to divide the SUB _ D Evaluation of All other blue algaeWater bloom time series-concentration-training sample input subset.
For example, b =4, from SUB _ D = [ bld ] 1 ,t_bld 1 ;bld 2 ,t_bld 2 ;bld 3 ,t_bld 3 ;bld 4 ,t_bld 4 ]The 4 th cyanobacterial bloom time sequence-concentration-training sample input subset bld is selected 4 ,t_bld 4 ]Is denoted as SUB _ D Evaluation of I.e. SUB _ D Evaluation of =[bld 4 ,t_bld 4 ](ii) a And SUB _ D Training The element in (1) has SUB _ D Training =[bld 1 ,t_bld 1 ;bld 2 ,t_bld 2 ;bld 3 ,t_bld 3 ]。
bld 3 And (4) representing the 3 rd cyanobacterial bloom time sequence-concentration-training sample input subset divided based on the grid search method.
bld 4 And (4) representing the input subset of the 4 th cyanobacterial bloom time sequence-concentration-training sample divided based on the grid search method.
t_bld 3 And (3) representing the 3 rd cyanobacterial bloom time sequence-sliding window concentration divided based on the grid search method.
t_bld 4 And (4) representing the concentration of the 4 th cyanobacterial bloom time sequence-sliding window divided based on a grid search method.
A step E104 is constructed, and referring to a formula 12 and a formula 15 in the speech recognition system based on the manifold regularization extreme learning machine, two weighting coefficients in the MRELM model objective function are respectively recorded as a first weighting coefficient C 1 Second weighting factor C 2 And the weight is denoted as β. Will the SUB _ D Training And (5) putting the model into an MRELM model for learning to obtain the weight beta.
Construction step E105, SUB _ D Evaluation of Middle bld 4 Putting the test sample in a trained MRELM model for evaluation, outputting a chlorophyll a concentration evaluation value corresponding to the evaluation sample, and recording the chlorophyll a concentration evaluation value as an LDD out _BLD Evaluation of
Construction step E106, calculating SUB _ D Evaluation of Middle t _ bld 4 And LDD out _BLD Evaluation of BetweenThe root mean square error value of (1) is recorded as JFC;
in the present invention, a smaller JFC indicates a higher prediction accuracy of the MRELM model.
A construction step E107 of setting a weighting coefficient C 1 ,C 2 And calculating root mean square error values under different parameter conditions;
setting a weighting coefficient C 1 ,C 2 Respectively has a value range of C 1 =[10 -8 ,10 -7 ,…,ψ 1 ], C 2 =[10 -8 ,10 -7 ,…,ψ 2 ],ψ 1 Is a first weighting coefficient C 1 Maximum value of, # 2 Is the second weighting coefficient C 2 Is measured.
According to H w ,L h ,C 1 ,C 2 Is taken from the value of SUB _ D Evaluation of Sample subset bld of (1) b As input, under the condition of examining different values, the MRELM model outputs corresponding evaluation values, and the evaluation values and t _ bld are calculated b Root mean square error value of
Figure BDA0003887391270000091
E.g., b =4, when H is w =3,L h =2,C 1 =10 -8 ,C 2 =10 -8 When it is, SUB _ D Evaluation of Sample subset bld of (1) 4 As input, under the condition, the MRELM model outputs a corresponding evaluation value and t _ bld 4 Root mean square error value of
Figure BDA0003887391270000092
E.g., b =4, when H is w =3,L h =3,C 1 =10 -8 ,C 2 =10 -8 When SUB _ D is detected Evaluation of Sample subset bld of (1) 4 As input, under the condition, the MRELM model outputs corresponding evaluation value and t _ bld 4 Root mean square error value of
Figure BDA0003887391270000093
E.g. b =4, when H w =τ,L h =κ,C 1 =ψ 1 ,C 2 =ψ 2 When it is, SUB _ D Evaluation of Sample subset bld of (1) 4 As input, under the condition, the MRELM model outputs corresponding evaluation value and t _ bld 4 Root mean square error value of
Figure BDA0003887391270000101
A construction step E108, selecting the minimum root mean square error value in the construction step E107, and marking as JFC Minimum size (ii) a And the JFC Minimum size Corresponding to H w ,L h ,C 1 ,C 2 And as the parameters of the MRELM model, the MRELM model is optimized to obtain the trained MRELM model.
(II) feature extraction of blue algae bloom
In the invention, the evolution process of the cyanobacterial bloom refers to the four-stage theory of the cyanobacterial bloom disclosed in section 2.1 of thinking of cyanobacterial bloom formation mechanism in large shallow water eutrophic lakes, by the author of the hole propagation in 3 months ecology bulletin 2005.
In the invention, the method for extracting the characteristic information of the cyanobacterial bloom refers to the characteristic extraction method disclosed in section 4 of the 2018 journal of 12-month pattern recognition, by the author Wang Haishuai, time series features learning with labeled and unlabeled data.
In the invention, the cyanobacterial bloom time sequence characteristic information extraction module 100 is arranged in the cyanobacterial bloom dynamic change trend model 10. The method aims to extract the existing cyanobacterial bloom data in the cyanobacterial bloom dynamic change trend module 10 according with the effective time sequence characteristics of the invention.
A characteristic extraction step 1, receiving chlorophyll a concentration information of cyanobacterial bloom;
receiving the blue algae bloom dynamic change trend module 10
Figure BDA0003887391270000102
The chlorophyll a concentration information in (1) is recorded as CYB _ LD, and CYB _ LD = [ LD _ data = 1 ,ld_data 2 ,…,ld_data δ ]。
A characteristic extraction step 2, setting chlorophyll a concentration threshold value information;
the chlorophyll a concentration related parameters set in the invention are as follows:
recording the critical value of the floating recovery stage of the cyanobacterial bloom as cu;
the critical value of the continuous growth stage of the cyanobacterial bloom is marked as cv;
and marking the outbreak threshold value of the cyanobacterial bloom as cw.
In the invention, the chlorophyll a concentration in the floating recovery stage of cyanobacterial bloom in lakes and reservoirs is generally shown as follows: chlorophyll a concentration value (such as ld _ data) in water body δ ) Is greater than or equal to cu and less than cv (i.e., cu ≦ ld _ data) δ < cv). Along with ld _ data δ Gradually rises when ld _ data δ Greater than or equal to cv (i.e., ld _ data) δ Not less than cv), the cyanobacterial bloom in lakes and reservoirs enters a continuous growth stage until ld _ data δ Greater than or equal to the burst threshold cw (i.e., ld _ data) δ Not less than cw), indicating that the cyanobacterial bloom in lakes and reservoirs will burst.
In the invention, the set threshold value cu of the floating resuscitation stage is selected according to the maximum chlorophyll a concentration LD _ data in CYB _ LD Maximum of And regulating coefficient p in the floating recovery stage Float upwards I.e. cu = p Float upward ×ld_data Maximum of . The set critical value cv for the continuous growth stage is selected according to the maximum chlorophyll a concentration LD _ data in CYB _ LD Maximum of And a continuous growth phase regulation factor p Growth of I.e. cv = p Growth of ×ld_data Maximum of
Figure BDA0003887391270000103
p Growth of the seed =1.76×p Float upward
In the present invention, the unit of the sampling time is day. The chlorophyll a concentration is expressed in mg/L.
In the invention, the concentration value of chlorophyll a during the formation of cyanobacterial bloomDenoted as ld _ data Form a And ld _ data Form (a) a =0.01mg/L (thought of mechanism of formation of cyanobacterial bloom in large shallow water eutrophic lakes).
A feature extraction step 3 of extracting feature information of a continuous growth stage;
in the invention, according to the chlorophyll a concentration interval cv is less than or equal to ld _ data δ < cw, extracting the candidate cyanobacterial bloom characteristic information from CYB _ LD, and recording as CYB _ LD HX And is and
Figure BDA0003887391270000111
and the lower corner mark Q is the total number of the candidate cyanobacterial bloom characteristic information.
Figure BDA0003887391270000112
Representing the characteristic information of the selected 1 st candidate cyanobacterial bloom.
Figure BDA0003887391270000113
And representing the characteristic information of the selected 2 nd candidate cyanobacterial bloom.
Figure BDA0003887391270000114
Representing the characteristic information of the selected Q candidate cyanobacterial bloom.
A characteristic extraction step 4, removing repeated characteristic information of the cyanobacterial bloom;
in the invention, CYB _ LD is divided according to distance similarity HX Removing the characteristic of the repeated cyanobacterial bloom information to obtain the prior time sequence characteristic information of the cyanobacterial bloom, and recording as CYBTF, wherein the CYBTF = { cyb = 1 ,cyb 2 ,…,cyb W }. The lower corner mark W is the total number of the characteristic sequences of the prior time sequence of the cyanobacterial bloom.
In the present invention, the distance similarity refers to the similarity between chlorophyll-a concentrations obtained by calculating the euclidean distance.
In the invention, each time sequence characteristic information in the cyanobacterial bloom prior time sequence characteristic information CYBTF consists of two parts, namely comparison characteristic information and prediction characteristic information, and is recorded as
Figure BDA0003887391270000115
cyb 1 And (3) representing the 1 st time sequence characteristic information of the cyanobacterial bloom prior.
Figure BDA0003887391270000116
The 1 st contrast characteristic information is represented.
Figure BDA0003887391270000117
Indicating the 1 st predictive feature information.
cyb 2 And (3) representing the 2 nd cyanobacterial bloom prior time sequence characteristic information.
Figure BDA0003887391270000118
The 2 nd contrast characteristic information is represented.
Figure BDA0003887391270000119
Indicating the 2 nd predictive feature information.
cyb W And (4) representing the Wth cyanobacterial bloom prior time sequence characteristic information.
Figure BDA00038873912700001110
Representing the W-th contrast characteristic information.
Figure BDA00038873912700001111
Indicating the W-th predicted feature information.
For example cyb 1 =[ld_data 1 ,ld_data 2 ,…,ld_data ε ]ε represents belonging to cyb 1 Total of the prior time sequence characteristic information of the algal bloom of the blue algaeAnd (4) the number. If cyb 1 Length of (2)
Figure BDA00038873912700001112
From cyb 1 Middle and front epsilon-1 chlorophyll a concentration value composition, and is recorded as
Figure BDA00038873912700001113
Figure BDA00038873912700001114
Corresponding cyb 1 Last value ld _ data in ε Is recorded as
Figure BDA00038873912700001115
For example,
Figure BDA00038873912700001116
Figure BDA00038873912700001117
indicates belonging to cyb 2 The identification number phi of the prior time sequence characteristic information of the cyanobacteria bloom represents that the cyanobacteria bloom belongs to cyb 2 The total number of the prior time sequence characteristic information of the water bloom of the blue algae. If cyb 2 Length of (2)
Figure BDA00038873912700001118
Figure BDA00038873912700001119
From cyb 2 Middle front
Figure BDA00038873912700001120
Individual chlorophyll a concentration value composition, noted
Figure BDA00038873912700001121
Figure BDA00038873912700001122
Corresponding cyb 2 Last value ld _ data in φ Is recorded as
Figure BDA0003887391270000121
For example,
Figure BDA0003887391270000122
Figure BDA0003887391270000123
indicates belonging to cyb W The identification number and xi of the prior time sequence characteristic information of the algal bloom of the blue algae represent belonging to cyb W The total number of the prior time sequence characteristic information of the blue algae bloom. If cyb W Length of (2)
Figure BDA0003887391270000124
Figure BDA0003887391270000125
From cyb W Middle front
Figure BDA0003887391270000126
The chlorophyll a concentration value composition is marked as
Figure BDA0003887391270000127
Figure BDA0003887391270000128
Corresponding cyb W Last value ld _ data in ξ Is marked as
Figure BDA0003887391270000129
In the invention, the cyanobacterial bloom prior time sequence characteristic information CYBTF is obtained through the characteristic extraction steps 1 to 4. The CYBTF participates in the construction and prediction of S-ELM, and can provide effective prior local shape characteristic information of blue algae bloom evolution for the invention.
In the invention, the MRELM model trained in the step (I) and the prior time sequence characteristic information CYBTF of the cyanobacterial bloom form an S-ELM model for switching prediction. The chlorophyll a concentration in the blue algae bloom dynamic change trend module 10 is set as the prediction output index of the prediction index module 20, namely the input variable and the output variable of the S-ELM model are the chlorophyll a concentration.
According to the current data information CYB _ LD At present Determining next predicted time data CYB _ LD The next moment In the process, the trend similarity sl and the distance similarity D between the local prediction fragment pseg of the current chlorophyll a concentration and each piece of comparison characteristic information in the extracted cyanobacterial bloom prior time series characteristic information CYBTF need to be calculated European style . And when the threshold condition is met, the corresponding prediction characteristic is used as a prediction value of the S-ELM model at the next moment, otherwise, the prediction characteristic is switched to the trained MRELM model to predict the next moment.
In the present invention, the local prediction segment pseg refers to the data CYB _ LD associated with the next prediction time The next moment A time series subsequence in nearest neighbor relation.
In the invention, the W-th contrast characteristic information is used
Figure BDA00038873912700001210
For example, the trend similarity sl in the S-ELM model is determined by comparing feature information with equal length
Figure BDA00038873912700001211
Calculating the average absolute slope value lp of the local prediction segment pseg Comparison of 、lp pseg Then, calculate | lp Comparison of -lp pseg And | obtaining.
The average absolute slope value is calculated as shown in equation (1):
Figure BDA00038873912700001212
Figure BDA00038873912700001213
representing contrasting characteristic information
Figure BDA00038873912700001214
Average absolute slope value of (a).
Figure BDA00038873912700001215
Representing contrasting characteristic information
Figure BDA00038873912700001216
Is of a length of
Figure BDA00038873912700001217
ft (a) represents contrast characteristic information
Figure BDA00038873912700001218
The a-th chlorophyll a concentration value.
ft (a + 1) represents contrast characteristic information
Figure BDA00038873912700001219
Concentration value of (a + 1) th chlorophyll a.
(III) model switching Condition
According to the set trend similarity threshold sl Threshold value When sl is greater than or equal to sl Threshold value (sl≥sl Threshold value ) Switching to a trained MRELM model to predict the chlorophyll a concentration at the next moment; when sl is less than sl Threshold value (sl<sl Threshold value ) Temporal computation of contrast feature information
Figure BDA00038873912700001220
Euclidean distance D between local prediction segments pseg with equal length European style
If D is European style Is less than a set distance similarity threshold td Threshold value (D European style <td Threshold value ) Selecting predicted feature information
Figure BDA00038873912700001221
As a predicted value of the chlorophyll-a concentration at the next time; if D is European style Greater than or equal to the set distance similarity threshold td Threshold value (D European style ≥td Threshold value ) Switching to trainingThe subsequent MRELM model predicts the chlorophyll a concentration at the next moment.
In the invention, the predicted values of all chlorophyll a concentrations output by the S-ELM model are recorded as LD out And is and
Figure BDA0003887391270000131
and the lower corner mark f is the total number of the predicted values of the chlorophyll a concentration output by the S-ELM model.
Figure BDA0003887391270000132
Represents the predicted value of the 1 st chlorophyll a concentration output by the S-ELM model.
Figure BDA0003887391270000133
And (3) a predicted value of the 2 nd chlorophyll-a concentration output by the S-ELM model is shown.
Figure BDA0003887391270000134
And (3) representing the predicted value of the f-th chlorophyll-a concentration output by the S-ELM model.
(IV) constructing an improved error compensation model based on the fuzzy neural network, and recording the improved error compensation model as an ECM model
In order to further improve the accuracy of the prediction model, the invention provides that the input information and the output information of the S-ELM model are used as the input information of the ECM model to compensate the chlorophyll a concentration predicted value so as to further improve the prediction accuracy of the chlorophyll a concentration.
In the invention, the T-S fuzzy neural network structure refers to PM based on T-S fuzzy neural network published by 'control engineering' at No. 3 of volume 25 of 3 months in 2018 2.5 Prediction research, authors jogjunfei, caijie, hanhonggui; 3.1 The section content.
In the invention, the fuzzy C mean algorithm refers to fuzzy C mean clustering disclosed in the third section of 'mathematical modeling' by Yang Gui Yuan, author, university Press of Shanghai finance and economics, 2015 02.
In the invention, subtractive clustering refers to subtractive clustering disclosed in 2.2 of the prediction of peak power of discharge of power battery based on ANFIS and subtractive clustering, shangpeng, an author of Sunpui, which is published in 2015, 02-month electrotechnical journal.
The method comprises the following steps of 1, setting input layer information of an ECM model;
in the present invention, the first aspect of the input layer of the ECM model is used for receiving the analysis information of the historical cyanobacterial bloom
Figure BDA0003887391270000135
The second aspect is for receiving output information of a corresponding S-ELM model
Figure BDA0003887391270000136
In a third aspect, use
Figure BDA0003887391270000137
Chlorophyll a concentration value of
Figure BDA0003887391270000138
Comparing and calculating difference, marking as BIA, and BIA = [ BIA = 1 ;bia 2 ;…;bia f ];
According to a fourth aspect, according to CYB ECM 、LD out And BIA to construct a training sample set for the ECM model, denoted X _ ALL = [ X, BIA]。
X represents a training sample sequence input in the ECM model, and
Figure BDA0003887391270000141
x 1 the 1 st training sample sequence representing the input layer in the ECM model.
x 2 The 2 nd training sample sequence representing the input layer in the ECM model.
x f The f-th training sample sequence representing the input layer in the ECM model.
bia 1 To represent
Figure BDA0003887391270000142
And the corresponding training sample error value is the chlorophyll a concentration difference value between the chlorophyll a concentration measured value and the predicted value at the next moment.
bia 2 To represent
Figure BDA0003887391270000143
And the corresponding training sample error value is the chlorophyll a concentration difference value between the chlorophyll a concentration measured value and the predicted value at the next moment.
bia f Represent
Figure BDA0003887391270000144
And the corresponding training sample error value is the chlorophyll a concentration difference value between the chlorophyll a concentration measured value and the predicted value at the next moment.
Constructing step 2, setting fuzzy layer information of the ECM model;
in the invention, the membership function of the fuzzy layer in the ECM model is a Gaussian membership function and is marked as Gfun (mea, psi). And the mean value mea is obtained by a clustering center generated by improved fuzzy C mean clustering, and the standard deviation psi is obtained by weighting the chlorophyll a concentration values in each row of samples in the X by using the membership value generated by an improved fuzzy C mean clustering algorithm.
In the invention, the information output by the fuzzy layer is obtained by calculating each input sequence in the input data X in the training sample set by adopting a Gaussian membership function Gfun (mea, psi), and is marked as U = [ U ] 1 ,u 2 ,…,u c×τ ]τ represents the maximum value of the sliding window width, and C represents the number of clusters generated by the improved fuzzy C-means clustering algorithm, i.e. the number of rules. And (4) calculating the membership value of each chlorophyll a concentration value in each training sample input sequence in the X according to the determined membership function.
u 1 Representing the 1 st letter output after the fuzzification layer in the construction stageAnd (4) information.
u 2 Representing the 2 nd information output after the fuzzification layer in the construction stage.
u c×τ Represents the c x t information output after the blurring layer in the construction stage.
In the invention, the improved fuzzy C-means clustering algorithm, by inputting the X into a subtraction clustering algorithm and taking the obtained clustering number and the clustering center as the initial value of the conventional fuzzy C-means clustering algorithm, avoids the subjectivity of manually setting the clustering number; according to the density value omega of each training sample input sequence (the density value is calculated by referring to a formula 7 in the power battery discharge peak power prediction based on ANFIS and subtractive clustering), different weights are given to each training sample input sequence; and simultaneously considering the intra-class compactness and the inter-class separation degree to improve the target function of the conventional fuzzy C-means clustering algorithm.
The objective function J of the improved fuzzy C-means clustering algorithm provided by the invention is as follows:
Figure BDA0003887391270000145
and the lower corner mark f is the total number of the predicted values of the chlorophyll a concentration output by the S-ELM model.
k denotes a training sample input sequence identification number.
c represents the rule number, i represents the identification number of the ith cluster, j represents the identification number of the jth cluster, and i and j are not the same cluster.
m represents a blurring coefficient.
Figure BDA0003887391270000151
Representing the k-th training sample input sequence to the m-th power of the membership value of the class i-th center.
ω k The density value of the input sequence for the kth training sample represents the weight.
Figure BDA0003887391270000152
Inputting Euclidean distance from the sequence to the class i center for the k training sample, and
Figure BDA0003887391270000153
x k the kth training sample sequence representing the input layer in the ECM model.
v i Indicating class i centers.
v j Representing class j class centers.
Gamma is a specific gravity coefficient.
Figure BDA0003887391270000154
Is the Euclidean distance from the ith cluster center to the jth cluster center, an
Figure BDA0003887391270000155
η is a regular term coefficient.
In the present invention, g is solved ik The calculation formula of (2) is as follows:
Figure BDA0003887391270000156
g ik representing the membership value of the kth training sample input sequence to the ith class center.
l denotes an identification number of the l-th class cluster, and l, i, j are not the same cluster.
Figure BDA0003887391270000157
Inputting Euclidean distance from the sequence to class I center for the kth training sample, and
Figure BDA0003887391270000158
x k the kth training sample sequence representing the input layer in the ECM model.
v l Indicating class i centers.
v j Representing the class j center.
Figure BDA0003887391270000159
Is the Euclidean distance from the ith cluster center to the jth cluster center, and
Figure BDA00038873912700001510
in the present invention, the i-th class center v i The calculation formula of (2) is as follows:
Figure BDA00038873912700001511
in the present invention, x f The output information of the blurred layer is recorded as
Figure BDA00038873912700001512
Figure BDA00038873912700001513
Represents x f And 1 st training information is output after the fuzzy layer is processed.
Figure BDA00038873912700001514
Denotes x f And 2, outputting the 2 nd training information after the fuzzy layer is processed.
Figure BDA00038873912700001515
Denotes x f C x tau training information output after fuzzy layer.
Constructing step 3, setting rule layer information of the ECM model;
in the invention, each rule in the rule layer of the ECM model is obtained by carrying out multiplication operation on output information of the fuzzy layer, and then the result is subjected to ruleThe output information of the layer is noted as
Figure BDA0003887391270000161
Figure BDA0003887391270000162
Denotes x f And (4) obtaining an output value through the 1 st rule in the c rules of the rule layer.
Figure BDA0003887391270000163
Denotes x f And (4) obtaining an output value through the 2 nd rule in the c rules of the rule layer.
Figure BDA0003887391270000164
Denotes x f And obtaining an output value through the q rule in the c rules of the rule layer.
Constructing step 4, setting normalization layer information of the ECM model;
in the invention, the output information of the normalization layer of the ECM model is obtained by the ratio of the output information of each rule in the rule layer to the sum of the output information of all the rules, and the output information of the normalization layer is recorded as
Figure BDA0003887391270000165
Figure BDA0003887391270000166
Represents x f And (4) an output value obtained by the 1 st rule in the c rules of the normalization layer.
Figure BDA0003887391270000167
Represents x f And (4) output values obtained by the 2 nd rule in the c rules of the normalization layer.
Figure BDA0003887391270000168
Denotes x f And (4) obtaining an output value through the q rule in the c rules of the normalization layer.
A step 5 of constructing, namely setting deblurring layer information of the ECM model;
in the present invention, the first aspect of the deblurring layer of the ECM model is used to receive the output information RA of the normalization layer xf
A second aspect is for receiving an input sequence of training samples, in x, for an ECM model f For example.
Output of deblurring layer
Figure BDA0003887391270000169
Is made by RA xf ,x f And parameter MP = [ MP ] in de-blurring layer 1 ;mp 2 ;…;mp Hw ;p cot ]Calculated according to the formula (4).
mp 1 Is x f The first chlorophyll-a concentration in the deblurring layer.
mp 2 Is x f Second chlorophyll-a concentration in the deblurred layer.
mp Hw Is x f Middle (H) w Parameters of chlorophyll a concentration in deblurred layer.
p cot Is a constant term in the deblurring layer.
Figure BDA00038873912700001610
Denotes x f And (4) obtaining an output value through the 1 st rule in the c rules of the deblurring layer.
Figure BDA00038873912700001611
Denotes x f And (4) obtaining an output value through the 2 nd rule in the c rules of the deblurring layer.
Figure BDA00038873912700001612
Represents x f And (4) obtaining an output value through a q rule in c rules of the deblurring layer.
Figure BDA00038873912700001613
A construction step 6, setting output layer information of the ECM model;
in the present invention, said x f The output information of the output layer of the ECM model is obtained by summing the output information of the deblurring layer
Figure BDA00038873912700001614
And is
Figure BDA00038873912700001615
Similarly, the output information of the ECM model output layer corresponding to the X is recorded as ME for the input data in the training sample set out And is and
Figure BDA00038873912700001616
said x 1 The output information of the output layer of the ECM model is obtained by summing the output information of the deblurring layer
Figure BDA00038873912700001617
Said x 2 The output information of the output layer of the ECM model is obtained by summing the output information of the deblurring layer
Figure BDA00038873912700001618
The parameter optimization is based on the BIAs BIA and the ECM model output information ME out And constructing an error square loss function, and then updating the mean value and the standard deviation of the Gaussian membership function in the ECM model and the parameters in the deblurring layer by adopting a gradient descent method. When the set training times are reached, the BIA and the ME obtained by each training are calculated out The parameter corresponding to the minimum sum of squared errors is selectedAnd taking the number as the final parameter of the ECM model, and finishing the training of the ECM model.
In the invention, an S-ELM model and an ECM model form a comprehensive prediction model based on time sequence characteristics and error compensation, and the comprehensive prediction model is marked as an SMRELM model. Output information LD of S-ELM model from chlorophyll a concentration predicted value of SMRELM model out And output information ME of ECM model out And (6) summing to obtain the final product.
(V) Generation of membership Functions
In the invention, a Gaussian membership function Gfun (mea, psi) is established by putting a training sample input set X in an improved fuzzy C-means clustering algorithm and then obtaining a mean value and a standard deviation. Wherein the mean mea is a cluster center v generated by improved fuzzy C-means clustering i And obtaining the standard deviation psi by taking a membership value g generated by an improved fuzzy C-means clustering algorithm as the weight of the chlorophyll a concentration value in each row of samples in the training sample input set X, and then performing weighted calculation.
In the improved fuzzy C-means clustering algorithm, the clustering center v i The calculation formula of (2) is as follows:
Figure BDA0003887391270000171
degree of membership g ik The calculation formula of (c) is:
Figure BDA0003887391270000172
(VI) blue algae bloom prediction by applying SMRELM model
In the invention, the cyanobacterial bloom test performed by combining the S-ELM model and the ECM model is called SMRELM model. FIG. 4 is a flow chart of forecasting lake and reservoir cyanobacterial bloom by the SMRELM model.
A prediction step I, receiving test information;
in the invention, the test information of the cyanobacterial bloom is represented in a set form and is marked as TCYB = { tdata = (TData) 1 ,tdata 2 ,…,tdataσ}。
tdata 1 Shows the analysis information of the blue algae bloom for prediction of the 1 st.
tdata 2 Shows the 2 nd analysis information of the blue algae bloom for prediction.
tdata σ Representing the last analysis information of cyanobacterial bloom for prediction
And the lower corner mark sigma is the number of all the cyanobacterial bloom analysis information used for prediction and is recorded as the test lumped number.
Tdata for convenience of description σ Also called any analysis information of blue algae bloom for prediction.
In the invention, each cyanobacterial bloom analysis information for prediction carries sampling time and chlorophyll a concentration value. Namely:
tdata 1 is denoted as time _ tdata 1
tdata 2 Is denoted as time _ tdata 2
tdata σ Is denoted as time _ tdata σ
tdata 1 The chlorophyll a concentration value is marked as tld _ data 1
tdata 2 The chlorophyll a concentration value is marked as tld _ data 2
tdata σ The chlorophyll a concentration value is marked as tld _ data σ
Then there are: test information of cyanobacterial bloom
Figure BDA0003887391270000181
TCYB = { tdata = { [ tdata ] 1 ,tdata 2 8230, the concentration value of all chlorophyll a in tdata sigma is marked as TCYB _ LD = [ tld _ data% 1 ,tld_data 2 ,…,tld_data σ ]。
Predicting step two, applying an S-ELM model to predict the cyanobacterial bloom;
according to historical cyanobacterial bloom analysis information CYB and CYB ECM And current data information TCYB _ LD of the prediction phase At present When is formedCalculating a local prediction segment pseg of the current chlorophyll a concentration, and then calculating the trend similarity sl and the distance similarity D between each piece of contrast characteristic information in the pseg and the cyanobacterial bloom prior time sequence characteristic information CYBTF European style . When the model switching condition is met, the corresponding prediction characteristic is used as the next-time prediction value of the S-ELM model, and the output information of the S-ELM model is recorded as TLD out Executing the predicting step three; otherwise, switching to the trained MRELM model to predict the next moment, and executing the predicting step nine.
Predicting information of an input layer in the application of the ECM model;
in the present invention, the first aspect of the input layer of the ECM model is for receiving data information of the same S-ELM model;
second aspect for receiving output information TLD of a corresponding S-ELM model out
Constructing a test sample set for the ECM model according to the data information of the two aspects, and marking as TX = [ TX = [ [ x ] 1 ,tx 2 ,…,tx s ] T
tx 1 The 1 st test sample sequence is indicated.
tx 2 The 2 nd test sample sequence is indicated.
tx s Representing the s-th test sample sequence.
Predicting output information of a fuzzification layer in the application of the ECM model;
in the present invention, the output information TU = [ TU ] of the blurring layer 1 ,tu 2 ,…,tu c×τ ]The method is obtained by calculating the membership value of each input sequence in a test sample set TX by adopting a Gaussian membership function Gfun (mea, psi).
tu 1 And the 1 st information output after the blurring layer in the prediction stage is shown.
tu 2 And 2, representing the 2 nd information output after the blurring layer in the prediction stage.
tu c×τ Represents the c × τ -th information output after the blurring layer in the prediction stage.
By tx s For the purpose of example, it is preferred that,output information of the blurring layer is recorded as
Figure BDA0003887391270000182
Figure BDA0003887391270000183
Representing tx in the prediction phase s And 1 st information output after the fuzzy layer.
Figure BDA0003887391270000184
Representing tx in the prediction phase s And 2 nd information is output after the fuzzy layer is processed.
Figure BDA0003887391270000185
Representing tx in the prediction phase s And c x tau information output after the blurring layer.
Step five, predicting output information of a rule layer in the application of the ECM model;
in the invention, a rule layer in the ECM model is used for receiving output information of the fuzzification layer, and the output information of the rule layer
Figure BDA0003887391270000191
And performing multiplication operation on the output information of the fuzzy layer.
Figure BDA0003887391270000192
Represents tx s And (4) obtaining an output value through the 1 st rule in the c rules of the rule layer.
Figure BDA0003887391270000193
Denotes tx s And (4) obtaining an output value through the 2 nd rule in the c rules of the rule layer.
Figure BDA0003887391270000194
Denotes tx s And obtaining an output value through the q rule in the c rules of the rule layer.
Predicting the output information of a normalization layer in the application of the ECM model;
in the invention, a normalization layer in the ECM model is used for receiving the output information of the rule layer, and the output information of the normalization layer
Figure BDA0003887391270000195
The rule layer is obtained by summing the output information of each rule and the output information of all rules.
Figure BDA0003887391270000196
Represents tx s And (4) output values obtained by the 1 st rule in the c rules of the normalization layer.
Figure BDA0003887391270000197
Denotes tx s And (4) output values obtained by the 2 nd rule in the c rules of the normalization layer.
Figure BDA0003887391270000198
Denotes tx s And (4) an output value obtained by a q rule in c rules of the normalization layer.
A seventh step of predicting, namely, the output information of the deblurring layer in the application of the ECM model;
in the present invention, a first aspect of a deblurring layer in an ECM model is used to receive output information from a normalization layer
Figure BDA0003887391270000199
A second aspect is for receiving a test sample set of an ECM model at tx s For example.
Output information of deblurring layer
Figure BDA00038873912700001910
By
Figure BDA00038873912700001911
tx s And parameters in the deblurring layer
Figure BDA00038873912700001912
Calculated according to the formula (5).
Figure BDA00038873912700001913
Denotes tx s And (4) an output value obtained by the 1 st rule in the c rules of the deblurring layer.
Figure BDA00038873912700001914
Denotes tx s And (4) obtaining an output value through the 2 nd rule in the c rules of the deblurring layer.
Figure BDA00038873912700001915
Represents tx s And (4) obtaining an output value through the q rule in the c rules of the deblurring layer.
The superscript T is the transposed symbol.
Step eight, output information of an output layer in the application of the ECM model is predicted;
in the present invention, the output values of the output layer of the ECM model
Figure BDA00038873912700001916
Outputting information from deblurring layer
Figure BDA00038873912700001917
Are summed to obtain
Figure BDA00038873912700001918
And is
Figure BDA00038873912700001919
Similarly, the output information corresponding to the test sample set TX is denoted as ALE out And is and
Figure BDA00038873912700001920
the tx 1 The output information of the output layer of the ECM model is obtained by summing the output information of the deblurring layer
Figure BDA00038873912700001921
The tx 2 The output information of the output layer of the ECM model is obtained by summing the output information of the deblurring layer
Figure BDA00038873912700001922
A ninth prediction step, outputting the final output information of the layer through the SMRELM model;
in the present invention, the FINAL chlorophyll-a concentration prediction value FINAL out Output information TLD from S-ELM model out And output information ALE of the ECM model out And (4) summing to obtain.
Example 1
The data in the examples are derived from water quality data sets at the sites of the great villa. The sampling frequency of the data set is 4 hours, and the sampling time starts from 12 points 04 at 20 days 6 month and 20 month in 2009 to 10 points 8 at 27 month in 2012, and the data set comprises 6342 groups of data. The data used in the experiment is obtained by averaging the chlorophyll a concentration data of each day in the data set to be taken as the data of the day, namely the sampling frequency is 24 hours, 1086 groups of data are obtained, the training data is 900 groups, wherein the first 600 groups of data are used for constructing an MRELM model and extracting the prior time sequence characteristic information of cyanobacterial bloom, the second 300 groups of data are used for testing the prediction performance of the S-ELM model and constructing an ECM model, and the data used for the final SMRELM model test are 186 groups.
In the MRELM model, a sliding window width H is set w Has a value range of [3,4, \ 8230;, 10]The value range of the number of the neurons in the hidden layer is [10,15, \8230;, 40]The weighting coefficient C 1 ,C 2 All values of (1) are [10 ] -8 ,10 -7 ,…,10 8 ]. Calculating the root mean square error under different parameters to obtain the optimum sliding window width length of 3 and hidden layerThe number of neurons is 10, and the weighting coefficient C 1 ,C 2 Are respectively 10 -8 And 10 -4
In the blue algae bloom characteristic extraction experiment, ld _ data Maximum of 29.35mg/L, and the adjustment coefficient p of the floating recovery stage Float upward 0.34, coefficient of regulation p in the continuous growth phase Growth of It was 0.6, and cw was 40mg/L. According to p Float upward ,p Growth of the seed And ld _ data Maximum of The calculated cu is 10mg/L and cv is 17.61mg/L. Setting the length value range of the blue algae bloom prior time sequence characteristic information as [3,4, \8230 ], 7]. According to cu and cv, calculating to obtain candidate cyanobacterial bloom characteristic information (CYB _ LD) with rising trend HX Information). Then according to the distance similarity, eliminating CYB _ LD HX And (4) obtaining the prior time sequence characteristic information (CYBTF information) of the cyanobacterial bloom by the repeated cyanobacterial bloom characteristic sequence in the information.
In the S-ELM model, setting a trend similarity threshold value of 0.1 and a distance similarity threshold value range of [0,0.05, \ 8230;, 1.5] in the S-ELM model, and performing switching prediction on the MRELM model and the CYBTF according to a similarity threshold value switching condition and a selected similarity threshold value. Fig. 5 shows the effect of different distance similarity thresholds on the prediction effect. As can be seen from fig. 5, the optimal distance similarity threshold is 0.2.
In the ECM model, the latter 300 groups of data are subjected to an improved fuzzy C-means clustering algorithm to obtain a Gaussian membership function Gfun (mea, psi). The improved fuzzy C-means clustering algorithm can avoid the number of artificially set Gaussian membership functions and reduce the influence of outliers in test data. In the improved fuzzy C-means clustering algorithm, a fuzzy coefficient m is 2, a subtraction clustering parameter rad is 0.5, a specific gravity coefficient gamma is 0.04, and a regular term coefficient eta is 0.03.
FIG. 6 shows the predicted results of cyanobacterial bloom in lakes and reservoirs by applying SMRELM model, and the comparison with the predicted results of other models. FIG. 7 is a box plot of the root mean square error for 10 experiments for each model to compare the stability of each model. Table 1 shows the combined performance of BP neural network, dynamic recurrent neural network (ELMAN), and SMRELM models in testing, including Root Mean Square Error (RMSE), mean absolute percentage error (MA)PE), normalized Root Mean Square Error (NRMSE), correlation coefficient (R) 2 ) And (4) indexes. By comparison, the SMRELM model in this embodiment has higher RMSE (mean), MAPE, NRMSE, R 2 And stability is second only to ELMAN. Therefore, the SMRELM model in the embodiment has higher prediction precision in the lake and reservoir cyanobacterial bloom prediction, and is suitable for the prediction application of the lake and reservoir cyanobacterial bloom.
TABLE 1 lake and reservoir blue algae bloom prediction experiment results and comparison of different methods
RMSE (mean value) MAPE NRMSE R 2
BP 1.3917 15.75% 0.3622 0.8721
ELMAN 1.3404 14.40% 0.3488 0.8796
SMRELM 1.3180 14.36% 0.3429 0.8825

Claims (2)

1. A lake and reservoir cyanobacterial bloom prediction system based on an SMRELM model comprises a cyanobacterial bloom dynamic change trend module (10) and a chlorophyll a concentration extraction index module (20); it is characterized by also comprising: the cyanobacterial bloom time sequence characteristic information extraction module (100), the S-ELM model (200) and the ECM model (30);
the cyanobacterial bloom time sequence characteristic information extraction module (100) is arranged in the cyanobacterial bloom dynamic change trend module (10);
the S-ELM model (200) is arranged in a chlorophyll a concentration extraction index module (20);
the ECM model (30) is arranged at the output end of the chlorophyll a concentration extraction index module (20);
the cyanobacterial bloom time sequence characteristic information extraction module (100) is used for extracting effective time sequence characteristics of the cyanobacterial bloom data in the cyanobacterial bloom dynamic change trend module (10);
a characteristic extraction step 1, receiving chlorophyll a concentration information of cyanobacterial bloom;
firstly, receiving the concentration CYB _ LD of the constructed S-ELM model in the blue algae water bloom dynamic change trend module (10);
a characteristic extraction step 2, setting chlorophyll a concentration threshold value information;
at least setting the upper limit critical value of the floating recovery stage of the cyanobacteria bloom, and recording the upper limit critical value as cu On the upper part (ii) a And cu On the upper part =p 1 ×ld_data Maximum of ,p 1 Represents the first chlorophyll-a concentration regulation coefficient, and
Figure FDA0003887391260000011
ld_data maximum of Represents the maximum chlorophyll a concentration selected from CYB _ LD;
the lower limit critical value of the continuous growth of the cyanobacterial bloom in the continuous growth stage is recorded as cv Lower part (ii) a And cv Lower part =p 2 ×ld_data Maximum of ,p 2 Denotes the second chlorophyll-a concentration-regulating coefficient, and p 2 =1.76×p 1
A feature extraction step 3 of extracting feature information of a continuous growth stage;
according to the chlorophyll a concentration interval cv Lower part ≤ld_data δ Extracting the characteristic information of the concentration of the cyanobacterial bloom chlorophyll candidate chlorophyll a from CYB _ LD, and recording the characteristic information as CYB _ LD HX
A characteristic extraction step 4, removing repeated characteristic information of the cyanobacterial bloom;
comparing CYB _ LD according to distance similarity HX Removing the characteristic of the repeated cyanobacterial bloom information to obtain the prior time sequence characteristic information of the cyanobacterial bloom, and recording as CYBTF;
the S-ELM model (200) construction comprises the following steps:
a, constructing a network architecture of an MRELM model;
the network architecture of the MRELM model is set as an input layer, a hidden layer and an output layer;
b, establishing an input layer information of the MRELM model;
the input layer of the MRELM model is used for receiving the concentration CYB _ LD for constructing the S-ELM model;
the second aspect sets the sliding window width of the MRELM model, which is marked as H w (ii) a The value range of the width of the sliding window is H w =[3,4,…,τ](ii) a The maximum value of the sliding window width is recorded as tau;
third aspect according to said H w Dividing the chlorophyll a concentration in the CYB _ LD into training sample sequences, and recording the training sample sequences as LD _ DATA;
c, establishing a step of setting the number of neurons of a hidden layer of the MRELM model;
neurons of the hidden layer of the MRELM model are denoted as L h (ii) a The value range of the number of the neurons in the hidden layer is 1 < L h Kappa is less than or equal to; the maximum value of the number of the neurons in the hidden layer is recorded as k; each hidden layer neuron accepts data fromExcitation connection of all input layer neurons, i.e. LD _ DATA is mapped by the characteristics of neurons in the hidden layer to obtain hidden layer output information, which is recorded as H out
D, constructing the output of the MRELM model;
the stream-shape regularization is the output space HID with LD _ DATA in the hidden layer out Can maintain its local geometry in the input layer if a training sample sequence
Figure FDA0003887391260000022
With another training sample sequence
Figure FDA0003887391260000023
If the similarity in the input layer is high, the similarity of the input layer and the hidden layer in the output space is also high, so that the influence of randomness is reduced; the generalization performance of the extreme learning machine is improved;
when the output layer of the MRELM model comprises 1 neuron, the output information corresponding to the neuron is recorded as LDD out (ii) a And the LDD out =H out X β, β representing the weight between the hidden layer and the output layer;
obtaining an MRELM model after the construction steps A to C;
constructing a step E, and optimizing an MRELM model;
a step E101 is established, wherein the number of the training sample subsets divided based on the grid search method is set to be b;
constructing a step E102, dividing D _ ALL into b training sample subsets with the same size, and recording the b training sample subsets as
Figure FDA0003887391260000021
BLD denotes sampling time ordering-chlorophyll a concentration value, and BLD = [ BLD = 1 ;bld 2 ;…;bld b ];
T _ BLD represents sample time ordering-sliding window-chlorophyll a concentration value, and T _ BLD = [ T _ BLD 1 ;t_bld 2 ;…;t_bld b ];
bld 1 Representing the 1 st training subset of the cyanobacterial bloom time sequence divided based on the grid search method;
bld 2 representing the training subset of the 2 nd cyanobacterial bloom time sequence divided based on the grid search method;
bld b representing a b-th cyanobacterial bloom time sequence training subset divided based on a grid search method;
t_bld 1 the 1 st cyanobacterial bloom time sequence-sliding window concentration divided based on the grid search method is represented;
t_bld 2 representing the 2 nd cyanobacterial bloom time sequence-sliding window concentration divided based on the grid search method;
t_bld b representing the concentration of the sliding window which is the b-th cyanobacterial bloom time sequence divided based on the grid search method;
a construction step E103 of setting a subset of training samples SUB _ D Training And evaluating the subset of samples SUB _ D Evaluation of
SUB_D Training Putting in MRELM for learning;
SUB_D evaluation of Placing in MRELM after training for evaluation;
SUB_D evaluation of Selecting a b-th cyanobacterial bloom time sequence-concentration-training sample input subset from the SUB _ D; then SUB _ D Training Is to divide the SUB _ D Evaluation of Inputting all the cyanobacterial bloom time sequences, concentrations and training samples into a subset;
a step E104 of constructing, respectively recording two weighting coefficients in the MRELM model objective function as first weighting coefficients C 1 Second weighting factor C 2 The weight is recorded as beta; will the SUB _ D Training Putting the model in an MRELM model for learning to obtain a weight beta;
construction step E105, SUB _ D Evaluation of Middle bld 4 Putting the test sample in a trained MRELM model for evaluation, outputting evaluation values of chlorophyll a concentrations corresponding to the evaluation samples, and recording the evaluation values as LDD out _BLD Evaluation of
Construction step E106, calculating SUB _ D Evaluation of Middle t _ bld 4 And LDD out _BLD Evaluation of The root mean square error value between them is recorded as JFC;
in the invention, the smaller JFC is, the higher the prediction accuracy of the MRELM model is;
a construction step E107 of setting a weighting coefficient C 1 ,C 2 And calculating root mean square error values under different parameter conditions;
setting a weighting coefficient C 1 ,C 2 Respectively is C 1 =[10 -8 ,10 -7 ,…,ψ 1 ],C 2 =[10 -8 ,10 -7 ,…,ψ 2 ],ψ 1 Is a first weighting coefficient C 1 Maximum value of (v), ψ 2 Is the second weighting coefficient C 2 Maximum value of (d);
according to H w ,L h ,C 1 ,C 2 Value of (2) will SUB _ D Evaluation of Sample subset bld of (1) b As input, under the condition of examining different values, the MRELM model outputs corresponding evaluation values, and the evaluation values and t _ bld are calculated b Root mean square error value of
Figure FDA0003887391260000041
A construction step E108, selecting the minimum root mean square error value in the construction step E107, and marking as JFC Minimum size of (ii) a And coupling said JFC Minimum size of Corresponding to H w ,L h ,C 1 ,C 2 As the parameters of the MRELM model, the MRELM model is optimized to obtain the trained MRELM model;
the construction step of the ECM model (30) comprises the following steps:
the method comprises the following steps of (1) constructing input layer information of an ECM model;
the first aspect of the input layer of the ECM model is used for receiving historical cyanobacterial bloom analysis information CYB ECM
The second aspect is for receiving output information LD of a corresponding S-ELM model out
In a third aspect, CYB is used ECM Leaf of Chinese Caterpillar fungusConcentration value of a and LD out Comparing and calculating a difference value, and recording as BIA;
fourth aspect, according to CYB ECM 、LD out And BIA to construct a training sample set for the ECM model, denoted X _ ALL = [ X, BIA];
X represents a training sample sequence input in the ECM model;
constructing step 2, setting fuzzy layer information of the ECM model;
the membership function of the fuzzy layer in the ECM model is a Gaussian membership function and is marked as Gfun (mea, psi); the mean value mea is obtained by a clustering center generated by improved fuzzy C mean clustering, and the standard deviation psi is obtained by weighting the chlorophyll a concentration values in each row of samples in the X by taking the membership value generated by the improved fuzzy C mean clustering algorithm as the weight of the chlorophyll a concentration values in each row of samples;
the objective function of the improved fuzzy C-means clustering algorithm is
Figure FDA0003887391260000042
h represents the total number of training samples;
k represents a training sample identification number;
c represents the total clustering number, namely the rule number;
i represents the identification number of the ith cluster, j represents the identification number of the jth cluster, and i and j are not the same cluster;
m represents a blurring coefficient;
Figure FDA0003887391260000051
representing the degree of membership to the ith class center of the kth training sample to the m-th power;
ω k weight of the kth training sample;
Figure FDA0003887391260000052
europe from kth training sample to class i centerA distance of formula
Figure FDA0003887391260000053
Gamma is a specific gravity coefficient;
Figure FDA0003887391260000054
is the Euclidean distance from the ith cluster center to the jth cluster center, and
Figure FDA0003887391260000055
eta is a regular term coefficient;
v i representing the class i center;
solving to obtain g ik Is calculated by the formula
Figure FDA0003887391260000056
l represents the identification number of the ith cluster, i represents the identification number of the ith cluster, j represents the identification number of the jth cluster, and l, i and j are not the same cluster;
Figure FDA0003887391260000057
is the Euclidean distance from the kth training sample to class I center, and
Figure FDA0003887391260000058
Figure FDA0003887391260000059
is the Euclidean distance from the ith cluster center to the jth cluster center, and
Figure FDA00038873912600000510
class i center v i Is calculated by the formula
Figure FDA00038873912600000511
x k A k training sample sequence representing an input layer in the error compensation model;
v j representing class j class centers;
constructing step 3, setting rule layer information of the ECM model;
each rule in the rule layer in the ECM model is obtained by performing multiplication operation on output information of the fuzzy layer, and the output information passing through the rule layer is recorded as
Figure FDA00038873912600000512
Constructing step 4, setting normalization layer information of the ECM model;
the output information of the normalization layer of the ECM model is obtained by the ratio of the output information of each rule in the rule layer to the sum of the output information of all the rules, and the output information of the normalization layer is recorded as
Figure FDA0003887391260000061
A step 5 of constructing, namely setting deblurring layer information of the ECM model;
defuzzification layer of ECM model the first aspect is used to receive output information of normalization layer
Figure FDA0003887391260000062
A second aspect is for receiving an input sequence of training samples, in x, for an ECM model f For example;
output of deblurring layer
Figure FDA0003887391260000063
Is formed by
Figure FDA0003887391260000064
x f And the parameter MP in the deblurred layer is calculated, i.e.
Figure FDA0003887391260000065
A construction step 6, setting output layer information of the ECM model;
said x f The output information of the output layer of the ECM model is obtained by summing the output information of the deblurring layer
Figure FDA0003887391260000066
And recording the output information of the ECM model output layer corresponding to the X in the input data in the training sample set as ME out
An error compensation model, namely an ECM model, is obtained through the construction steps 1 to 6.
2. The SMRELM model-based lake and reservoir cyanobacterial bloom prediction system according to claim 1, characterized in that: the steps for predicting the blue algae bloom in the lake and reservoir comprise:
a prediction step I, receiving test information;
recording the test information of the cyanobacterial bloom as TCYB;
predicting step two, applying an S-ELM model to predict the cyanobacterial bloom;
according to historical cyanobacterial bloom analysis information CYB and CYB ECM And current data information TCYB _ LD of the prediction phase At present The formed time sequence is used for calculating a local prediction segment pseg of the current chlorophyll a concentration, and then the trend similarity sl and the distance similarity D between each contrast characteristic information in the pseg and the cyanobacterial bloom prior time sequence characteristic information CYBTF are calculated European style (ii) a When the model switching condition is met, the corresponding prediction feature is used as a prediction value of the S-ELM model at the next moment, and the output information of the S-ELM model is recorded as TLD out Executing a prediction step three; otherwise, switching to the trained MRELM model to predict the next moment, and executing the predicting step nine;
predicting information of an input layer in ECM model application;
the first aspect of the input layer of the ECM model is used for receiving data information of the S-ELM model;
a second aspect is for receiving output information TLD of a corresponding S-ELM model out
Constructing a test sample set used for an ECM model according to the data information of the two aspects, and recording the test sample set as TX;
predicting output information of a fuzzification layer in the application of the ECM model;
the output information TU of the fuzzy layer is obtained by calculating the membership value of each input sequence in the test sample set TX by adopting a Gaussian membership function Gfun (mea, psi);
predicting output information of a rule layer in the application of the ECM model;
the rule layer in the ECM model is used for receiving output information of the fuzzification layer and outputting the output information of the rule layer
Figure FDA0003887391260000071
Performing multiplication operation on output information of the fuzzy layer;
predicting the output information of a normalization layer in the application of the ECM model;
the normalization layer in the ECM model is used for receiving the output information of the rule layer and the output information of the normalization layer
Figure FDA0003887391260000072
The output information of each rule in the rule layer is obtained from the sum of the output information of all the rules;
a seventh step of predicting, namely, the output information of the deblurring layer in the application of the ECM model;
defuzzification layer in ECM model the first aspect is used to receive normalization layer output information
Figure FDA0003887391260000073
A second aspect is for receiving a set of test samples of an ECM model at tx s For example;
output information of deblurring layer
Figure FDA0003887391260000074
By
Figure FDA0003887391260000075
tx s And parameter MP in the deblurred layer T Calculating to obtain;
step eight, output information of an output layer in the application of the ECM model is predicted;
output values of an output layer of an ECM model
Figure FDA0003887391260000076
Outputting information from deblurring layer
Figure FDA0003887391260000077
Are summed to obtain
Figure FDA0003887391260000078
The output information corresponding to the test sample set TX is marked as ALE out
A ninth predicting step, outputting the final output information of the layer through the SMRELM model;
FINAL chlorophyll a concentration prediction value FINAL out Output information TLD from S-ELM model out And output information ALE of ECM model out And (4) summing to obtain.
CN202211248275.8A 2022-07-08 2022-10-12 Lake and reservoir cyanobacterial bloom prediction system based on SMRELM model Pending CN115587538A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210806836 2022-07-08
CN2022108068365 2022-07-08

Publications (1)

Publication Number Publication Date
CN115587538A true CN115587538A (en) 2023-01-10

Family

ID=84779667

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211248275.8A Pending CN115587538A (en) 2022-07-08 2022-10-12 Lake and reservoir cyanobacterial bloom prediction system based on SMRELM model

Country Status (1)

Country Link
CN (1) CN115587538A (en)

Similar Documents

Publication Publication Date Title
Huan et al. Prediction of dissolved oxygen in aquaculture based on gradient boosting decision tree and long short-term memory network: A study of Chang Zhou fishery demonstration base, China
CN109308544B (en) Blue algae bloom prediction method based on contrast divergence-long and short term memory network
Wang et al. An approach of improved Multivariate Timing-Random Deep Belief Net modelling for algal bloom prediction
CN106056127A (en) GPR (gaussian process regression) online soft measurement method with model updating
CN110889085A (en) Intelligent wastewater monitoring method and system based on complex network multiple online regression
CN113554466B (en) Short-term electricity consumption prediction model construction method, prediction method and device
Gutiérrez-Estrada et al. A heuristic approach to predicting water beetle diversity in temporary and fluctuating waters
CN114417740B (en) Deep sea breeding situation sensing method
CN111242380A (en) Lake (reservoir) eutrophication prediction method based on artificial intelligence algorithm
Wang et al. An approach of recursive timing deep belief network for algal bloom forecasting
CN111079926A (en) Equipment fault diagnosis method with self-adaptive learning rate based on deep learning
CN107729988B (en) Blue algae bloom prediction method based on dynamic deep belief network
CN115456245A (en) Prediction method for dissolved oxygen in tidal river network area
CN117113735B (en) Algal bloom intelligent early warning method and system based on multilayer ecological model
Ni et al. An improved attention-based bidirectional LSTM model for cyanobacterial bloom prediction
Huan et al. River dissolved oxygen prediction based on random forest and LSTM
Peng et al. Monitoring of wastewater treatment process based on multi-stage variational autoencoder
CN109146007B (en) Solid waste intelligent treatment method based on dynamic deep belief network
CN109408896A (en) A kind of anerobic sowage processing gas production multi-element intelligent method for real-time monitoring
Fantin‐Cruz et al. Zooplankton density prediction in a flood lake (Pantanal–Brazil) using artificial neural networks
CN109214513B (en) Solid-liquid waste intelligent coupling treatment method based on adaptive deep belief network
CN115587538A (en) Lake and reservoir cyanobacterial bloom prediction system based on SMRELM model
CN116432832A (en) Water quality prediction method based on XGBoost-LSTM prediction model
Xia et al. Environmental factor assisted chlorophyll-a prediction and water quality eutrophication grade classification: A comparative analysis of multiple hybrid models based on a SVM
CN112862173B (en) Lake and reservoir cyanobacterial bloom prediction method based on self-organizing deep confidence echo state network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination