CN116362394A - Synergistic prediction method and system for marine algae growth pollution - Google Patents

Synergistic prediction method and system for marine algae growth pollution Download PDF

Info

Publication number
CN116362394A
CN116362394A CN202310301771.3A CN202310301771A CN116362394A CN 116362394 A CN116362394 A CN 116362394A CN 202310301771 A CN202310301771 A CN 202310301771A CN 116362394 A CN116362394 A CN 116362394A
Authority
CN
China
Prior art keywords
algae growth
marine algae
pollution
related data
growth
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310301771.3A
Other languages
Chinese (zh)
Inventor
刘树堂
张丽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong University
Original Assignee
Shandong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong University filed Critical Shandong University
Priority to CN202310301771.3A priority Critical patent/CN116362394A/en
Publication of CN116362394A publication Critical patent/CN116362394A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Tourism & Hospitality (AREA)
  • Economics (AREA)
  • Data Mining & Analysis (AREA)
  • Human Resources & Organizations (AREA)
  • Strategic Management (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Operations Research (AREA)
  • General Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Mathematical Physics (AREA)
  • Development Economics (AREA)
  • Mathematical Analysis (AREA)
  • Computational Mathematics (AREA)
  • Educational Administration (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Game Theory and Decision Science (AREA)
  • Evolutionary Biology (AREA)
  • Quality & Reliability (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Algebra (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Farming Of Fish And Shellfish (AREA)

Abstract

The invention discloses a method and a system for synergistic prediction of marine algae growth pollution, belonging to the technical field of algae pollution treatment. According to the related results of the existing evidence theory and nonlinear complex system theory of algae growth factors, statistical analysis and multiple linear regression prediction are carried out based on the related real data of algae growth, core factors affecting the algae growth are obtained, and further marine algae pollution is treated. The method can obtain the influence factors on the growth of the marine algae according to the actual conditions of different sea areas, and can treat the marine algae pollution in a targeted manner; solves the problem of the prior art that the effective method for predicting the pollution influence factors of the marine algae is lacked.

Description

Synergistic prediction method and system for marine algae growth pollution
Technical Field
The invention relates to the technical field of algae pollution treatment, in particular to a method and a system for collaborative prediction of marine algae growth pollution.
Background
The statements in this section merely relate to the background of the present disclosure and may not necessarily constitute prior art.
Under special environmental conditions, excessive proliferation and high aggregation (i.e., "detrimental algal bloom") of certain marine algae can cause marine algae pollution. The current method for judging marine algae pollution is that the traditional ships are regularly investigated and shore is manually observed at regular intervals, but sudden changes of red tides are difficult to find; or the dynamic change of green tide is monitored by multi-source satellite remote sensing, unmanned plane, ship and vehicle land and shore three-dimensionally, the change of green tide is predicted by ecological dynamics, drift prediction model and the like, and the scale change of green tide is predicted by long-term balance and causality between the coverage area of green tide and sea temperature.
All marine algae pollution predictions above require a deep understanding of algae growth quantity or density changes, however, based on complex nonlinear relationships between algae growth and sea temperature, chemical duty, other predatory algae, etc., algae will have different growth change characteristics when weather factors, wind speeds, temperatures, etc. are different when the algae changes in different time periods, vertical distributions, etc. In recent years, the ecological balance of a water area ecological system is destroyed based on human fishing, sewage discharge, water eutrophication, natural climate change and the like, so that the phenomena of nutrition level reduction, dominant population degradation and the like occur.
Thus, even though one has roughly understood that an increase in the amount of fertilizer used and global warming are possible causes of deterioration in algal bloom, there is currently no effective method for treating marine algae pollution. Although algorithms such as machine learning are used for classifying the influence factors of the algae growth at present, the machine learning algorithm does not analyze the influence factors and can not provide help for the pollution control of marine algae.
Disclosure of Invention
In order to solve the defects of the prior art, the invention provides a marine algae growth pollution cooperative prediction method, a system, electronic equipment and a computer readable storage medium, and statistical analysis and multiple linear regression prediction are performed based on algae growth related real data in different areas according to the related results of the existing evidence theory and nonlinear complex system theory of algae growth factors.
In a first aspect, the invention provides a marine algae growth pollution synergistic prediction method;
a marine algae growth pollution synergistic prediction method comprises the following steps:
acquiring marine algae growth related data, and preprocessing the marine algae growth related data;
inputting the pretreated marine algae growth related data into a preset cooperative prediction model for processing, and obtaining a cooperative relationship among marine algae growth influence variables so as to inhibit marine algae growth pollution according to the cooperative relationship; the step of inputting the pretreated marine algae growth related data into a preset cooperative prediction model for processing comprises the following steps:
performing stability analysis on the pretreated marine algae growth related data to obtain an algae multi-element time dynamic growth trend curve;
normalizing the pretreated marine algae growth related data, and determining key influencing variables and secondary influencing variables of algae growth pollution based on an algae multivariate time dynamic growth trend curve;
calculating a correlation coefficient matrix between the key influence variable and the secondary influence variable, and establishing a coordination relation between the key influence variable and the secondary influence variable; and (3) checking a synergistic relation between the key influence variable and the secondary influence variable by using a JCITest method, and establishing a multiple linear regression model. Further, the stability analysis is performed on the pretreated marine algae growth related data, and the acquisition of the algae multi-element time dynamic growth trend curve is specifically as follows:
analyzing the marine algae growth related data of which the occurrence frequency of marine algae growth pollution or the marine algae growth pollution area is in a reducing trend according to the pretreated marine algae growth related data, and obtaining a multi-element time dynamic growth trend curve.
Further, the determining key influencing variables and secondary influencing variables of algae growth pollution based on the algae multivariate time dynamic growth trend curve comprises:
based on the algae multi-element time dynamic growth trend curve, taking the chlorophyll a concentration of the algae as a key influencing variable of algae growth pollution;
and performing correlation analysis on the marine water quality basic data in the marine algae growth related data to obtain secondary influence variables with high correlation.
Preferably, deleting the secondary influencing variable with low correlation, and carrying out stability analysis on the processed data by using a Diyl Fullerene test method;
judging whether the core influence variable accords with normal distribution or not through a QQ diagram analysis method.
Further, the multiple linear regression model is expressed as
Y=1.63X 1 +4.6516X 2 +3.6104X 3 -0.91236X 4
+1.0207X 5 -2.563X 6 +0.52605X 7 +0.33509X 8
+0.85703X 9 -0.023949
Wherein Y is chlorophyll a concentration, X 1 Is five days of biochemical oxygen demand, X 2 Is ammonia nitrogen content, X 3 Is the temperature, X 4 Is of salinity, X 5 For removing magnesium pigment content, X 6 Is silicon content, X 7 To dissolve oxygen content, X 8 Is total nitrogen content, X 9 Is the total phosphorus content;
or,
the multiple linear regression model is expressed as
Y=-0.023616X 1 +1.4527X 2 -0.29186X 3 +0.055603X 4 +0.13686
Wherein Y is silicon-phosphorus ratio, X 1 Mean chlorophyll a concentration, X, of water column 2 Is silicate, X 3 To dissolve inorganic nitrogen, X 4 Is nitrogen-phosphorus ratio.
Further, preprocessing marine algae growth-related data includes:
constructing a marine algae growth related data set, and judging whether the missing value in the data set affects marine algae growth pollution prediction;
if yes, supplementing the missing value by an interpolation method; if not, deleting the missing value;
non-numerical data not related to marine algae growth pollution predictions is deleted.
Further, the marine algae growth-related data includes chlorophyll concentration, sea water nutrient salt concentration, and sea water quality data.
In a second aspect, the invention provides a marine algae growth pollution synergistic prediction system;
a marine algae growth pollution co-ordination prediction system, comprising:
a data processing module configured to: acquiring marine algae growth related data, and preprocessing the marine algae growth related data;
a synergistic relationship prediction module configured to: inputting the pretreated marine algae growth related data into a preset cooperative prediction model for processing, and obtaining a cooperative relationship between marine algae growth influence variables so as to inhibit marine algae growth pollution according to the cooperative relationship; the step of inputting the pretreated marine algae growth related data into a preset cooperative prediction model for processing comprises the following steps:
performing stability analysis on the pretreated marine algae growth related data to obtain an algae multi-element time dynamic growth trend curve;
normalizing the pretreated marine algae growth related data, and determining key influencing variables and secondary influencing variables of algae growth pollution based on an algae multivariate time dynamic growth trend curve;
calculating a correlation coefficient matrix between the key influence variable and the secondary influence variable, and establishing a coordination relation between the key influence variable and the secondary influence variable; and (3) checking a synergistic relation between the key influencing variable and the secondary influencing variable by using a JCITest method, and establishing a multiple linear regression model to adjust the duty ratio of the secondary influencing variable according to the synergistic relation and inhibit marine algae growth pollution.
In a third aspect, the present invention provides an electronic device;
an electronic device comprising a memory and a processor and computer instructions stored on the memory and running on the processor, which when executed by the processor, perform the steps of the marine algae growth pollution synergistic prediction method described above.
In a fourth aspect, the present invention provides a computer-readable storage medium;
a computer readable storage medium storing computer instructions which, when executed by a processor, perform the steps of the marine algae growth pollution cooperative prediction method described above.
Compared with the prior art, the invention has the beneficial effects that:
according to the technical scheme provided by the invention, based on complex nonlinear dynamic characteristics of algae growth and actual algae growth data characteristics, and in combination with the regional situation of the economic development level, the time dynamic comparison multiple linear regression model based on real algae growth related data is constructed, so that the influence relation between human activities and algae pollution is facilitated to be ascertained, key influence factors influencing marine algae growth pollution are determined, regulation and control according to the key influence factors are facilitated, the growth of marine algae is inhibited, and marine algae growth pollution is remedied.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the invention.
FIG. 1 is a schematic flow chart provided in an embodiment of the present invention;
FIG. 2 is a graph showing the dynamic growth trend of algae in a sea area according to the present invention; wherein, (a) is a time dynamic growth trend curve schematic diagram of water nutrient salt, and (b) is a time dynamic growth trend curve schematic diagram of chlorophyll a;
FIG. 3 is a schematic diagram showing the correlation between the silicon-phosphorus ratio of a sea area and other relevant data in the economic development level according to the embodiment of the present invention;
FIG. 4 is a schematic diagram showing a correlation test between the silicon-phosphorus ratio of a sea area and other related data in accordance with the present invention; wherein, (a) is a scatter diagram and regression confidence interval diagram of the predicted value and the true value, (b) is a normal distribution check diagram of the residual error of the predicted value and the true value, and (c) is a scatter diagram of the relation between the fitting value and the residual error value;
FIG. 5 is a graph showing the multi-element time dynamic growth trend of algae in a sea area with high economic development level according to the embodiment of the invention; wherein, (a) is a time dynamic growth trend curve diagram of chlorophyll a, and (b) is a time dynamic growth trend curve diagram of water nutrient salt;
FIG. 6 is a schematic diagram of correlation coefficients between variables such as nutrient salts and chlorophyll a of a water body in a sea area with a high economic development level, which is provided by the embodiment of the invention;
FIG. 7 is a schematic diagram showing the examination of the synergistic relationship between chlorophyll a and secondary influencing variables in a sea area with a high economic development level according to the embodiment of the present invention; wherein, (a) is a graph of the actual chlorophyll a concentration and the fitted chlorophyll a concentration, (b) is a residual graph, and (c) is a probability distribution schematic diagram of the residual graph; (d) A scatter diagram and a regression confidence interval diagram of model predicted values and existing statistical data values.
Detailed Description
It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the invention. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of exemplary embodiments according to the present invention. As used herein, unless the context clearly indicates otherwise, the singular forms also are intended to include the plural forms, and furthermore, it is to be understood that the terms "comprises" and "comprising" and any variations thereof are intended to cover non-exclusive inclusions, such as, for example, processes, methods, systems, products or devices that comprise a series of steps or units, are not necessarily limited to those steps or units that are expressly listed, but may include other steps or units that are not expressly listed or inherent to such processes, methods, products or devices.
Embodiments of the invention and features of the embodiments may be combined with each other without conflict.
Example 1
In connection with fig. 1, the existing machine learning algorithm does not pay attention to the correlation and causal relationship behind the algae growth influencing factors, and therefore, does not give a quantization factor that influences the algae growth concentration by different factors or combinations of factors (i.e., multivariate cooperation). The synergistic relationship refers to a certain long-term stable relationship presented by a time sequence from the generation process of actual data; regression analysis and prediction can only be performed if there is a synergistic relationship in the time series. If the time series is unstable and regression analysis is directly carried out, all the significance and the fitting degree of the regression result are good, but the regression relation does not exist in practice. Therefore, the invention provides a marine algae growth pollution cooperative prediction method, which comprises the following steps:
s1, acquiring marine algae growth related data, and preprocessing the marine algae growth related data; specifically, constructing a marine algae growth related data set, and judging whether a missing value in the data set affects marine algae growth pollution cooperative prediction; if yes, supplementing the missing value by an interpolation method; if not, deleting the missing value; and deleting non-numerical data which are not related to marine algae growth pollution synergistic prediction.
S2, inputting the pretreated marine algae growth related data into a preset cooperative prediction model for processing, and obtaining a cooperative relationship among marine algae growth influence variables so as to inhibit marine algae growth pollution according to the cooperative relationship; the step of inputting the pretreated marine algae growth related data into a preset cooperative prediction model for processing comprises the following steps:
s201, performing stability analysis on the pretreated marine algae growth related data to obtain an algae multi-element time dynamic growth trend curve; specifically, according to the pretreated marine algae growth related data, analyzing the marine algae growth related data of which the occurrence frequency of marine algae growth pollution or the marine algae growth pollution area is in a reduced trend, and obtaining a multi-element time dynamic growth trend curve.
Based on the speckle pattern dynamics in the nonlinear power system theory, the spatial or dynamic change of the marine algae individual quantity is related to the behaviors of diffusion, migration and the like, the complex characteristics are presented under the actions of capturing, toxin and the like for an open and dissipation system far away from thermodynamic equilibrium, and the obvious characteristics of the complex system are that the overall change cannot be predicted through the local rule from the perspective of space; from a temporal point of view, short-term laws do not predict long-term changes well. Therefore, instead of the more data, the prediction in a short period of time must be given better based on the actual algae growth data of different sea areas and analyzing the specific dynamic trend thereof.
Specifically, according to the pretreated marine algae growth related data, the data is logarithmized, and the stability of the data is analyzed, so that the marine algae growth related data of which the occurrence number of marine algae growth pollution or the marine algae growth pollution area is in a reduced trend is given, and a multi-element time dynamic growth trend curve is obtained.
S202, carrying out normalization treatment on the pretreated marine algae growth related data, and determining key influence variables and secondary influence variables of algae growth pollution based on an algae multivariate time dynamic growth trend curve. The method comprises the following steps:
s2021, acquiring key influence variables of algae growth pollution based on an algae multivariate time dynamic growth trend curve;
s2022, performing correlation analysis on the basic marine water quality data in the marine algae growth related data to obtain secondary influence variables with high correlation;
s2023, deleting a secondary influence variable with low correlation, and carrying out stability analysis on the processed data by using a Diyl Fullerene test method; the specific flow is as follows:
(1) Assuming that the time series being examined obeys an autoregressive model:
Y t =δY t-1t ,
(2) Analyzing whether lambda in a differential regression model is 0:
ΔY t =λY t-1t ,
(3) If 0, the unit root process of which the time sequence is the most basic is non-stationary. The method for judging the significance level is a Monte Carlo simulation method.
S203, calculating a correlation coefficient matrix between the key influence variable and the secondary influence variable, selecting independent variables affecting the predicted variable, and finally analyzing influence factors with a cooperative relationship based on the definition of the cooperative relationship to obtain a specific coefficient and a statistical index of the multiple linear regression model, checking the cooperative relationship between the key influence variable and the secondary influence variable by using a JCITest method, and establishing the multiple linear regression model; and judging whether the residual error of the multiple linear regression pre-model accords with normal distribution or not through a QQ diagram analysis method. The QQ diagram is a simple name of Quantile-Quantile (Quantile-Quantile diagram) and is based on the principle that whether the detected data accords with a certain distribution is judged by detecting quantiles of a group of sample data and comparing the quantiles of the data with known distribution.
The multiple linear regression model is used for displaying the coordination relation between the key influencing variables and the secondary influencing variables so as to adjust the duty ratio of the secondary influencing variables according to the coordination relation and inhibit marine algae growth pollution.
The normal distribution test is to determine whether the residuals of the multiple linear regression model conform to the normal distribution. Since only the residuals fit the normal distribution, this model can be guaranteed to be optimal.
Specifically, firstly, the calculation of the correlation coefficient is performed pairwise based on the logarithm of the original data, so as to obtain a correlation coefficient matrix of the correlation coefficient matrix, and the correlation coefficient matrix is represented by a graph.
Then, the synergistic relationship between the key influencing variables and the secondary influencing variables is checked using the jcitiest method, i.e. the Johansen synergistic check method, which is intended to check whether the causal relationship between the (non-stationary) data is pseudo regression. The co-ordination theory can be used to correctly interpret the predicted phenomena. Jcitiest is a multi-element equation technique whose idea is to use maximum likelihood estimates to verify the synergistic relationship between multiple variables. The specific inspection flow is as follows:
(1) Determining the hysteresis order of marine algae growth related data by using unit root test;
(2) A Vector Autoregressive (VAR) model is constructed, taking k variables, the hysteresis order 1 as an example:
Figure BDA0004145316500000101
wherein Y is 1,t ,Y 2,t Is a dependent variable, X 1,t ,X 2,t As an independent variable beta 1,t, β 2,t1,t2,t As regression coefficient, random error term u 1,t ,u 2,t Satisfy u 1,t ,u 2,t
Figure BDA0004145316500000106
IID(0,σ 2 ),Cov(u 1,t ,u 2,t )=0;
(3) Construction of Vector Error Correction (VECM) model, residual sequence e 1,t ,e 2,t The method meets the following conditions:
Figure BDA0004145316500000102
(4) Analysis of the influence matrix in the VECM model
Figure BDA0004145316500000103
Rank of (c);
(401) And (3) performing statistical test on M tracks of the matrix:
Figure BDA0004145316500000104
wherein,,
Figure BDA0004145316500000105
feature roots of the matrix M arranged in descending order;
4.2. and checking the number of the non-zero characteristic roots of the M matrix to determine a synergistic relationship.
Further, in some embodiments, a method for collaborative prediction of marine algae growth pollution disclosed in this embodiment will be described in detail with reference to fig. 2-7, taking a sea area with a high economic development level as an example.
A marine algae growth pollution synergistic prediction method comprises the following steps:
step 1, establishing an algae growth related data set, checking whether a missing value in the data set affects a final statistical analysis result, and deleting if the missing value does not affect the final statistical analysis result; if so, interpolation or the like is selected to supplement the missing values. There is no specific value in the dataset, but the data that gives the upper or lower bound of values is subject to the selection threshold. Based on the existing analysis of the algae growth related nonlinear dynamics system and the analysis of algae actual data and field data, deleting irrelevant nonlinear data.
The algae growth-related data set included seawater quality data (data source epd. Gov. Hk) for the DM1 surface area ranging in time from 1986/2/14 to 2021/12/4, including five days of biochemical oxygen demand (mg/l), ammonia nitrogen (mg/l), chlorophyll a (microgram/l), percent of dissolved oxygen saturation (percentage), nitrate nitrogen (mg/l), acid-base value, demanganization (microgram/l), salinity, temperature (celsius), inorganic nitrogen (mg/l), total nitrogen (mg/l), and total phosphorus (mg/l).
And step 2, statistically analyzing the regional algae growth related data with developed economy but the occurrence frequency of marine algae pollution or the occurrence trend of pollution area reduction trend by using the statistical analysis results of the theoretical and real data of a marine algae growth nonlinear power system, a complex system and the like, and giving the multi-element time dynamic long-term trend.
And step 3, carrying out normalization processing on the data by using a maximum and minimum method in order to remove the influence caused by different units. Based on the experimental analysis and demonstration analysis results of algae growth, chlorophyll a in the ocean can be regarded as an output variable (core influence variable) of pollution phenomena such as ocean red tide or green tide. Firstly, according to ocean water quality basic data of the back bay, analyzing the correlation relationship of five days of biochemical oxygen demand (milligrams/liter), ammonia nitrogen (milligrams/liter), total phosphorus (milligrams/liter), chlorophyll a (micrograms/liter), temperature (celsius), transparency (meter), acid-base value, demanganizing pigment (micrograms/liter), dissolved oxygen (milligrams/liter), dissolved oxygen saturation percentage (percentage), salinity (psu), nitrate nitrogen (milligrams/liter), nitrite nitrogen (milligrams/liter) and orthophosphate phosphorus (micrograms/liter), namely the correlation relationship of a core influence variable and a secondary influence variable.
According to fig. 6, it can be known that there is a high correlation among ammonia nitrogen, total phosphorus and n-nitrate phosphorus, so that the total nitrogen, total phosphorus and n-nitrate phosphorus are removed, ammonia nitrogen data are reserved, the factor of the high correlation is prevented from affecting the accuracy of the prediction result, and the independence among independent variables is ensured. However, because of excessive influencing factors, irrelevant factors need to be further deleted, and 9 factors of five-day biochemical oxygen demand, ammonia nitrogen, salinity, magnesium-removing pH value, salinity, magnesium-removing pigment, temperature, dissolved oxygen, total phosphorus and total phosphorus are reserved. And found that the processed dataset was not stationary using the direc-based fullerene test (ADF) test method. The judgment becomes smooth again with the first-order difference.
Step 4, according to the multi-data correlation coefficient matrix of the algae growth correlation data, statistical analysis finds that the stable marine algae growth correlation data do not have obvious linear correlation, X is calculated as follows 1 Five days of biochemical oxygen demand, X 2 Ammonia nitrogen, X 3 Temperature, X 4 Salinity, X 5 De-magnesian pigment, X 6 Silicon, X 7 Dissolved oxygen, X 8 Total nitrogen, X 9 The total of 9 factors of phosphorus remain the most important core variables. Meanwhile, according to fig. 3, it is shown that the output variable Y chlorophyll a has no independent linear correlation with all other factors, so that whether chlorophyll a has a synergistic relationship with other variables should be considered. And (3) checking the coordination relation among the marine algae growth related data by using the JCITest, and finding that the coordination relation exists indeed according to the checking result. Subsequently, a multiple linear regression model is built:
Y=1.63X 1 +4.6516X 2 +3.6104X 3 -0.91236X 4
+1.0207X 5 -2.563X 6 +0.52605X 7 +0.33509X 8
+0.85703X 9 -0.023949(1)
the estimation coefficients and corresponding statistic values of the corresponding multiple linear regression model are shown in Table 1.
TABLE 1 statistics of multiple linear regression
Figure BDA0004145316500000121
Figure BDA0004145316500000131
The F statistic is 59.7 and the p value is 7.36e-66. The results of the above model (1) and table 1 show that chlorophyll a concentration has a significant synergistic relationship with five days of biochemical oxygen demand, ammonia nitrogen, salinity, demanganized pigment, silicon and dissolved oxygen, wherein salinity, silicon have a significant negative effect on chlorophyll a concentration, and the other four factors have a significant positive effect.
Further, in some embodiments, taking a sea area in general of the economic development level as an example, the method for collaborative prediction of marine algae growth pollution in this embodiment will be further described with reference to fig. 2 to 7. The method comprises the following specific steps:
step 1, establishing an algae growth related data set, checking whether missing values in the two data sets affect a final statistical analysis result, and deleting if the missing values do not affect the final statistical analysis result; if so, interpolation or the like is selected to supplement the missing values. There is no specific value in the dataset, but the data that gives the upper or lower bound of values is subject to the selection threshold. Based on the existing analysis of the algae growth related nonlinear dynamics system and the analysis of algae actual data and field data, deleting irrelevant nonlinear data. The marine algae growth related data comprise data of water nutrient salts such as surface chlorophyll a concentration, bottom chlorophyll a concentration, average chlorophyll a concentration of water column (milligrams/liter) and nitrogen-phosphorus ratio, silicon-nitrogen ratio, and the like, and the time range is 3 months 1997 to 11 months 2010.
And step 2, statistically analyzing the regional algae growth related data of developed economy but the occurrence frequency of marine algae pollution or the occurrence trend of pollution area reduction trend by using the theoretical and actual results of a marine algae growth nonlinear power system, a complex system and the like, and giving the multi-element time dynamic long-term trend.
And 3, according to the correlation coefficient of the algae growth related data (see fig. 6), the obvious correlation exists among the surface chlorophyll a concentration, the bottom chlorophyll a concentration and the average chlorophyll a concentration of the water column, and the obvious linear correlation exists among the silicon-phosphorus ratio, the silicon-nitrogen ratio and the silicate, so that the average chlorophyll a concentration of the water column, the silicate, the dissolved inorganic nitrogen, the nitrogen-phosphorus ratio and the silicon-phosphorus ratio are reserved as core variables, wherein the silicon-phosphorus ratio is an output variable, and the other is a core explanatory variable.
Step 4, establishing an algae growth multiple linear regression model of a certain sea area in general of the economic development level, wherein the algae growth multiple linear regression model is as follows:
Y=-0.023616X 1 +1.4527X 2 -0.29186X 3 +0.055603X 4 +0.13686 (2)
the corresponding statistics are table 2.
The corresponding estimated coefficients and statistics are shown in table 2.
TABLE 2 estimation coefficients and corresponding statistics for multiple linear regression
Figure BDA0004145316500000141
Figure BDA0004145316500000151
The F statistic is 32.6 and the p value is 6.63e-13. The results of model (2) above and table 2 show that there is a significant synergistic relationship between the silicon to phosphorus ratio and the average chlorophyll a concentration of the water column, silicate, dissolved inorganic nitrogen, nitrogen to phosphorus ratio, where the dissolved inorganic nitrogen has a significant negative effect on the silicon to phosphorus ratio and the nitrogen to phosphorus ratio has a significant positive effect.
From the models (1) and (2), it was found that for the latter bay area where the economic development degree is high, the concentration of chlorophyll a can be significantly reduced by increasing the salinity, the silicon content, or by decreasing the ammonia nitrogen content, the five-day biochemical oxygen demand, the demanganizing pigment content and the dissolved oxygen content. Whereas for the gulf-sea region with a lower level of economic development, the silicon-to-phosphorus ratio can be significantly reduced by increasing the content of dissolved inorganic nitrogen, or by increasing the silicate content and the nitrogen-to-phosphorus ratio content.
Example two
The embodiment discloses a marine algae growth pollution cooperation prediction system, including:
a data processing module configured to: acquiring marine algae growth related data, and preprocessing the marine algae growth related data;
a synergistic relationship prediction module configured to: inputting the pretreated marine algae growth related data into a preset cooperative prediction model for processing, and obtaining a cooperative relationship between marine algae growth influence variables so as to inhibit marine algae growth pollution according to the cooperative relationship; the step of inputting the pretreated marine algae growth related data into a preset cooperative prediction model for processing comprises the following steps:
performing stability analysis on the pretreated marine algae growth related data to obtain an algae multi-element time dynamic growth trend curve;
normalizing the pretreated marine algae growth related data, and determining key influencing variables and secondary influencing variables of algae growth pollution based on an algae multivariate time dynamic growth trend curve;
calculating a correlation coefficient matrix between the key influence variable and the secondary influence variable, and establishing a coordination relation between the key influence variable and the secondary influence variable; and (3) checking a synergistic relation between the key influence variable and the secondary influence variable by using a JCITest method, and establishing a multiple linear regression model.
It should be noted that, the data processing module and the synergistic relationship prediction module correspond to the steps in the first embodiment, and the modules are the same as examples and application scenarios implemented by the corresponding steps, but are not limited to the disclosure in the first embodiment. It should be noted that the modules described above may be implemented as part of a system in a computer system, such as a set of computer-executable instructions.
Example III
The third embodiment of the invention provides an electronic device, which comprises a memory, a processor and computer instructions stored on the memory and running on the processor, wherein the steps of the marine algae growth pollution cooperative prediction method are completed when the computer instructions are run by the processor.
Example IV
The fourth embodiment of the invention provides a computer readable storage medium for storing computer instructions, which when executed by a processor, complete the steps of the marine algae growth pollution cooperative prediction method.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The foregoing embodiments are directed to various embodiments, and details of one embodiment may be found in the related description of another embodiment.
The above description is only of the preferred embodiments of the present invention and is not intended to limit the present invention, but various modifications and variations can be made to the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. The marine algae growth pollution synergistic prediction method is characterized by comprising the following steps of:
acquiring marine algae growth related data, and preprocessing the marine algae growth related data;
inputting the pretreated marine algae growth related data into a preset cooperative prediction model for processing, and obtaining a cooperative relationship between marine algae growth influence variables so as to inhibit marine algae growth pollution according to the cooperative relationship; the step of inputting the pretreated marine algae growth related data into a preset cooperative prediction model for processing comprises the following steps:
performing stability analysis on the pretreated marine algae growth related data to obtain an algae multi-element time dynamic growth trend curve;
normalizing the pretreated marine algae growth related data, and determining key influencing variables and secondary influencing variables of algae growth pollution based on an algae multivariate time dynamic growth trend curve;
calculating a correlation coefficient matrix between the key influence variable and the secondary influence variable, and establishing a coordination relation between the key influence variable and the secondary influence variable; and (3) checking a synergistic relation between the key influence variable and the secondary influence variable by using a JCITest method, and establishing a multiple linear regression model.
2. The method for collaborative prediction of marine algae growth pollution according to claim 1, wherein the stability analysis of the pretreated marine algae growth related data is performed to obtain an algae multi-element time dynamic growth trend curve specifically comprising:
analyzing the marine algae growth related data of which the occurrence frequency of marine algae growth pollution or the marine algae growth pollution area is in a reducing trend according to the pretreated marine algae growth related data, and obtaining a multi-element time dynamic growth trend curve.
3. The marine algae growth pollution cooperative prediction method of claim 1, wherein determining key influencing variables and secondary influencing variables of algae growth pollution based on the algae multivariate time dynamic growth trend curve comprises:
based on the algae multi-element time dynamic growth trend curve, taking the chlorophyll a concentration of the algae as a key influencing variable of algae growth pollution;
and performing correlation analysis on the marine water quality basic data in the marine algae growth related data to obtain secondary influence variables with high correlation.
4. The marine algae growth pollution cooperative prediction method of claim 3, further comprising:
deleting secondary influencing variables with low correlation, and carrying out stability analysis on the processed data by using a Diyl Fullerene test method;
judging whether the core influence variable accords with normal distribution or not through a QQ diagram analysis method.
5. The method of collaborative prediction of marine algae growth pollution of claim 1, wherein the multiple linear regression model is expressed as
Y=1.63X 1 +4.6516X 2 +3.6104X 3 -0.91236X 4
+1.0207X 5 -2.563X 6 +0.52605X 7 +0.33509X 8
+0.85703X 9 -0.023949
Wherein Y is chlorophyll a concentration, X 1 Is five days of biochemical oxygen demand, X 2 Is ammonia nitrogen content, X 3 Is the temperature, X 4 Is of salinity, X 5 For removing magnesium pigment content, X 6 Is silicon content, X 7 To dissolve oxygen content, X 8 Is total nitrogen, X 9 Is the total phosphorus content;
or,
the multiple linear regression model is expressed as
Y=-0.023616X 1 +1.4527X 2 -0.29186X 3 +0.055603X 4 +0.13686
Wherein Y is silicon-phosphorus ratio, X 1 Mean chlorophyll a concentration, X, of water column 2 Is silicate, X 3 To dissolve inorganic nitrogen, X 4 Is nitrogen-phosphorus ratio.
6. The method of collaborative prediction of marine algae growth pollution of claim 1, wherein preprocessing marine algae growth related data comprises:
constructing a marine algae growth related data set, and judging whether the missing value in the data set affects marine algae growth pollution prediction;
if yes, supplementing the missing value by an interpolation method; if not, deleting the missing value;
non-numerical data not related to marine algae growth pollution predictions is deleted.
7. The marine algae growth pollution cooperative prediction method of claim 1, wherein the marine algae growth related data includes chlorophyll concentration, sea water nutrient salt concentration, and sea water quality data.
8. A marine algae growth pollution cooperative prediction system, comprising:
a data processing module configured to: acquiring marine algae growth related data, and preprocessing the marine algae growth related data;
a synergistic relationship prediction module configured to: inputting the pretreated marine algae growth related data into a preset cooperative prediction model for processing, and obtaining a cooperative relationship between marine algae growth influence variables so as to inhibit marine algae growth pollution according to the cooperative relationship; the step of inputting the pretreated marine algae growth related data into a preset cooperative prediction model for processing comprises the following steps:
performing stability analysis on the pretreated marine algae growth related data to obtain an algae multi-element time dynamic growth trend curve;
normalizing the pretreated marine algae growth related data, and determining key influencing variables and secondary influencing variables of algae growth pollution based on an algae multivariate time dynamic growth trend curve;
calculating a correlation coefficient matrix between the key influence variable and the secondary influence variable, and establishing a coordination relation between the key influence variable and the secondary influence variable; and (3) checking a synergistic relation between the key influence variable and the secondary influence variable by using a JCITest method, and establishing a multiple linear regression model.
9. An electronic device comprising a memory and a processor and computer instructions stored on the memory and running on the processor, which when executed by the processor, perform the steps of any of claims 1-7.
10. A computer readable storage medium storing computer instructions which, when executed by a processor, perform the steps of any of claims 1-7.
CN202310301771.3A 2023-03-22 2023-03-22 Synergistic prediction method and system for marine algae growth pollution Pending CN116362394A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310301771.3A CN116362394A (en) 2023-03-22 2023-03-22 Synergistic prediction method and system for marine algae growth pollution

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310301771.3A CN116362394A (en) 2023-03-22 2023-03-22 Synergistic prediction method and system for marine algae growth pollution

Publications (1)

Publication Number Publication Date
CN116362394A true CN116362394A (en) 2023-06-30

Family

ID=86941429

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310301771.3A Pending CN116362394A (en) 2023-03-22 2023-03-22 Synergistic prediction method and system for marine algae growth pollution

Country Status (1)

Country Link
CN (1) CN116362394A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117093956A (en) * 2023-10-19 2023-11-21 美赞臣婴幼儿营养品技术(广州)有限公司 Method and device for predicting tap density of dry-mixed finished product
CN117725345A (en) * 2024-02-08 2024-03-19 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) Multi-source remote sensing green tide growth rate measuring method based on green tide biomass density

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117093956A (en) * 2023-10-19 2023-11-21 美赞臣婴幼儿营养品技术(广州)有限公司 Method and device for predicting tap density of dry-mixed finished product
CN117093956B (en) * 2023-10-19 2024-02-20 美赞臣婴幼儿营养品技术(广州)有限公司 Method and device for predicting tap density of dry-mixed finished product
CN117725345A (en) * 2024-02-08 2024-03-19 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) Multi-source remote sensing green tide growth rate measuring method based on green tide biomass density
CN117725345B (en) * 2024-02-08 2024-05-31 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) Multi-source remote sensing green tide growth rate measuring method based on green tide biomass density

Similar Documents

Publication Publication Date Title
CN116362394A (en) Synergistic prediction method and system for marine algae growth pollution
Meier et al. Assessment of eutrophication abatement scenarios for the Baltic Sea by multi-model ensemble simulations
Palani et al. An ANN application for water quality forecasting
Noori et al. Active and online prediction of BOD 5 in river systems using reduced-order support vector machine
Irby et al. Evaluating confidence in the impact of regulatory nutrient reduction on Chesapeake Bay water quality
Wang et al. Reliable model of reservoir water quality prediction based on improved ARIMA method
CN114817851A (en) Water quality monitoring method and equipment
Kim et al. Towards the development of integrated modelling systems in aquatic biogeochemistry: a Bayesian approach
CN118183886A (en) Neural network-based water quality model basic parameter dynamic tuning method and device
Qian et al. Estimating the long‐term phosphorus accretion rate in the Everglades: A Bayesian approach with risk assessment
CN117253344B (en) Seawater acidification early warning and forecasting method, system and electronic equipment
Das Identification of surface water contamination zones and its sources on Mahanadi River, Odisha using entropy-based WQI and MCDM techniques
CN117388457B (en) Method for improving prediction accuracy of effluent of sewage plant by coupling hydraulic retention time
Osidele et al. Identification of model structure for aquatic ecosystems using regionalized sensitivity analysis
Tsai et al. Probabilistic eutrophication risk mapping in response to reservoir remediation
Jiang et al. Prediction of sea temperature using temporal convolutional network and LSTM-GRU network
CN117434235A (en) Water bloom early warning method, device, equipment and medium based on water quality monitoring
Karul et al. A comparison between neural network based and multiple regression models for chlorophyll-a estimation
Yang et al. Integration of Bayesian analysis for eutrophication prediction and assessment in a landscape lake
Liu et al. Determination of the optimal training principle and input variables in artificial neural network model for the biweekly chlorophyll-a prediction: a case study of the Yuqiao reservoir, China
Podgornyi et al. Review of the current methods used to assess the values of coefficients, sensitivity, and adequacy of simulation models of aquatic ecosystems
CN117892983B (en) Method and system for determining offshore area land pollution load distribution
Zhou et al. Modeling Chlorophyll $ a $ Concentration Affected by Artificial Upwelling in Qiandao Lake
CN118261452B (en) Ocean carbon balance assessment method, device, equipment and medium
Li et al. Optimal Dynamic Temporal-Spatial Paramter Inversion Methods for the Marine Integrated Element Water Quality Model Using A Data-Driven Neural Network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination