CN117077819A - Water quality prediction method - Google Patents

Water quality prediction method Download PDF

Info

Publication number
CN117077819A
CN117077819A CN202311106497.0A CN202311106497A CN117077819A CN 117077819 A CN117077819 A CN 117077819A CN 202311106497 A CN202311106497 A CN 202311106497A CN 117077819 A CN117077819 A CN 117077819A
Authority
CN
China
Prior art keywords
water quality
antibody
data
function
svr
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311106497.0A
Other languages
Chinese (zh)
Inventor
陈爱华
郑金洪
黄健萌
占沛远
范贵源
张传琦
何惺
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fuzhou University
Original Assignee
Fuzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fuzhou University filed Critical Fuzhou University
Priority to CN202311106497.0A priority Critical patent/CN117077819A/en
Publication of CN117077819A publication Critical patent/CN117077819A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/10Machine learning using kernel methods, e.g. support vector machines [SVM]
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/18Water
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/217Validation; Performance evaluation; Active pattern learning techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/27Regression, e.g. linear or logistic regression
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/004Artificial life, i.e. computing arrangements simulating life
    • G06N3/006Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Software Systems (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Health & Medical Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Medical Informatics (AREA)
  • Food Science & Technology (AREA)
  • Medicinal Chemistry (AREA)
  • Analytical Chemistry (AREA)
  • Biochemistry (AREA)
  • Immunology (AREA)
  • Pathology (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides a water quality prediction method, which utilizes a regression type SVR algorithm to predict dissolved oxygen in water, optimizes parameters C and g in the SVR by an artificial immune algorithm AIA to reduce subjective influence of human factors and improve universality and performance of the SVR, carries out correlation calculation on the output of a SVR model and various water quality parameters, and selects the water quality parameters with higher correlation coefficients as the input of the model so as to improve the accuracy of the algorithm; the predicted value of the invention is closer to the true value, and the performance is more excellent. The improved algorithm can be used for early prediction of dissolved oxygen.

Description

Water quality prediction method
Technical Field
The invention relates to the technical field of water quality detection, in particular to a water quality prediction method.
Background
Lake water quality affects the water safety of its surrounding organisms including human beings, and in order to make precautionary measures in advance, prediction of water quality is required. Traditional predictive algorithms have difficulty forming efficient nonlinear systems due to the complexity of the water quality system.
At present, the existing technology comprises the steps of forecasting water quality by using a gray neural network, and correcting error residual values through Markov, wherein the numerical value can be corrected by the method so as to be relatively easy to approach to a real numerical value; the grey neural network and the artificial neural network are combined, and the algorithm is used for predicting the water quality; the time sequence is optimized through a subdivision extrapolation limit method and a multi-reference weighted fuzzy prediction method, and the detection result shows that the time sequence prediction designed by the subdivision extrapolation limit method can obtain a good detection conclusion.
Disclosure of Invention
The invention provides a water quality prediction method, wherein the predicted value is closer to the true value, and the performance is more excellent. The improved algorithm can be used for early prediction of dissolved oxygen.
The invention adopts the following technical scheme.
A water quality prediction method utilizes a regression type SVR algorithm to predict dissolved oxygen in water, optimizes parameters C and g in the SVR by an artificial immune algorithm AIA to reduce subjective influence of human factors and improve universality and performance of the SVR, and carries out correlation calculation on output of a SVR model and various water quality parameters to select water quality parameters with higher correlation coefficients as input of the model so as to improve accuracy of the algorithm.
The method comprises the following steps;
step S1, selecting water quality data with higher correlation coefficient with dissolved oxygen as an input node of an algorithm, wherein the water quality data comprises water temperature, conductivity, total phosphorus and chemical oxygen demand, and the dissolved oxygen is an output node of the algorithm; normalizing the water quality historical data to obtain a test set and a training set;
s2, constructing an SVR water quality prediction model, and taking an antibody generated by an artificial immune algorithm as a parameter c and a parameter g in a regression type support vector machine SVR model;
s3, bringing the data of the training set obtained in the step S1 into a model, and comparing and analyzing the prediction accuracy of the SVR model on the dissolved oxygen under the action of different parameters c and parameters g;
s4, taking the prediction accuracy generated by SVR as an affinity function of an artificial immunity algorithm, and keeping parameters with high propagation probability as memory cells;
s5, in order to avoid the algorithm to fall into a local optimal solution, carrying out random variation on the antibody with low affinity in the memory cells, and finally forming a new parent group;
s6, re-screening the parent group newly generated by the artificial immune algorithm by applying the step c until iteration is finished;
and S7, obtaining the parameter c and the parameter g which are the optimal values after iteration is finished, namely an optimal algorithm model, and taking the data of the test set into the model to obtain the predicted value of the dissolved oxygen.
The correlation coefficient described in step S1 is an introduced correlation coefficient CC, and is used for selecting appropriate water quality data as an input node, where the correlation coefficient CC is used to display the closeness of the relationship between two variables, especially the trend of these variables;
the correlation coefficient CC is defined as:
wherein X, Y is the water quality data and dissolved oxygen data to be compared, cov (X, Y) is the covariance between the two data, σ x Sum sigma y Is the variance of the two data; correlation coefficient |CC|<0.4 is weakly correlated, 0.4<|CC|<0.7 is medium intensity related, |CC|>0.7 is a strong correlation.
The water quality prediction model of the regression-type support vector machine in the step S2 specifically comprises:
assuming that a group of training samples L (x, y) exist, wherein x represents input data of the training samples, namely other water quality data, and y represents output data corresponding to the training samples, namely dissolved oxygen data; in order to determine the corresponding relation between the two, a linear regression function is established in a high-dimensional feature space:
f (x) =wΦ (x) +b formula two;
where phi (x) is a nonlinear mapping function. To solve for w and b, a relaxation variable ζ is introduced here iThe mathematical expression is:
the constraint conditions are as follows:
to solve equation four, the larginge function is also introduced and converted to the dual form:
the constraint conditions are as follows:
wherein K (x) i ,z i ) Is a kernel function.
The SVR model under the action of the different parameters c and g in the step S3 is specifically expressed as follows: c is a penalty factor, the requirement of the whole SVR model function on errors is determined, and as the numerical value of c is increased, the requirement of the function on error values is stricter, so that real data are easily missed excessively; as the value of c is reduced, the more relaxed the function has to the error value, the more likely the function screening effect is invalid;
kernel function K (x i ,z i ) Adopting RBF, wherein the Gaussian kernel function RBF reduces the weight of data points far away from the plane, so that the RBF can process high-low frequency data faster than other kernel functions, and the RBF kernel function can find a proper plane by using a help regression type support vector machine faster than other kernel functions; the parameter g of the RBF influences generalization performance by influencing the action range of the Gaussian function, the action range of the Gaussian function is too small due to the fact that the value of the parameter g is too large, so that some other data are not classified, the effect of data classification is reduced due to the fact that the Gaussian function acts on too much data due to the fact that the value of the parameter g is too small, good training effect cannot be obtained on a training set, and the prediction result of the testing set is deteriorated.
The specific steps of the propagation probability calculation in the step S4 are as follows:
and step A1, analyzing the problem. Taking ideal predicted values as antigens and taking parameters C and g as antibodies; the difference between the predicted value and the true value generated by the SVR is used as an affinity function;
step A2, generating an initial antibody group; randomly generating an initial antibody population;
step A3, evaluating the antibody group; two criteria are used for evaluating antibody populations by artificial immune algorithms; firstly, the affinity between the antibody and the antigen, namely the affinity function in the step A1, and secondly, the concentration between the antibody and the antibody; the concentration expression is:
wherein N is the total number of antibodies, S v,s Is the similarity between antibodies. The similarity expression is:
wherein k is v,s The number of bits of the antibody v is the same as that of the antibody s, and L is the length of the antibody;
then calculating the reproduction probability by using the affinity between the antibody and the antigen and the concentration of the antibody, wherein the probability of being selected to a memory bank and a parent group is higher as the reproduction probability is higher; the propagation probability expression is as follows:
wherein alpha is a constant, A v As the affinity function, it is known from the above equation that the higher the affinity, the higher the propagation probability, the higher the individual concentration, and the lower the propagation probability.
In step S5, the generation of the new parent group specifically includes the steps of:
step B1, generating a memory bank and a new antibody group; the antibody groups with highest similarity are reserved as a memory bank according to the arrangement of the similarity from high to low; arranging from top to bottom according to the propagation probability, and taking the first N individuals to form a new antibody group;
step B2, cross mutation; based on the antibody population produced in step B1, cross mutation was performed for each antibody to obtain a new antibody population.
Step B3, generating a new generation of parent group; combining the new antibody group obtained in the step B2 with the memory bank obtained in the step B1 to jointly form a new generation parent group.
The method is used for predicting the water quality change of the lake.
The invention belongs to a water quality prediction method based on a regression type support vector machine and an artificial immunity algorithm, which comprises the following steps: firstly, calculating the correlation between the data to be predicted and other various water quality data, and then taking the water quality data with high correlation coefficient as the input data of an optimization algorithm. Because the regression type support vector machine is greatly influenced by the parameters C and the parameters g, the variance of SVR output is used as the adaptability of an artificial immunity algorithm, the parameters C and the parameters g are optimized by utilizing the excellent optimizing capability of the artificial immunity algorithm, the best parameters C and parameters g are found, and the SVR model frame is built again, so that the regression type support vector machine outputs the optimal predicted value.
According to the invention, the parameter C and the parameter g in the Artificial Immune Algorithm (AIA) optimized regression type support vector machine (SVR) are used for predicting the dissolved oxygen in water, so that the subjective influence of human factors can be reduced, and the universality and the performance of the support vector machine are improved. Meanwhile, in order to improve algorithm accuracy, the output of the model and various water quality parameters are subjected to correlation calculation, the water quality parameters with high correlation coefficients are selected as the input of the model, finally, the prediction result is compared with other algorithm models, and the experimental result shows that the prediction value of the new model is smaller than the SVR and GRNN models in variance and maximum error value, the prediction value is closer to a true value, and the performance is more excellent. The improved algorithm can be used for early prediction of dissolved oxygen.
Compared with the prior art, the technical scheme of the invention has the beneficial effects that:
the invention provides a novel water quality prediction method which comprises the following steps: and the correlation coefficient is used for selecting water quality data as an input node of an algorithm, so that the problem that the predicted data effect is not ideal due to the error of selecting the input node when predicting different water quality data is avoided. And improving the parameters C and g of the regression type support vector machine through an artificial immune algorithm. Because the artificial immune algorithm not only has excellent optimizing capability, but also introduces the concept of propagation probability, the diversity of the antibody is ensured, and the algorithm is prevented from entering a local optimal solution. The method can rapidly predict the future water quality change of the lake and avoid the deterioration of the water body.
Drawings
The invention is described in further detail below with reference to the attached drawings and detailed description:
FIG. 1 is a schematic flow chart of the present invention;
FIG. 2 is a schematic diagram comparing the predicted results of the present invention with other algorithms.
Detailed Description
As shown in the figure, the method predicts the dissolved oxygen in water by using a regression type SVR algorithm, optimizes the parameters C and g in the SVR by using an artificial immune algorithm AIA to reduce the subjective influence of human factors and improve the universality and performance of the SVR, and carries out correlation calculation on the output of a SVR model and various water quality parameters to select the water quality parameters with higher correlation coefficients as the input of the model so as to improve the accuracy of the algorithm.
The method comprises the following steps;
step S1, selecting water quality data with higher correlation coefficient with dissolved oxygen as an input node of an algorithm, wherein the water quality data comprises water temperature, conductivity, total phosphorus and chemical oxygen demand, and the dissolved oxygen is an output node of the algorithm; normalizing the water quality historical data to obtain a test set and a training set;
s2, constructing an SVR water quality prediction model, and taking an antibody generated by an artificial immune algorithm as a parameter c and a parameter g in a regression type support vector machine SVR model;
s3, bringing the data of the training set obtained in the step S1 into a model, and comparing and analyzing the prediction accuracy of the SVR model on the dissolved oxygen under the action of different parameters c and parameters g;
s4, taking the prediction accuracy generated by SVR as an affinity function of an artificial immunity algorithm, and keeping parameters with high propagation probability as memory cells;
s5, in order to avoid the algorithm to fall into a local optimal solution, carrying out random variation on the antibody with low affinity in the memory cells, and finally forming a new parent group;
s6, re-screening the parent group newly generated by the artificial immune algorithm by applying the step c until iteration is finished;
and S7, obtaining the parameter c and the parameter g which are the optimal values after iteration is finished, namely an optimal algorithm model, and taking the data of the test set into the model to obtain the predicted value of the dissolved oxygen.
The correlation coefficient described in step S1 is an introduced correlation coefficient CC, and is used for selecting appropriate water quality data as an input node, where the correlation coefficient CC is used to display the closeness of the relationship between two variables, especially the trend of these variables;
the correlation coefficient CC is defined as:
wherein X, Y is the water quality data and dissolved oxygen data to be compared, cov (X, Y) is the covariance between the two data, σ x Sum sigma y Is the variance of the two data; correlation coefficient |CC|<0.4 is weakly correlated, 0.4<|CC|<0.7 is medium intensity related, |CC|>0.7 is a strong correlation.
The water quality prediction model of the regression-type support vector machine in the step S2 specifically comprises:
assuming that a group of training samples L (x, y) exist, wherein x represents input data of the training samples, namely other water quality data, and y represents output data corresponding to the training samples, namely dissolved oxygen data; in order to determine the corresponding relation between the two, a linear regression function is established in a high-dimensional feature space:
f (x) =wΦ (x) +b formula two;
where phi (x) is a nonlinear mapping function. To solve for w and b, a relaxation variable ζ is introduced here i ,ξ i * The mathematical expression is:
the constraint conditions are as follows:
to solve equation four, the larginge function is also introduced and converted to the dual form:
the constraint conditions are as follows:
wherein K (x) i ,z i ) Is a kernel function.
The SVR model under the action of the different parameters c and g in the step S3 is specifically expressed as follows: c is a penalty factor, the requirement of the whole SVR model function on errors is determined, and as the numerical value of c is increased, the requirement of the function on error values is stricter, so that real data are easily missed excessively; as the value of c is reduced, the more relaxed the function has to the error value, the more likely the function screening effect is invalid;
kernel function K (x i ,z i ) Adopting RBF, wherein the Gaussian kernel function RBF reduces the weight of data points far away from the plane, so that the RBF can process high-low frequency data faster than other kernel functions, and the RBF kernel function can find a proper plane by using a help regression type support vector machine faster than other kernel functions; the parameter g of the RBF influences generalization performance by influencing the action range of the Gaussian function, the action range of the Gaussian function is too small due to the fact that the value of the parameter g is too large, so that some other data are not classified, the effect of data classification is reduced due to the fact that the Gaussian function acts on too much data due to the fact that the value of the parameter g is too small, good training effect cannot be obtained on a training set, and the prediction result of the testing set is deteriorated.
The specific steps of the propagation probability calculation in the step S4 are as follows:
and step A1, analyzing the problem. Taking ideal predicted values as antigens and taking parameters C and g as antibodies; the difference between the predicted value and the true value generated by the SVR is used as an affinity function;
step A2, generating an initial antibody group; randomly generating an initial antibody population;
step A3, evaluating the antibody group; two criteria are used for evaluating antibody populations by artificial immune algorithms; firstly, the affinity between the antibody and the antigen, namely the affinity function in the step A1, and secondly, the concentration between the antibody and the antibody; the concentration expression is:
wherein N is the total number of antibodies, S v,s Is the similarity between antibodies. The similarity expression is:
wherein k is v,s The number of bits of the antibody v is the same as that of the antibody s, and L is the length of the antibody;
then calculating the reproduction probability by using the affinity between the antibody and the antigen and the concentration of the antibody, wherein the probability of being selected to a memory bank and a parent group is higher as the reproduction probability is higher; the propagation probability expression is as follows:
wherein alpha is a constant, A v As the affinity function, it is known from the above equation that the higher the affinity, the higher the propagation probability, the higher the individual concentration, and the lower the propagation probability.
In step S5, the generation of the new parent group specifically includes the steps of:
step B1, generating a memory bank and a new antibody group; the antibody groups with highest similarity are reserved as a memory bank according to the arrangement of the similarity from high to low; arranging from top to bottom according to the propagation probability, and taking the first N individuals to form a new antibody group;
step B2, cross mutation; based on the antibody population produced in step B1, cross mutation was performed for each antibody to obtain a new antibody population.
Step B3, generating a new generation of parent group; combining the new antibody group obtained in the step B2 with the memory bank obtained in the step B1 to jointly form a new generation parent group.
The method is used for predicting the water quality change of the lake.
Examples:
the rationality of the example verification algorithm is utilized as follows:
the example data is derived from 108 groups of data in total from 1 month in 2007 to 12 months in 2015 of a Taihu lake station No. 0 observation station (120 DEG 22 217'E,31 DEG 53 983' N) selected herein, and the data sampling frequency is once in mid-month. The data of water temperature, conductivity, total nitrogen, total phosphorus, transparency, water depth, PH chemical oxygen demand and ammoniacal nitrogen are selected for calculation and analysis with the dissolved oxygen as output data, and the final results are shown in the following table:
TABLE 1 correlation coefficient of dissolved oxygen with individual water quality data
Generally, the correlation coefficient |cc| <0.4 is a weak correlation, 0.4< |cc| <0.7 is a medium-intensity correlation, and |cc| > 0.7 is a strong correlation. The water temperature, conductivity, total phosphorus, chemical oxygen demand are selected as the input nodes of the algorithm.
As shown in fig. 2, the predicted value of the modified algorithm is closer to the true value and the fluctuations are smaller than the original algorithm. Because the naked eyes cannot describe the number of the specific improvement, the variance and the maximum error value of the predicted value and the true value are used for comparison in the section, and the superiority and inferiority of the two algorithms are analyzed.
TABLE 2 variance to maximum error value comparison
As shown in the table above, the variance of the predicted and actual values of the AIA-SVR model was 0.19153, the maximum error value was 0.76558mg/L, the variance of the predicted and actual values of the SVR model was 0.6248, and the maximum error was 1.3952mg/L. The variance of the predicted value and the true value of the GRNN model is 0.39799, and the maximum error value is 1.19mg/L. As can be seen by comparison, the predicted value of the AIA-SVR model is smaller than the SVR and GRNN models in variance and maximum error value, the predicted value is closer to the true value, and the performance is more excellent.

Claims (8)

1. A water quality prediction method is characterized in that: according to the method, dissolved oxygen in water is predicted by using a regression type SVR algorithm, parameters C and g in the SVR are optimized by using an artificial immune algorithm AIA to reduce subjective influence of human factors and improve universality and performance of the SVR.
2. A water quality prediction method according to claim 1, characterized in that: the method comprises the following steps;
step S1, selecting water quality data with higher correlation coefficient with dissolved oxygen as an input node of an algorithm, wherein the water quality data comprises water temperature, conductivity, total phosphorus and chemical oxygen demand, and the dissolved oxygen is an output node of the algorithm; normalizing the water quality historical data to obtain a test set and a training set;
s2, constructing an SVR water quality prediction model, and taking an antibody generated by an artificial immune algorithm as a parameter c and a parameter g in a regression type support vector machine SVR model;
s3, bringing the data of the training set obtained in the step S1 into a model, and comparing and analyzing the prediction accuracy of the SVR model on the dissolved oxygen under the action of different parameters c and parameters g;
s4, taking the prediction accuracy generated by SVR as an affinity function of an artificial immunity algorithm, and keeping parameters with high propagation probability as memory cells;
s5, in order to avoid the algorithm to fall into a local optimal solution, carrying out random variation on the antibody with low affinity in the memory cells, and finally forming a new parent group;
s6, re-screening the parent group newly generated by the artificial immune algorithm by applying the step c until iteration is finished;
and S7, obtaining the parameter c and the parameter g which are the optimal values after iteration is finished, namely an optimal algorithm model, and taking the data of the test set into the model to obtain the predicted value of the dissolved oxygen.
3. A water quality prediction method according to claim 2, characterized in that: the correlation coefficient described in step S1 is an introduced correlation coefficient CC, and is used for selecting appropriate water quality data as an input node, where the correlation coefficient CC is used to display the closeness of the relationship between two variables, especially the trend of these variables;
the correlation coefficient CC is defined as:
wherein X, Y is the water quality data and dissolved oxygen data to be compared, cov (X, Y) is the covariance between the two data, σ x Sum sigma y Is the variance of the two data; correlation coefficient |CC|<0.4 is weakly correlated, 0.4<|CC|<0.7 is medium intensity related, |CC|>0.7 is a strong correlation.
4. A water quality prediction method according to claim 2, characterized in that: the water quality prediction model of the regression-type support vector machine in the step S2 specifically comprises:
assuming that a group of training samples L (x, y) exist, wherein x represents input data of the training samples, namely other water quality data, and y represents output data corresponding to the training samples, namely dissolved oxygen data; in order to determine the corresponding relation between the two, a linear regression function is established in a high-dimensional feature space:
f (x) =wΦ (x) +b formula two;
wherein phi is(x) Is a nonlinear mapping function. To solve for w and b, a relaxation variable ζ is introduced here i ,ξ i * The mathematical expression is:
the constraint conditions are as follows:
to solve equation four, the larginge function is also introduced and converted to the dual form:
the constraint conditions are as follows:
wherein K (x) i ,z i ) Is a kernel function.
5. The method for predicting water quality as claimed in claim 4, wherein: the SVR model under the action of the different parameters c and g in the step S3 is specifically expressed as follows: c is a penalty factor, the requirement of the whole SVR model function on errors is determined, and as the numerical value of c is increased, the requirement of the function on error values is stricter, so that real data are easily missed excessively; as the value of c is reduced, the more relaxed the function has to the error value, the more likely the function screening effect is invalid;
kernel function K (x i ,z i ) The RBF is adopted, wherein the Gaussian kernel function RBF reduces the weight of data points far from a plane, so that the RBF can process high-frequency data and low-frequency data faster than other kernel functions, and the RBF kernel function phases can be realizedFinding a proper plane by using a help regression type support vector machine faster than other kernel functions; the parameter g of the RBF influences generalization performance by influencing the action range of the Gaussian function, the action range of the Gaussian function is too small due to the fact that the value of the parameter g is too large, so that some other data are not classified, the effect of data classification is reduced due to the fact that the Gaussian function acts on too much data due to the fact that the value of the parameter g is too small, good training effect cannot be obtained on a training set, and the prediction result of the testing set is deteriorated.
6. The method for predicting water quality as claimed in claim 4, wherein: the specific steps of the propagation probability calculation in the step S4 are as follows:
and step A1, analyzing the problem. Taking ideal predicted values as antigens and taking parameters C and g as antibodies; the difference between the predicted value and the true value generated by the SVR is used as an affinity function;
step A2, generating an initial antibody group; randomly generating an initial antibody population;
step A3, evaluating the antibody group; two criteria are used for evaluating antibody populations by artificial immune algorithms; firstly, the affinity between the antibody and the antigen, namely the affinity function in the step A1, and secondly, the concentration between the antibody and the antibody; the concentration expression is:
wherein N is the total number of antibodies, S v,s Is the similarity between antibodies. The similarity expression is:
wherein k is v,s The number of bits of the antibody v is the same as that of the antibody s, and L is the length of the antibody;
then calculating the reproduction probability by using the affinity between the antibody and the antigen and the concentration of the antibody, wherein the probability of being selected to a memory bank and a parent group is higher as the reproduction probability is higher; the propagation probability expression is as follows:
wherein alpha is a constant, A v As the affinity function, it is known from the above equation that the higher the affinity, the higher the propagation probability, the higher the individual concentration, and the lower the propagation probability.
7. A water quality prediction method according to claim 2, characterized in that: in step S5, the generation of the new parent group specifically includes the steps of:
step B1, generating a memory bank and a new antibody group; the antibody groups with highest similarity are reserved as a memory bank according to the arrangement of the similarity from high to low; arranging from top to bottom according to the propagation probability, and taking the first N individuals to form a new antibody group;
step B2, cross mutation; based on the antibody population produced in step B1, cross mutation was performed for each antibody to obtain a new antibody population.
Step B3, generating a new generation of parent group; combining the new antibody group obtained in the step B2 with the memory bank obtained in the step B1 to jointly form a new generation parent group.
8. A water quality prediction method according to claim 2, characterized in that: the method is used for predicting the water quality change of the lake.
CN202311106497.0A 2023-08-30 2023-08-30 Water quality prediction method Pending CN117077819A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311106497.0A CN117077819A (en) 2023-08-30 2023-08-30 Water quality prediction method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311106497.0A CN117077819A (en) 2023-08-30 2023-08-30 Water quality prediction method

Publications (1)

Publication Number Publication Date
CN117077819A true CN117077819A (en) 2023-11-17

Family

ID=88709581

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311106497.0A Pending CN117077819A (en) 2023-08-30 2023-08-30 Water quality prediction method

Country Status (1)

Country Link
CN (1) CN117077819A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117633721A (en) * 2024-01-25 2024-03-01 水利部交通运输部国家能源局南京水利科学研究院 Urban river network transparency prediction method driven by mechanism model and data in combined mode

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117633721A (en) * 2024-01-25 2024-03-01 水利部交通运输部国家能源局南京水利科学研究院 Urban river network transparency prediction method driven by mechanism model and data in combined mode
CN117633721B (en) * 2024-01-25 2024-04-09 水利部交通运输部国家能源局南京水利科学研究院 Urban river network transparency prediction method driven by mechanism model and data in combined mode

Similar Documents

Publication Publication Date Title
CN109214575B (en) Ultrashort-term wind power prediction method based on small-wavelength short-term memory network
CN108900346B (en) Wireless network flow prediction method based on LSTM network
CN111563706A (en) Multivariable logistics freight volume prediction method based on LSTM network
CN112989635B (en) Integrated learning soft measurement modeling method based on self-encoder diversity generation mechanism
CN117077819A (en) Water quality prediction method
Tsakiridis et al. DECO3RUM: A Differential Evolution learning approach for generating compact Mamdani fuzzy rule-based models
CN112289391B (en) Anode aluminum foil performance prediction system based on machine learning
CN112434891A (en) Method for predicting solar irradiance time sequence based on WCNN-ALSTM
CN111191823B (en) Deep learning-based production logistics prediction method
CN110765418B (en) Intelligent set evaluation method and system for basin water and sand research model
CN115185937A (en) SA-GAN architecture-based time sequence anomaly detection method
CN111985825A (en) Crystal face quality evaluation method for roller mill orientation instrument
CN113889198A (en) Transformer fault diagnosis method and equipment based on oil chromatogram time-frequency domain information and residual error attention network
CN113537469A (en) Urban water demand prediction method based on LSTM network and Attention mechanism
Tessoni et al. Advanced statistical and machine learning methods for multi-step multivariate time series forecasting in predictive maintenance
Buragohain Adaptive network based fuzzy inference system (ANFIS) as a tool for system identification with special emphasis on training data minimization
Wang et al. Causal carbon price interval prediction using lower upper bound estimation combined with asymmetric multi-objective evolutionary algorithm and long short-term memory
CN117728403A (en) Wind power probability prediction method and system under severe wind scene of cold weather
CN116579371A (en) Double-layer optimization heterogeneous proxy model assisted multi-objective evolutionary optimization computing method
Xu et al. Wisdom: Weighted incremental spatio-temporal multi-task learning via tensor decomposition
CN114004346A (en) Soft measurement modeling method based on gating stacking isomorphic self-encoder and storage medium
CN115618987A (en) Production well production data prediction method, device, equipment and storage medium
Huang et al. Calibration-aware bayesian learning
CN115035962A (en) Variational self-encoder and generation countermeasure network-based virtual sample generation and soft measurement modeling method
CN113723707A (en) Medium-and-long-term runoff trend prediction method based on deep learning model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination