CN112446168A - Effluent BOD concentration soft measurement method based on MIC and RBFNN - Google Patents

Effluent BOD concentration soft measurement method based on MIC and RBFNN Download PDF

Info

Publication number
CN112446168A
CN112446168A CN202011169471.7A CN202011169471A CN112446168A CN 112446168 A CN112446168 A CN 112446168A CN 202011169471 A CN202011169471 A CN 202011169471A CN 112446168 A CN112446168 A CN 112446168A
Authority
CN
China
Prior art keywords
samples
effluent
sample
clustering
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011169471.7A
Other languages
Chinese (zh)
Inventor
乔俊飞
石文强
李文静
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Technology
Original Assignee
Beijing University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Technology filed Critical Beijing University of Technology
Priority to CN202011169471.7A priority Critical patent/CN112446168A/en
Publication of CN112446168A publication Critical patent/CN112446168A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • G06F30/27Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Computer Hardware Design (AREA)
  • Geometry (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses an effluent BOD concentration soft measurement method based on MIC and RBFNN, aiming at the problems of long waiting time, high manufacturing cost of instruments and equipment, independent development, deployment and maintenance of soft measurement hardware and the like in the effluent BOD concentration measurement process in the current sewage treatment process, and realizing soft measurement of the BOD concentration of the effluent of key water quality parameters by utilizing a neural network based on the maximum information number and a radial basis function based on the biochemical reaction characteristic of sewage treatment, thereby solving the problems that the BOD concentration of the effluent is difficult to measure and the software measurement hardware needs to be independently developed and deployed; the result shows that the radial basis function neural network can accurately predict the BOD concentration of the effluent of sewage treatment to a certain extent, and is favorable for improving the BOD concentration quality monitoring level of the effluent in the sewage treatment process.

Description

Effluent BOD concentration soft measurement method based on MIC and RBFNN
Technical Field
According to the biochemical reaction characteristics of sewage treatment, the method uses a Neural Network (RBFNN) based on Maximum Information Coefficient (MIC) and Radial Basis Function to realize the prediction of the BOD concentration of a key water quality parameter in the sewage treatment process, and the BOD concentration of effluent is an important parameter for representing the water pollution and the sewage treatment degree, and has important influence on the environment. The realization of the online prediction of the BOD concentration of the effluent is an important link of sewage treatment, and belongs to the fields of artificial intelligence and sewage treatment.
Background
The urban sewage treatment process is a complex and large-lag biochemical reaction process, has the characteristics of diversity, randomness, uncertainty, strong coupling, high nonlinearity, large time variation and the like, and the detection and control of key water quality parameters are important preconditions for stable and efficient operation of sewage treatment plants.
The BOD of the effluent is one of the key parameters for describing the characteristics of the sewage and is an important index for measuring the overall performance of the sewage treatment. However, the traditional effluent BOD detection technology is offline, a measured value can be obtained only after several days, and the sewage treatment process has the characteristics of strong nonlinearity, time-varying property and the like, so that BOD has the characteristic of difficult accurate measurement.
The BOD concentration of the effluent can be obtained through an artificial chemical examination method, the operation of the artificial chemical examination method is complex, the time consumption from sampling to chemical examination is long, 5 days are needed, the time lag of the artificial chemical examination can seriously influence the sewage treatment effect, and the secondary pollution is easily caused. Compared with the manual sampling assay method, the online detection instrument can shorten the detection time, avoid accidental errors caused by manual operation, but has very expensive purchase and maintenance cost.
In order to measure BOD concentration of water quickly and accurately, many researchers have proposed soft-sensing methods. The soft measurement technology is to utilize the mathematical relationship established between the process variable easy to measure and the variable to be measured which is difficult to directly measure, and to realize the measurement of the process variable to be measured through various mathematical calculation and estimation methods. Soft measurements are able to measure variables that are currently impossible or difficult to detect directly with sensors for technical or economic reasons.
Based on the method, the invention designs the soft BOD concentration measurement method of the effluent based on the maximum information number and the radial basis function neural network, and realizes the online prediction of the BOD concentration of the effluent.
Disclosure of Invention
The invention designs an effluent BOD concentration prediction method based on the maximum information number and the radial basis function neural network, which trains the radial basis function neural network by using the production data of a sewage treatment plant, corrects the parameters of the network, realizes the real-time measurement of the BOD concentration of the effluent, solves the problem that the BOD concentration of the effluent is difficult to measure in real time in the sewage treatment process, and reduces the production cost of sewage treatment;
the invention adopts the following technical scheme that the method for predicting the BOD concentration of the effluent based on the maximum information number and the radial basis function neural network comprises the following steps:
step 1, determining auxiliary variables: carrying out correlation analysis on the acquired actual water quality parameter original data of the sewage treatment plant by adopting a Maximum Information Coefficient (MIC), calculating the correlation coefficient of each water quality parameter and the BOD of the effluent water in a calculation mode shown as a formula (1), selecting a variable with the correlation coefficient larger than 0.5, and obtaining an auxiliary variable with strong correlation with the BOD concentration of the effluent water as follows: the total nitrogen concentration of the effluent, the ammonia nitrogen concentration of the effluent, the total nitrogen concentration of the influent, the BOD concentration of the influent, the ammonia nitrogen concentration of the influent, the DO concentration of the biochemical tank and the phosphate concentration of the influent tank;
Figure BDA0002746843440000021
Figure BDA0002746843440000022
wherein, I (X, Y) represents mutual information of X and Y, p (X, Y) represents joint probability density distribution function of X and Y, p (X), p (Y) respectively represent probability density distribution function of X, Y, N represents sample data volume, B (N) is function related to sample data volume, and its value is N0.6
Step 2, determining an initial clustering center of the K-means clustering algorithm on the basis of the feature data screened in the step 1: using the sample densities and the distances between the samples to determine K initial cluster centers for the K-means algorithm, step 2 comprises the steps of,
step 2.1 data normalization: normalizing the training data and the test data according to the formula (3) to reduce the influence of different dimensions on the result;
Figure BDA0002746843440000023
wherein xnormalRepresenting the normalized data, min representing the minimum value of the variable in all samples, max representing the maximum value of the variable in all samples, and x representing the original value of the data;
step 2.2 determining clustered candidate samples: calculating Euclidean distances among all samples in all data, sequencing all the distances in an ascending order, taking the mean value of the upper quartile and the lower quartile of the distances as a distance threshold value R, and calculating the density of the ith sample according to a formula (4)iSorting all the densities in an ascending order, selecting the mean value of the upper quartile and the lower quartile of the density values of all the samples as a density threshold value, and selecting the samples with the density being more than or equal to the threshold value as candidate samples;
Figure BDA0002746843440000031
Figure BDA0002746843440000032
wherein, | | | represents a modulo operation, N represents the number of samples, N represents the number of input samples, x (x) is a threshold function, and the function value is 0 or 1;
step 2.3, determining an initial clustering center of the K-means clustering algorithm: determining the number K of final clustering centers, obtaining two samples with the largest distance from the candidate samples as initial clustering centers, and recording the initial clustering centers as C1、C2Deleting two samples from the candidate set, and in the remaining samples, allocating the remaining candidate samples to the nearest center according to the Euclidean distance shortest principle to serve as a sample cluster, and forming two sample clusters S1、S2Calculating S1Samples in clusters to C1And S2Samples in clusters to C2Taking two samples farthest from the center of the existing initial cluster in the two clusters as C11、C21The two farthest distances are denoted as d1、d2If, ifd1>=d2Then C will be11Removed from the original sample set, added to the initial cluster center sample, denoted C3Otherwise, C is added21Removed from the original sample set, added to the initial cluster center sample, denoted C3
Step 2.4 calculate the remaining initial cluster centers: dividing the rest samples into corresponding initial clustering centers according to the principle of closest distance, and recording the formed clustering cluster as S1、S2...SmCalculating the distance from the sample point in each cluster to the cluster center thereof, and respectively recording the distance from the sample in each cluster to the cluster center thereof as d1,d2,...,dmM represents the number of the existing cluster, and d is takenm+1=max{d1,d2,...,dmGet it before
Figure BDA0002746843440000033
h is an empirical value and the value range is [0, 1]]Get dm+1The corresponding sample is taken as a new clustering center and is marked as Cm+1If m +1 is equal to K, all initial clustering centers have been determined, and the step is ended, if m +1 is equal to K<K, continuing the step;
step 3, determining the center, width and weight parameters of the radial basis function neural network in the soft measurement model: substituting the K initial clustering centers obtained in the step (2) into an original K-means algorithm to obtain a clustering result, taking the clustering result of the K-means clustering algorithm as a central parameter of a radial basis function, and taking the initial weight of each node in the hidden layer and the node of the output layer as 1, wherein the step (3) comprises the following steps;
step 3.1, calculating Euclidean distance between the samples in the data set and the existing clustering centers, and distributing the samples to the clustering centers closest to the samples to form clustering clusters;
step 3.2, solving the mean value of all samples in each cluster, and taking the mean value as a new cluster center;
step 3.3, repeating the steps 3.1 and 3.2, ending clustering when the clustering centers are not changed or the cycle number reaches a specified upper limit, and obtaining K clustering centers;
and 3.4, selecting K clustering centers as the centers of the radial basis functions, and selecting the shortest Euclidean distance from the current center to other centers as the width parameters of the radial basis functions corresponding to the centers.
Step 4, determining a topological structure of a radial basis function neural network for predicting the BOD concentration of the effluent, wherein the step 4 comprises the following steps;
step 4.1, determining the number of nodes of an input layer: the layer has n neurons, n is the number of auxiliary variables determined in step 1, and each node represents an input variable xiThe purpose of this layer is to pass the input value directly to the hidden layer, i denotes the sample sequence number;
xi,i=1,2,...,n (6)
step 4.2 determines the number of hidden layer nodes and the width and center of the hidden layer nodes. The layer is provided with m neurons in total, m is the number K of the clustering centers determined by the K-means algorithm in the step 2, the center selection of the radial basis function is the clustering result determined in the step 2, the width is the nearest Euclidean distance from the clustering center to other clustering centers, the hidden layer transfer function is the radial basis function, and a standard Gaussian function is usually selected and shown in the formula (7);
Figure BDA0002746843440000041
wherein, ciCentral parameter, σ, representing the ith hidden layer nodeiA width parameter representing the ith hidden layer node;
step 4.3, determining the output layer connection weight: the output layer has a node in common, the output of the node of the output layer is as shown in formula (8), and the initial connection weight of each hidden layer node and the output node is set to be 1;
Figure BDA0002746843440000042
wherein, yjRepresenting the jth input sample xjCorresponding output when input to the network, wiRepresenting the ith implicitAnd connecting the layer node with the output node.
Step 5, adjusting radial basis function neural network parameters of the soft measurement model, wherein the step 5 comprises the following steps;
step 5.1 determining the parameters to be updated: the parameters to be adjusted are the output weights of the radial basis function neural network, all the weight parameters are arranged into a row vector, and the row vector is marked as delta, and the value of the delta is shown as a formula (9);
Δ=[w1,w2,...,wm] (9)
wherein wmRepresenting the connection weight of the mth hidden layer node and the output node; step 5.2, circularly adjusting weight parameters of the radial basis function neural network by using an LM algorithm: calculating a gradient vector, a Jacobian matrix and a Hessian-like matrix according to the input of the current network, wherein the gradient vector g is calculated as shown in a formula (10), and the Jacobian matrix j is calculatedpThe calculation of (2) is shown as a formula (11), and the calculation of the Hessian-like matrix Q is shown as a formula (12);
Figure BDA0002746843440000051
Figure BDA0002746843440000052
Figure BDA0002746843440000053
ep=yd-yo (13)
where P denotes the total number of samples, P denotes the sample number currently input into the network, ydFor the desired output of the network, yoIs the actual output j of the networkp TRepresenting the Jacobian matrix jpThe transposed matrix of (2); updating the parameters to be updated: updating the output weight of the radial basis function neural network according to the formula (14);
Δk+1=Δk-(QkkI)-1gk (14)
wherein k represents the current training times, mu represents the learning rate, the value is 1, when the network is reduced, the parameter is reduced to 1/10 of the last iteration, otherwise, the parameter is increased to 10 times of the last iteration, the upper value limit of mu is 10^15, and the lower value limit of mu is 10^ 15;
if the absolute value of the error change of the two adjacent parameter updates is less than 10^ -10 or the number of times of single adjustment cycle reaches the upper limit, ending the parameter adjustment of the cycle, inputting the next sample, and repeating the step 5.2;
if the training samples are completely traversed, but the error is not smaller than the target value yet and the traversal times do not reach the upper limit, re-inputting the first sample in the training sample set, and repeating the step 5.2, otherwise, ending the parameter adjusting process;
step 5.3, the test sample is used as the input of the radial basis function neural network to obtain the predicted value of the BOD concentration of the normalized effluent, and the result is subjected to reverse normalization according to the formula (15) to obtain the actual predicted value of the BOD concentration of the effluent;
xreal=xnormal*(max-min)+min (15)
wherein xreal,xnormalRepresenting true prediction data;
and 6, packaging the soft measurement model obtained in the step 5 into a jar file, importing a JavaWeb project, using a cloud server to complete service deployment, using a browser to access the project, uploading production data, calling a radial basis function neural network program by the server to predict, and returning a predicted result to the client.
The invention is mainly characterized in that:
(1) aiming at the problem that the BOD concentration of effluent of the current sewage treatment plant cannot be measured in real time, the invention extracts 7 related quantities with higher BOD concentration of the effluent through a maximum information number algorithm, simplifies the input of a neural network and improves the processing speed of a radial basis function neural network;
(2) the urban sewage treatment process is a complex and large-lag biochemical reaction process and has the characteristics of diversity, randomness, uncertainty, strong coupling, high nonlinearity, large time variation and the like, so that the prediction of the BOD concentration of effluent is realized by adopting a radial basis function neural network based on actual measured data of an actual sewage treatment plant, and the method has the characteristics of higher prediction precision, strong adaptability to complex working conditions and the like;
particular attention is paid to: the invention adopts 7 screened auxiliary variables based on the maximum information number algorithm, and the radial basis function neural network initialization mode based on the improved K-means algorithm all belongs to the scope of the invention;
drawings
FIG. 1 is a diagram of a radial basis function neural network architecture of the present invention
FIG. 2 is a graph of the BOD concentration prediction method of effluent according to the present invention
FIG. 3 is a graph of the BOD concentration prediction method of the effluent water according to the present invention
FIG. 4 is a test result chart of the BOD concentration prediction method of effluent water of the present invention
FIG. 5 is a test error diagram of the BOD concentration prediction method of effluent water of the present invention
Detailed Description
The invention obtains a soft BOD concentration measuring method of effluent based on maximum mutual information number and radial basis function network, completes auxiliary variable screening by using maximum information number calculation method, completes initialization of radial basis function neural network by using improved K-means algorithm, completes output weight adjustment of network by using second-order LM algorithm, realizes real-time measurement of BOD concentration of effluent, and solves the problem that BOD concentration of effluent is difficult to measure in real time in sewage treatment process;
experimental data come from production operation data of a certain Beijing sewage plant; selecting actual detection data of total nitrogen concentration of outlet water, ammonia nitrogen concentration of outlet water, total nitrogen concentration of inlet water, BOD concentration of inlet water, ammonia nitrogen concentration of inlet water, DO concentration of a biochemical pool and phosphate concentration of an inlet pool as experimental sample data, wherein 365 groups of samples are total and divided into two parts: wherein the first 280 groups of data are used as training samples, and the other 85 groups of data are used as testing samples;
a method for predicting BOD concentration of effluent based on maximum information number and radial basis function neural network is characterized by comprising the following steps:
step 1, determining auxiliary variables: carrying out correlation analysis on the acquired actual water quality parameter original data of the sewage treatment plant by adopting a Maximum Information Coefficient (MIC), calculating the correlation coefficient of each water quality parameter and the BOD of the effluent water in a calculation mode shown as a formula (16), selecting a variable with the correlation coefficient larger than 0.5, and obtaining an auxiliary variable with strong correlation with the BOD concentration of the effluent water as follows: the total nitrogen concentration of the effluent, the ammonia nitrogen concentration of the effluent, the total nitrogen concentration of the influent, the BOD concentration of the influent, the ammonia nitrogen concentration of the influent, the DO concentration of the biochemical tank and the phosphate concentration of the influent tank;
Figure BDA0002746843440000071
Figure BDA0002746843440000072
wherein, I (X, Y) represents mutual information of X and Y, p (X, Y) represents joint probability density distribution function of X and Y, p (X), p (Y) respectively represent probability density distribution function of X, Y, N represents sample data volume, B (N) is function related to sample data volume, and its value is N0.6
Step 2, determining an initial clustering center of the K-means clustering algorithm on the basis of the feature data screened in the step 1: using the sample densities and the distances between the samples to determine K initial cluster centers for the K-means algorithm, step 2 comprises the steps of,
step 2.1 data normalization: normalizing the training data and the test data according to the formula (18) to reduce the influence of different dimensions on the result;
Figure BDA0002746843440000073
wherein xnormalRepresents the normalized data, min represents the minimum value of the variable in all samples,max represents the maximum value of the variable in all samples, x represents the original value of the data;
step 2.2 determining clustered candidate samples: calculating Euclidean distances among all samples in all data, sequencing all the distances in an ascending order, taking the mean value of the upper quartile and the lower quartile of the distances as a distance threshold value R, and calculating the density diversity of the ith sample according to a formula (19)iSorting all the densities in an ascending order, selecting the mean value of the upper quartile and the lower quartile of the density values of all the samples as a density threshold value, and selecting the samples with the density being more than or equal to the threshold value as candidate samples;
Figure BDA0002746843440000081
Figure BDA0002746843440000082
wherein, | | | represents a modulo operation, N represents the number of samples, N represents the number of input samples, x (x) is a threshold function, and the function value is 0 or 1;
step 2.3, determining an initial clustering center of the K-means clustering algorithm: determining the number K of final clustering centers, obtaining two samples with the largest distance from the candidate samples as initial clustering centers, and recording the initial clustering centers as C1、C2Deleting two samples from the candidate set, and in the remaining samples, allocating the remaining candidate samples to the nearest center according to the Euclidean distance shortest principle to serve as a sample cluster, and forming two sample clusters S1、S2Calculating S1Samples in clusters to C1And S2Samples in clusters to C2Taking two samples farthest from the center of the existing initial cluster in the two clusters as C11、C21The two farthest distances are denoted as d1、d2If d is1>=d2Then C will be11Removed from the original sample set, added to the initial cluster center sample, denoted C3Otherwise, C is added21Removed from the original sample set, added to the initial cluster center sample, denoted C3
Step 2.4 calculate the remaining initial cluster centers: dividing the rest samples into corresponding initial clustering centers according to the principle of closest distance, and recording the formed clustering cluster as S1、S2...SmCalculating the distance from the sample point in each cluster to the cluster center thereof, and respectively recording the distance from the sample in each cluster to the cluster center thereof as d1,d2,...,dmM represents the number of the existing cluster, and d is takenm+1=max{d1,d2,...,dmGet it before
Figure BDA0002746843440000083
h is an empirical value and the value range is [0, 1]]Get dm+1The corresponding sample is taken as a new clustering center and is marked as Cm+1If m +1 is equal to K, all initial clustering centers have been determined, and the step is ended, if m +1 is equal to K<K, continuing the step;
step 3, determining the center, width and weight parameters of the radial basis function neural network in the soft measurement model: substituting the K initial clustering centers obtained in the step (2) into an original K-means algorithm to obtain a clustering result, taking the clustering result of the K-means clustering algorithm as a central parameter of a radial basis function, and taking the initial weight of each node in the hidden layer and the node of the output layer as 1, wherein the step (3) comprises the following steps;
step 3.1, calculating Euclidean distance between the samples in the data set and the existing clustering centers, and distributing the samples to the clustering centers closest to the samples to form clustering clusters;
step 3.2, solving the mean value of all samples in each cluster, and taking the mean value as a new cluster center;
step 3.3, repeating the steps 3.1 and 3.2, ending clustering when the clustering centers are not changed or the cycle number reaches a specified upper limit, and obtaining K clustering centers;
and 3.4, selecting K clustering centers as the centers of the radial basis functions, and selecting the shortest Euclidean distance from the current center to other centers as the width parameters of the radial basis functions corresponding to the centers.
Step 4, determining a topological structure of a radial basis function neural network for predicting the BOD concentration of the effluent, wherein the step 4 comprises the following steps;
step 4.1, determining the number of nodes of an input layer: the layer has n neurons, n is the number of auxiliary variables determined in step 1, and each node represents an input variable xiThe purpose of this layer is to pass the input value directly to the hidden layer, i denotes the sample sequence number;
xi,i=1,2,...,n (21)
step 4.2 determines the number of hidden layer nodes and the width and center of the hidden layer nodes. The layer is provided with m neurons in total, m is the number K of the clustering centers determined by the K-means algorithm in the step 2, the center selection of the radial basis function is the clustering result determined in the step 2, the width is the nearest Euclidean distance from the clustering center to other clustering centers, the hidden layer transfer function is the radial basis function, and a standard Gaussian function is usually selected and is shown in a formula (22);
Figure BDA0002746843440000091
wherein, ciCentral parameter, σ, representing the ith hidden layer nodeiA width parameter representing the ith hidden layer node;
step 4.3, determining the output layer connection weight: the output layer has a node in common, the output of the node of the output layer is as shown in formula (23), and the initial connection weight of each hidden layer node and the output node is set to be 1;
Figure BDA0002746843440000092
wherein, yjRepresenting the jth input sample xjCorresponding output when input to the network, wiAnd representing the connection weight of the ith hidden layer node and the output node.
Step 5,
Adjusting the parameters of the radial basis function neural network of the soft measurement model, wherein the step 5 comprises the following steps;
step 5.1 determining the parameters to be updated: the parameters to be adjusted are the output weights of the radial basis function neural network, all the weight parameters are arranged into a row vector, and the row vector is marked as delta, and the value of the delta is shown as a formula (24);
Δ=[w1,w2,...,wm] (24)
wherein wmRepresenting the connection weight of the mth hidden layer node and the output node; step 5.2, circularly adjusting weight parameters of the radial basis function neural network by using an LM algorithm: calculating a gradient vector, a Jacobian matrix and a Hessian-like matrix according to the input of the current network, wherein the gradient vector g is calculated as shown in a formula (26), and the Jacobian matrix j is calculated as shown in a formula (26)pThe calculation of (2) is shown as a formula (26), and the calculation of the Hessian-like matrix Q is shown as a formula (27);
Figure BDA0002746843440000101
Figure BDA0002746843440000102
Figure BDA0002746843440000103
ep=yd-yo (28)
where P denotes the total number of samples, P denotes the sample number currently input into the network, ydFor the desired output of the network, yoIs the actual output j of the networkp TRepresenting the Jacobian matrix jpThe transposed matrix of (2); updating the parameters to be updated: updating the output weight of the radial basis function neural network according to a formula (29);
Δk+1=Δk-(QkkI)-1gk (29)
wherein k represents the current training times, mu represents the learning rate, the value is 1, when the network is reduced, the parameter is reduced to 1/10 of the last iteration, otherwise, the parameter is increased to 10 times of the last iteration, the upper value limit of mu is 10^15, and the lower value limit of mu is 10^ 15;
if the absolute value of the error change of the two adjacent parameter updates is less than 10^ -10 or the number of times of single adjustment cycle reaches the upper limit, ending the parameter adjustment of the cycle, inputting the next sample, and repeating the step 5.2;
if the training samples are completely traversed, but the error is not smaller than the target value yet and the traversal times do not reach the upper limit, re-inputting the first sample in the training sample set, and repeating the step 5.2, otherwise, ending the parameter adjusting process;
step 6, inputting the test data into the trained radial basis function neural network to obtain a predicted value of the BOD concentration of the effluent, packaging an MATLAB program into jar files through MATLAB, adding Java engineering, and realizing the soft measurement of the BOD of the effluent by using Java language through calling corresponding API;
the training results for the radial basis function neural network are shown in fig. 2, with X-axis: number of samples, in units of units per sample, Y-axis: the BOD concentration of the effluent water is in unit mg/L, the dotted line is the actual BOD concentration value of the effluent water, and the solid line is the output value of the radial basis function neural network; the error between the actual output value of the BOD concentration of the effluent and the output value of the radial basis function neural network is shown in FIG. 3, and the X axis: number of samples, in units of units per sample, Y-axis: the BOD concentration of the effluent is mg/L;
the prediction results are shown in fig. 4, X-axis: number of samples, in units of units per sample, Y-axis: the BOD concentration of the effluent is in mg/L, the dotted line is the actual output value of the BOD concentration of the effluent, and the solid line is the predicted output value of the BOD concentration of the effluent; the error between the actual output value of the BOD concentration of the effluent and the predicted output value of the BOD concentration of the effluent is shown in figure 5, and the X axis: number of samples, in units of units per sample, Y-axis: predicting the BOD concentration of the effluent, wherein the unit is mg/L;
tables 1-18 show the experimental data of the present invention, with the auxiliary variables having been normalized (normalized interval of [1-,1 ]). Tables 1 to 7 show auxiliary variable data in the training process, table 8 shows actual training output, table 9 is output of the radial basis function neural network in the training process, tables 10 to 16 show auxiliary variable data of the test sample, table 17 shows actual test output data, and table 18 shows effluent BOD concentration prediction value data of the present invention.
TABLE 1 auxiliary variable Total Nitrogen concentration in effluent
Figure BDA0002746843440000111
Figure BDA0002746843440000121
TABLE 2 auxiliary variable of the Ammonia Nitrogen concentration in the effluent
Figure BDA0002746843440000122
Figure BDA0002746843440000131
TABLE 3 Total Nitrogen concentration of the auxiliary variable influent
-0.529 0.740 0.047 -0.017 0.339 0.044 0.666 -0.018 -0.522
0.083 0.012 0.065 -0.352 -0.326 0.273 -0.312 -0.270 -0.803
0.049 0.669 -0.714 -0.018 0.058 -0.295 -0.207 -0.531 0.820
0.037 -0.809 0.026 -0.226 -0.360 -0.148 0.162 -0.250 0.056
-0.299 -0.377 -0.177 -0.258 0.073 -0.022 -0.477 -0.322 -0.109
0.499 0.016 -0.686 -0.738 -0.015 -0.544 0.290 -0.276 -0.224
-0.273 -0.514 0.106 -0.377 -0.529 -0.873 -0.201 0.833 -0.418
-0.268 -0.298 -0.061 -0.609 0.753 -0.566 0.522 0.120 -0.501
-0.051 -0.253 0.317 0.002 0.027 -0.008 -0.417 -0.161 0.378
-0.436 -0.528 -0.212 -0.509 -0.396 0.042 -0.737 -0.381 -1.000
0.371 -0.558 -0.566 -0.534 0.658 0.367 -0.268 -0.258 0.060
-0.398 -0.317 0.147 -0.521 0.008 -0.588 0.049 0.052 -0.091
0.793 -0.242 -0.283 -0.479 0.198 -0.320 -0.533 -0.682 0.309
-0.462 -0.246 -0.400 -0.229 -0.227 0.013 0.004 -0.297 -0.448
-0.495 -0.352 -0.869 0.780 -0.036 -0.252 0.092 -0.589 -0.558
-0.236 -0.143 -0.057 -0.711 0.835 -0.234 -0.934 0.415 -0.257
-0.285 0.248 0.382 -0.521 0.059 -0.105 0.749 -0.194 0.037
-0.276 -0.263 0.259 -0.530 -0.544 0.032 -0.512 -0.336 -0.209
-0.606 -0.196 0.228 -0.551 -0.090 -0.349 0.326 -0.565 -0.275
0.004 -0.478 -0.701 0.040 -0.518 -0.033 -0.206 -0.415 0.504
-0.746 -0.212 -0.363 -0.146 0.075 0.601 0.206 -0.580 -0.085
-0.391 -0.308 -0.419 -0.257 -0.680 -0.604 0.019 -0.240 0.590
-0.638 -0.282 -0.327 -0.524 -0.267 -0.334 -0.553 -0.233 0.257
-0.243 0.191 -0.283 -0.374 0.284 -0.339 -0.516 -0.188 -0.191
-0.037 -0.613 -0.268 0.259 0.806 -0.363 -0.495 -0.631 -0.422
-0.274 0.070 -0.322 -0.529 0.087 -0.323 -0.613 0.208 0.374
-0.266 0.334 0.234 0.183 -0.536 -0.135 -0.303 0.386 -0.237
0.241 -0.281 -0.253 -0.664 -0.007 -0.646 -0.400 -0.259 -0.217
-0.420 -0.260 0.060 0.115 -0.033 0.066 0.167 -0.256 0.012
-0.425 -0.582 0.029 -0.440 -0.148 -0.490 -0.350 -0.282 0.916
0.363 -0.538 -0.272 -0.220 -0.216 0.727 0.015 -0.555 -0.018
0.009
TABLE 4 BOD concentration of the auxiliary variable influent water
Figure BDA0002746843440000132
Figure BDA0002746843440000141
TABLE 5 auxiliary variable influent ammonia nitrogen concentration
Figure BDA0002746843440000142
Figure BDA0002746843440000151
TABLE 6 auxiliary variable Biochemical pool DO concentration
Figure BDA0002746843440000152
Figure BDA0002746843440000161
TABLE 7 auxiliary variable intake pool phosphate concentration
Figure BDA0002746843440000162
Figure BDA0002746843440000171
TABLE 8 measured BOD concentration (mg/L) of the water
10.371 12.957 12.529 14.829 12.871 14.143 14.700 12.600 12.929
13.029 12.729 13.843 10.800 10.557 14.671 11.543 11.686 11.857
12.971 13.386 11.700 12.857 12.543 11.600 11.314 11.029 12.100
13.829 11.171 13.114 10.843 11.071 12.386 11.929 10.857 12.843
12.000 12.380 12.029 11.543 12.557 12.343 10.700 11.486 12.900
15.100 12.914 12.171 10.100 12.714 10.857 14.800 11.814 10.986
11.386 13.100 13.943 11.686 10.900 11.214 11.800 14.300 10.843
10.971 10.200 12.814 11.114 12.814 10.943 14.214 13.871 10.686
12.800 10.671 15.329 12.686 14.800 12.643 10.943 12.271 13.457
10.900 11.443 10.457 11.200 11.129 12.857 12.043 10.729 11.300
13.314 12.071 11.900 10.614 13.471 13.243 11.371 10.629 13.971
10.986 11.514 13.886 10.543 12.357 11.029 12.771 12.900 12.200
12.386 10.800 11.629 10.600 14.371 12.100 11.000 11.086 13.043
11.600 11.357 10.400 10.900 11.786 14.486 12.671 11.571 12.730
11.057 10.814 11.671 12.529 12.400 11.100 13.857 12.457 10.243
11.814 10.500 11.871 12.100 13.643 12.243 11.486 15.300 11.057
11.086 15.700 13.529 12.271 12.786 12.457 14.500 11.300 12.943
11.029 11.200 14.657 10.214 12.414 12.814 10.714 11.400 11.143
12.414 11.957 14.514 12.243 12.157 12.240 13.800 12.529 11.671
13.457 11.400 10.171 12.586 12.686 15.000 10.700 11.729 13.129
11.129 10.600 12.310 12.000 13.800 12.043 14.200 10.600 12.857
12.450 10.500 12.590 11.743 11.657 10.400 12.600 11.000 13.843
12.314 10.371 11.457 11.857 11.129 12.170 10.457 11.343 14.543
11.800 14.029 10.357 10.971 13.014 11.643 10.529 11.957 12.314
12.771 11.571 10.514 12.986 12.243 10.671 11.757 11.200 10.900
10.457 12.614 11.157 12.757 14.600 11.457 12.386 14.157 13.386
10.543 13.071 12.957 12.900 12.586 12.586 11.200 13.600 10.914
14.414 10.414 11.629 10.243 13.629 11.614 10.786 10.914 10.800
10.329 11.371 12.600 12.714 12.757 12.671 14.229 11.743 12.686
10.500 10.186 14.314 11.443 12.429 11.814 10.371 10.700 14.100
13.171 10.200 11.286 11.329 10.771 13.100 13.286 11.000 13.800
13.814
TABLE 9 radial basis function neural network training output (mg/L)
11.711 12.926 13.370 13.929 13.699 13.212 14.130 12.596 12.069
13.417 12.604 13.309 10.690 11.376 13.536 11.372 10.925 11.351
13.296 14.083 11.259 13.129 12.622 12.221 11.249 11.771 13.480
13.327 11.299 13.104 10.996 11.283 12.688 12.881 10.652 12.557
11.263 12.554 12.803 10.952 12.542 13.136 10.521 11.135 13.428
13.989 13.263 11.682 11.011 12.879 10.772 14.258 12.404 10.910
11.571 11.901 12.963 11.333 11.032 10.899 12.253 13.645 11.359
11.012 10.870 13.063 10.709 13.230 10.518 13.390 13.167 11.098
12.941 11.116 13.569 12.553 13.914 12.522 11.365 12.842 13.206
11.348 11.943 10.524 10.675 11.095 12.534 10.581 11.326 10.577
13.518 11.389 11.323 12.048 12.790 13.604 11.021 10.792 13.069
11.454 11.291 13.087 11.746 13.370 10.581 13.157 12.545 12.558
13.577 10.654 10.970 10.706 13.727 12.598 11.036 10.612 13.768
11.094 11.254 11.060 10.928 12.616 13.605 12.921 10.943 12.575
11.789 11.277 10.593 13.383 12.791 11.060 13.261 12.126 10.491
12.628 10.749 12.553 11.833 13.749 12.657 10.812 13.862 11.063
11.014 13.550 13.474 11.854 12.603 12.708 14.024 11.055 13.258
10.700 11.427 14.058 11.018 11.642 12.545 12.078 11.168 11.059
11.452 12.894 13.927 11.601 12.810 12.472 13.531 11.350 12.449
13.174 11.718 11.338 12.726 11.838 14.069 10.940 11.306 13.732
10.400 11.028 12.472 12.586 12.759 13.561 13.795 10.607 13.396
12.553 11.405 12.599 10.757 11.502 10.595 13.058 10.984 13.026
11.774 10.872 11.104 11.812 11.101 12.428 11.396 11.357 13.879
10.696 13.344 11.197 11.343 13.617 11.358 10.699 12.540 12.725
13.232 11.660 11.075 13.674 13.182 11.403 11.877 10.821 10.640
10.919 12.613 10.844 11.675 13.709 11.138 11.585 13.583 13.514
10.716 13.744 13.595 13.583 11.681 13.053 11.125 13.541 10.716
13.845 10.926 10.971 11.333 13.113 11.511 11.417 10.930 11.109
10.886 11.259 13.202 13.448 12.912 12.524 13.670 12.504 13.304
10.730 10.846 13.289 11.187 13.062 11.458 10.817 10.799 13.946
13.435 11.462 11.080 11.299 10.900 12.797 13.324 10.729 13.217
11.711 12.926 13.370 13.929 13.699 13.212 14.130 12.596 12.069
13.194
Test specimen
TABLE 10 auxiliary variables Total Nitrogen concentration in effluent
Figure BDA0002746843440000181
Figure BDA0002746843440000191
TABLE 11 auxiliary variable effluent Ammonia Nitrogen concentration
0.484 0.029 0.282 -0.800 -0.245 -0.938 -0.386 -0.580 -0.399
-0.688 -0.679 0.362 0.565 -0.097 -0.789 -0.951 -0.643 0.289
0.273 -0.545 -1.000 -0.841 -0.080 -0.437 0.180 0.321 -0.179
0.343 -0.628 0.354 -0.713 0.427 0.183 -0.617 -0.529 0.584
0.403 -0.461 -0.383 -0.716 -0.630 0.432 0.256 0.430 0.208
-0.299 0.825 -0.506 -0.792 -0.761 -0.825 0.597 0.214 0.494
0.987 -0.567 -0.989 -0.369 -0.094 -0.594 0.305 -0.940 0.266
0.357 -0.670 -0.469 0.273 0.610 0.344 -0.756 0.255 0.224
0.805 0.591 0.412 -0.950 -0.114 0.266 -0.675 0.268 0.394
-0.555 0.393 -0.485 -0.443
TABLE 12 Total Nitrogen concentration of the auxiliary variables influent
-0.374 -0.310 -0.489 0.224 0.077 -0.216 0.183 -0.579 0.150
-0.122 -0.545 -0.379 -0.634 -0.277 -0.013 0.147 -0.302 -0.503
-0.478 0.767 -0.157 0.266 -0.433 0.073 -0.561 -0.628 0.022
-0.686 0.453 -0.627 1.000 -0.540 -0.405 -0.486 0.022 -0.451
-0.290 0.116 -0.512 -0.175 -0.504 -0.556 -0.491 -0.662 -0.263
-0.331 -0.672 -0.307 0.136 0.359 0.208 -0.426 -0.248 -0.936
-0.618 0.385 0.174 0.063 -0.462 0.582 -0.330 -0.002 -0.286
-0.250 0.332 0.128 -0.238 0.111 -0.344 0.036 -0.294 -0.274
-0.523 -0.590 -0.341 0.175 -0.061 -0.247 -0.453 -0.277 -0.418
-0.528 -0.485 -0.158 0.382
TABLE 13 BOD concentration of the auxiliary variable influent water
Figure BDA0002746843440000192
Figure BDA0002746843440000201
TABLE 14 auxiliary variable influent ammonia nitrogen concentration
-0.240 -0.305 -0.356 0.101 0.140 -0.248 0.460 -0.493 0.675
-0.049 -0.427 -0.446 -0.316 0.026 0.166 0.105 -0.448 -0.478
-0.279 0.885 -0.040 0.448 -0.092 0.067 -0.417 -0.446 0.071
-0.533 0.346 -0.519 0.706 -0.260 -0.051 -0.504 0.037 0.057
-0.406 0.355 -0.392 0.451 -0.410 -0.441 -0.487 -0.383 -0.368
-0.348 0.020 -0.400 0.425 0.433 0.040 -0.276 -0.359 0.099
-0.514 0.295 0.263 0.173 -0.348 0.464 -0.351 0.013 -0.402
-0.421 0.452 0.470 -0.442 -0.492 -0.438 0.306 -0.584 -0.415
-0.433 -0.248 -0.446 0.363 -0.019 -0.447 -0.417 -0.395 -0.237
-0.298 -0.345 -0.233 0.448
TABLE 15 auxiliary variables Biochemical pool DO concentration
-0.325 0.564 0.342 -0.654 -0.218 -0.169 -0.383 0.111 -0.300
-0.342 0.539 -0.259 0.523 -0.374 -0.177 -0.243 0.704 0.169
-0.029 0.169 -0.276 -0.300 -0.202 0.251 0.267 0.449 -0.193
-0.119 0.103 0.070 -0.128 0.350 -0.218 0.358 -0.399 0.037
0.152 -0.457 0.440 -0.259 0.539 0.424 0.004 0.350 0.646
0.597 0.613 0.473 -0.366 -0.597 -0.473 0.514 0.671 0.259
0.383 0.556 -0.185 -0.358 -0.202 -0.169 0.202 -0.259 0.572
0.399 0.012 -0.383 0.564 0.630 0.235 -0.111 0.556 0.556
0.070 0.152 0.440 -0.366 -0.366 0.218 0.358 0.309 0.193
0.712 0.407 -0.333 0.572
TABLE 16 auxiliary variable intake pool phosphate concentration
-0.832 -0.802 -0.835 -0.204 -0.646 -0.085 -0.527 -0.822 -0.572
-0.591 -0.808 -0.916 -0.952 0.772 -0.547 -0.042 -0.796 -0.867
-0.880 -0.686 0.080 -0.220 0.345 -0.572 -0.892 -0.881 -0.610
-0.976 -0.719 -0.936 -0.363 -0.761 0.509 -0.790 -0.731 -0.853
-0.789 -0.618 -0.793 -0.654 -0.811 -0.846 -0.863 -0.783 -0.794
-0.815 -0.829 -0.800 -0.204 -0.451 -0.193 -0.856 -0.769 -0.983
-0.898 -0.724 -0.047 -0.631 0.181 -0.569 -0.858 0.087 -0.774
-0.795 -0.693 -0.549 -0.770 -0.668 -0.889 -0.534 -0.869 -0.784
-0.870 -0.914 -0.898 -0.310 -0.369 -0.828 -0.865 -0.819 -0.820
-0.899 -0.812 -0.261 -0.729
TABLE 17 actual BOD concentration (mg/L) of the water
10.300 11.514 10.286 14.286 12.500 11.886 13.200 11.529 13.143
12.743 11.486 11.029 10.157 12.171 12.729 14.400 11.600 10.800
10.243 12.671 12.100 14.000 12.660 12.800 11.100 10.200 12.771
10.129 14.586 10.314 13.900 12.600 12.520 11.229 12.629 10.600
10.286 13.086 11.443 12.114 10.886 10.800 11.000 12.243 11.457
11.429 12.229 11.571 14.086 13.100 12.929 10.271 11.714 11.257
11.043 14.957 12.614 12.729 12.800 14.900 10.657 14.657 11.400
10.714 15.500 13.000 10.829 11.900 10.614 12.643 11.143 11.300
10.771 10.386 11.114 13.900 12.529 10.986 11.771 11.200 11.286
11.857 11.400 11.971 11.986
TABLE 18 radial basis function neural network prediction output (mg/L)
10.649 11.010 10.688 13.822 12.576 12.594 13.666 11.784 13.533
13.225 11.761 11.020 11.259 12.607 13.251 13.677 11.374 10.608
10.738 13.392 12.955 13.725 12.588 13.029 10.772 10.861 12.520
10.871 13.602 10.879 14.053 11.343 12.553 11.908 12.818 10.370
10.906 13.547 12.161 12.669 11.883 10.968 10.718 11.676 11.012
11.091 11.312 11.351 13.434 13.887 13.655 11.092 10.960 10.553
11.108 13.339 13.508 12.608 12.563 14.101 11.334 13.538 11.498
10.825 13.643 13.498 11.225 10.513 11.421 13.242 11.130 11.409
10.817 10.779 11.174 13.111 12.648 10.864 11.229 10.909 11.042
11.390 11.018 12.642 13.101

Claims (5)

1. A soft BOD concentration measurement method of effluent based on MIC and RBFNN is characterized by comprising the following steps:
step 1, determining auxiliary variables: carrying out correlation analysis on the acquired actual water quality parameter original data of the sewage treatment plant by adopting a maximum information coefficient MIC, calculating the correlation coefficient of each water quality parameter and the BOD of the effluent in a calculation mode shown as a formula (1), selecting a variable with the correlation coefficient larger than 0.5, and obtaining an auxiliary variable with strong correlation with the BOD concentration of the effluent as follows: the total nitrogen concentration of the effluent, the ammonia nitrogen concentration of the effluent, the total nitrogen concentration of the influent, the BOD concentration of the influent, the ammonia nitrogen concentration of the influent, the DO concentration of the biochemical tank and the phosphate concentration of the influent tank;
Figure FDA0002746843430000011
Figure FDA0002746843430000012
wherein, I (X, Y) represents mutual information of X and Y, p (X, Y) represents joint probability density distribution function of X and Y, p (X), p (Y) respectively represent probability density distribution function of X, Y, N represents sample data volume, B (N) is function related to sample data volume, and its value is N0.6
Step 2, determining an initial clustering center of the K-means clustering algorithm on the basis of the feature data screened in the step 1: determining K initial clustering centers of the K-means algorithm by using the sample density and the distance between the samples;
step 3, determining the center, width and weight parameters of the radial basis function neural network: substituting the K initial clustering centers obtained in the step (2) into an original K-means algorithm to obtain a clustering result, taking the clustering result of the K-means clustering algorithm as a central parameter of a radial basis function, and taking the initial weight of each node in the hidden layer and the node of the output layer as 1;
step 4, determining a topological structure of a radial basis function neural network for predicting the BOD concentration of the effluent;
step 5, adjusting the radial basis function neural network parameters of the soft measurement model;
and 6, packaging the soft measurement model obtained in the step 5 into a jar file, importing the jar file into a Javaweb project, using a cloud server to complete service deployment, using a browser to access the project, uploading production data, calling a radial basis function neural network program by the server to predict, and transmitting a predicted result back to the client.
2. The BOD (biochemical oxygen demand) concentration soft measurement method of effluent based on MIC (many integrated core) and RBFNN (radial basis function) as claimed in claim 1, wherein: the step 2 comprises the following steps of,
step 2.1 data normalization: normalizing the training data and the test data according to the formula (3) to reduce the influence of different dimensions on the result;
Figure FDA0002746843430000021
wherein xnormalRepresenting the normalized data, min representing the minimum value of the variable in all samples, max representing the maximum value of the variable in all samples, and x representing the original value of the data;
step 2.2 determining clustered candidate samples: calculating Euclidean distances among all samples in all data, sequencing all the distances in an ascending order, taking the mean value of the upper quartile and the lower quartile of the distances as a distance threshold value R, and calculating the density of the ith sample according to a formula (4)iSorting all the densities in an ascending order, selecting the mean value of the upper quartile and the lower quartile of the density values of all the samples as a density threshold value, and selecting the samples with the density being more than or equal to the threshold value as candidate samples;
Figure FDA0002746843430000022
Figure FDA0002746843430000023
wherein, | | | represents a modulo operation, N represents the number of samples, N represents the number of input samples, x (x) is a threshold function, and the function value is 0 or 1;
step 2.3, determining an initial clustering center of the K-means clustering algorithm: determining the number K of final clustering centers, obtaining two samples with the largest distance from the candidate samples as initial clustering centers, and recording the initial clustering centers as C1、C2Two ofDeleting samples from the candidate set, distributing the remaining candidate samples to the nearest center according to the Euclidean distance shortest principle in the remaining samples to serve as a sample cluster, and forming two sample clusters S1、S2Calculating S1Samples in clusters to C1And S2Samples in clusters to C2Taking two samples farthest from the center of the existing initial cluster in the two clusters as C11、C21The two farthest distances are denoted as d1、d2If d is1>=d2Then C will be11Removed from the original sample set, added to the initial cluster center sample, denoted C3Otherwise, C is added21Removed from the original sample set, added to the initial cluster center sample, denoted C3
Step 2.4 calculate the remaining initial cluster centers: dividing the rest samples into corresponding initial clustering centers according to the principle of closest distance, and recording the formed clustering cluster as S1、S2...SmCalculating the distance from the sample point in each cluster to the cluster center thereof, and respectively recording the distance from the sample in each cluster to the cluster center thereof as d1,d2,...,dmM represents the number of the existing cluster, and d is takenm+1=max{d1,d2,...,dmGet it before
Figure FDA0002746843430000024
h is an empirical value and the value range is [0, 1]]Get dm+1The corresponding sample is taken as a new clustering center and is marked as Cm+1If m +1 is equal to K, all initial cluster centers have been determined, ending this step 2.4, if m +1 is equal to K<K, continue this step 2.4.
3. The BOD (biochemical oxygen demand) concentration soft measurement method of effluent based on MIC (many integrated core) and RBFNN (radial basis function) as claimed in claim 1, wherein: the step 3 comprises the following steps;
step 3.1, calculating Euclidean distance between the samples in the data set and the existing clustering centers, and distributing the samples to the clustering centers closest to the samples to form clustering clusters;
step 3.2, solving the mean value of all samples in each cluster, and taking the mean value as a new cluster center;
step 3.3, repeating the steps 3.1 and 3.2, ending clustering when the clustering centers are not changed or the cycle number reaches a specified upper limit, and obtaining K clustering centers;
and 3.4, selecting K clustering centers as the centers of the radial basis functions, and selecting the shortest Euclidean distance from the current center to other centers as the width parameters of the radial basis functions corresponding to the centers.
4. The BOD (biochemical oxygen demand) concentration soft measurement method of effluent based on MIC (many integrated core) and RBFNN (radial basis function) as claimed in claim 1, wherein: the step 4 comprises the following steps;
step 4.1, determining the number of nodes of an input layer: the layer has n neurons, n is the number of auxiliary variables determined in step 1, and each node represents an input variable xiThe purpose of this layer is to pass the input value directly to the hidden layer, i denotes the sample sequence number;
xi,i=1,2,...,n (6)
step 4.2, determining the number of hidden layer nodes and the width and the center of the hidden layer nodes; the layer is provided with m neurons in total, m is the number K of the clustering centers determined by the K-means algorithm in the step 2, the center selection of the radial basis function is the clustering result determined in the step 2, the width is the nearest Euclidean distance from the clustering center to other clustering centers, the hidden layer transfer function is the radial basis function, and a standard Gaussian function is usually selected and shown in the formula (7);
Figure FDA0002746843430000031
wherein, ciCentral parameter, σ, representing the ith hidden layer nodeiA width parameter representing the ith hidden layer node;
step 4.3, determining the output layer connection weight: the output layer has a node in common, the output of the node of the output layer is as shown in formula (8), and the initial connection weight of each hidden layer node and the output node is set to be 1;
Figure FDA0002746843430000032
wherein, yjRepresenting the jth input sample xjCorresponding output when input to the network, wiAnd representing the connection weight of the ith hidden layer node and the output node.
5. The BOD (biochemical oxygen demand) concentration soft measurement method of effluent based on MIC (many integrated core) and RBFNN (radial basis function) as claimed in claim 1, wherein:
step 5.1 determining the parameters to be updated: the parameters to be adjusted are the output weights of the radial basis function neural network, all the weight parameters are arranged into a row vector, and the row vector is marked as delta, and the value of the delta is shown as a formula (9);
Δ=[w1,w2,...,wm] (9)
wherein wmRepresenting the connection weight of the mth hidden layer node and the output node; step 5.2, circularly adjusting weight parameters of the radial basis function neural network by using an LM algorithm: calculating a gradient vector, a Jacobian matrix and a Hessian-like matrix according to the input of the current network, wherein the gradient vector g is calculated as shown in a formula (10), and the Jacobian matrix j is calculatedpThe calculation of (2) is shown as a formula (11), and the calculation of the Hessian-like matrix Q is shown as a formula (12);
Figure FDA0002746843430000041
Figure FDA0002746843430000042
Figure FDA0002746843430000043
ep=yd-yo (13)
where P denotes the total number of samples, P denotes the sample number currently input into the network, ydFor the desired output of the network, yoIs the actual output j of the networkp TRepresenting the Jacobian matrix jpThe transposed matrix of (2); updating the parameters to be updated: updating the output weight of the radial basis function neural network according to the formula (14);
Δk+1=Δk-(QkkI)-1gk (14)
wherein k represents the current training times, mu represents the learning rate, the value is 1, when the network is reduced, the parameter is reduced to 1/10 of the last iteration, otherwise, the parameter is increased to 10 times of the last iteration, the upper value limit of mu is 10^15, and the lower value limit of mu is 10^ 15;
if the absolute value of the error change of the two adjacent parameter updates is less than 10^ -10 or the number of times of single adjustment cycle reaches the upper limit, ending the parameter adjustment of the cycle, inputting the next sample, and repeating the step 5.2;
if the training samples are completely traversed, but the error is not smaller than the target value yet and the traversal times do not reach the upper limit, re-inputting the first sample in the training sample set, and repeating the step 5.2, otherwise, ending the parameter adjusting process;
step 5.3, the test sample is used as the input of the radial basis function neural network to obtain the predicted value of the BOD concentration of the normalized effluent, and the result is subjected to reverse normalization according to the formula (15) to obtain the actual predicted value of the BOD concentration of the effluent;
xreal=xnormal*(max-min)+min (15)
wherein xreal,xnormalRepresenting the true prediction data.
CN202011169471.7A 2020-10-28 2020-10-28 Effluent BOD concentration soft measurement method based on MIC and RBFNN Pending CN112446168A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011169471.7A CN112446168A (en) 2020-10-28 2020-10-28 Effluent BOD concentration soft measurement method based on MIC and RBFNN

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011169471.7A CN112446168A (en) 2020-10-28 2020-10-28 Effluent BOD concentration soft measurement method based on MIC and RBFNN

Publications (1)

Publication Number Publication Date
CN112446168A true CN112446168A (en) 2021-03-05

Family

ID=74736616

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011169471.7A Pending CN112446168A (en) 2020-10-28 2020-10-28 Effluent BOD concentration soft measurement method based on MIC and RBFNN

Country Status (1)

Country Link
CN (1) CN112446168A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116029589A (en) * 2022-12-14 2023-04-28 浙江问源环保科技股份有限公司 Rural domestic sewage animal and vegetable oil online monitoring method based on two-section RBF
CN117983134A (en) * 2024-04-03 2024-05-07 山西华凯伟业科技有限公司 Quantitative feeding control method and system for industrial production

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109978024A (en) * 2019-03-11 2019-07-05 北京工业大学 A kind of water outlet BOD prediction technique based on interconnection module neural network

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109978024A (en) * 2019-03-11 2019-07-05 北京工业大学 A kind of water outlet BOD prediction technique based on interconnection module neural network

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
乔俊飞等: "改进K-means算法优化RBF神经网络的出水氨氮预测", 《控制工程》, pages 375 - 379 *
李文静等: "基于互信息和自组织RBF神经网络的出水BOD软测量方法", 《化工学报》, pages 687 - 695 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116029589A (en) * 2022-12-14 2023-04-28 浙江问源环保科技股份有限公司 Rural domestic sewage animal and vegetable oil online monitoring method based on two-section RBF
CN116029589B (en) * 2022-12-14 2023-08-22 浙江问源环保科技股份有限公司 Rural domestic sewage animal and vegetable oil online monitoring method based on two-section RBF
CN117983134A (en) * 2024-04-03 2024-05-07 山西华凯伟业科技有限公司 Quantitative feeding control method and system for industrial production

Similar Documents

Publication Publication Date Title
CN107358021B (en) DO prediction model establishment method based on BP neural network optimization
CN108469507B (en) Effluent BOD soft measurement method based on self-organizing RBF neural network
CN110807554B (en) Generation method and system based on wind power/photovoltaic classical scene set
CN109657790B (en) PSO-based recursive RBF neural network effluent BOD prediction method
CN112884056A (en) Optimized LSTM neural network-based sewage quality prediction method
CN109558893B (en) Rapid integrated sewage treatment fault diagnosis method based on resampling pool
CN107895100B (en) Drainage basin water quality comprehensive evaluation method and system
CN109143408B (en) Dynamic region combined short-time rainfall forecasting method based on MLP
CN112446168A (en) Effluent BOD concentration soft measurement method based on MIC and RBFNN
CN112989704B (en) IRFM-CMNN effluent BOD concentration prediction method based on DE algorithm
CN108334943A (en) The semi-supervised soft-measuring modeling method of industrial process based on Active Learning neural network model
CN112765902A (en) RBF neural network soft measurement modeling method based on TentFWA-GD and application thereof
CN115982141A (en) Characteristic optimization method for time series data prediction
CN109934334B (en) Disturbance-based chlorophyll a content related factor sensitivity analysis method
CN109408896B (en) Multi-element intelligent real-time monitoring method for anaerobic sewage treatment gas production
CN113111576B (en) Mixed coding particle swarm-long-short-term memory neural network-based effluent ammonia nitrogen soft measurement method
CN114239397A (en) Soft measurement modeling method based on dynamic feature extraction and local weighted deep learning
Yang et al. Teacher-student uncertainty autoencoder for the process-relevant and quality-relevant fault detection in the industrial process
CN112819087B (en) Method for detecting abnormality of BOD sensor of outlet water based on modularized neural network
CN110542748B (en) Knowledge-based robust effluent ammonia nitrogen soft measurement method
CN111863153A (en) Method for predicting total amount of suspended solids in wastewater based on data mining
CN116386756A (en) Soft measurement modeling method based on integrated neural network reliability estimation and weighted learning
CN109101759A (en) A kind of parameter identification method based on forward and reverse response phase method
CN112634347B (en) Soft measurement method for activated sludge morphology and sludge volume index SVI
CN115034140A (en) Surface water quality change trend prediction method based on key control factors

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination