CN112819087B - Method for detecting abnormality of BOD sensor of outlet water based on modularized neural network - Google Patents

Method for detecting abnormality of BOD sensor of outlet water based on modularized neural network Download PDF

Info

Publication number
CN112819087B
CN112819087B CN202110185682.8A CN202110185682A CN112819087B CN 112819087 B CN112819087 B CN 112819087B CN 202110185682 A CN202110185682 A CN 202110185682A CN 112819087 B CN112819087 B CN 112819087B
Authority
CN
China
Prior art keywords
sample
network
neural network
variable
bod
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110185682.8A
Other languages
Chinese (zh)
Other versions
CN112819087A (en
Inventor
李文静
张竣凯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Technology
Original Assignee
Beijing University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Technology filed Critical Beijing University of Technology
Priority to CN202110185682.8A priority Critical patent/CN112819087B/en
Publication of CN112819087A publication Critical patent/CN112819087A/en
Application granted granted Critical
Publication of CN112819087B publication Critical patent/CN112819087B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/18Water
    • G01N33/1806Biological oxygen demand [BOD] or chemical oxygen demand [COD]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A20/00Water conservation; Efficient water supply; Efficient water use
    • Y02A20/152Water filtration

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Chemical & Material Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biodiversity & Conservation Biology (AREA)
  • Emergency Medicine (AREA)
  • Probability & Statistics with Applications (AREA)
  • Food Science & Technology (AREA)
  • Medicinal Chemistry (AREA)
  • Analytical Chemistry (AREA)
  • Biochemistry (AREA)
  • Immunology (AREA)
  • Pathology (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

A method for detecting abnormality of a BOD sensor of effluent based on a modularized neural network relates to the field of artificial intelligence and is directly applied to the field of sewage treatment. Aiming at the problems that a current sewage treatment process effluent BOD sensor drifts and burst abnormality cannot be detected in real time under a complex working condition, the invention adopts a density-based clustering algorithm to automatically classify and input samples according to the working condition, and extracts an effluent BOD auxiliary variable as an input variable of each sub-network in a modularized network by using a mutual information-based method; an error correction-based self-organizing RBF neural network is designed to serve as a sub-network, and the network is trained through an improved Levenberg-Marquardt (LM) algorithm so as to improve the training speed; the result shows that the abnormality detection method has a compact structure, can rapidly and accurately detect the abnormality of the effluent BOD sensor in the sewage treatment process, and provides technical support for safe and stable operation of sewage treatment.

Description

Method for detecting abnormality of BOD sensor of outlet water based on modularized neural network
Technical field:
the invention relates to the field of artificial intelligence, is directly applied to the field of BOD sensor abnormality detection in sewage treatment, and particularly relates to an effluent BOD sensor abnormality detection method based on a modularized neural network.
The background technology is as follows:
the biochemical oxygen demand (Biochemical Oxygen Demand, BOD) is an important parameter reflecting the pollution degree of water body by organic matters, is an important index for evaluating the water quality of sewage and an important control parameter for the sewage treatment process, and has important significance for sewage treatment in the detection and measurement of the BOD. The current standard method for measuring BOD is a dilution and inoculation method, but the procedure is complicated, the measurement period is long, serious hysteresis exists, and the change of BOD in the water body cannot be reflected. In recent years, random BOD sensors are widely applied, and the measurement of the BOD sensors can meet the requirement on BOD measurement, however, under complex working conditions, the sewage sensor can be subjected to the problems of vibration, pH value change, short-time strong load and the like, so that the sensor is easy to be abnormal and difficult to detect, and the sewage treatment process is influenced. Therefore, how to rapidly monitor the abnormality of the sensor and ensure the normal operation of sewage treatment is an important difficulty.
The soft measurement method adopts an indirect measurement thought, utilizes an easily-measured variable, and detects a difficult-to-measure variable or an undetectable variable in real time by constructing a model, is a method for obtaining key water quality parameters in the sewage treatment process, and can be used for detecting the abnormality of the BOD sensor and detecting whether the BOD sensor is abnormal or not. The invention designs a sewage treatment effluent BOD abnormality detection method based on a modularized neural network, which simulates complex working conditions through various weather conditions to carry out online soft measurement on the BOD concentration of effluent, and realizes online abnormality detection on a BOD sensor based on soft measurement values.
The invention comprises the following steps:
1. the invention needs to solve the technical problems.
The invention provides a sewage treatment effluent BOD sensor abnormality detection method based on a modularized neural network. The method is characterized in that the density-based characteristic clustering is used for clustering sewage quality variables, the mutual information-based method is used for selecting auxiliary variables to improve the detection precision of the BOD sensor of the effluent, meanwhile, a modularized neural network with a sub-network structure based on error correction is designed, the value of the BOD sensor of the effluent in the sewage treatment process is subjected to real-time soft measurement, and further the BOD sensor of the effluent is subjected to abnormal detection, so that the abnormality of the sensor is timely found, and the sewage treatment quality is improved.
2. The specific technical scheme of the invention is as follows:
the invention provides an abnormality detection method of a sewage treatment effluent Biochemical Oxygen Demand (BOD) sensor based on a modularized neural network. The algorithm comprises the following steps:
step 1: classifying water quality variables;
collecting actual water quality variable data of a sewage treatment plant, and recording BOD concentration o= [ o ] of water 1 ,o 2 ,…,o P ]To output variable, record x= [ x ] 1 ,x 2 ,...,x P ]Is a set of input variables consisting of a total of P samples from a variety of weather conditions, where x p =[x p,1 ,x p,2 ,...,x p,M ]P=1, 2,..p, M represents the number of water quality variables;
step 1.1: preprocessing data; normalizing each variable, wherein the normalized range is [0,1], and the calculation formula is as follows:
where x represents the set of input samples before being normalized, x p Represents the p-th input sample, X p Represents normalized p-th sample, o p Represents the BOD concentration and O of the target effluent corresponding to the p-th input sample p Represents normalized o p The method comprises the steps of carrying out a first treatment on the surface of the mean, max, and min represent operations to average, maximum, and minimum variables;
step 1.2: calculating a characteristic index DI of an input variable, selecting the characteristic variable, and calculating the formula as follows:
wherein X is k Representing a collection of data from the kth weather condition in the dataset, three different weather conditions, data on dry, rainy and rainy days, k=3; v is a judging function, if the p-th sample belongs to the k-th set, V is equal to 1, otherwise, V is equal to 0;
step 1.3: arranging sewage quality variable sets in descending order according to DI values, and setting the value of a threshold value alpha to be 3, wherein variables with DI values higher than the set threshold value alpha can be screened as characteristic variables for density clustering;
step 1.4: the local density of each sample was calculated as follows:
wherein ρ is p Is the local density of the p-th sample, d p,i The bit representing the p-th sample and the i-th sampleEuclidean distance between the characterization variables; d, d c Is a cut-off distance, set to 1; f is an indication function, when d p,i -d c When less than 0, F is equal to 1; otherwise, F is equal to 0;
step 1.5: the relative distance of each sample was calculated as follows:
where p represents the p-th sample point, the relative distance of which is defined as the minimum distance between the p-th sample and the sample whose local density is higher than its density; for the sample with the highest local density, its relative distance is equal to the maximum value between it and the other sample distances;
step 1.6: determining a cluster center, wherein the cluster center can be determined when the p-th sample meets the following conditions:
γ p >10*mean(γ) (6)
wherein gamma is p Determining a value for the clustering center of the p-th sample, wherein gamma is a vector consisting of the clustering center determination values of all samples, mean is an average value of orientation quantities, and gamma is a vector consisting of the clustering centers of all samples p The calculation formula of (2) is as follows:
γ p =δ pp (7)
wherein ρ is p For the local density of the p-th sample, delta p The relative distance for the p-th sample;
step 1.7: traversing all samples according to the steps, and marking the number of sample points determined as the clustering center as Z, namely finally, totally Z clustering clusters; then, the rest samples are distributed to clusters where nearest neighbor samples with highest local density are located, and sample clustering is completed;
step 2: determining an auxiliary variable of BOD of the water by adopting a mutual information method, and obtaining mutual information values of an input variable and an output variable from the normalized sample data obtained in the step 1;
step 2.1: the entropy value of each input variable is calculated, and the calculation formula is as follows:
wherein f (X) p,m ) Is a probability density function of the mth input variable;
step 2.2: the joint probability density of each input variable and output variable is calculated as follows:
wherein f (X) p,m ,O p ) Is a joint probability density function between the mth input variable and the output variable;
step 2.3: the mutual information value between the input variable and the output variable is calculated, and the calculation formula is as follows:
MI(X m ,O)=H(X m )+H(O)-H(X m ,O) (11)
arranging sewage quality variables according to the MI value, setting the value of a threshold value beta to be 6.5, wherein the variables with MI values higher than the set threshold value beta are regarded as auxiliary variables for predicting the BOD of the effluent, and the number of the screened auxiliary variables is recorded as N;
step 3: designing a modularized neural network prediction model structure of BOD of water;
step 3.1: extracting the auxiliary variable obtained in the step 2 from the sample data normalized in the step 1 to be used as the input of a modularized network, wherein the normalized BOD concentration of the outlet water is the output variable of the modularized network;
step 3.2: designing a network structure of a modularized neural network: the modularized neural network is composed of Z sub-networks, wherein Z is determined by the number of class clusters clustered by the density-based samples, namely Z clustered class clusters correspond to the Z sub-networks;
step 3.3: and (3) designing a modularized neural network sub-network structure: the sub-network adopts an error correction algorithm optimized self-organizing RBF neural network, and comprises an input layer, an hidden layer and an output layer, wherein the three-layer structure is determined, the topological structure of the self-organizing RBF neural network is N-H-1, namely the input layer comprises N neurons, the N auxiliary variables respectively correspond to the normalized N auxiliary variables extracted in the step 3.1, the hidden layer comprises H neurons, the output layer comprises 1 neuron, and the corresponding output variables;
step 3.4: for the p-th training sample, the neural network input is x p =[x p,1 ,x p,2 ,...,x p,N ]P=1, 2,..p, where x p,n An nth auxiliary variable value representing a p-th sample; the output of the output layer neurons of the neural network at this time is:
wherein w is h Is the connection weight value phi of the h hidden layer neuron and the output layer neuron h For the activation function of the h hidden layer neuron of the RBF neural network, the definition is as shown in a formula (13):
wherein c p Sum sigma h The center and the width of the h hidden layer neuron are respectively;
step 3.5: the root mean square error function is selected as the performance index, and is defined by the following formula:
wherein O is p For the desired output of the p-th sample, y p Network output layer neuron transfusion at the time of the p-th sampleP is the number of training samples;
step 3.6: setting the number H of neurons of an hidden layer of the neural network to be 0, and calculating the network output error of the current p-th sample:
e p =y p -O p (15)
wherein p=1, 2,; for all training samples, searching for the training sample with the largest error, as shown in formula (16):
wherein e= [ e 1 ,e 2 ,...,e p ] T The method comprises the steps of carrying out a first treatment on the surface of the Newly adding an RBF neuron, wherein the number H=H+1 of the neurons;
setting hidden layer neuron initial parameters according to formulas (17) - (19);
c H+1 =X pmax (17)
σ H+1 =1 (18)
ω H+1 =1 (19)
wherein c H+1 Sum sigma H+1 Center and width, w, of the H-th hidden layer neuron, respectively H+1 X is the connection weight of hidden layer neuron and output layer neuron Pmax Is P max A plurality of input samples;
step 3.7: under the current network structure, the vector delta contains all parameters needing to be updated, namely
The update rule is as follows
Δ t+1 =Δ t -(Q tt I) -1 g t (21)
Wherein t represents the iteration step number, Q is a Heisen-like matrix, g is a gradient vector, I is a unit matrix, mu is a learning rate parameter, the setting range is [0.001,0.1], different setting values in the range of the interval only affect the network convergence speed and do not affect the result, and the Heisen-like matrix Q and the gradient vector g are calculated according to formulas (21) and (22) respectively:
wherein e p For the network output error of the p-th sample, calculate j according to equation (14) p The jacobian row vector for the corresponding sample is defined as follows:
according to formulas (11), (12) and (23), the following is obtained:
the row vectors of the Jacobian matrix can be obtained through formulas (24) - (26), and the Heisen-like matrix Q and the gradient vector g can be obtained after all training samples are traversed once, so that parameters are updated according to a parameter updating formula (21);
during the training process, when the RMSE t+1 ≤RMSE t Mu when it is t+1 =μ t 10, the current parameters of the neural network are reserved, and conversely mu t+1 =μ t *10, before the neural network parameters are restored to the parameter adjustment, updating the network parameters based on the current mu, and setting the maximum iteration step number t max ∈[100,500]RMSE is expected d ∈(0,0.02]The method comprises the steps of carrying out a first treatment on the surface of the The neural network parameter learning process is iterated continuously, and when the iteration step number t=t max Or the current training RMSE is less than or equal to RMSE d When the current network training is stopped;
step 4: performing abnormality detection on a BOD sensor of the discharged water, wherein the scene of abnormality judgment is mutation abnormality and drift abnormality of the sensor;
step 4.1: the mutation abnormality detection of the BOD sensor of the discharged water judges that the BOD mutation abnormality occurs in the sensor at the moment when the p-th test sample meets the following conditions, wherein the conditions are as follows:
e p >ξ (28)
wherein xi is the abnormal threshold value of the burst of the BOD sensor of the effluent, and the calculation formula is as follows
ξ=3*max(e) (29)
Wherein e is a vector of final training errors of the modular network; max is the maximum value of the calculated vector;
step 4.2: the detection of the drift abnormality of the water outlet BOD sensor is mainly characterized in that the value of the sensor is in a steadily rising state when the water outlet BOD sensor is in the drift abnormality, the sliding average value of the error between the modularized neural network soft measurement and the water outlet BOD sensor is larger and larger, and the drift abnormality of the BOD sensor is judged when the p-th sample meets the following conditions, wherein the conditions are as follows:
1.3*max(se train )<se p (30)
wherein se is train The vector formed by the sliding average value representing the error of the training sample, and max represents the maximum value in the calculated vector;se p For the running average of the errors of the p-th test sample, the calculation formula is as follows:
se p =(e p +e p-1 +....+e p-z )/z (31)
where z is the sliding distance and z has a value of 10.
3. Compared with the prior art, the invention has the following obvious advantages and beneficial effects:
aiming at the problem that the effluent BOD sensor is difficult to detect in the sewage treatment process under the complex working condition, the density clustering method based on the characteristic variables is adopted to cluster the water quality variables together according to the working condition, so that the pertinence of abnormality detection is improved, and meanwhile, the online soft measurement detection method for the effluent BOD sensor of the modularized neural network is provided, the real-time detection of the effluent BOD sensor is realized, and the characteristics of good real-time performance, high stability, high abnormality detection precision and the like are realized.
Drawings
FIG. 1 is a structural model diagram of a modular network of the present invention
FIG. 2 is a graph showing the variation of RMSE in training of subnetwork 1 and subnetwork 2 in the modularized neural network of the present embodiment
FIG. 3 is a diagram showing the detection of the BOD sensor of the present embodiment when the BOD sensor is mutated or drifting abnormally
The specific embodiment is as follows:
the invention provides a modularized neural network-based method for detecting the abnormality of a water outlet BOD sensor, which is used for detecting the output of the water outlet BOD sensor through online soft measurement of the neural network, so that the detection precision and timeliness of the abnormality of the water outlet BOD sensor in the sewage treatment process are improved, the real-time monitoring level of the water outlet BOD of an urban sewage treatment plant is improved, and the normal operation of the sewage treatment process is ensured:
the example of the invention adopts water quality analysis data of a sewage plant in 2006, and totally comprises 2688 groups of data from a dry day and a rainy day, and 10 water quality variables comprise (1) water inflow; (2) effluent So (Oxygen) concentration; (3) effluent Sno (Nitrate and nitrite nitrogen) concentration; (4) inlet water Snh (nh4++ NH3 nitrogen) concentration; (5) effluent Snh (nh4++ NH3 nitrogen) concentration; (6) inlet water COD (Chemical oxygen demand) concentration; (7) effluent COD (Chemical oxygen demand) concentration; (8) inlet water TSS (Total suspended solid) concentration; (9) effluent TSS (Total suspended solid) concentration; (10) effluent BOD (Biochemical oxygen demand) concentration; 50% of the data under each weather condition were randomly selected as training samples, and the remaining 50% were selected as test samples.
The method for detecting the abnormality of the BOD sensor of the outlet water based on the modularized neural network comprises the following steps:
step 1: classifying water quality variables;
collecting actual water quality variable data of a sewage treatment plant, and recording BOD concentration o= [ o ] of water 1 ,o 2 ,…,o P ]To output variable, record x= [ x ] 1 ,x 2 ,...,x P ]Is a set of input variables consisting of a total of P samples from a variety of weather conditions, where x p =[x p,1 ,x p,2 ,...,x p,M ]P=1, 2,..p, M represents the number of water quality variables;
step 1.1: preprocessing data; normalizing each variable, wherein the normalized range is [0,1], and the calculation formula is as follows:
where x represents the set of input samples before being normalized, x p Represents the p-th input sample, X p Represents normalized p-th sample, o p Represents the BOD concentration and O of the target effluent corresponding to the p-th input sample p Represents normalized o p The method comprises the steps of carrying out a first treatment on the surface of the mean, max, and min represent operations to average, maximum, and minimum variables;
step 1.2: calculating a characteristic index DI of an input variable, selecting the characteristic variable, and calculating the formula as follows:
wherein X is k Representing a collection of data from the kth weather condition in the dataset, three different weather conditions, data on dry, rainy and rainy days, k=3; v is a judging function, if the p-th sample belongs to the k-th set, V is equal to 1, otherwise, V is equal to 0;
step 1.3: arranging sewage quality variable sets in descending order according to DI values, and setting the value of a threshold value alpha to be 3, wherein variables with DI values higher than the set threshold value alpha can be screened as characteristic variables for density clustering;
in this example, 3 variables were obtained as characteristic variables for water quality variable clustering, which were (1) water inflow amounts, respectively; (2) inlet water Snh; (3) a feed water TSS;
step 1.4: the local density of each sample was calculated as follows:
wherein ρ is p Is the local density of the p-th sample, d p,i Representing the Euclidean distance between the feature variables of the p-th sample and the i-th sample; d, d c Is a cut-off distance, set to 1; f is an indication function, when d p,i -d c When less than 0, F is equal to 1; otherwise, F is equal to 0;
step 1.5: the relative distance of each sample was calculated as follows:
where p represents the p-th sample point, the relative distance of which is defined as the minimum distance between the p-th sample and the sample whose local density is higher than its density; for the sample with the highest local density, its relative distance is equal to the maximum value between it and the other sample distances;
step 1.6: determining a cluster center, wherein the cluster center can be determined when the p-th sample meets the following conditions:
γ p >10*mean(γ) (6)
wherein gamma is p Determining a value for the clustering center of the p-th sample, wherein gamma is a vector consisting of the clustering center determination values of all samples, mean is an average value of orientation quantities, and gamma is a vector consisting of the clustering centers of all samples p The calculation formula of (2) is as follows:
γ p =δ pp (7)
wherein ρ is p For the local density of the p-th sample, delta p The relative distance for the p-th sample;
step 1.7: traversing all samples according to the steps, and marking the number of sample points determined as the clustering center as Z, namely finally, totally Z clustering clusters; then, the rest samples are distributed to clusters where nearest neighbor samples with highest local density are located, and sample clustering is completed;
in this embodiment, two sample cluster points in total satisfy the above condition, because the value of Z is recorded as 2, the data is classified into two types;
step 2: determining an auxiliary variable of BOD of the water by adopting a mutual information method, and obtaining mutual information values of an input variable and an output variable from the normalized sample data obtained in the step 1;
step 2.1: the entropy value of each input variable is calculated, and the calculation formula is as follows:
wherein f (X) p,m ) Is the mth input changeProbability density function of quantity;
step 2.2: the joint probability density of each input variable and output variable is calculated as follows:
wherein f (X) p,m ,O p ) Is a joint probability density function between the mth input variable and the output variable;
step 2.3: the mutual information value between the input variable and the output variable is calculated, and the calculation formula is as follows:
MI(X m ,O)=H(X m )+H(O)-H(X m ,O) (11)
arranging sewage quality variables according to the MI value, setting the value of a threshold value beta to be 6.5, wherein the variables with MI values higher than the set threshold value beta are regarded as auxiliary variables for predicting the BOD of the effluent, and the number of the screened auxiliary variables is recorded as N;
in this example, 6 variables were obtained as auxiliary variables as sub-network inputs to the modular network, which were (1) water inflow, respectively; (2) inlet water Sno concentration; (3) inlet water Snh; (4) COD of the inflow water; (5) effluent COD; (6) a feed water TSS;
step 3: designing a modularized neural network prediction model structure of BOD of water;
step 3.1: extracting the auxiliary variable obtained in the step 2 from the sample data normalized in the step 1 to be used as the input of a modularized network, wherein the normalized BOD concentration of the outlet water is the output variable of the modularized network;
step 3.2: designing a network structure of a modularized neural network: the modularized neural network is composed of Z sub-networks, wherein Z is determined by the number of class clusters clustered by the density-based samples, namely Z clustered class clusters correspond to the Z sub-networks;
step 3.3: and (3) designing a modularized neural network sub-network structure: the sub-network adopts an error correction algorithm optimized self-organizing RBF neural network, and comprises an input layer, an hidden layer and an output layer, wherein the three-layer structure is determined, the topological structure of the self-organizing RBF neural network is N-H-1, namely the input layer comprises N neurons, the N auxiliary variables respectively correspond to the normalized N auxiliary variables extracted in the step 3.1, the hidden layer comprises H neurons, the output layer comprises 1 neuron, and the corresponding output variables;
step 3.4: for the p-th training sample, the neural network input is x p =[x p,1 ,x p,2 ,...,x p,N ]P=1, 2,..p, where x p,n An nth auxiliary variable value representing a p-th sample; the output of the output layer neurons of the neural network at this time is:
wherein w is h Is the connection weight value phi of the h hidden layer neuron and the output layer neuron h For the activation function of the h hidden layer neuron of the RBF neural network, the definition is as shown in a formula (13):
wherein c p Sum sigma h The center and the width of the h hidden layer neuron are respectively;
step 3.5: the root mean square error function is selected as the performance index, and is defined by the following formula:
wherein O is p For the desired output of the p-th sample, y p The output of the network output layer neuron is the P-th sample, and P is the number of training samples;
step 3.6: setting the number H of neurons of an hidden layer of the neural network to be 0, and calculating the network output error of the current p-th sample:
e p =y p -O p (15)
wherein p=1, 2,; for all training samples, searching for the training sample with the largest error, as shown in formula (16):
wherein e= [ e 1 ,e 2 ,...,e p ] T The method comprises the steps of carrying out a first treatment on the surface of the Newly adding an RBF neuron, wherein the number H=H+1 of the neurons;
setting hidden layer neuron initial parameters according to formulas (17) - (19);
c H+1 =X pmax (17)
σ H+1 =1 (18)
ω H+1 =1 (19)
wherein c H+1 Sum sigma H+1 Center and width, w, of the H-th hidden layer neuron, respectively H+1 X is the connection weight of hidden layer neuron and output layer neuron Pmax Is P max A plurality of input samples;
step 3.7: under the current network structure, the vector delta contains all parameters needing to be updated, namely
The update rule is as follows
Δ t+1 =Δ t -(Q tt I) -1 g t (21)
Wherein t represents the iteration step number, Q is a Heisen-like matrix, g is a gradient vector, I is a unit matrix, mu is a learning rate parameter, the setting range is [0.001,0.1], different setting values in the range of the interval only affect the network convergence speed and do not affect the result, and the Heisen-like matrix Q and the gradient vector g are calculated according to formulas (21) and (22) respectively:
wherein e p For the network output error of the p-th sample, calculate j according to equation (14) p The jacobian row vector for the corresponding sample is defined as follows:
according to formulas (11), (12) and (23), the following is obtained:
the row vectors of the Jacobian matrix can be obtained through formulas (24) - (26), and the Heisen-like matrix Q and the gradient vector g can be obtained after all training samples are traversed once, so that parameters are updated according to a parameter updating formula (21);
during the training process, when the RMSE t+1 ≤RMSE t Mu when it is t+1 =μ t 10, the current parameters of the neural network are reserved, and conversely mu t+1 =μ t *10, before the neural network parameters are restored to the parameter adjustment, updating the network parameters based on the current mu, and setting the maximum iteration step number t max ∈[100,500]RMSE is expected d ∈(0,0.02]The method comprises the steps of carrying out a first treatment on the surface of the The neural network parameter learning process is iterated continuously, and when the iteration step number t=t max Or the current training RMSE is less than or equal to RMSE d When the current network training is stopped;
in this embodiment, the learning rate μ is set to 0.01, t max Set to 100, expect RMSE d Set to 0.02;
step 4: performing abnormality detection on a BOD sensor of the discharged water, wherein the scene of abnormality judgment is mutation abnormality and drift abnormality of the sensor;
step 4.1: the mutation abnormality detection of the BOD sensor of the discharged water judges that the BOD mutation abnormality occurs in the sensor at the moment when the p-th test sample meets the following conditions, wherein the conditions are as follows:
e p >ξ (28)
wherein xi is the abnormal threshold value of the burst of the BOD sensor of the effluent, and the calculation formula is as follows
ξ=3*max(e) (29)
Wherein e is a vector of final training errors of the modular network; max is the maximum value of the calculated vector;
step 4.2: the detection of the drift abnormality of the water outlet BOD sensor is mainly characterized in that the value of the sensor is in a steadily rising state when the water outlet BOD sensor is in the drift abnormality, the sliding average value of the error between the modularized neural network soft measurement and the water outlet BOD sensor is larger and larger, and the drift abnormality of the BOD sensor is judged when the p-th sample meets the following conditions, wherein the conditions are as follows:
1.3*max(se train )<se p (30)
wherein se is train Representing a vector formed by a sliding average value of the training sample errors, and max represents the maximum value in the calculated vector; se (b) p For the running average of the errors of the p-th test sample, the calculation formula is as follows:
se p =(e p +e p-1 +....+e p-z )/z (31)
where z is the sliding distance and z has a value of 10.
In this embodiment, RMSE changes of modular neural network subnetwork 1 and subnetwork 2 during training are shown in fig. 2, X-axis: iteration steps, Y axis: training the value of RMSE, the solid line being the variation of RMSE of the subnetwork during training; FIG. 3 is a detection image of the occurrence of abnormality of the BOD sensor of the effluent, X axis: the number of test samples, in units of one, Y-axis: the unit of the BOD concentration of the outlet water of the prediction sensor is mg/L, the solid line is the BOD concentration prediction output value of the outlet water, the dotted line is the BOD sensor concentration output value of the outlet water, and the point is the sample point for detecting the abnormality of the BOD sensor.

Claims (1)

1. The method for detecting the abnormality of the BOD sensor of the outlet water based on the modularized neural network is characterized by comprising the following steps of:
step 1: classifying water quality variables;
collecting actual water quality variable data of a sewage treatment plant, and recording BOD concentration o= [ o ] of water 1 ,o 2 ,…,o P ]To output variable, record x= [ x ] 1 ,x 2 ,...,x P ]Is a set of input variables consisting of a total of P samples from a variety of weather conditions, where x p =[x p,1 ,x p,2 ,...,x p,M ]P=1, 2,..p, M represents the number of water quality variables;
step 1.1: preprocessing data; normalizing each variable, wherein the normalized range is [0,1], and the calculation formula is as follows:
wherein x represents not normalizedThe set of input samples before the chemical conversion, x p Represents the p-th input sample, X p Represents normalized p-th sample, o p Represents the BOD concentration and O of the target effluent corresponding to the p-th input sample p Represents normalized o p The method comprises the steps of carrying out a first treatment on the surface of the mean, max, and min represent operations to average, maximum, and minimum variables;
step 1.2: calculating a characteristic index DI of an input variable, selecting the characteristic variable, and calculating the formula as follows:
wherein X is k Representing a collection of data from the kth weather condition in the dataset, three different weather conditions, data on dry, rainy and rainy days, k=3; v is a judging function, if the p-th sample belongs to the k-th set, V is equal to 1, otherwise, V is equal to 0;
step 1.3: arranging sewage quality variable sets in descending order according to DI values, and setting the value of a threshold value alpha to be 3, wherein variables with DI values higher than the set threshold value alpha can be screened as characteristic variables for density clustering;
step 1.4: the local density of each sample was calculated as follows:
wherein ρ is p Is the local density of the p-th sample, d p,i Representing the Euclidean distance between the feature variables of the p-th sample and the i-th sample; d, d c Is a cut-off distance, set to 1; f is an indication function, when d p,i -d c When less than 0, F is equal to 1; otherwise, F is equal to 0;
step 1.5: the relative distance of each sample was calculated as follows:
where p represents the p-th sample point, the relative distance of which is defined as the minimum distance between the p-th sample and the sample whose local density is higher than its density; for the sample with the highest local density, its relative distance is equal to the maximum value between it and the other sample distances;
step 1.6: determining a clustering center point, wherein the clustering center point can be determined when a p-th sample meets the following conditions:
γ p >10*mean(γ) (6)
wherein gamma is p Determining a value for the clustering center of the p-th sample, wherein gamma is a vector consisting of the clustering center determination values of all samples, mean is an average value of orientation quantities, and gamma is a vector consisting of the clustering centers of all samples p The calculation formula of (2) is as follows:
γ p =δ pp (7)
wherein ρ is p For the local density of the p-th sample, delta p The relative distance for the p-th sample;
step 1.7: traversing all samples according to the steps, and marking the number of sample points determined as the clustering center as Z, namely finally, totally Z clustering clusters; then, the rest samples are distributed to clusters where nearest neighbor samples with highest local density are located, and sample clustering is completed;
step 2: determining an auxiliary variable of BOD of the water by adopting a mutual information method, and obtaining mutual information values of an input variable and an output variable from the normalized sample data obtained in the step 1;
step 2.1: the entropy value of each input variable and output variable is calculated, and the calculation formula is as follows:
wherein f (X) p,m ) Is a probability density function of the mth input variable;
step 2.2: the joint probability density of each input variable and output variable is calculated as follows:
wherein f (X) p,m ,O p ) Is a joint probability density function between the mth input variable and the output variable;
step 2.3: the mutual information value between the input variable and the output variable is calculated, and the calculation formula is as follows:
MI(X m ,O)=H(X m )+H(O)-H(X m ,O) (11)
arranging sewage quality variables according to the MI value, setting the value of a threshold value beta to be 6.5, wherein the variables with MI values higher than the set threshold value beta are regarded as auxiliary variables for predicting the BOD of the effluent, and the number of the screened auxiliary variables is recorded as N;
step 3: designing a modularized neural network prediction model structure of BOD of water;
step 3.1: extracting the auxiliary variable obtained in the step 2 from the sample data normalized in the step 1 to be used as the input of a modularized network, wherein the normalized BOD concentration of the outlet water is the output variable of the modularized network;
step 3.2: designing a network structure of a modularized neural network: the modularized neural network is composed of Z sub-networks, wherein Z is determined by the number of class clusters clustered by the density-based samples, namely Z clustered class clusters correspond to the Z sub-networks;
step 3.3: and (3) designing a modularized neural network sub-network structure: the sub-network adopts an error correction algorithm optimized self-organizing RBF neural network, and comprises an input layer, an hidden layer and an output layer, wherein the three-layer structure is determined, the topological structure of the self-organizing RBF neural network is N-H-1, namely the input layer comprises N neurons, the N auxiliary variables respectively correspond to the normalized N auxiliary variables extracted in the step 3.1, the hidden layer comprises H neurons, the output layer comprises 1 neuron, and the corresponding output variables;
step 3.4: for the p-th training sample, the neural network input is x p =[x p,1 ,x p,2 ,...,x p,N ]P=1, 2,..p, where x p,n An nth auxiliary variable value representing a p-th sample; the output of the output layer neurons of the neural network at this time is:
wherein w is h Is the connection weight value phi of the h hidden layer neuron and the output layer neuron h For the activation function of the h hidden layer neuron of the RBF neural network, the definition is as shown in a formula (13):
wherein c p Sum sigma h The center and the width of the h hidden layer neuron are respectively;
step 3.5: the root mean square error function is selected as the performance index, and is defined by the following formula:
wherein O is p For the desired output of the p-th sample, y p The output of the network output layer neuron is the P-th sample, and P is the number of training samples;
step 3.6: setting the number H of neurons of an hidden layer of the neural network to be 0, and calculating the network output error of the current p-th sample:
e p =y p -O p (15)
wherein p=1, 2,; for all training samples, searching for the training sample with the largest error, as shown in formula (16):
wherein e= [ e 1 ,e 2 ,...,e p ] T The method comprises the steps of carrying out a first treatment on the surface of the Newly adding an RBF neuron, wherein the number H=H+1 of the neurons;
setting hidden layer neuron initial parameters according to formulas (17) - (19);
c H+1 =X pmax (17)
σ H+1 =1 (18)
ω H+1 =1 (19)
wherein c H+1 Sum sigma H+1 Center and width, w, of the H-th hidden layer neuron, respectively H+1 X is the connection weight of hidden layer neuron and output layer neuron Pmax Is P max A plurality of input samples;
step 3.7: under the current network structure, the vector delta contains all parameters needing to be updated, namely
The update rule is as follows
Δ t+1 =Δ t -(Q tt I) -1 g t (21)
Wherein t represents the iteration step number, Q is a Heisen-like matrix, g is a gradient vector, I is a unit matrix, mu is a learning rate parameter, the setting range is [0.001,0.1], different setting values in the range of the interval only affect the network convergence speed and do not affect the result, and the Heisen-like matrix Q and the gradient vector g are calculated according to formulas (21) and (22) respectively:
wherein e p For the network output error of the p-th sample, calculate j according to equation (14) p The jacobian row vector for the corresponding sample is defined as follows:
according to formulas (11), (12) and (23), the following is obtained:
the row vectors of the Jacobian matrix can be obtained through formulas (24) - (26), and the Heisen-like matrix Q and the gradient vector g can be obtained after all training samples are traversed once, so that parameters are updated according to a parameter updating formula (21);
during the training process, when the RMSE t+1 ≤RMSE t Mu when it is t+1 =μ t 10, the current parameters of the neural network are reserved, and conversely mu t+1 =μ t *10, before the neural network parameters are restored to the parameters for adjustmentUpdating network parameters based on current mu, and setting the maximum iteration step number t max ∈[100,500]RMSE is expected d ∈(0,0.02]The method comprises the steps of carrying out a first treatment on the surface of the The neural network parameter learning process is iterated continuously, and when the iteration step number t=t max Or the current training RMSE is less than or equal to RMSE d When the current network training is stopped;
step 4: performing abnormality detection on a BOD sensor of the discharged water;
step 4.1: the mutation abnormality detection of the BOD sensor of the discharged water judges that the BOD mutation abnormality occurs in the sensor at the moment when the p-th test sample meets the following conditions, wherein the conditions are as follows:
e p >ξ (28)
wherein xi is the abnormal threshold value of the burst of the BOD sensor of the effluent, and the calculation formula is as follows
ξ=3*max(e) (29)
Wherein e is a vector of final training errors of the modular network; max is the maximum value of the calculated vector;
step 4.2: the detection of the drift abnormality of the water outlet BOD sensor is mainly characterized in that the value of the sensor is in a steadily rising state when the water outlet BOD sensor is in the drift abnormality, the sliding average value of the error between the modularized neural network soft measurement and the water outlet BOD sensor is larger and larger, and the drift abnormality of the BOD sensor is judged when the p-th sample meets the following conditions, wherein the conditions are as follows:
1.3*max(se train )<se p (30)
wherein se is train Representing a vector formed by a sliding average value of the training sample errors, and max represents the maximum value in the calculated vector; se (b) p For the running average of the errors of the p-th test sample, the calculation formula is as follows:
se p =(e p +e p-1 +....+e p-z )/z (31)
where z is the sliding distance and z has a value of 10.
CN202110185682.8A 2021-02-11 2021-02-11 Method for detecting abnormality of BOD sensor of outlet water based on modularized neural network Active CN112819087B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110185682.8A CN112819087B (en) 2021-02-11 2021-02-11 Method for detecting abnormality of BOD sensor of outlet water based on modularized neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110185682.8A CN112819087B (en) 2021-02-11 2021-02-11 Method for detecting abnormality of BOD sensor of outlet water based on modularized neural network

Publications (2)

Publication Number Publication Date
CN112819087A CN112819087A (en) 2021-05-18
CN112819087B true CN112819087B (en) 2024-03-15

Family

ID=75865301

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110185682.8A Active CN112819087B (en) 2021-02-11 2021-02-11 Method for detecting abnormality of BOD sensor of outlet water based on modularized neural network

Country Status (1)

Country Link
CN (1) CN112819087B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116953488B (en) * 2023-09-19 2023-12-12 深圳市东陆科技有限公司 Monitoring method for integrated photoelectric chip
CN117346829B (en) * 2023-12-06 2024-02-23 科瑞工业自动化系统(苏州)有限公司 Underwater sensor detection and correction method, detection device and control platform

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106991493A (en) * 2017-03-17 2017-07-28 浙江工商大学 Sewage disposal water outlet parameter prediction method based on Grey production fuction
CN108469507A (en) * 2018-03-13 2018-08-31 北京工业大学 A kind of water outlet BOD flexible measurement methods based on Self organizing RBF Neural Network
CN109978024A (en) * 2019-03-11 2019-07-05 北京工业大学 A kind of water outlet BOD prediction technique based on interconnection module neural network
CN110929809A (en) * 2019-12-14 2020-03-27 北京工业大学 Soft measurement method for key water quality index of sewage by using characteristic self-enhanced circulating neural network
CN111369078A (en) * 2020-02-14 2020-07-03 迈拓仪表股份有限公司 Water supply quality prediction method based on long-term and short-term memory neural network

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105550744A (en) * 2015-12-06 2016-05-04 北京工业大学 Nerve network clustering method based on iteration

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106991493A (en) * 2017-03-17 2017-07-28 浙江工商大学 Sewage disposal water outlet parameter prediction method based on Grey production fuction
CN108469507A (en) * 2018-03-13 2018-08-31 北京工业大学 A kind of water outlet BOD flexible measurement methods based on Self organizing RBF Neural Network
CN109978024A (en) * 2019-03-11 2019-07-05 北京工业大学 A kind of water outlet BOD prediction technique based on interconnection module neural network
CN110929809A (en) * 2019-12-14 2020-03-27 北京工业大学 Soft measurement method for key water quality index of sewage by using characteristic self-enhanced circulating neural network
CN111369078A (en) * 2020-02-14 2020-07-03 迈拓仪表股份有限公司 Water supply quality prediction method based on long-term and short-term memory neural network

Also Published As

Publication number Publication date
CN112819087A (en) 2021-05-18

Similar Documents

Publication Publication Date Title
CN108898215B (en) Intelligent sludge bulking identification method based on two-type fuzzy neural network
CN111291937A (en) Method for predicting quality of treated sewage based on combination of support vector classification and GRU neural network
CN108469507B (en) Effluent BOD soft measurement method based on self-organizing RBF neural network
US20170185892A1 (en) Intelligent detection method for Biochemical Oxygen Demand based on a Self-organizing Recurrent RBF Neural Network
US20180029900A1 (en) A Method for Effluent Total Nitrogen-based on a Recurrent Self-organizing RBF Neural Network
CN110070060B (en) Fault diagnosis method for bearing equipment
CN106022954B (en) Multiple BP neural network load prediction method based on grey correlation degree
CN111105332A (en) Highway intelligent pre-maintenance method and system based on artificial neural network
CN103606006B (en) Sludge volume index (SVI) soft measuring method based on self-organized T-S fuzzy nerve network
CN112819087B (en) Method for detecting abnormality of BOD sensor of outlet water based on modularized neural network
CN110020712B (en) Optimized particle swarm BP network prediction method and system based on clustering
CN108897975A (en) Coalbed gas logging air content prediction technique based on deepness belief network
CN101404071A (en) Electronic circuit fault diagnosis neural network method based on grouping particle swarm algorithm
CN114037163A (en) Sewage treatment effluent quality early warning method based on dynamic weight PSO (particle swarm optimization) optimization BP (Back propagation) neural network
CN110598902A (en) Water quality prediction method based on combination of support vector machine and KNN
CN115374995A (en) Distributed photovoltaic and small wind power station power prediction method
CN111723949A (en) Porosity prediction method based on selective ensemble learning
CN112765902A (en) RBF neural network soft measurement modeling method based on TentFWA-GD and application thereof
CN116109039A (en) Data-driven anomaly detection and early warning system
CN114707692A (en) Wetland effluent ammonia nitrogen concentration prediction method and system based on hybrid neural network
CN117371616A (en) Soil frost heaving rate prediction method for optimizing generalized regression neural network
CN110542748B (en) Knowledge-based robust effluent ammonia nitrogen soft measurement method
CN113159395A (en) Deep learning-based sewage treatment plant water inflow prediction method and system
CN111863153A (en) Method for predicting total amount of suspended solids in wastewater based on data mining
CN117350146A (en) GA-BP neural network-based drainage pipe network health evaluation method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant