CN110232256B - KPLS (kernel principal component system) and RWFCM (wireless remote control unit) -based sewage treatment process monitoring method - Google Patents

KPLS (kernel principal component system) and RWFCM (wireless remote control unit) -based sewage treatment process monitoring method Download PDF

Info

Publication number
CN110232256B
CN110232256B CN201910573311.XA CN201910573311A CN110232256B CN 110232256 B CN110232256 B CN 110232256B CN 201910573311 A CN201910573311 A CN 201910573311A CN 110232256 B CN110232256 B CN 110232256B
Authority
CN
China
Prior art keywords
matrix
sewage treatment
treatment process
sample
variable
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910573311.XA
Other languages
Chinese (zh)
Other versions
CN110232256A (en
Inventor
周平
张瑞垚
王宏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northeastern University China
Original Assignee
Northeastern University China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northeastern University China filed Critical Northeastern University China
Priority to CN201910573311.XA priority Critical patent/CN110232256B/en
Publication of CN110232256A publication Critical patent/CN110232256A/en
Application granted granted Critical
Publication of CN110232256B publication Critical patent/CN110232256B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C02TREATMENT OF WATER, WASTE WATER, SEWAGE, OR SLUDGE
    • C02FTREATMENT OF WATER, WASTE WATER, SEWAGE, OR SLUDGE
    • C02F9/00Multistage treatment of water, waste water or sewage
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • CCHEMISTRY; METALLURGY
    • C02TREATMENT OF WATER, WASTE WATER, SEWAGE, OR SLUDGE
    • C02FTREATMENT OF WATER, WASTE WATER, SEWAGE, OR SLUDGE
    • C02F1/00Treatment of water, waste water, or sewage
    • C02F1/52Treatment of water, waste water, or sewage by flocculation or precipitation of suspended impurities
    • CCHEMISTRY; METALLURGY
    • C02TREATMENT OF WATER, WASTE WATER, SEWAGE, OR SLUDGE
    • C02FTREATMENT OF WATER, WASTE WATER, SEWAGE, OR SLUDGE
    • C02F1/00Treatment of water, waste water, or sewage
    • C02F2001/007Processes including a sedimentation step
    • CCHEMISTRY; METALLURGY
    • C02TREATMENT OF WATER, WASTE WATER, SEWAGE, OR SLUDGE
    • C02FTREATMENT OF WATER, WASTE WATER, SEWAGE, OR SLUDGE
    • C02F2101/00Nature of the contaminant
    • C02F2101/10Inorganic compounds
    • C02F2101/16Nitrogen compounds, e.g. ammonia
    • CCHEMISTRY; METALLURGY
    • C02TREATMENT OF WATER, WASTE WATER, SEWAGE, OR SLUDGE
    • C02FTREATMENT OF WATER, WASTE WATER, SEWAGE, OR SLUDGE
    • C02F2101/00Nature of the contaminant
    • C02F2101/30Organic compounds
    • CCHEMISTRY; METALLURGY
    • C02TREATMENT OF WATER, WASTE WATER, SEWAGE, OR SLUDGE
    • C02FTREATMENT OF WATER, WASTE WATER, SEWAGE, OR SLUDGE
    • C02F2209/00Controlling or monitoring parameters in water treatment
    • C02F2209/08Chemical Oxygen Demand [COD]; Biological Oxygen Demand [BOD]
    • CCHEMISTRY; METALLURGY
    • C02TREATMENT OF WATER, WASTE WATER, SEWAGE, OR SLUDGE
    • C02FTREATMENT OF WATER, WASTE WATER, SEWAGE, OR SLUDGE
    • C02F2209/00Controlling or monitoring parameters in water treatment
    • C02F2209/10Solids, e.g. total solids [TS], total suspended solids [TSS] or volatile solids [VS]
    • CCHEMISTRY; METALLURGY
    • C02TREATMENT OF WATER, WASTE WATER, SEWAGE, OR SLUDGE
    • C02FTREATMENT OF WATER, WASTE WATER, SEWAGE, OR SLUDGE
    • C02F2209/00Controlling or monitoring parameters in water treatment
    • C02F2209/14NH3-N
    • CCHEMISTRY; METALLURGY
    • C02TREATMENT OF WATER, WASTE WATER, SEWAGE, OR SLUDGE
    • C02FTREATMENT OF WATER, WASTE WATER, SEWAGE, OR SLUDGE
    • C02F2209/00Controlling or monitoring parameters in water treatment
    • C02F2209/22O2
    • CCHEMISTRY; METALLURGY
    • C02TREATMENT OF WATER, WASTE WATER, SEWAGE, OR SLUDGE
    • C02FTREATMENT OF WATER, WASTE WATER, SEWAGE, OR SLUDGE
    • C02F3/00Biological treatment of water, waste water, or sewage
    • C02F3/30Aerobic and anaerobic processes
    • C02F3/302Nitrification and denitrification treatment
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2111/00Details relating to CAD techniques
    • G06F2111/04Constraint-based CAD
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2111/00Details relating to CAD techniques
    • G06F2111/10Numerical modelling
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A20/00Water conservation; Efficient water supply; Efficient water use
    • Y02A20/152Water filtration
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02WCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO WASTEWATER TREATMENT OR WASTE MANAGEMENT
    • Y02W10/00Technologies for wastewater treatment
    • Y02W10/10Biological treatment of water, waste water, or sewage

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Water Supply & Treatment (AREA)
  • Environmental & Geological Engineering (AREA)
  • Hydrology & Water Resources (AREA)
  • Computer Hardware Design (AREA)
  • Geometry (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Activated Sludge Processes (AREA)

Abstract

The invention relates to the technical field of sewage treatment quality monitoring, and provides a KPLS and RWFCM-based sewage treatment process monitoring method. The method comprises the steps of firstly, collecting data samples of sewage treatment processes under normal working conditions and abnormal working conditions, respectively using data of sewage treatment operation variables and data of effluent quality variables as input and output data matrixes, and standardizing the two matrixes; then constructing a KPLS model and solving a score matrix; clustering the scoring matrix based on the RWFCM algorithm to obtain a membership matrix, and monitoring abnormal working conditions in the sewage treatment process according to the membership matrix; and finally, establishing a linear regression model of the membership matrix and the sample variable, solving the variable contribution matrix, and identifying abnormal working conditions in the sewage treatment process according to the variable contribution matrix. The invention can reduce the dimension of high-dimensional data and process nonlinear data, is insensitive to outliers, and can improve the timeliness and accuracy of monitoring and identifying the sewage treatment process.

Description

KPLS (kernel principal component system) and RWFCM (wireless remote control unit) -based sewage treatment process monitoring method
Technical Field
The invention relates to the technical field of sewage treatment quality monitoring, in particular to a sewage treatment process monitoring method based on KPLS and RWFCM.
Background
With the acceleration of the urbanization and industrialization process in China, the demand of the society on fresh water resources is increasing day by day, and the construction of municipal sewage treatment facilities needs to be accelerated to improve the municipal sewage treatment capacity. The activated sludge process is the main method for treating urban sewage at present. The activated sludge sewage purification mainly comprises 3 processes of initial adsorption, microorganism metabolism, flocculation formation and sedimentation, and the essence is that biodegradable organic matters in the sewage are adsorbed, decomposed and oxidized by utilizing the microorganism group in the activated sludge through a series of biochemical reactions, so that the biodegradable organic matters are separated from the sewage, and the aim of purifying the sewage is fulfilled.
At present, biochemical oxygen demand ([ BOD ]), chemical oxygen demand ([ COD ]), suspended matter ([ SS ]), ammonia nitrogen ([ NH ]), and total phosphorus ([ TP ]) are generally adopted as sewage discharge indexes. In the sewage treatment process, parameters such as water inlet flow, water inlet components, pollutant concentration, weather change and the like are passively accepted, and the life activities of microorganisms are influenced by various factors such as dissolved oxygen concentration, microorganism population, the pH value of sewage and the like, so that the long-term stable operation of the urban sewage treatment plant is very difficult to maintain. The failure of the sewage treatment plant easily causes the quality of the effluent not to reach the standard, increases the operation cost and causes environmental pollution. Therefore, if the abnormal working condition of the sewage treatment process cannot be detected in time, the correct judgment cannot be made and no powerful measures are taken in time for adjustment and correction, so that the irreversible loss of the sewage treatment process can be caused. Therefore, an operator can accurately judge the abnormal working condition by detecting the sewage treatment process, and timely and accurately take measures, so that the safety, stability and smooth operation of sewage treatment are ensured, and the quality of effluent is especially important.
The process monitoring method based on data driving is widely applied, wherein the multivariate statistical process monitoring method becomes one of research hotspots in the process monitoring field, and the research on the effluent quality monitoring of sewage treatment is promoted. The sewage treatment process is a complex nonlinear process, and historical fault data is not perfect, so that fault identification of fault variables is still a difficult problem in the sewage treatment effluent quality monitoring process.
The existing sewage treatment process monitoring method adopts a data mining method in recent years, and the main reason is that a large amount of data exists and can be widely used, and the data needs to be converted into useful information and knowledge urgently. Since sewage treatment process data has no classification identification and the occurrence of sewage treatment failures is not correlated much with time, it is not suitable to mine using classification or sequence pattern mining. The cluster analysis in the data mining technology is an unsupervised classification technology and can be well used for analyzing data with less prior knowledge, so that the cluster analysis technology is widely applied to sewage process monitoring.
The fuzzy c-means clustering (FCM) algorithm is one of the classical clustering algorithms. FCM gives the uncertainty degree of the sample to the category, and establishes the uncertainty description of the sample to the category, which is more consistent with the description of the objective world. However, the data of the sewage treatment process has high dimensionality and nonlinearity and has outliers, and the traditional FCM algorithm cannot process the high dimensionality and nonlinearity data and is very sensitive to the outliers, so that the difficulty of process monitoring is increased, the reliability of fault detection is reduced, the quality of the sewage effluent is greatly influenced, and certain economic loss and even accidents are caused. Meanwhile, the clustering number of the FCM algorithm needs to be preset manually, and the method has great limitation in practical application.
Disclosure of Invention
Aiming at the problems in the prior art, the invention provides a KPLS and RWFCM-based sewage treatment process monitoring method, which can reduce the dimension of high-dimensional data and process nonlinear data, is insensitive to outliers, can accurately and conveniently determine the clustering number, and improves the timeliness and accuracy of sewage treatment process monitoring and identification.
The technical scheme of the invention is as follows:
a KPLS and RWFCM based sewage treatment process monitoring method is characterized by comprising the following steps:
step 1: respectively collecting data samples of a normal working condition and a sewage treatment process containing an abnormal working condition, wherein the data samples of the sewage treatment process comprise m 1 Operation variable m of sewage treatment 2 A water outlet quality variable; adding the sewage treatment process data sample under the normal working condition before the sewage treatment process data sample under the abnormal working condition from the time angle to form a mixed data sample set; collecting m mixed data samples 1 Taking the data of the running variables of the sewage treatment as an input data matrix X, and concentrating the mixed data samples into m 2 Taking the data of the effluent quality variable as an output data matrix Y;
step 2: preprocessing an input data matrix X and an output data matrix Y; the preprocessing comprises the steps of calculating the mean value and the standard deviation of all variables in an input data matrix X and an output data matrix Y, and normalizing the input data matrix X and the output data matrix Y into data with zero mean value and unit standard deviation;
and 3, step 3: constructing a KPLS (kernel principal component analysis) model for monitoring the sewage treatment process, and mapping an input sample X in an input data matrix X to a high-dimensional feature space F: x → phi (X) belongs to F, a Gaussian kernel function is introduced to obtain a Gram matrix K of the input data matrix X, and the Gram matrix K is subjected to centralization processing;
and 4, step 4: determining the number of pivot elements by adopting a cross verification method, and solving a score matrix T;
and 5: clustering the score matrix T based on the RWFCM algorithm to obtain a membership matrix U, and monitoring abnormal working conditions of the sewage treatment process according to the membership matrix U: if the membership degree of the sample to the clustering center of the normal working condition sample at a certain moment is less than mu, the sewage treatment process generates an abnormality at the sample;
and 6: establishing a linear regression model of variables in the membership matrix U and the input data matrix X, solving a variable contribution matrix N by adopting a Lagrange multiplier method, and identifying abnormal working conditions of the sewage treatment process according to the variable contribution matrix N: if the contribution of the a-th variable to all clusters { η } a1 ,...,η ac Maximum of η ag If the variable a is a fault variable related to the g abnormal working condition; wherein c is the number of clusters, g belongs to { 1., c-1}, and the c-th cluster is the cluster of the normal working condition sample.
The sewage treatment process adopts an activated sludge method, raw sewage enters a biochemical tank part after primary treatment, after biological denitrification, one part of the raw sewage enters a secondary sedimentation tank for sedimentation after denitrification again through internal circulation reflux; the biochemical pool part comprises a biochemical pool l belonging to {1,2,3,4,5}, wherein the biochemical pool l belongs to 1 Belongs to {1,2} as an anoxic zone mainly completing the denitrification reaction process, a biochemical pool 2 Belongs to {3,4,5} as an aerobic zone mainly completing the nitration reaction process; in the step 1, m is 1 The operation variables of the sewage treatment comprise inflow, inflow ammonia nitrogen, active heterotrophic bacteria biomass in a biochemical pond I E {1,2,3,4,5}, easily biodegradable organic substrate in the biochemical pond I E {1,2,3,4,5}, and alkali in the biochemical pond I E {1,2,3,4,5}Biochemical and biochemical pool 1 Amount of nitronium in E {1,2}, biochemical pool l 2 Active autotrophic bacteria biomass in epsilon {3,4,5}, biochemical pond l 2 Ammonia nitrogen content in E {3,4,5}, biochemical pool l 2 The dissolved oxygen in the element {3,4,5 }; m is said 2 The quality variables of the effluent comprise biochemical oxygen demand, chemical oxygen demand, suspended matters and ammonia nitrogen amount of the effluent.
The step 3 comprises the following steps:
step 3.1: the KPLS model for monitoring the sewage treatment process is constructed as
Φ=TP 1 '+Φ r
Y=TQ'+Y r
Step 3.2: mapping input samples X in an input data matrix X to a high-dimensional feature space F: x → phi (X) belongs to F, a Gaussian kernel function is introduced to obtain a Gram matrix K of the input data matrix X, the Gram matrix K is subjected to centralized processing, and a KPLS model is converted into a kernel-based modeling model
K=TP 2 '+E
Y=TQ'+Y r
Wherein, the element of the ith row and the jth column of the Gram matrix K is K ij =k(x i ,x j )=<Φ(x i ),Φ(x j )>,x i 、x j Respectively, the ith input sample X in the input data matrix X i J-th input sample x j ,k(x i ,x j ) Is a Gaussian kernel function, i, j belongs to {1, 2., n }, and n is the number of samples in the input data matrix X; t is high dimensional data phi = { phi (x) i ) I ∈ {1,2,.., n } } score matrix, T = [ T ] } 1 ,...,t A ]A is the number of pivot elements, P 1 =[p 11 ,...,p 1A ]、P 2 =[p 21 ,...,p 2A ]、Q=[q 1 ,...,q A ]A load matrix phi, a Gram matrix K, and an output data matrix Y r 、E、Y r Respectively are the modeling residual errors of the matrix phi, the Gram matrix K and the output data matrix Y.
In the step 4, determining the number of the principal elements A by adopting a cross verification method, and solving a scoring matrix T, wherein the method comprises the following steps:
step 4.1: let u be any column of the output data matrix Y;
and 4.2: calculating a score vector: t = Ku;
step 4.3: normalizing the score vector t: i t | → 1;
step 4.4: and (3) performing regression on each column in the output data matrix Y on the score vector t: q = Y't;
step 4.5: calculating a new score for the output data matrix Y: u = Yq;
step 4.6: and (3) carrying out normalization processing on the u vector: l u | | → 1;
step 4.7: judging whether u converges: if yes, jumping to step 4.8; if not, jumping to step 4.2;
step 4.8: updating the matrix: k = (I-tt ') K (I-tt '), Y = Y-tq ', repeat steps 4.2 through 4.7, and perform the calculation of the next score vector until a score vectors are all extracted; wherein I is an identity matrix.
In the step 3, the Gram matrix after the centralization processing
Figure GDA0002146462390000041
Wherein E is n Is an n × n identity matrix, 1 n Is n-dimensional all 1 column vector, 1' n Is 1 n The transposed matrix of (2).
The step 5 comprises the following steps:
step 5.1: clustering the score matrix T based on the RWFCM algorithm, and constructing an RWFCM target function of
Figure GDA0002146462390000042
Wherein, the first and the second end of the pipe are connected with each other,
Figure GDA0002146462390000043
is the ith row vector of the scoring matrix T,
Figure GDA0002146462390000044
input sample x in m1 dimension i Corresponding toA-dimensional new sample u after dimensionality reduction ij Is a sample
Figure GDA0002146462390000045
For the jth clustering center v j Degree of membership of s ij Is a sample
Figure GDA0002146462390000046
Likelihood of belonging to jth cluster, membership matrix U = (U) ij ) n×c Cluster center matrix V = (V) j ) c×A And c is the number of clusters; m is in the range of [1, + ∞]Is a blur index;
Figure GDA0002146462390000047
is a sample
Figure GDA0002146462390000048
With the jth cluster center v j Mahalanobis distance between them, S j Is a fuzzified covariance matrix, S j Is a positive definite matrix;
Figure GDA0002146462390000049
as a penalty term, η j P is a probability index as a penalty factor; c clustering centers obtained by clustering the score matrix T comprise clustering centers of normal working condition samples and clustering centers of c-1 abnormal working condition samples;
step 5.2: solving a membership matrix U:
step 5.2.1: initializing RWFCM algorithm parameters: determining the clustering number c, setting a fuzzy index m and a probability index p, setting an algorithm termination limit epsilon and a maximum iteration count, initializing the iteration count k =1, and randomly initializing a membership matrix U (k) =(u ij (k) ) n×c Randomly initializing a cluster center matrix V (k) =(v j (k) ) c×A Randomly initializing a fuzzy covariance matrix set S (k) =(S j (k) ) n×n×c
Step 5.2.2: u is to be ij (k) 、v j (k) 、S j (k) Substituting into formula
Figure GDA0002146462390000051
Computing a likelihood matrix B for the (k + 1) th iteration (k+1) =(s ij (k+1) ) n×c
Step 5.2.3: will s ij (k+1) 、v j (k) 、S j (k) Substitution formula
Figure GDA0002146462390000052
Calculating membership degree matrix U of k +1 iteration (k+1) =(u ij (k+1) ) n×c
Step 5.2.4: will u ij (k+1) 、s ij (k+1) Substitution formula
Figure GDA0002146462390000053
Calculating a clustering center matrix of the (k + 1) th iteration as V (k+1) =(v j (k+1) ) c×A
Step 5.2.5: will u ij (k+1) 、s ij (k+1) 、v j (k+1) Substituting into formula
Figure GDA0002146462390000054
Calculating the fuzzification covariance matrix set S of the (k + 1) th iteration (k+1) =(S j (k+1) ) n×n×c (ii) a Wherein, γ j Is a lagrange multiplier;
step 5.2.6: if | | | U (k+1) -U (k) If | | is less than epsilon or the iteration times k is more than count, stopping iteration to obtain a final membership matrix U, and entering the step 5.3; otherwise, letting k = k +1, and returning to step 5.2.2;
step 5.3: monitoring abnormal working conditions in the sewage treatment process according to the membership matrix U: if the ith sample
Figure GDA0002146462390000055
Membership degree of clustering center of normal working condition sampleIf the value is less than mu, the sewage treatment process is abnormal at the ith sample; if the sewage treatment process is abnormal, entering the step 6; and if the sewage treatment process is not abnormal, ending the monitoring method of the sewage treatment process based on KPLS and RWFCM.
In the step 5.2.1, determining the cluster number c includes:
computing input samples X in an input data matrix X i Point density value D of i Is composed of
Figure GDA0002146462390000061
Wherein, the first and the second end of the pipe are connected with each other,
Figure GDA0002146462390000062
r d is the effective radius of the neighborhood density,
Figure GDA0002146462390000063
computing the constructor S (j) as
Figure GDA0002146462390000064
The image of the structural function S (j) is drawn, and the number of slopes of the image of the structural function S (j) is defined as the cluster number c.
The step 6 comprises the following steps:
step 6.1: establishing a linear regression model of the variables in the membership matrix U and the input data matrix X as
Figure GDA0002146462390000065
Wherein N is 0 =(η 0j ) 1×c Is a constant term of ∈ ij For the error term, it is assumed that: e (. Epsilon.) ij )=0,Var(ε ij )=δ 2 δ is a constant; x is a radical of a fluorine atom ia For the ith input sample X in the input data matrix X i The value of the a variable of (a);
Figure GDA0002146462390000066
contribution matrix, η, for variables aj Is a regression coefficient, eta aj The contribution of the a variable to the j cluster;
step 6.2: solving a variable contribution matrix N by adopting a Lagrange multiplier method:
step 6.2.1: initializing each parameter: setting an algorithm termination limit tau and a maximum iteration number T, initializing the iteration number k =1, and randomly initializing a variable contribution matrix
Figure GDA0002146462390000067
Step 6.2.2: will eta aj (k) Substitution formula
Figure GDA0002146462390000068
Calculate N for the k +1 th iteration 0 (k+1) =(η 0j (k+1) ) 1×c
Step 6.2.3: will eta 0j (k+1) Substitution formula
Figure GDA0002146462390000069
Computing the variable contribution matrix for the (k + 1) th iteration
Figure GDA00021464623900000610
Step 6.2.4: if N (k+1) -N (k) If | is less than tau or the iteration times k is more than T, stopping iteration and entering the step 6.3; otherwise, let k = k +1, return to step 6.2.2;
step 6.3: and (3) identifying abnormal working conditions in the sewage treatment process according to the variable contribution matrix N: if the contribution of the a-th variable to all clusters { η } a1 ,...,η ac The maximum value in (j) is η ag If the variable a is a fault variable related to the g abnormal working condition, judging that the variable a is a fault variable related to the g abnormal working condition; wherein g belongs to { 1., c-1}, and the c-th cluster is a cluster of normal working condition samples.
The beneficial effects of the invention are as follows:
(1) The KPLS algorithm and the RWFCM algorithm are combined, the KPLS model and the RWFCM model are constructed to describe the normal production process, prior knowledge of abnormal working conditions in the sewage treatment process is not needed, and only normal working condition data are used as marking data. Firstly, based on a data driving method, a Gaussian kernel function is adopted, standardized process variables are projected to a high-dimensional feature space, a KPLS model for monitoring the sewage treatment process is established in the high-dimensional feature space, after the number of principal elements is determined by a cross verification method, dimension reduction is carried out on high-dimensional input data, a score matrix T is obtained and is used as input data of cluster analysis in an RWFCM algorithm, and the problems that the FCM cannot process nonlinear data and is sensitive to outliers are solved while the purpose of dimension reduction is achieved.
(2) The invention calculates the constructor based on the density function and solves the cluster number according to the constructor, thereby accurately and conveniently determining the cluster number and solving the limitation problem that the cluster number of the RWFCM algorithm needs to be preset manually.
(3) According to the invention, the scoring matrix T is clustered based on the RWFCM algorithm to obtain the membership matrix U, abnormal working conditions are monitored in the sewage treatment process according to the membership matrix U, the occurrence time of the abnormal working conditions can be monitored through the sample membership, and the number of the abnormal working conditions can be identified. According to the method, the membership matrix U and the linear regression model of the sewage treatment process variable are established, the Lagrange multiplier method is adopted to solve the linear regression model to obtain the variable contribution matrix N, the abnormal working condition identification is carried out on the sewage treatment process according to the variable contribution matrix N, and the fault variable related to each abnormal working condition can be identified through the contribution of the variable to the clustering. The invention has high timeliness and accuracy for monitoring and identifying abnormal working conditions in the sewage treatment process, can facilitate operators to monitor the sewage treatment process, accurately judge the fluctuation of the effluent quality of sewage treatment, and timely take measures to treat and correct, thereby ensuring the stable, efficient and safe operation of sewage plants and ensuring the effluent quality.
Drawings
FIG. 1 is a flow chart of a KPLS and RWFCM based wastewater treatment process monitoring method of the present invention;
FIG. 2 is a schematic diagram illustrating the membership of a monitoring sample to a clustering center of a normal condition sample according to an embodiment of the present invention;
FIG. 3 is a schematic diagram illustrating the membership of a monitoring sample to the clustering center of a type 1 abnormal condition sample according to an embodiment of the present invention;
FIG. 4 is a schematic diagram illustrating the membership of a monitoring sample to a cluster center of a sample under a type 2 abnormal condition in an embodiment of the present invention;
FIG. 5 is a diagram illustrating a clustering effect of the score matrix T according to an embodiment of the present invention;
fig. 6 is a schematic diagram illustrating the contribution of each sample variable to each cluster in the embodiment of the present invention.
Detailed Description
The invention will be further described with reference to the accompanying drawings and specific embodiments.
Fig. 1 shows a flow chart of the KPLS and RWFCM-based sewage treatment process monitoring method of the present invention. The invention discloses a sewage treatment process monitoring method based on KPLS and RWFCM, the method is characterized by comprising the following steps:
step 1: respectively collecting data samples of a normal working condition and a sewage treatment process containing an abnormal working condition, wherein the data samples of the sewage treatment process comprise m 1 Operation variable m of sewage treatment 2 A water outlet quality variable; adding the sewage treatment process data sample under the normal working condition before the sewage treatment process data sample under the abnormal working condition from the time angle to form a mixed data sample set; collecting m mixed data samples 1 Taking the data of the running variable of the sewage treatment as an input data matrix X, and concentrating the mixed data sample into m 2 The data of the water quality variable is used as an output data matrix Y.
In this embodiment, the sewage treatment process employs an activated sludge process. The activated sludge process flow is generally divided into primary treatment, secondary treatment and tertiary treatment according to the treatment degree. The raw sewage is treated in the first stage and then enters the biochemical tank for biological denitrification, one part of the raw sewage is subjected to denitrification again through internal circulation reflux, and the other part of the raw sewage enters the secondary sedimentation tank for sedimentation. The biochemical pool is used for performing biochemical reactionThe most important place for sewage treatment is the process and the purification. The biochemical pool part comprises a biochemical pool l belonging to {1,2,3,4,5}, wherein the biochemical pool l belongs to 1 Belongs to {1,2} as an anoxic zone for mainly completing the denitrification reaction process, i.e. a biochemical pool 2 The epsilon {3,4,5} is an aerobic zone which mainly completes the nitration reaction process; in the step 1, as shown in the following Table 1, the m 1 The operation variables of the sewage treatment comprise inflow, inflow ammonia nitrogen, the biomass of active heterotrophic bacteria in a biochemical pond I E {1,2,3,4,5}, the biomass of easily biodegradable organic matters in the biochemical pond I E {1,2,3,4,5}, the alkalinity in the biochemical pond I E {1,2,3,4,5}, and the biochemical pond I 1 Amount of nitrol in E {1,2}, biochemical pool l 2 Active autotrophic bacteria biomass in epsilon {3,4,5}, biochemical pond l 2 Amount of ammonia nitrogen in E {3,4,5}, biochemical pool l 2 The dissolved oxygen content in the epsilon {3,4,5 }; m is said 2 The quality variables of the effluent comprise biochemical oxygen demand, chemical oxygen demand, suspended matters and ammonia nitrogen amount of the effluent.
The abnormal conditions are known to those skilled in the art as sludge bulking, foaming, scum, toxic shock, heavy rain, etc. In this embodiment, 100 sewage treatment process data samples under normal conditions and 1300 sewage treatment process data samples under abnormal conditions including rainstorm weather and toxic impact are collected to form a mixed data sample set including 1400 samples. Collecting m mixed data samples 1 Taking data of =28 sewage treatment operation variables as input data matrix X ∈ R 1400×28 Set m of mixed data samples 2 Taking the data of the =4 water outlet quality variables as an output data matrix Y ∈ R 1400×4
TABLE 1
Figure GDA0002146462390000091
Step 2: preprocessing an input data matrix X and an output data matrix Y; the preprocessing comprises the steps of calculating the mean value and the standard deviation of each variable in the input data matrix X and the output data matrix Y, and normalizing the input data matrix X and the output data matrix Y into data with zero mean value and unit standard deviation.
And 3, step 3: constructing a KPLS model for monitoring the sewage treatment process, and mapping an input sample X in an input data matrix X to a high-dimensional feature space F: x → phi (X) is in the middle of F, a Gaussian kernel function is introduced to obtain a Gram matrix K of the input data matrix X, and the Gram matrix K is subjected to centralization processing.
The step 3 comprises the following steps:
step 3.1: the KPLS model for monitoring the sewage treatment process is constructed as
Φ=TP 1 '+Φ r
Y=TQ'+Y r
Step 3.2: mapping input samples X in an input data matrix X to a high-dimensional feature space F: x → phi (X) belongs to F, a Gaussian kernel function is introduced to obtain a Gram matrix K of the input data matrix X, the Gram matrix K is subjected to centralization processing, and a KPLS model is converted into
K=TP 2 '+E
Y=TQ'+Y r
Wherein, the element of the ith row and the jth column of the Gram matrix K is K ij =k(x i ,x j )=<Φ(x i ),Φ(x j )>,x i 、x j Respectively, the ith input sample X in the input data matrix X i J-th input sample x j ,k(x i ,x j ) Is a Gaussian kernel function, i, j belongs to {1, 2., n }, and n is the number of samples in the input data matrix X; t is high dimensional data phi = { phi (x) i ) I ∈ {1,2, ·, n } } score matrix, T = [ T ] } 1 ,...,t A ]A is the number of pivot elements, P 1 =[p 11 ,...,p 1A ]、P 2 =[p 21 ,...,p 2A ]、Q=[q 1 ,...,q A ]Respectively a matrix phi, a Gram matrix K, a load matrix of an output data matrix Y, phi r 、E、Y r Respectively are the modeling residual errors of the matrix phi, the Gram matrix K and the output data matrix Y.
In the step 3, the Gram matrix after the centralization processing
Figure GDA0002146462390000101
Where En is an n × n identity matrix, 1 n Is n-dimensional all 1-column vector, 1' n Is 1 n The transposed matrix of (2).
A KPLS model is constructed by adopting a nonlinear least square iterative algorithm, and KPLS is kernel projection to relative structure. In this embodiment, the Gaussian kernel function is
Figure GDA0002146462390000102
Wherein, c 1 Is a Gaussian kernel function width parameter, c 1 Is taken from 5m 1 Empirical determination, i.e. determination of c 1 =5*m 1 =140。
And 4, step 4: and determining the number of the principal elements by adopting a cross verification method, and solving a score matrix T.
In the step 4, determining the number of the principal elements A by adopting a cross verification method, and solving a scoring matrix T, wherein the method comprises the following steps:
step 4.1: let u be any column of the output data matrix Y;
step 4.2: calculating a score vector: t = Ku;
step 4.3: normalizing the score vector t: i t | → 1;
step 4.4: and (3) performing regression on each column in the output data matrix Y on the score vector t: q = Y't;
step 4.5: calculating a new score for the output data matrix Y: u = Yq;
step 4.6: and (3) carrying out normalization processing on the u vector: | | u | → 1;
step 4.7: judging whether u converges: if yes, jumping to step 4.8; if not, jumping to step 4.2;
step 4.8: updating the matrix: k = (I-tt ') K (I-tt '), Y = Y-tq ', repeat steps 4.2 through 4.7, and perform the calculation of the next score vector until a score vectors are all extracted; wherein I is an identity matrix.
In this embodiment, a cross-validation method is used to determine the number of pivot elements a =3.
And 5: clustering the score matrix T based on the RWFCM algorithm to obtain a membership matrix U, and monitoring abnormal working conditions of the sewage treatment process according to the membership matrix U: and if the membership degree of the sample to the clustering center of the normal working condition sample at a certain time is less than mu, the sewage treatment process generates an abnormality at the sample.
The step 5 comprises the following steps:
step 5.1: clustering the score matrix T based on the RWFCM algorithm, and constructing an RWFCM target function as
Figure GDA0002146462390000111
Wherein the content of the first and second substances,
Figure GDA0002146462390000112
is the ith row vector of the scoring matrix T,
Figure GDA0002146462390000113
is a new dimension-reduced A-dimension sample, u, corresponding to the input sample xi of m1 dimension ij Is a sample
Figure GDA0002146462390000114
For jth cluster center v j Degree of membership of s ij Is a sample
Figure GDA0002146462390000115
Likelihood of belonging to the j-th cluster, membership matrix U = (U) ij ) n×c Cluster center matrix V = (V) j ) c×A And c is the number of clusters; m is an element of [1, + ∞]Is a fuzzy index;
Figure GDA0002146462390000116
is a sample
Figure GDA0002146462390000117
With the jth cluster center v j Mahalanobis distance between them, S j To blur the covariance matrix, S j Is a positive definite matrix;
Figure GDA0002146462390000118
as a penalty term, η j P is a probability index as a penalty factor; c clustering centers obtained by clustering the score matrix T comprise clustering centers of normal working condition samples and clustering centers of c-1 abnormal working condition samples; step 5.2: solving a membership matrix U:
by introducing lagrange multipliers λ and γ, the following function is constructed:
Figure GDA0002146462390000121
separately relating s to function L ij 、u ij 、v j 、S j Partial derivatives of (A) to obtain
Figure GDA0002146462390000122
Figure GDA0002146462390000123
Figure GDA0002146462390000124
Figure GDA0002146462390000125
By
Figure GDA0002146462390000126
To obtain
Figure GDA0002146462390000127
Will be provided with
Figure GDA0002146462390000128
Brought into the above formula
Figure GDA0002146462390000129
Figure GDA00021464623900001210
Figure GDA00021464623900001211
S j -1 +(S j -1 ) T Reversible, get
Figure GDA0002146462390000131
Figure GDA0002146462390000132
Figure GDA0002146462390000133
Figure GDA0002146462390000134
Figure GDA0002146462390000135
Step 5.2.1: initializing RWFCM algorithm parameters: determining the clustering number c, setting a fuzzy index m and a probability index p, setting an algorithm termination limit epsilon and a maximum iteration count, initializing the iteration count k =1, and randomly initializing a membership matrix U (k) =(u ij (k) ) n×c Randomly initializing a cluster center matrix V (k) =(v j (k) ) c×A Randomly initializing a fuzzy covariance matrix set S (k) =(S j (k) ) n×n×c
Step 5.2.2: will u ij (k) 、v j (k) 、S j (k) Substitution formula
Figure GDA0002146462390000136
Computing a likelihood matrix B for the (k + 1) th iteration (k+1) =(s ij (k+1) ) n×c
Step 5.2.3: will s ij (k+1) 、v j (k) 、S j (k) Substitution formula
Figure GDA0002146462390000137
Calculating membership matrix U of the (k + 1) th iteration (k+1) =(u ij (k+1) ) n×c
Step 5.2.4: will u ij (k+1) 、s ij (k+1) Substitution formula
Figure GDA0002146462390000138
Calculating the clustering center matrix of the (k + 1) th iteration as V (k+1) =(v j (k+1) ) c×A
Step 5.2.5: will u ij (k+1) 、s ij (k+1) 、v j (k+1) Substitution formula
Figure GDA0002146462390000141
Calculating the fuzzification covariance matrix set S of the (k + 1) th iteration (k+1) =(S j (k+1) ) n×n×c (ii) a Wherein, γ j Is a lagrange multiplier;
step 5.2.6: if | | | U (k+1) -U (k) If | | < epsilon or the iteration times k > count, stopping iteration to obtain a final membership matrix U, and entering the step 5.3; otherwise, let k = k +1, return to step 5.2.2;
step 5.3: according to the membership matrixU carries out abnormal condition monitoring to sewage treatment process: if the ith sample
Figure GDA0002146462390000142
If the membership degree of the clustering center of the normal working condition sample is less than mu, the sewage treatment process is abnormal at the ith sample; if the sewage treatment process is abnormal, entering the step 6; and if no abnormity occurs in the sewage treatment process, ending the KPLS and RWFCM-based sewage treatment process monitoring method.
In the step 5.2.1, determining the cluster number c includes:
computing input samples X in an input data matrix X i Point density value D of i Is composed of
Figure GDA0002146462390000143
Wherein the content of the first and second substances,
Figure GDA0002146462390000144
r d is the effective radius of the neighborhood density,
Figure GDA0002146462390000145
calculating a constructor S (j) of
Figure GDA0002146462390000146
The image of the structural function S (j) is drawn, and the number of slopes of the image of the structural function S (j) is set as the cluster number c.
Wherein the slope of the constructor S (j) reflects the point density value of the sample data, which has the practical significance that the slopes of the constructor S (j) at the homogeneous data are the same. In this embodiment, the number of slopes of the image of the constructor S (j) is 3, and according to the analysis, it can be determined that the mixed data sample set is classified into 3 types: class1 is class1 abnormal condition sample class, class2 is class2 abnormal condition sample class, and class3 is normal condition sample class, thereby determining cluster number c =3.
Wherein the fuzzy index m influences the fuzzy degree of the membership matrix. In this embodiment, the fuzzy index m =2.4 is set, so that the algorithm effect is optimal. RWFCM is robust weighted fuzzy c-means clustering.
In this example, the sample was monitored
Figure GDA0002146462390000147
For the clustering center v of the normal working condition sample 3 1 st type abnormal working condition sample clustering center v 1 And the clustering center v of the 2 nd abnormal working condition sample 2 Degree of membership u of i3 、u i1 、u i2 As shown in fig. 2,3, and 4, the clustering effect of the score matrix T is shown in fig. 5.μ =0.5 was set. As can be seen from fig. 2,3,4, at the 200 th and 800 th samples, samples
Figure GDA0002146462390000151
And the membership degrees of the clustering centers of the normal working condition samples are all less than 0.5, so that the wastewater treatment process is judged to be abnormal at the 200 th and 800 th samples. Therefore, the monitoring method can timely monitor the occurrence of abnormal working conditions in the sewage treatment process.
And 6: establishing a linear regression model of variables in the membership matrix U and the input data matrix X, solving a variable contribution matrix N by adopting a Lagrange multiplier method, and identifying abnormal working conditions of the sewage treatment process according to the variable contribution matrix N: if the contribution of the a-th variable to all clusters { η } a1 ,...,η ac Maximum of η ag If the variable a is a fault variable related to the g abnormal working condition, judging that the variable a is a fault variable related to the g abnormal working condition; wherein c is the number of clusters, g belongs to { 1., c-1}, and the c-th cluster is the cluster of the normal working condition sample.
The step 6 comprises the following steps:
step 6.1: establishing a linear regression model of the variables in the membership matrix U and the input data matrix X as
Figure GDA0002146462390000152
Wherein, N 0 =(η 0j ) 1×c Is a constant term of ∈ ij For the error term, it is assumed that: e (ε) ij )=0,Var(ε ij )=δ 2 Delta is a constant; x is a radical of a fluorine atom ia For the ith input sample X in the input data matrix X i The value of the a variable of (a);
Figure GDA0002146462390000153
contribution matrix, η, for variables aj Is a regression coefficient, η aj Is the contribution of the a-th variable to the j-th cluster.
Step 6.2: solving a variable contribution matrix N by adopting a Lagrange multiplier method:
wherein, to solve eta aj 、η 0j Introducing a loss function of
Figure GDA0002146462390000154
η aj Representing the degree of interpretation of the jth cluster by the a variable, introducing a constraint on the loss function as
Figure GDA0002146462390000155
Solving a variable contribution matrix N by adopting a Lagrange multiplier method, introducing a Lagrange multiplier Zeta and constructing an objective function of
Figure GDA0002146462390000156
Separately solving the objective function L about eta 0j 、η aj Partial derivatives of (A) to obtain
Figure GDA0002146462390000161
Figure GDA0002146462390000162
Figure GDA0002146462390000163
Figure GDA0002146462390000164
By
Figure GDA0002146462390000165
General formula
Figure GDA0002146462390000166
Brought into the above formula
Figure GDA0002146462390000167
Figure GDA0002146462390000168
Step 6.2.1: initializing each parameter: setting an algorithm termination limit tau and a maximum iteration number T, initializing the iteration number k =1, and randomly initializing a variable contribution matrix
Figure GDA0002146462390000169
Step 6.2.2: will eta aj (k) Substitution formula
Figure GDA00021464623900001610
Calculate N for the k +1 th iteration 0 (k+1) =(η 0j (k+1) ) 1×c
Step 6.2.3: will eta 0j (k+1) Substituting into formula
Figure GDA00021464623900001611
Calculating the variable contribution matrix of the (k + 1) th iteration
Figure GDA00021464623900001612
Step 6.2.4: if N (k+1) -N (k) If | is less than tau or the iteration times k is more than T, stopping iteration and entering the step 6.3; otherwise, letting k = k +1, and returning to step 6.2.2;
step 6.3: and (3) identifying abnormal working conditions in the sewage treatment process according to the variable contribution matrix N: if the contribution of the a-th variable to all clusters { η } a1 ,...,η ac Maximum of η ag If the variable a is a fault variable related to the g abnormal working condition, judging that the variable a is a fault variable related to the g abnormal working condition; wherein g belongs to { 1., c-1}, and the c-th cluster is a cluster of normal working condition samples.
In this embodiment, the contribution values of the 28 sample variables to the 3 clusters are shown in fig. 6. From fig. 6 it can be recognized that: the fault variables related to the 1 st abnormal working condition are the biomass of active heterotrophic bacteria in the biochemical pools 1-5, the biomass of organic bottom products easy to biodegrade, the biomass of active autotrophic bacteria in the biochemical pools 3-5 and the amount of nitrate nitrogen in the biochemical pool 1; the fault variables related to the 2 nd abnormal working condition are the inflow water flow, the inflow ammonia nitrogen amount, the alkalinity in the biochemical tanks 1-5, the nitrate nitrogen amount in the biochemical tank 2, the dissolved oxygen amount in the biochemical tanks 3-5 and the ammonia nitrogen amount in the biochemical tanks 3-5. Therefore, the method can identify the fault variable of each abnormal working condition in time.
It is to be understood that the above-described embodiments are only some of the embodiments of the present invention, and not all of the embodiments. The above examples are only for explaining the present invention and do not constitute a limitation to the scope of protection of the present invention. All other embodiments, which can be derived by those skilled in the art from the above-described embodiments without any creative effort, namely all modifications, equivalents, improvements and the like made within the spirit and principle of the present application, fall within the scope of the present invention as claimed.

Claims (8)

1. A KPLS and RWFCM based sewage treatment process monitoring method is characterized by comprising the following steps:
step 1: respectively collecting data samples of a normal working condition and a sewage treatment process containing an abnormal working condition, wherein the data samples of the sewage treatment process comprise m 1 Operation variable m of sewage treatment 2 A water outlet quality variable; adding the sewage treatment process data sample under the normal working condition before the sewage treatment process data sample under the abnormal working condition from the time angle to form a mixed data sample set; collecting m mixed data samples 1 Taking the data of the running variable of the sewage treatment as an input data matrix X, and concentrating the mixed data sample into m 2 Taking the data of the effluent quality variable as an output data matrix Y;
and 2, step: preprocessing an input data matrix X and an output data matrix Y; the preprocessing comprises the steps of calculating the mean value and the standard deviation of all variables in an input data matrix X and an output data matrix Y, and normalizing the input data matrix X and the output data matrix Y into data with zero mean value and unit standard deviation;
and step 3: constructing a KPLS (kernel principal component analysis) model for monitoring the sewage treatment process, and mapping an input sample X in an input data matrix X to a high-dimensional feature space F: x → phi (X) belongs to F, a Gaussian kernel function is introduced to obtain a Gram matrix K of the input data matrix X, and the Gram matrix K is subjected to centralization processing;
and 4, step 4: determining the number of pivot elements by adopting a cross verification method, and solving a score matrix T;
and 5: clustering the scoring matrix T based on the RWFCM algorithm to obtain a membership matrix U, and monitoring abnormal working conditions in the sewage treatment process according to the membership matrix U: if the membership degree of the sample to the clustering center of the normal working condition sample at a certain time is less than mu, abnormality occurs in the sample in the sewage treatment process;
and 6: establishing a linear regression model of variables in the membership matrix U and the input data matrix X, solving a variable contribution matrix N by adopting a Lagrange multiplier method, and identifying abnormal working conditions of the sewage treatment process according to the variable contribution matrix N: if the contribution of the a-th variable to all clusters { η } a1 ,...,η ac The maximum value in (j) is η ag If the variable a is a fault variable related to the g abnormal working condition; wherein c is the number of clusters, g belongs to { 1., c-1}, and the c-th cluster is the cluster of the normal working condition sample.
2. The KPLS and RWFCM based sewage treatment process monitoring method according to claim 1, wherein the sewage treatment process adopts an activated sludge process, the raw sewage is subjected to primary treatment and then enters a biochemical tank part, after biological denitrification, one part of the raw sewage enters a secondary sedimentation tank for sedimentation after denitrification again through internal circulation reflux; the biochemical pond part comprises a biochemical pond I epsilon {1,2,3,4,5}, wherein the biochemical pond I epsilon 1 Belongs to {1,2} as an anoxic zone mainly completing the denitrification reaction process, a biochemical pool 2 The epsilon {3,4,5} is an aerobic zone which mainly completes the nitration reaction process; in the step 1, m is 1 The operation variables of the sewage treatment comprise inflow, inflow ammonia nitrogen, the biomass of active heterotrophic bacteria in a biochemical pond I E {1,2,3,4,5}, the biomass of easily biodegradable organic matters in the biochemical pond I E {1,2,3,4,5}, the alkalinity in the biochemical pond I E {1,2,3,4,5}, and the biochemical pond I 1 Amount of nitrol in E {1,2}, biochemical pool l 2 Activity autotrophic bacteria biomass in epsilon {3,4,5}, biochemical pool l 2 Amount of ammonia nitrogen in E {3,4,5}, biochemical pool l 2 The dissolved oxygen content in the epsilon {3,4,5 }; m is 2 The quality variables of the effluent comprise biochemical oxygen demand, chemical oxygen demand, suspended matters and ammonia nitrogen amount of the effluent.
3. A KPLS and RWFCM based sewage treatment process monitoring method according to claim 1 or 2, characterised in that said step 3 comprises the steps of:
step 3.1: the KPLS model for monitoring the sewage treatment process is constructed as
Φ=TP 1 '+Φ r
Y=TQ'+Y r
Step 3.2: mapping input samples X in an input data matrix X to a high-dimensional feature space F: x → phi (X) belongs to F, a Gaussian kernel function is introduced to obtain a Gram matrix K of the input data matrix X, the Gram matrix K is subjected to centralization processing, and a KPLS model is converted into
K=TP 2 '+E
Y=TQ'+Y r
Wherein, the element of the ith row and the jth column of the Gram matrix K is K ij =k(x i ,x j )=<Φ(x i ),Φ(x j )>,x i 、x j Respectively, the ith input sample X in the input data matrix X i Jth input sample x j ,k(x i ,x j ) Is a Gaussian kernel function, i, j belongs to {1, 2., n }, and n is the number of samples in the input data matrix X; t is high dimensional data phi = { phi (x) i ) I ∈ {1,2, ·, n } } score matrix, T = [ T ] } 1 ,...,t A ]A is the number of pivot elements, P 1 =[p 11 ,...,p 1A ]、P 2 =[p 21 ,...,p 2A ]、Q=[q 1 ,...,q A ]A load matrix phi, a Gram matrix K, and an output data matrix Y r 、E、Y r Respectively are the modeling residual errors of the matrix phi, the Gram matrix K and the output data matrix Y.
4. The KPLS and RWFCM based sewage treatment process monitoring method according to claim 3, wherein in step 4, the cross validation method is adopted to determine the number of principal elements A and solve the score matrix T, comprising the following steps:
step 4.1: let u be any column of the output data matrix Y;
and 4.2: calculating a score vector: t = Ku;
step 4.3: normalizing the score vector t: l t | → 1;
step 4.4: and (3) performing regression on each column in the output data matrix Y on the score vector t: q = Y't;
step 4.5: calculating a new score for the output data matrix Y: u = Yq;
step 4.6: and (3) normalizing the u vector: | | u | → 1;
step 4.7: judging whether u converges: if yes, jumping to step 4.8; if not, jumping to step 4.2;
step 4.8: updating the matrix: k = (I-tt ') K (I-tt '), Y = Y-tq ', repeat steps 4.2 through 4.7, and perform the calculation of the next score vector until a score vectors are all extracted; wherein I is an identity matrix.
5. The KPLS and RWFCM-based sewage treatment process monitoring method according to claim 3, wherein in step 3, the processed Gram matrix is centralized
Figure FDA0002111429540000031
Wherein, E n Is an n × n identity matrix, 1 n Is n-dimensional all 1-column vector, 1' n Is 1 n The transposed matrix of (2).
6. The KPLS and RWFCM based sewage treatment process monitoring method according to claim 4, wherein said step 5 comprises the steps of:
step 5.1: clustering the score matrix T based on the RWFCM algorithm, and constructing an RWFCM target function as
Figure FDA0002111429540000032
Wherein the content of the first and second substances,
Figure FDA0002111429540000033
is the ith row vector of the scoring matrix T,
Figure FDA0002111429540000034
is m 1 Input sample x of dimension i Corresponding reduced A-dimensional new sample, u ij Is a sample
Figure FDA0002111429540000035
For the jth clustering center v j Degree of membership of s ij Is a sample
Figure FDA0002111429540000036
Likelihood of belonging to the j-th cluster, membership matrix U = (U) ij ) n×c Cluster center matrix V = (V) j ) c×A And c is the number of clusters; m is in the range of [1, + ∞]Is a fuzzy index;
Figure FDA0002111429540000037
is a sample
Figure FDA0002111429540000038
With the jth cluster center v j Mahalanobis distance between them, S j Is a fuzzified covariance matrix, S j Is a positive definite matrix;
Figure FDA0002111429540000039
as a penalty term, η j P is a probability index as a penalty factor; c clustering centers obtained by clustering the score matrix T comprise clustering centers of normal working condition samples and clustering centers of c-1 abnormal working condition samples;
step 5.2: solving a membership matrix U:
step 5.2.1: initializing RWFCM algorithm parameters: determining the clustering number c, setting a fuzzy index m and a probability index p, setting an algorithm termination limit epsilon and a maximum iteration count, initializing the iteration count k =1, and randomly initializing a membership matrix U (k) =(u ij (k) ) n×c Randomly initializing a cluster center matrix V (k) =(v j (k) ) c×A Random initialization fuzzification covariance matrix set S (k) =(S j (k) ) n×n×c
Step 5.2.2: will u ij (k) 、v j (k) 、S j (k) Substitution formula
Figure FDA0002111429540000041
Computing a likelihood matrix B for the (k + 1) th iteration (k+1) =(s ij (k+1) ) n×c
Step 5.2.3: will s ij (k+1) 、v j (k) 、S j (k) Substitution formula
Figure FDA0002111429540000042
Calculating membership degree matrix U of k +1 iteration (k+1) =(u ij (k+1) ) n×c
Step 5.2.4: will u ij (k+1) 、s ij (k+1) Substitution formula
Figure FDA0002111429540000043
Calculating the clustering center matrix of the (k + 1) th iteration as V (k+1) =(v j (k+1) ) c×A
Step 5.2.5: will u ij (k+1) 、s ij (k+1) 、v j (k+1) Substitution formula
Figure FDA0002111429540000044
Calculating the fuzzification covariance matrix set S of the (k + 1) th iteration (k+1) =(S j (k+1) ) n×n×c (ii) a Wherein, γ j Is a lagrange multiplier;
step 5.2.6: if | | | U (k+1) -U (k) If | | < epsilon or the iteration times k > count, stopping iteration to obtain a final membership matrix U, and entering the step 5.3; otherwise, letting k = k +1, and returning to step 5.2.2;
step 5.3: monitoring abnormal working conditions in the sewage treatment process according to the membership matrix U: if the ith sample
Figure FDA0002111429540000045
If the membership degree of the clustering center of the normal working condition sample is less than mu, the sewage treatment process is abnormal at the ith sample; if the sewage treatment process is abnormal, entering the step 6; if no abnormity occurs in the sewage treatment process, ending the KPLS and RWFCM based sewage treatmentA water treatment process monitoring method.
7. A KPLS and RWFCM based sewage treatment process monitoring method according to claim 6, wherein said step 5.2.1, determining the cluster number c comprises:
computing input samples X in an input data matrix X i Point density value D of i Is composed of
Figure FDA0002111429540000051
Wherein the content of the first and second substances,
Figure FDA0002111429540000052
r d is the effective radius of the neighborhood density,
Figure FDA0002111429540000053
computing the constructor S (j) as
Figure FDA0002111429540000054
The image of the structural function S (j) is drawn, and the number of slopes of the image of the structural function S (j) is set as the cluster number c.
8. The KPLS and RWFCM-based sewage treatment process monitoring method according to claim 6, wherein the step 6 comprises the steps of:
step 6.1: a linear regression model of the variables in the membership matrix U and the input data matrix X is established as
Figure FDA0002111429540000055
Wherein N is 0 =(η 0j ) 1×c Is a constant term of ∈ ij For the error term, it is assumed that: e (ε) ij )=0,Var(ε ij )=δ 2 δ is a constant; x is the number of ia For the ith input sample X in the input data matrix X i The value of the a variable of (a);
Figure FDA0002111429540000056
is a variable contribution matrix, η aj Is a regression coefficient, η aj The contribution of the a variable to the j cluster;
step 6.2: solving a variable contribution matrix N by adopting a Lagrange multiplier method:
step 6.2.1: initializing each parameter: setting an algorithm termination limit tau and a maximum iteration number T, initializing the iteration number k =1, and randomly initializing a variable contribution matrix
Figure FDA0002111429540000057
Step 6.2.2: will eta aj (k) Substituting into formula
Figure FDA0002111429540000058
Calculate N for the k +1 th iteration 0 (k+1) =(η 0j (k+1) ) 1×c
Step 6.2.3: will eta 0j (k+1) Substituting into formula
Figure FDA0002111429540000059
Computing the variable contribution matrix for the (k + 1) th iteration
Figure FDA00021114295400000510
Step 6.2.4: if N (k+1) -N (k) If | is less than tau or the iteration times k is more than T, stopping iteration and entering the step 6.3; otherwise, letting k = k +1, and returning to step 6.2.2;
step 6.3: and (3) identifying abnormal working conditions in the sewage treatment process according to the variable contribution matrix N: if the contribution of the a-th variable to all clusters { η } a1 ,...,η ac Maximum of η ag Then the a is changedThe quantity is a fault variable related to the g abnormal working condition; wherein g belongs to { 1.,. C-1}, and the c-th cluster is the cluster of the normal working condition samples.
CN201910573311.XA 2019-06-28 2019-06-28 KPLS (kernel principal component system) and RWFCM (wireless remote control unit) -based sewage treatment process monitoring method Active CN110232256B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910573311.XA CN110232256B (en) 2019-06-28 2019-06-28 KPLS (kernel principal component system) and RWFCM (wireless remote control unit) -based sewage treatment process monitoring method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910573311.XA CN110232256B (en) 2019-06-28 2019-06-28 KPLS (kernel principal component system) and RWFCM (wireless remote control unit) -based sewage treatment process monitoring method

Publications (2)

Publication Number Publication Date
CN110232256A CN110232256A (en) 2019-09-13
CN110232256B true CN110232256B (en) 2022-11-15

Family

ID=67856526

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910573311.XA Active CN110232256B (en) 2019-06-28 2019-06-28 KPLS (kernel principal component system) and RWFCM (wireless remote control unit) -based sewage treatment process monitoring method

Country Status (1)

Country Link
CN (1) CN110232256B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110928187B (en) * 2019-12-03 2021-02-26 北京工业大学 Sewage treatment process fault monitoring method based on fuzzy width self-adaptive learning model
CN113222324B (en) * 2021-03-13 2023-04-07 宁波大学科学技术学院 Sewage quality monitoring method based on PLS-PSO-RBF neural network model
CN113223633B (en) * 2021-03-13 2024-04-05 宁波大学科学技术学院 Width GRNN model-based water quality prediction method for sewage discharge outlet in papermaking process
CN113468479B (en) * 2021-06-16 2023-08-08 北京科技大学 Cold continuous rolling industrial process monitoring and abnormality detection method based on data driving
CN116092078B (en) * 2023-04-11 2023-06-09 深圳市信远环保水务有限公司 Intelligent control system for sewage treatment
CN116110516B (en) * 2023-04-14 2023-07-21 青岛山青华通环境科技有限公司 Method and device for identifying abnormal working conditions in sewage treatment process

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE4243882C1 (en) * 1992-12-23 1994-01-05 Baleanu Michael Alin Method and device for monitoring a technical process
CN106600605A (en) * 2016-12-14 2017-04-26 陕西科技大学 Unsupervised fast image segmentation algorithm
CN107025338B (en) * 2017-03-27 2020-04-03 北京工业大学 Recursive RBF neural network-based sludge bulking fault identification method
CN107463093B (en) * 2017-07-13 2019-05-21 东北大学 A kind of blast-melted quality monitoring method based on KPLS robust reconstructed error

Also Published As

Publication number Publication date
CN110232256A (en) 2019-09-13

Similar Documents

Publication Publication Date Title
CN110232256B (en) KPLS (kernel principal component system) and RWFCM (wireless remote control unit) -based sewage treatment process monitoring method
CN110232062B (en) KPLS (kernel principal component plus minor component plus) and FCM (fiber channel model) -based sewage treatment process monitoring method
US10570024B2 (en) Method for effluent total nitrogen-based on a recurrent self-organizing RBF neural network
Farhi et al. Prediction of wastewater treatment quality using LSTM neural network
Vasilaki et al. Relating N2O emissions during biological nitrogen removal with operating conditions using multivariate statistical techniques
CN111291937A (en) Method for predicting quality of treated sewage based on combination of support vector classification and GRU neural network
CN111160776A (en) Method for detecting abnormal working condition in sewage treatment process by utilizing block principal component analysis
CN111126870B (en) Sewage treatment process abnormal condition detection method by utilizing integrated principal component analysis
CN112417765B (en) Sewage treatment process fault detection method based on improved teacher-student network model
Liu et al. Modeling of wastewater treatment processes using dynamic Bayesian networks based on fuzzy PLS
CN113189881A (en) Multi-objective optimization control method and system for sewage treatment
Zhong et al. Water quality prediction of MBR based on machine learning: A novel dataset contribution analysis method
AU2018418038B2 (en) System and method for predicting a parameter associated with a wastewater treatment process
Mbamba et al. Optimization of deep learning models for forecasting performance in the water industry using genetic algorithms
CN108549740A (en) A kind of anaerobic system water outlet ammonia nitrogen flexible measurement method based on integrated intelligent algorithm
Wang et al. Artificial intelligence algorithm application in wastewater treatment plants: Case study for COD load prediction
CN115356930B (en) Multi-objective optimization control system and method in sewage treatment process
Bakht et al. Ingredient analysis of biological wastewater using hybrid multi-stream deep learning framework
Hadian et al. Application of artificial intelligence in modeling, control, and fault diagnosis
Zhao et al. Enhanced classification based on probabilistic extreme learning machine in wastewater treatment process
CN113065242A (en) KPLSR model-based soft measurement method for total nitrogen concentration of effluent from sewage treatment
Nasr et al. Sustainable management of wastewater treatment plants using artificial intelligence techniques
Kini et al. Enhanced data-driven monitoring of wastewater treatment plants using the Kolmogorov–Smirnov test
Bakht et al. Hybrid Multi-Stream Deep Learning-Based Nutrient Estimation Framework in Biological Wastewater Treatement
CN116110516B (en) Method and device for identifying abnormal working conditions in sewage treatment process

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant