CN110232062B - KPLS (kernel principal component plus minor component plus) and FCM (fiber channel model) -based sewage treatment process monitoring method - Google Patents
KPLS (kernel principal component plus minor component plus) and FCM (fiber channel model) -based sewage treatment process monitoring method Download PDFInfo
- Publication number
- CN110232062B CN110232062B CN201910572930.7A CN201910572930A CN110232062B CN 110232062 B CN110232062 B CN 110232062B CN 201910572930 A CN201910572930 A CN 201910572930A CN 110232062 B CN110232062 B CN 110232062B
- Authority
- CN
- China
- Prior art keywords
- matrix
- sewage treatment
- treatment process
- sample
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/21—Design, administration or maintenance of databases
- G06F16/215—Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/28—Databases characterised by their database models, e.g. relational or object models
- G06F16/284—Relational databases
- G06F16/285—Clustering or classification
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Quality & Reliability (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Activated Sludge Processes (AREA)
Abstract
The invention relates to the technical field of sewage treatment quality monitoring, and provides a KPLS and FCM based sewage treatment process monitoring method. The method comprises the steps of firstly, collecting data samples of sewage treatment processes under normal working conditions and abnormal working conditions, respectively using data of sewage treatment operation variables and data of effluent quality variables as input and output data matrixes, and standardizing the two matrixes; then constructing a KPLS model, mapping an input sample to a high-dimensional characteristic space, introducing a Gaussian kernel function to obtain a Gram matrix K, and solving a score matrix; then calculating density values of input sample points, calculating a constructor and drawing constructor images to determine the clustering number; and finally, clustering the scoring matrix based on an FCM algorithm to obtain a membership matrix, and monitoring abnormal working conditions in the sewage treatment process according to the membership matrix. The invention can reduce the dimension of high-dimensional data, process nonlinear data, accurately and conveniently determine the clustering number and improve the timeliness and accuracy of monitoring.
Description
Technical Field
The invention relates to the technical field of sewage treatment quality monitoring, in particular to a KPLS and FCM based sewage treatment process monitoring method.
Background
With the acceleration of urbanization and industrialization in China, the demand of the society on fresh water resources is increasing day by day, and the construction of urban domestic sewage treatment facilities needs to be accelerated to improve the urban domestic sewage treatment capacity. The active sludge process is the main method for treating urban sewage at present. The activated sludge sewage purification mainly comprises 3 processes of initial adsorption, microorganism metabolism, flocculation formation and sedimentation, and the essence is that biodegradable organic matters in the sewage are adsorbed, decomposed and oxidized by utilizing the microorganism group in the activated sludge through a series of biochemical reactions, so that the biodegradable organic matters are separated from the sewage, and the aim of purifying the sewage is fulfilled.
At present, biochemical oxygen demand ([ BOD ]), chemical oxygen demand ([ COD ]), suspended matter ([ SS ]), ammonia nitrogen ([ NH ]), and total phosphorus ([ TP ]) are generally adopted as sewage discharge indexes. In the sewage treatment process, parameters such as water inlet flow, water inlet components, pollutant concentration, weather change and the like are passively accepted, and the life activities of microorganisms are influenced by various factors such as dissolved oxygen concentration, microorganism population, the pH value of sewage and the like, so that the long-term stable operation of the urban sewage treatment plant is very difficult to maintain. The failure of the sewage treatment plant easily causes the quality of the effluent not to reach the standard, increases the operation cost and causes environmental pollution. Therefore, if the abnormal working condition of the sewage treatment process cannot be detected in time, the correct judgment cannot be made and no powerful measures are taken in time for adjustment and correction, so that the irreversible loss of the sewage treatment process can be caused. Therefore, an operator can accurately judge the abnormal working condition by detecting the sewage treatment process, and timely and accurately take measures, so that the safety, stability and smooth operation of sewage treatment are ensured, and the quality of effluent is especially important.
The existing sewage treatment process monitoring method adopts a data mining method in recent years, and the main reason is that a large amount of data exists and can be widely used, and the data needs to be converted into useful information and knowledge urgently. Since sewage treatment process data has no classification identification and the occurrence of sewage treatment failures is not correlated much with time, it is not suitable to mine using classification or sequence pattern mining. The cluster analysis in the data mining technology is an unsupervised classification technology and can be well used for analyzing data with less prior knowledge, so that the cluster analysis technology is widely applied to sewage process monitoring.
The fuzzy c-means clustering (FCM) algorithm is one of the classical clustering algorithms. FCM gives the uncertainty degree of the sample to the category, and establishes the uncertainty description of the sample to the category, which is more consistent with the description of the objective world. However, the data of the sewage treatment process has high dimensionality and nonlinearity, and the traditional FCM algorithm cannot process the high dimensionality and nonlinearity data, so that the difficulty of process monitoring is increased, the reliability of fault detection is reduced, the effluent quality of sewage is greatly influenced, and certain economic loss and even accidents are caused. Meanwhile, the clustering number of the FCM algorithm needs to be preset manually, and the method has great limitation in practical application.
Disclosure of Invention
Aiming at the problems in the prior art, the invention provides the KPLS and FCM-based sewage treatment process monitoring method, which can reduce the dimension of high-dimensional data, process nonlinear data, accurately and conveniently determine the clustering number and improve the timeliness and the accuracy of sewage treatment process monitoring.
The technical scheme of the invention is as follows:
a KPLS and FCM based sewage treatment process monitoring method is characterized by comprising the following steps:
step 1: respectively collecting data samples of a normal working condition and a sewage treatment process containing an abnormal working condition, wherein the data samples of the sewage treatment process comprise m1Operation variable m of sewage treatment2Individual effluent quality variables; adding the sewage treatment process data sample under the normal working condition before the sewage treatment process data sample under the abnormal working condition from the time angle to form a mixed data sample set; collecting m mixed data samples1Taking the data of the running variable of the sewage treatment as an input data matrix X, and concentrating the mixed data sample into m2Taking the data of the effluent quality variable as an output data matrix Y;
step 2: preprocessing an input data matrix X and an output data matrix Y; the preprocessing comprises the steps of calculating the mean value and the standard deviation of each variable in an input data matrix X and an output data matrix Y, and normalizing the input data matrix X and the output data matrix Y into data with zero mean value and unit standard deviation;
and step 3: constructing a KPLS model for monitoring the sewage treatment process, and mapping an input sample X in an input data matrix X to a high-dimensional feature space F: x → phi (X) belongs to F, a Gaussian kernel function is introduced to obtain a Gram matrix K of the input data matrix X, and the Gram matrix K is subjected to centralization processing;
and 4, step 4: determining the number of pivot elements by adopting a cross verification method, and solving a score matrix T;
and 5: computing input samples X in an input data matrix XiPoint density value D ofiCalculating a constructor S (j), drawing an image of the constructor S (j), and constructing the image according to the constructor S (j)Determining the clustering number c according to the slope number of the image;
step 6: and c is used as the clustering number, the scoring matrix T is clustered based on an FCM algorithm to obtain a membership matrix U, and abnormal working condition monitoring is carried out on the sewage treatment process according to the membership matrix U: and if the membership degree of the sample to the clustering center of the normal working condition sample at a certain moment is less than mu, the sewage treatment process generates an abnormality at the sample.
The sewage treatment process adopts an activated sludge method, raw sewage enters a biochemical tank part after primary treatment, after biological denitrification, one part of the raw sewage enters a secondary sedimentation tank for sedimentation after denitrification again through internal circulation reflux; the biochemical pool part comprises biochemical pool l is belonged to {1,2,3,4,5}, wherein the biochemical pool l1Belongs to {1,2} as an anoxic zone mainly completing the denitrification reaction process, a biochemical pool l2The epsilon {3,4,5} is an aerobic zone which mainly completes the nitration reaction process; in the step 1, m is1The operation variables of the sewage treatment comprise inflow, inflow ammonia nitrogen, the biomass of active heterotrophic bacteria in a biochemical pool I belonging to {1,2,3,4,5}, the biomass of easily biodegradable organic matters in the biochemical pool I belonging to {1,2,3,4,5}, and the operation variables of the sewage treatment comprise inflow, inflow ammonia nitrogen, inflow, outflow and outflow of water, outflow of water1Amount of nitrol in epsilon {1,2}, biochemical pool l2Activity autotrophic bacteria biomass in epsilon {3,4,5}, biochemical pool l2The ammonia nitrogen content in the epsilon {3,4,5 }; m is2The quality variables of the effluent comprise biochemical oxygen demand, chemical oxygen demand, suspended matters and ammonia nitrogen amount of the effluent.
The step 3 comprises the following steps:
step 3.1: the KPLS model for monitoring the sewage treatment process is constructed as
Φ=TP1'+Φr
Y=TQ'+Yr
Step 3.2: mapping input samples X in an input data matrix X to a high-dimensional feature space F: x → phi (X) belongs to F, a Gaussian kernel function is introduced to obtain a Gram matrix K of the input data matrix X, the Gram matrix K is subjected to centralization processing, and a KPLS model is converted into
K=TP2'+E
Y=TQ'+Yr
Wherein, the element of the ith row and the jth column of the Gram matrix K is Kij=k(xi,xj)=<Φ(xi),Φ(xj)>,xi、xjRespectively, the ith input sample X in the input data matrix XiJ-th input sample xj,k(xi,xj) Is a Gaussian kernel function, i, j belongs to {1, 2., n }, and n is the number of samples in the input data matrix X; t is high dimensional data phi ═ phi (x)i) I ∈ {1, 2., n } } score matrix, T ═ T [1,...,tA]A is the number of pivot elements, P1=[p11,...,p1A]、P2=[p21,...,p2A]、Q=[q1,...,qA]Respectively a matrix phi, a Gram matrix K, a load matrix of an output data matrix Y, phir、E、YrRespectively are the modeling residual errors of the matrix phi, the Gram matrix K and the output data matrix Y.
In the step 4, determining the number of the principal elements A by adopting a cross verification method, and solving a scoring matrix T, wherein the method comprises the following steps:
step 4.1: let u be any column of the output data matrix Y;
step 4.2: calculating a score vector: t is Ku;
step 4.3: normalizing the score vector t: l t | → 1;
step 4.4: and (3) performing regression on each column in the output data matrix Y on the score vector t: q ═ Y't;
step 4.5: calculating a new score for the output data matrix Y: u is Yq;
step 4.6: and (3) normalizing the u vector: | | u | → 1;
step 4.7: judging whether u converges: if yes, jumping to step 4.8; if not, jumping to step 4.2;
step 4.8: updating the matrix: repeating the steps 4.2 to 4.7 until a score vector is calculated, wherein K is (I-tt ') K (I-tt '), and Y is Y-tq ', and calculating the next score vector until a score vectors are extracted; wherein I is an identity matrix.
In the step 3, the Gram matrix after the centralization processingWherein E isnIs an n × n identity matrix, 1nIs n-dimensional all 1-column vector, 1'nIs 1nThe transposed matrix of (2).
The step 5 comprises the following steps:
step 5.1: computing input samples X in an input data matrix XiPoint density value D ofiIs composed of
Wherein the content of the first and second substances,rdis the effective radius of the neighborhood density,
Step 5.3: and drawing the image of the constructor S (j), and taking the slope number of the image of the constructor S (j) as the cluster number c.
The step 6 comprises the following steps:
step 6.1: clustering the score matrix T based on the FCM algorithm by taking c as the clustering number to construct an FCM target function
Wherein the content of the first and second substances,is the ith row vector of the scoring matrix T,is m1Input sample x of dimensioniCorresponding reduced A-dimensional new sample, uijIs a sampleFor the jth clustering center vjThe degree of membership of (a) is, membership matrix U ═ Uij)n×cThe cluster center matrix V ═ V (V)j)c×A;m∈[1,+∞]Is a fuzzy index;is a sampleWith the jth cluster center vjThe Euclidean distance between; c clustering centers obtained by clustering the score matrix T comprise clustering centers of normal working condition samples and clustering centers of c-1 abnormal working condition samples;
step 6.2: solving a membership matrix U:
step 6.2.1: initializing FCM algorithm parameters: setting a fuzzy index m, setting an algorithm termination limit epsilon and a maximum iteration count, setting an initialization iteration count k to be 1, and randomly initializing a membership matrix U(k)=(uij (k))n×cRandomly initializing a cluster center matrix V(k)=(vj (k))c×A;
Step 6.2.2: v is to bej (k)Substitution formulaCalculating membership degree matrix U of k +1 iteration(k+1)=(uij (k+1))n×c;
Step 6.2.3: will uij (k+1)Substitution formulaCalculating the clustering center matrix of the (k + 1) th iteration as V(k+1)=(vj (k+1))c×A;
Step 6.2.4: if | | | U(k+1)-U(k)If | is less than epsilon or the iteration times k is more than count, stopping iteration to obtain a final membership matrix U, and entering step 6.3; otherwise, making k equal to k +1, and returning to the step 6.2.2;
step 6.3: monitoring abnormal working conditions in the sewage treatment process according to the membership matrix U: if the ith sampleAnd if the membership degree of the clustering center of the normal working condition sample is less than mu, the sewage treatment process is abnormal at the ith sample.
The invention has the beneficial effects that:
(1) the KPLS algorithm and the FCM algorithm are combined, the KPLS model and the FCM model are constructed to describe the normal production process, prior knowledge of abnormal working conditions in the sewage treatment process is not needed, and only normal working condition data are used as marking data. Firstly, based on a data driving method, a Gaussian kernel function is adopted, standardized process variables are projected to a high-dimensional feature space, a KPLS model for monitoring the sewage treatment process is established in the high-dimensional feature space, after the number of principal elements is determined by a cross verification method, dimension reduction is carried out on high-dimensional input data, a score matrix T is obtained and used as input data of clustering analysis in an FCM algorithm, the purpose of dimension reduction is achieved, and meanwhile the limitation that the FCM cannot process nonlinear data is solved.
(2) The invention calculates the constructor based on the density function and solves the clustering number according to the constructor, thereby accurately and conveniently determining the clustering number and solving the limitation problem that the clustering number of the FCM algorithm needs to be preset manually.
(3) The method and the device cluster the scoring matrix T based on the FCM algorithm to obtain the membership matrix U, monitor abnormal working conditions in the sewage treatment process according to the membership matrix U, monitor the occurrence time of the abnormal working conditions through sample membership, and simultaneously identify the number of the abnormal working conditions, have high monitoring timeliness and accuracy, are convenient for operators to monitor the sewage treatment process, accurately judge the fluctuation of the effluent quality of sewage treatment, and timely take measures to treat and correct, so that the stable, efficient and safe operation of a sewage plant is ensured, and the effluent quality is ensured.
Drawings
FIG. 1 is a flow chart of a KPLS and FCM based sewage treatment process monitoring method of the present invention;
FIG. 2 is a schematic diagram of a constructor in accordance with an embodiment of the invention;
FIG. 3 is a schematic diagram illustrating the membership of a monitoring sample to a clustering center of a normal condition sample according to an embodiment of the present invention;
FIG. 4 is a schematic diagram illustrating the membership of a monitoring sample to a cluster center of an abnormal condition sample according to an embodiment of the present invention;
fig. 5 is a schematic diagram of the clustering effect of the score matrix T according to the embodiment of the present invention.
Detailed Description
The invention will be further described with reference to the accompanying drawings and specific embodiments.
Fig. 1 shows a flow chart of the KPLS and FCM-based sewage treatment process monitoring method according to the present invention. The KPLS and FCM-based sewage treatment process monitoring method is characterized by comprising the following steps of:
step 1: respectively collecting data samples of a normal working condition and a sewage treatment process containing an abnormal working condition, wherein the data samples of the sewage treatment process comprise m1Operation variable m of sewage treatment2Individual effluent quality variables; adding the sewage treatment process data sample under the normal working condition before the sewage treatment process data sample under the abnormal working condition from the time angle to form a mixed data sample set; collecting m mixed data samples1Taking the data of the running variable of the sewage treatment as an input data matrix X, and mixing the data samplesCentralizing m2The data of the water quality variable is used as an output data matrix Y.
In this embodiment, the sewage treatment process employs an activated sludge process. The activated sludge process flow is generally divided into primary treatment, secondary treatment and tertiary treatment according to the treatment degree. The raw sewage is treated in the first stage and then enters the biochemical tank for biological denitrification, one part of the raw sewage is subjected to denitrification again through internal circulation reflux, and the other part of the raw sewage enters the secondary sedimentation tank for sedimentation. The biochemical tank is the most important place for completing biochemical reaction process and purifying sewage. The biochemical pool part comprises biochemical pool l is belonged to {1,2,3,4,5}, wherein the biochemical pool l1Belongs to {1,2} as an anoxic zone mainly completing the denitrification reaction process, a biochemical pool l2The epsilon {3,4,5} is an aerobic zone which mainly completes the nitration reaction process; in the step 1, m is1The operation variables of the sewage treatment comprise inflow, inflow ammonia nitrogen, the biomass of active heterotrophic bacteria in a biochemical pool I belonging to {1,2,3,4,5}, the biomass of easily biodegradable organic matters in the biochemical pool I belonging to {1,2,3,4,5}, and the operation variables of the sewage treatment comprise inflow, inflow ammonia nitrogen, inflow, outflow and outflow of water, outflow of water1Amount of nitrol in epsilon {1,2}, biochemical pool l2Activity autotrophic bacteria biomass in epsilon {3,4,5}, biochemical pool l2The ammonia nitrogen content in the epsilon {3,4,5 }; m is2The quality variables of the effluent comprise biochemical oxygen demand, chemical oxygen demand, suspended matters and ammonia nitrogen amount of the effluent.
The abnormal conditions are sludge bulking, foaming, scumming, toxic shock, stormy weather, etc., as is well known to those skilled in the art. In this embodiment, 200 sewage treatment process data samples under normal conditions and 800 sewage treatment process data samples under an abnormal condition including rainstorm weather are collected to form a mixed data sample set including 1000 samples. Collecting m mixed data samples1Taking the data of 20 sewage treatment operation variables as an input data matrix X epsilon R1000×20Collecting m mixed data samples2Taking the data of 4 water outlet quality variables as an output data matrix Y epsilon R1000×4。
Step 2: preprocessing an input data matrix X and an output data matrix Y; the preprocessing comprises the steps of calculating the mean value and the standard deviation of each variable in the input data matrix X and the output data matrix Y, and normalizing the input data matrix X and the output data matrix Y into data with zero mean value and unit standard deviation.
And step 3: constructing a KPLS model for monitoring the sewage treatment process, and mapping an input sample X in an input data matrix X to a high-dimensional feature space F: x → phi (X) is in the middle of F, a Gaussian kernel function is introduced to obtain a Gram matrix K of the input data matrix X, and the Gram matrix K is subjected to centralization processing.
The step 3 comprises the following steps:
step 3.1: the KPLS model for monitoring the sewage treatment process is constructed as
Φ=TP1'+Φr
Y=TQ'+Yr
Step 3.2: mapping input samples X in an input data matrix X to a high-dimensional feature space F: x → phi (X) belongs to F, a Gaussian kernel function is introduced to obtain a Gram matrix K of the input data matrix X, the Gram matrix K is subjected to centralization processing, and a KPLS model is converted into
K=TP2'+E
Y=TQ'+Yr
Wherein, the element of the ith row and the jth column of the Gram matrix K is Kij=k(xi,xj)=<Φ(xi),Φ(xj)>,xi、xjRespectively, the ith input sample X in the input data matrix XiJ-th input sample xj,k(xi,xj) Is a Gaussian kernel function, i, j belongs to {1, 2., n }, and n is the number of samples in the input data matrix X; t is high dimensional data phi ═ phi (x)i) I ∈ {1, 2., n } } score matrix, T ═ T [1,...,tA]A is the number of pivot elements, P1=[p11,...,p1A]、P2=[p21,...,p2A]、Q=[q1,...,qA]Respectively a matrix phi, a Gram matrix K, a load matrix of an output data matrix Y, phir、E、YrRespectively are the modeling residual errors of the matrix phi, the Gram matrix K and the output data matrix Y.
In the step 3, the Gram matrix after the centralization processingWherein E isnIs an n × n identity matrix, 1nIs n-dimensional all 1-column vector, 1'nIs 1nThe transposed matrix of (2).
A KPLS model is constructed by adopting a nonlinear least square iterative algorithm, and KPLS is kernel projection to relative structure. In this embodiment, the Gaussian kernel function isWherein, c1Is a Gaussian kernel width parameter, c1Is taken from 5m1Empirical determination, i.e. determination of c1=5*m1=100。
And 4, step 4: determining the number of pivot elements by adopting a cross verification method, and solving a score matrix T;
in the step 4, determining the number of the principal elements A by adopting a cross verification method, and solving a scoring matrix T, wherein the method comprises the following steps:
step 4.1: let u be any column of the output data matrix Y;
step 4.2: calculating a score vector: t is Ku;
step 4.3: normalizing the score vector t: l t | → 1;
step 4.4: and (3) performing regression on each column in the output data matrix Y on the score vector t: q ═ Y't;
step 4.5: calculating a new score for the output data matrix Y: u is Yq;
step 4.6: and (3) normalizing the u vector: | | u | → 1;
step 4.7: judging whether u converges: if yes, jumping to step 4.8; if not, jumping to step 4.2;
step 4.8: updating the matrix: repeating the steps 4.2 to 4.7 until a score vector is calculated, wherein K is (I-tt ') K (I-tt '), and Y is Y-tq ', and calculating the next score vector until a score vectors are extracted; wherein I is an identity matrix.
In this embodiment, the number a of principal elements is determined to be 3 by using a cross-validation method.
And 5: computing input samples X in an input data matrix XiPoint density value D ofiCalculating a structural function S (j), drawing an image of the structural function S (j), and determining a cluster number c according to the slope number of the image of the structural function S (j);
the step 5 comprises the following steps:
step 5.1: computing input samples X in an input data matrix XiPoint density value D ofiIs composed of
Wherein the content of the first and second substances,rdis the effective radius of the neighborhood density,
Step 5.3: and drawing the image of the constructor S (j), and taking the slope number of the image of the constructor S (j) as the cluster number c.
As shown in fig. 2, the slope of the constructor s (j) reflects the point density value of the sample data, which has the practical meaning that the slopes of the constructor s (j) at the homogeneous data are the same. As can be seen from fig. 2, the image has distinct transitions around 200 and 700, respectively, whereby the image can be roughly divided into two parts, i.e., (0,500) U (700,1000) and (500,700). In the 1000 test data sets, the first 200 test data sets are normal working condition data, and the last 800 test data sets are data including abnormal working conditions. From the analysis, it can be judged that the mixed data sample set is divided into two categories: class1 class1 is a normal condition sample class and class2 class2 is an abnormal condition sample class, so that the cluster number c is determined to be 2.
Step 6: and c is used as the clustering number, the scoring matrix T is clustered based on an FCM algorithm to obtain a membership matrix U, and abnormal working condition monitoring is carried out on the sewage treatment process according to the membership matrix U: and if the membership degree of the sample to the clustering center of the normal working condition sample at a certain moment is less than mu, the sewage treatment process generates an abnormality at the sample.
The step 6 comprises the following steps:
step 6.1: clustering the score matrix T based on the FCM algorithm by taking c as the clustering number to construct an FCM target function
Wherein the content of the first and second substances,is the ith row vector of the scoring matrix T,is m1Input sample x of dimensioniCorresponding reduced A-dimensional new sample, uijIs a sampleFor the jth clustering center vjThe degree of membership of (a) is, membership matrix U ═ Uij)n×cThe cluster center matrix V ═ V (V)j)c×A;m∈[1,+∞]Is a fuzzy index;is a sampleWith the jth cluster center vjThe Euclidean distance between; c clustering centers obtained by clustering the score matrix T comprise clustering centers of normal working condition samples and clustering centers of c-1 abnormal working condition samples;
step 6.2: solving a membership matrix U:
step 6.2.1: initializing FCM algorithm parameters: setting a fuzzy index m, setting an algorithm termination limit epsilon and a maximum iteration count, setting an initialization iteration count k to be 1, and randomly initializing a membership matrix U(k)=(uij (k))n×cRandomly initializing a cluster center matrix V(k)=(vj (k))c×A;
Step 6.2.2: v is to bej (k)Substitution formulaCalculating membership degree matrix U of k +1 iteration(k+1)=(uij (k+1))n×c;
Step 6.2.3: will uij (k+1)Substitution formulaCalculating the clustering center matrix of the (k + 1) th iteration as V(k+1)=(vj (k+1))c×A;
Step 6.2.4: if | | | U(k+1)-U(k)If | is less than epsilon or the iteration times k is more than count, stopping iteration to obtain a final membership matrix U, and entering step 6.3; otherwise, making k equal to k +1, and returning to the step 6.2.2;
step 6.3: monitoring abnormal working conditions in the sewage treatment process according to the membership matrix U: if the ith sampleAnd if the membership degree of the clustering center of the normal working condition sample is less than mu, the sewage treatment process is abnormal at the ith sample.
Wherein the fuzzy index m influences the fuzzy degree of the membership degree matrix. In this embodiment, the effect of the algorithm can be optimized by setting the fuzzy index m to 2.4.
In this example, the sample was monitoredFor the clustering center v1 of normal condition sample and the clustering center v of abnormal condition sample2Degree of membership u ofi1、ui2As shown in fig. 3 and 4, respectively, the clustering effect of the score matrix T is shown in fig. 5. Set μ to 0.5. As can be seen from FIGS. 3 and 4, at the 500 th and 700 th samples, the samplesAnd the membership degree of the clustering center of the normal working condition samples is less than 0.5, so that the abnormality of the sewage treatment process at 700 th samples is judged. Therefore, the monitoring method can timely monitor the occurrence of abnormal working conditions in the sewage treatment process.
It is to be understood that the above-described embodiments are only a few embodiments of the present invention, and not all embodiments. The above examples are only for explaining the present invention and do not constitute a limitation to the scope of protection of the present invention. All other embodiments, which can be derived by those skilled in the art from the above-described embodiments without any creative effort, namely all modifications, equivalents, improvements and the like made within the spirit and principle of the present application, fall within the protection scope of the present invention claimed.
Claims (6)
1. A KPLS and FCM based sewage treatment process monitoring method is characterized by comprising the following steps:
step 1: respectively collecting data samples of a normal working condition and a sewage treatment process containing an abnormal working condition, wherein the data samples of the sewage treatment process comprise m1Operation variable m of sewage treatment2Individual effluent quality variables; adding the sewage treatment process data sample under the normal working condition before the sewage treatment process data sample under the abnormal working condition from the time angle to form a mixed data sample set; will be provided withMixed data sample set m1Taking the data of the running variable of the sewage treatment as an input data matrix X, and concentrating the mixed data sample into m2Taking the data of the effluent quality variable as an output data matrix Y;
step 2: preprocessing an input data matrix X and an output data matrix Y; the preprocessing comprises the steps of calculating the mean value and the standard deviation of each variable in an input data matrix X and an output data matrix Y, and normalizing the input data matrix X and the output data matrix Y into data with zero mean value and unit standard deviation;
and step 3: constructing a KPLS model for monitoring the sewage treatment process, and mapping an input sample X in an input data matrix X to a high-dimensional feature space F: x → phi (X) belongs to F, a Gaussian kernel function is introduced to obtain a Gram matrix K of the input data matrix X, and the Gram matrix K is subjected to centralization processing;
and 4, step 4: determining the number of pivot elements by adopting a cross verification method, and solving a score matrix T;
and 5: computing input samples X in an input data matrix XiPoint density value D ofiCalculating a structural function S (j), drawing an image of the structural function S (j), and determining a cluster number c according to the slope number of the image of the structural function S (j);
the step 5 comprises the following steps:
step 5.1: computing input samples X in an input data matrix XiPoint density value D ofiIs composed of
Wherein the content of the first and second substances,rdis the effective radius of the neighborhood density,
Step 5.3: drawing an image of a constructor S (j), and taking the slope number of the image of the constructor S (j) as a cluster number c;
step 6: and c is used as the clustering number, the scoring matrix T is clustered based on an FCM algorithm to obtain a membership matrix U, and abnormal working condition monitoring is carried out on the sewage treatment process according to the membership matrix U: and if the membership degree of the sample to the clustering center of the normal working condition sample at a certain moment is less than mu, the sewage treatment process generates an abnormality at the sample.
2. The KPLS and FCM based sewage treatment process monitoring method of claim 1, wherein the sewage treatment process is activated sludge process, the raw sewage is treated in the first stage, and then enters into the biochemical tank, after biological denitrification, one part of the raw sewage enters into the secondary sedimentation tank to be precipitated again through internal circulation reflux; the biochemical pool part comprises biochemical pool l is belonged to {1,2,3,4,5}, wherein the biochemical pool l1Belongs to {1,2} as an anoxic zone mainly completing the denitrification reaction process, a biochemical pool l2The epsilon {3,4,5} is an aerobic zone which mainly completes the nitration reaction process; in the step 1, m is1The operation variables of the sewage treatment comprise inflow, inflow ammonia nitrogen, the biomass of active heterotrophic bacteria in a biochemical pool I belonging to {1,2,3,4,5}, the biomass of easily biodegradable organic matters in the biochemical pool I belonging to {1,2,3,4,5}, and the operation variables of the sewage treatment comprise inflow, inflow ammonia nitrogen, inflow, outflow and outflow of water, outflow of water1Amount of nitrol in epsilon {1,2}, biochemical pool l2Activity autotrophic bacteria biomass in epsilon {3,4,5}, biochemical pool l2The ammonia nitrogen content in the epsilon {3,4,5 }; m is2The quality variables of the effluent comprise biochemical oxygen demand, chemical oxygen demand, suspended matters and ammonia nitrogen amount of the effluent.
3. A KPLS and FCM based sewage treatment process monitoring method according to claim 1 or 2, wherein said step 3 comprises the steps of:
step 3.1: the KPLS model for monitoring the sewage treatment process is constructed as
Φ=TP1'+Φr
Y=TQ'+Yr
Step 3.2: mapping input samples X in an input data matrix X to a high-dimensional feature space F: x → phi (X) belongs to F, a Gaussian kernel function is introduced to obtain a Gram matrix K of the input data matrix X, the Gram matrix K is subjected to centralization processing, and a KPLS model is converted into
K=TP2'+E
Y=TQ'+Yr
Wherein, the element of the ith row and the jth column of the Gram matrix K is Kij=k(xi,xj)=<Φ(xi),Φ(xj)>,xi、xjRespectively, the ith input sample X in the input data matrix XiJ-th input sample xj,k(xi,xj) Is a Gaussian kernel function, i, j belongs to {1, 2., n }, and n is the number of samples in the input data matrix X; t is high dimensional data phi ═ phi (x)i) I ∈ {1,2, …, n } } score matrix, T ═ T [ -T }1,…,tA]A is the number of pivot elements, P1=[p11,…,p1A]、P2=[p21,…,p2A]、Q=[q1,…,qA]Respectively a matrix phi, a Gram matrix K, a load matrix of an output data matrix Y, phir、E、YrRespectively are the modeling residual errors of the matrix phi, the Gram matrix K and the output data matrix Y.
4. The KPLS and FCM based sewage treatment process monitoring method of claim 3, wherein in step 4, cross validation method is used to determine principal component number A and solve scoring matrix T, comprising the following steps:
step 4.1: let u be any column of the output data matrix Y;
step 4.2: calculating a score vector: t is Ku;
step 4.3: normalizing the score vector t: l t | → 1;
step 4.4: and (3) performing regression on each column in the output data matrix Y on the score vector t: q ═ Y't;
step 4.5: calculating a new score for the output data matrix Y: u is Yq;
step 4.6: and (3) normalizing the u vector: | | u | → 1;
step 4.7: judging whether u converges: if yes, jumping to step 4.8; if not, jumping to step 4.2;
step 4.8: updating the matrix: repeating the steps 4.2 to 4.7 until a score vector is calculated, wherein K is (I-tt ') K (I-tt '), and Y is Y-tq ', and calculating the next score vector until a score vectors are extracted; wherein I is an identity matrix.
6. A KPLS and FCM based sewage treatment process monitoring method according to claim 4, wherein said step 6 comprises the steps of:
step 6.1: clustering the score matrix T based on the FCM algorithm by taking c as the clustering number to construct an FCM target function
Wherein the content of the first and second substances,is the ith row vector of the scoring matrix T,is m1Input of dimensionSample xiCorresponding reduced A-dimensional new sample, uijIs a sampleFor the jth clustering center vjThe degree of membership of (a) is, membership matrix U ═ Uij)n×cThe cluster center matrix V ═ V (V)j)c×A;m∈[1,+∞]Is a fuzzy index;is a sampleWith the jth cluster center vjThe Euclidean distance between; c clustering centers obtained by clustering the score matrix T comprise clustering centers of normal working condition samples and clustering centers of c-1 abnormal working condition samples;
step 6.2: solving a membership matrix U:
step 6.2.1: initializing FCM algorithm parameters: setting a fuzzy index m, setting an algorithm termination limit epsilon and a maximum iteration count, setting an initialization iteration count k to be 1, and randomly initializing a membership matrix U(k)=(uij (k))n×cRandomly initializing a cluster center matrix V(k)=(vj (k))c×A;
Step 6.2.2: v is to bej (k)Substitution formulaCalculating membership degree matrix U of k +1 iteration(k+1)=(uij (k+1))n×c;
Step 6.2.3: will uij (k+1)Substitution formulaCalculating the clustering center matrix of the (k + 1) th iteration as V(k+1)=(vj (k+1))c×A;
Step 6.2.4: if | | | U(k+1)-U(k)If | is less than epsilon or the iteration times k is more than count, stopping iteration to obtain a final membership matrix U, and entering step 6.3; otherwise, making k equal to k +1, and returning to the step 6.2.2;
step 6.3: monitoring abnormal working conditions in the sewage treatment process according to the membership matrix U: if the ith sampleAnd if the membership degree of the clustering center of the normal working condition sample is less than mu, the sewage treatment process is abnormal at the ith sample.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910572930.7A CN110232062B (en) | 2019-06-28 | 2019-06-28 | KPLS (kernel principal component plus minor component plus) and FCM (fiber channel model) -based sewage treatment process monitoring method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910572930.7A CN110232062B (en) | 2019-06-28 | 2019-06-28 | KPLS (kernel principal component plus minor component plus) and FCM (fiber channel model) -based sewage treatment process monitoring method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110232062A CN110232062A (en) | 2019-09-13 |
CN110232062B true CN110232062B (en) | 2021-04-02 |
Family
ID=67856615
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910572930.7A Active CN110232062B (en) | 2019-06-28 | 2019-06-28 | KPLS (kernel principal component plus minor component plus) and FCM (fiber channel model) -based sewage treatment process monitoring method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110232062B (en) |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110928187B (en) * | 2019-12-03 | 2021-02-26 | 北京工业大学 | Sewage treatment process fault monitoring method based on fuzzy width self-adaptive learning model |
CN111233118A (en) * | 2020-03-19 | 2020-06-05 | 中冶赛迪工程技术股份有限公司 | Intelligent control system and control method for high-density sedimentation tank |
CN113222324B (en) * | 2021-03-13 | 2023-04-07 | 宁波大学科学技术学院 | Sewage quality monitoring method based on PLS-PSO-RBF neural network model |
CN114527249B (en) * | 2022-01-17 | 2024-03-19 | 南方海洋科学与工程广东省实验室(广州) | Quality control method and system for water quality monitoring data |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104299169A (en) * | 2014-09-26 | 2015-01-21 | 华中科技大学 | Online sewage disposal system information safety risk analysis method and system |
EP3065076A1 (en) * | 2015-03-04 | 2016-09-07 | Secure-Nok AS | System and method for responding to a cyber-attack-related incident against an industrial control system |
CN107463093A (en) * | 2017-07-13 | 2017-12-12 | 东北大学 | A kind of blast-melted quality monitoring method based on KPLS robust reconstructed errors |
-
2019
- 2019-06-28 CN CN201910572930.7A patent/CN110232062B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104299169A (en) * | 2014-09-26 | 2015-01-21 | 华中科技大学 | Online sewage disposal system information safety risk analysis method and system |
EP3065076A1 (en) * | 2015-03-04 | 2016-09-07 | Secure-Nok AS | System and method for responding to a cyber-attack-related incident against an industrial control system |
CN107463093A (en) * | 2017-07-13 | 2017-12-12 | 东北大学 | A kind of blast-melted quality monitoring method based on KPLS robust reconstructed errors |
Non-Patent Citations (2)
Title |
---|
Adaptive Fuzzy C-Means clustering in process monitoring;Pekka Teppola等;《Chemometrics and Intelligent Laboratory Systems》;19991231;全文 * |
面向污水处理的数据驱动故障诊断及预测方法综述;黄道平等;《华南理工大学学报(自然科学版)》;20150331;全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN110232062A (en) | 2019-09-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110232062B (en) | KPLS (kernel principal component plus minor component plus) and FCM (fiber channel model) -based sewage treatment process monitoring method | |
CN110232256B (en) | KPLS (kernel principal component system) and RWFCM (wireless remote control unit) -based sewage treatment process monitoring method | |
US10570024B2 (en) | Method for effluent total nitrogen-based on a recurrent self-organizing RBF neural network | |
Farhi et al. | Prediction of wastewater treatment quality using LSTM neural network | |
CN111126870B (en) | Sewage treatment process abnormal condition detection method by utilizing integrated principal component analysis | |
CN111160776A (en) | Method for detecting abnormal working condition in sewage treatment process by utilizing block principal component analysis | |
CN112417765B (en) | Sewage treatment process fault detection method based on improved teacher-student network model | |
CN106600509B (en) | Method for analyzing and judging water exchange and pollution discharge behaviors of enterprise based on basic data | |
CN109064048B (en) | Wastewater discharge source rapid investigation method and system based on wastewater treatment process analysis | |
Ba-Alawi et al. | Missing data imputation and sensor self-validation towards a sustainable operation of wastewater treatment plants via deep variational residual autoencoders | |
Liu et al. | Modeling of wastewater treatment processes using dynamic Bayesian networks based on fuzzy PLS | |
CN108088974A (en) | A kind of flexible measurement method of anaerobism while denitrification methane phase process water outlet nitrate nitrogen | |
Pisa et al. | A recurrent neural network for wastewater treatment plant effuents' prediction | |
Zhong et al. | Water quality prediction of MBR based on machine learning: A novel dataset contribution analysis method | |
US20210355007A1 (en) | System and method for predicting a parameter associated with a wastewater treatment process | |
Mbamba et al. | Optimization of deep learning models for forecasting performance in the water industry using genetic algorithms | |
WO2014157750A1 (en) | Apparatus and method for providing causative factors for state of quality of effluent water from sewage treatment plant | |
CN201330211Y (en) | Working parameter self-optimizing simulation system for sewage treatment plant | |
CN116046048A (en) | Sewage treatment sensor fault diagnosis method based on data driving | |
CN116048024A (en) | Distributed typical correlation analysis process monitoring method and device | |
Huang et al. | Improving nitrogen removal using a fuzzy neural network-based control system in the anoxic/oxic process | |
CN114565154A (en) | Prediction method and optimization algorithm for carbon source adding amount of biochemical section of leachate | |
Kini et al. | Enhanced data-driven monitoring of wastewater treatment plants using the Kolmogorov–Smirnov test | |
CN116110516B (en) | Method and device for identifying abnormal working conditions in sewage treatment process | |
US20240083789A1 (en) | Reconstruction method and system for wastewater biological treatment process based on machine learning system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |