CN110232062B - KPLS (kernel principal component plus minor component plus) and FCM (fiber channel model) -based sewage treatment process monitoring method - Google Patents

KPLS (kernel principal component plus minor component plus) and FCM (fiber channel model) -based sewage treatment process monitoring method Download PDF

Info

Publication number
CN110232062B
CN110232062B CN201910572930.7A CN201910572930A CN110232062B CN 110232062 B CN110232062 B CN 110232062B CN 201910572930 A CN201910572930 A CN 201910572930A CN 110232062 B CN110232062 B CN 110232062B
Authority
CN
China
Prior art keywords
matrix
sewage treatment
treatment process
sample
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910572930.7A
Other languages
Chinese (zh)
Other versions
CN110232062A (en
Inventor
周平
张瑞垚
王宏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northeastern University China
Original Assignee
Northeastern University China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northeastern University China filed Critical Northeastern University China
Priority to CN201910572930.7A priority Critical patent/CN110232062B/en
Publication of CN110232062A publication Critical patent/CN110232062A/en
Application granted granted Critical
Publication of CN110232062B publication Critical patent/CN110232062B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/285Clustering or classification

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Activated Sludge Processes (AREA)

Abstract

The invention relates to the technical field of sewage treatment quality monitoring, and provides a KPLS and FCM based sewage treatment process monitoring method. The method comprises the steps of firstly, collecting data samples of sewage treatment processes under normal working conditions and abnormal working conditions, respectively using data of sewage treatment operation variables and data of effluent quality variables as input and output data matrixes, and standardizing the two matrixes; then constructing a KPLS model, mapping an input sample to a high-dimensional characteristic space, introducing a Gaussian kernel function to obtain a Gram matrix K, and solving a score matrix; then calculating density values of input sample points, calculating a constructor and drawing constructor images to determine the clustering number; and finally, clustering the scoring matrix based on an FCM algorithm to obtain a membership matrix, and monitoring abnormal working conditions in the sewage treatment process according to the membership matrix. The invention can reduce the dimension of high-dimensional data, process nonlinear data, accurately and conveniently determine the clustering number and improve the timeliness and accuracy of monitoring.

Description

KPLS (kernel principal component plus minor component plus) and FCM (fiber channel model) -based sewage treatment process monitoring method
Technical Field
The invention relates to the technical field of sewage treatment quality monitoring, in particular to a KPLS and FCM based sewage treatment process monitoring method.
Background
With the acceleration of urbanization and industrialization in China, the demand of the society on fresh water resources is increasing day by day, and the construction of urban domestic sewage treatment facilities needs to be accelerated to improve the urban domestic sewage treatment capacity. The active sludge process is the main method for treating urban sewage at present. The activated sludge sewage purification mainly comprises 3 processes of initial adsorption, microorganism metabolism, flocculation formation and sedimentation, and the essence is that biodegradable organic matters in the sewage are adsorbed, decomposed and oxidized by utilizing the microorganism group in the activated sludge through a series of biochemical reactions, so that the biodegradable organic matters are separated from the sewage, and the aim of purifying the sewage is fulfilled.
At present, biochemical oxygen demand ([ BOD ]), chemical oxygen demand ([ COD ]), suspended matter ([ SS ]), ammonia nitrogen ([ NH ]), and total phosphorus ([ TP ]) are generally adopted as sewage discharge indexes. In the sewage treatment process, parameters such as water inlet flow, water inlet components, pollutant concentration, weather change and the like are passively accepted, and the life activities of microorganisms are influenced by various factors such as dissolved oxygen concentration, microorganism population, the pH value of sewage and the like, so that the long-term stable operation of the urban sewage treatment plant is very difficult to maintain. The failure of the sewage treatment plant easily causes the quality of the effluent not to reach the standard, increases the operation cost and causes environmental pollution. Therefore, if the abnormal working condition of the sewage treatment process cannot be detected in time, the correct judgment cannot be made and no powerful measures are taken in time for adjustment and correction, so that the irreversible loss of the sewage treatment process can be caused. Therefore, an operator can accurately judge the abnormal working condition by detecting the sewage treatment process, and timely and accurately take measures, so that the safety, stability and smooth operation of sewage treatment are ensured, and the quality of effluent is especially important.
The existing sewage treatment process monitoring method adopts a data mining method in recent years, and the main reason is that a large amount of data exists and can be widely used, and the data needs to be converted into useful information and knowledge urgently. Since sewage treatment process data has no classification identification and the occurrence of sewage treatment failures is not correlated much with time, it is not suitable to mine using classification or sequence pattern mining. The cluster analysis in the data mining technology is an unsupervised classification technology and can be well used for analyzing data with less prior knowledge, so that the cluster analysis technology is widely applied to sewage process monitoring.
The fuzzy c-means clustering (FCM) algorithm is one of the classical clustering algorithms. FCM gives the uncertainty degree of the sample to the category, and establishes the uncertainty description of the sample to the category, which is more consistent with the description of the objective world. However, the data of the sewage treatment process has high dimensionality and nonlinearity, and the traditional FCM algorithm cannot process the high dimensionality and nonlinearity data, so that the difficulty of process monitoring is increased, the reliability of fault detection is reduced, the effluent quality of sewage is greatly influenced, and certain economic loss and even accidents are caused. Meanwhile, the clustering number of the FCM algorithm needs to be preset manually, and the method has great limitation in practical application.
Disclosure of Invention
Aiming at the problems in the prior art, the invention provides the KPLS and FCM-based sewage treatment process monitoring method, which can reduce the dimension of high-dimensional data, process nonlinear data, accurately and conveniently determine the clustering number and improve the timeliness and the accuracy of sewage treatment process monitoring.
The technical scheme of the invention is as follows:
a KPLS and FCM based sewage treatment process monitoring method is characterized by comprising the following steps:
step 1: respectively collecting data samples of a normal working condition and a sewage treatment process containing an abnormal working condition, wherein the data samples of the sewage treatment process comprise m1Operation variable m of sewage treatment2Individual effluent quality variables; adding the sewage treatment process data sample under the normal working condition before the sewage treatment process data sample under the abnormal working condition from the time angle to form a mixed data sample set; collecting m mixed data samples1Taking the data of the running variable of the sewage treatment as an input data matrix X, and concentrating the mixed data sample into m2Taking the data of the effluent quality variable as an output data matrix Y;
step 2: preprocessing an input data matrix X and an output data matrix Y; the preprocessing comprises the steps of calculating the mean value and the standard deviation of each variable in an input data matrix X and an output data matrix Y, and normalizing the input data matrix X and the output data matrix Y into data with zero mean value and unit standard deviation;
and step 3: constructing a KPLS model for monitoring the sewage treatment process, and mapping an input sample X in an input data matrix X to a high-dimensional feature space F: x → phi (X) belongs to F, a Gaussian kernel function is introduced to obtain a Gram matrix K of the input data matrix X, and the Gram matrix K is subjected to centralization processing;
and 4, step 4: determining the number of pivot elements by adopting a cross verification method, and solving a score matrix T;
and 5: computing input samples X in an input data matrix XiPoint density value D ofiCalculating a constructor S (j), drawing an image of the constructor S (j), and constructing the image according to the constructor S (j)Determining the clustering number c according to the slope number of the image;
step 6: and c is used as the clustering number, the scoring matrix T is clustered based on an FCM algorithm to obtain a membership matrix U, and abnormal working condition monitoring is carried out on the sewage treatment process according to the membership matrix U: and if the membership degree of the sample to the clustering center of the normal working condition sample at a certain moment is less than mu, the sewage treatment process generates an abnormality at the sample.
The sewage treatment process adopts an activated sludge method, raw sewage enters a biochemical tank part after primary treatment, after biological denitrification, one part of the raw sewage enters a secondary sedimentation tank for sedimentation after denitrification again through internal circulation reflux; the biochemical pool part comprises biochemical pool l is belonged to {1,2,3,4,5}, wherein the biochemical pool l1Belongs to {1,2} as an anoxic zone mainly completing the denitrification reaction process, a biochemical pool l2The epsilon {3,4,5} is an aerobic zone which mainly completes the nitration reaction process; in the step 1, m is1The operation variables of the sewage treatment comprise inflow, inflow ammonia nitrogen, the biomass of active heterotrophic bacteria in a biochemical pool I belonging to {1,2,3,4,5}, the biomass of easily biodegradable organic matters in the biochemical pool I belonging to {1,2,3,4,5}, and the operation variables of the sewage treatment comprise inflow, inflow ammonia nitrogen, inflow, outflow and outflow of water, outflow of water1Amount of nitrol in epsilon {1,2}, biochemical pool l2Activity autotrophic bacteria biomass in epsilon {3,4,5}, biochemical pool l2The ammonia nitrogen content in the epsilon {3,4,5 }; m is2The quality variables of the effluent comprise biochemical oxygen demand, chemical oxygen demand, suspended matters and ammonia nitrogen amount of the effluent.
The step 3 comprises the following steps:
step 3.1: the KPLS model for monitoring the sewage treatment process is constructed as
Φ=TP1'+Φr
Y=TQ'+Yr
Step 3.2: mapping input samples X in an input data matrix X to a high-dimensional feature space F: x → phi (X) belongs to F, a Gaussian kernel function is introduced to obtain a Gram matrix K of the input data matrix X, the Gram matrix K is subjected to centralization processing, and a KPLS model is converted into
K=TP2'+E
Y=TQ'+Yr
Wherein, the element of the ith row and the jth column of the Gram matrix K is Kij=k(xi,xj)=<Φ(xi),Φ(xj)>,xi、xjRespectively, the ith input sample X in the input data matrix XiJ-th input sample xj,k(xi,xj) Is a Gaussian kernel function, i, j belongs to {1, 2., n }, and n is the number of samples in the input data matrix X; t is high dimensional data phi ═ phi (x)i) I ∈ {1, 2., n } } score matrix, T ═ T [1,...,tA]A is the number of pivot elements, P1=[p11,...,p1A]、P2=[p21,...,p2A]、Q=[q1,...,qA]Respectively a matrix phi, a Gram matrix K, a load matrix of an output data matrix Y, phir、E、YrRespectively are the modeling residual errors of the matrix phi, the Gram matrix K and the output data matrix Y.
In the step 4, determining the number of the principal elements A by adopting a cross verification method, and solving a scoring matrix T, wherein the method comprises the following steps:
step 4.1: let u be any column of the output data matrix Y;
step 4.2: calculating a score vector: t is Ku;
step 4.3: normalizing the score vector t: l t | → 1;
step 4.4: and (3) performing regression on each column in the output data matrix Y on the score vector t: q ═ Y't;
step 4.5: calculating a new score for the output data matrix Y: u is Yq;
step 4.6: and (3) normalizing the u vector: | | u | → 1;
step 4.7: judging whether u converges: if yes, jumping to step 4.8; if not, jumping to step 4.2;
step 4.8: updating the matrix: repeating the steps 4.2 to 4.7 until a score vector is calculated, wherein K is (I-tt ') K (I-tt '), and Y is Y-tq ', and calculating the next score vector until a score vectors are extracted; wherein I is an identity matrix.
In the step 3, the Gram matrix after the centralization processing
Figure GDA0002146462530000041
Wherein E isnIs an n × n identity matrix, 1nIs n-dimensional all 1-column vector, 1'nIs 1nThe transposed matrix of (2).
The step 5 comprises the following steps:
step 5.1: computing input samples X in an input data matrix XiPoint density value D ofiIs composed of
Figure GDA0002146462530000042
Wherein the content of the first and second substances,
Figure GDA00021464625300000411
rdis the effective radius of the neighborhood density,
Figure GDA0002146462530000043
step 5.2: calculating the constructor S (j) as
Figure GDA0002146462530000044
Step 5.3: and drawing the image of the constructor S (j), and taking the slope number of the image of the constructor S (j) as the cluster number c.
The step 6 comprises the following steps:
step 6.1: clustering the score matrix T based on the FCM algorithm by taking c as the clustering number to construct an FCM target function
Figure GDA0002146462530000045
Wherein the content of the first and second substances,
Figure GDA0002146462530000046
is the ith row vector of the scoring matrix T,
Figure GDA0002146462530000047
is m1Input sample x of dimensioniCorresponding reduced A-dimensional new sample, uijIs a sample
Figure GDA0002146462530000048
For the jth clustering center vjThe degree of membership of (a) is,
Figure GDA0002146462530000049
Figure GDA00021464625300000410
membership matrix U ═ Uij)n×cThe cluster center matrix V ═ V (V)j)c×A;m∈[1,+∞]Is a fuzzy index;
Figure GDA0002146462530000051
is a sample
Figure GDA0002146462530000052
With the jth cluster center vjThe Euclidean distance between; c clustering centers obtained by clustering the score matrix T comprise clustering centers of normal working condition samples and clustering centers of c-1 abnormal working condition samples;
step 6.2: solving a membership matrix U:
step 6.2.1: initializing FCM algorithm parameters: setting a fuzzy index m, setting an algorithm termination limit epsilon and a maximum iteration count, setting an initialization iteration count k to be 1, and randomly initializing a membership matrix U(k)=(uij (k))n×cRandomly initializing a cluster center matrix V(k)=(vj (k))c×A
Step 6.2.2: v is to bej (k)Substitution formula
Figure GDA0002146462530000053
Calculating membership degree matrix U of k +1 iteration(k+1)=(uij (k+1))n×c
Step 6.2.3: will uij (k+1)Substitution formula
Figure GDA0002146462530000054
Calculating the clustering center matrix of the (k + 1) th iteration as V(k+1)=(vj (k+1))c×A
Step 6.2.4: if | | | U(k+1)-U(k)If | is less than epsilon or the iteration times k is more than count, stopping iteration to obtain a final membership matrix U, and entering step 6.3; otherwise, making k equal to k +1, and returning to the step 6.2.2;
step 6.3: monitoring abnormal working conditions in the sewage treatment process according to the membership matrix U: if the ith sample
Figure GDA0002146462530000055
And if the membership degree of the clustering center of the normal working condition sample is less than mu, the sewage treatment process is abnormal at the ith sample.
The invention has the beneficial effects that:
(1) the KPLS algorithm and the FCM algorithm are combined, the KPLS model and the FCM model are constructed to describe the normal production process, prior knowledge of abnormal working conditions in the sewage treatment process is not needed, and only normal working condition data are used as marking data. Firstly, based on a data driving method, a Gaussian kernel function is adopted, standardized process variables are projected to a high-dimensional feature space, a KPLS model for monitoring the sewage treatment process is established in the high-dimensional feature space, after the number of principal elements is determined by a cross verification method, dimension reduction is carried out on high-dimensional input data, a score matrix T is obtained and used as input data of clustering analysis in an FCM algorithm, the purpose of dimension reduction is achieved, and meanwhile the limitation that the FCM cannot process nonlinear data is solved.
(2) The invention calculates the constructor based on the density function and solves the clustering number according to the constructor, thereby accurately and conveniently determining the clustering number and solving the limitation problem that the clustering number of the FCM algorithm needs to be preset manually.
(3) The method and the device cluster the scoring matrix T based on the FCM algorithm to obtain the membership matrix U, monitor abnormal working conditions in the sewage treatment process according to the membership matrix U, monitor the occurrence time of the abnormal working conditions through sample membership, and simultaneously identify the number of the abnormal working conditions, have high monitoring timeliness and accuracy, are convenient for operators to monitor the sewage treatment process, accurately judge the fluctuation of the effluent quality of sewage treatment, and timely take measures to treat and correct, so that the stable, efficient and safe operation of a sewage plant is ensured, and the effluent quality is ensured.
Drawings
FIG. 1 is a flow chart of a KPLS and FCM based sewage treatment process monitoring method of the present invention;
FIG. 2 is a schematic diagram of a constructor in accordance with an embodiment of the invention;
FIG. 3 is a schematic diagram illustrating the membership of a monitoring sample to a clustering center of a normal condition sample according to an embodiment of the present invention;
FIG. 4 is a schematic diagram illustrating the membership of a monitoring sample to a cluster center of an abnormal condition sample according to an embodiment of the present invention;
fig. 5 is a schematic diagram of the clustering effect of the score matrix T according to the embodiment of the present invention.
Detailed Description
The invention will be further described with reference to the accompanying drawings and specific embodiments.
Fig. 1 shows a flow chart of the KPLS and FCM-based sewage treatment process monitoring method according to the present invention. The KPLS and FCM-based sewage treatment process monitoring method is characterized by comprising the following steps of:
step 1: respectively collecting data samples of a normal working condition and a sewage treatment process containing an abnormal working condition, wherein the data samples of the sewage treatment process comprise m1Operation variable m of sewage treatment2Individual effluent quality variables; adding the sewage treatment process data sample under the normal working condition before the sewage treatment process data sample under the abnormal working condition from the time angle to form a mixed data sample set; collecting m mixed data samples1Taking the data of the running variable of the sewage treatment as an input data matrix X, and mixing the data samplesCentralizing m2The data of the water quality variable is used as an output data matrix Y.
In this embodiment, the sewage treatment process employs an activated sludge process. The activated sludge process flow is generally divided into primary treatment, secondary treatment and tertiary treatment according to the treatment degree. The raw sewage is treated in the first stage and then enters the biochemical tank for biological denitrification, one part of the raw sewage is subjected to denitrification again through internal circulation reflux, and the other part of the raw sewage enters the secondary sedimentation tank for sedimentation. The biochemical tank is the most important place for completing biochemical reaction process and purifying sewage. The biochemical pool part comprises biochemical pool l is belonged to {1,2,3,4,5}, wherein the biochemical pool l1Belongs to {1,2} as an anoxic zone mainly completing the denitrification reaction process, a biochemical pool l2The epsilon {3,4,5} is an aerobic zone which mainly completes the nitration reaction process; in the step 1, m is1The operation variables of the sewage treatment comprise inflow, inflow ammonia nitrogen, the biomass of active heterotrophic bacteria in a biochemical pool I belonging to {1,2,3,4,5}, the biomass of easily biodegradable organic matters in the biochemical pool I belonging to {1,2,3,4,5}, and the operation variables of the sewage treatment comprise inflow, inflow ammonia nitrogen, inflow, outflow and outflow of water, outflow of water1Amount of nitrol in epsilon {1,2}, biochemical pool l2Activity autotrophic bacteria biomass in epsilon {3,4,5}, biochemical pool l2The ammonia nitrogen content in the epsilon {3,4,5 }; m is2The quality variables of the effluent comprise biochemical oxygen demand, chemical oxygen demand, suspended matters and ammonia nitrogen amount of the effluent.
The abnormal conditions are sludge bulking, foaming, scumming, toxic shock, stormy weather, etc., as is well known to those skilled in the art. In this embodiment, 200 sewage treatment process data samples under normal conditions and 800 sewage treatment process data samples under an abnormal condition including rainstorm weather are collected to form a mixed data sample set including 1000 samples. Collecting m mixed data samples1Taking the data of 20 sewage treatment operation variables as an input data matrix X epsilon R1000×20Collecting m mixed data samples2Taking the data of 4 water outlet quality variables as an output data matrix Y epsilon R1000×4
Step 2: preprocessing an input data matrix X and an output data matrix Y; the preprocessing comprises the steps of calculating the mean value and the standard deviation of each variable in the input data matrix X and the output data matrix Y, and normalizing the input data matrix X and the output data matrix Y into data with zero mean value and unit standard deviation.
And step 3: constructing a KPLS model for monitoring the sewage treatment process, and mapping an input sample X in an input data matrix X to a high-dimensional feature space F: x → phi (X) is in the middle of F, a Gaussian kernel function is introduced to obtain a Gram matrix K of the input data matrix X, and the Gram matrix K is subjected to centralization processing.
The step 3 comprises the following steps:
step 3.1: the KPLS model for monitoring the sewage treatment process is constructed as
Φ=TP1'+Φr
Y=TQ'+Yr
Step 3.2: mapping input samples X in an input data matrix X to a high-dimensional feature space F: x → phi (X) belongs to F, a Gaussian kernel function is introduced to obtain a Gram matrix K of the input data matrix X, the Gram matrix K is subjected to centralization processing, and a KPLS model is converted into
K=TP2'+E
Y=TQ'+Yr
Wherein, the element of the ith row and the jth column of the Gram matrix K is Kij=k(xi,xj)=<Φ(xi),Φ(xj)>,xi、xjRespectively, the ith input sample X in the input data matrix XiJ-th input sample xj,k(xi,xj) Is a Gaussian kernel function, i, j belongs to {1, 2., n }, and n is the number of samples in the input data matrix X; t is high dimensional data phi ═ phi (x)i) I ∈ {1, 2., n } } score matrix, T ═ T [1,...,tA]A is the number of pivot elements, P1=[p11,...,p1A]、P2=[p21,...,p2A]、Q=[q1,...,qA]Respectively a matrix phi, a Gram matrix K, a load matrix of an output data matrix Y, phir、E、YrRespectively are the modeling residual errors of the matrix phi, the Gram matrix K and the output data matrix Y.
In the step 3, the Gram matrix after the centralization processing
Figure GDA0002146462530000081
Wherein E isnIs an n × n identity matrix, 1nIs n-dimensional all 1-column vector, 1'nIs 1nThe transposed matrix of (2).
A KPLS model is constructed by adopting a nonlinear least square iterative algorithm, and KPLS is kernel projection to relative structure. In this embodiment, the Gaussian kernel function is
Figure GDA0002146462530000082
Wherein, c1Is a Gaussian kernel width parameter, c1Is taken from 5m1Empirical determination, i.e. determination of c1=5*m1=100。
And 4, step 4: determining the number of pivot elements by adopting a cross verification method, and solving a score matrix T;
in the step 4, determining the number of the principal elements A by adopting a cross verification method, and solving a scoring matrix T, wherein the method comprises the following steps:
step 4.1: let u be any column of the output data matrix Y;
step 4.2: calculating a score vector: t is Ku;
step 4.3: normalizing the score vector t: l t | → 1;
step 4.4: and (3) performing regression on each column in the output data matrix Y on the score vector t: q ═ Y't;
step 4.5: calculating a new score for the output data matrix Y: u is Yq;
step 4.6: and (3) normalizing the u vector: | | u | → 1;
step 4.7: judging whether u converges: if yes, jumping to step 4.8; if not, jumping to step 4.2;
step 4.8: updating the matrix: repeating the steps 4.2 to 4.7 until a score vector is calculated, wherein K is (I-tt ') K (I-tt '), and Y is Y-tq ', and calculating the next score vector until a score vectors are extracted; wherein I is an identity matrix.
In this embodiment, the number a of principal elements is determined to be 3 by using a cross-validation method.
And 5: computing input samples X in an input data matrix XiPoint density value D ofiCalculating a structural function S (j), drawing an image of the structural function S (j), and determining a cluster number c according to the slope number of the image of the structural function S (j);
the step 5 comprises the following steps:
step 5.1: computing input samples X in an input data matrix XiPoint density value D ofiIs composed of
Figure GDA0002146462530000083
Wherein the content of the first and second substances,
Figure GDA0002146462530000091
rdis the effective radius of the neighborhood density,
Figure GDA0002146462530000092
step 5.2: calculating the constructor S (j) as
Figure GDA0002146462530000093
Step 5.3: and drawing the image of the constructor S (j), and taking the slope number of the image of the constructor S (j) as the cluster number c.
As shown in fig. 2, the slope of the constructor s (j) reflects the point density value of the sample data, which has the practical meaning that the slopes of the constructor s (j) at the homogeneous data are the same. As can be seen from fig. 2, the image has distinct transitions around 200 and 700, respectively, whereby the image can be roughly divided into two parts, i.e., (0,500) U (700,1000) and (500,700). In the 1000 test data sets, the first 200 test data sets are normal working condition data, and the last 800 test data sets are data including abnormal working conditions. From the analysis, it can be judged that the mixed data sample set is divided into two categories: class1 class1 is a normal condition sample class and class2 class2 is an abnormal condition sample class, so that the cluster number c is determined to be 2.
Step 6: and c is used as the clustering number, the scoring matrix T is clustered based on an FCM algorithm to obtain a membership matrix U, and abnormal working condition monitoring is carried out on the sewage treatment process according to the membership matrix U: and if the membership degree of the sample to the clustering center of the normal working condition sample at a certain moment is less than mu, the sewage treatment process generates an abnormality at the sample.
The step 6 comprises the following steps:
step 6.1: clustering the score matrix T based on the FCM algorithm by taking c as the clustering number to construct an FCM target function
Figure GDA0002146462530000094
Wherein the content of the first and second substances,
Figure GDA0002146462530000095
is the ith row vector of the scoring matrix T,
Figure GDA0002146462530000096
is m1Input sample x of dimensioniCorresponding reduced A-dimensional new sample, uijIs a sample
Figure GDA0002146462530000097
For the jth clustering center vjThe degree of membership of (a) is,
Figure GDA0002146462530000098
Figure GDA0002146462530000099
membership matrix U ═ Uij)n×cThe cluster center matrix V ═ V (V)j)c×A;m∈[1,+∞]Is a fuzzy index;
Figure GDA00021464625300000910
is a sample
Figure GDA00021464625300000911
With the jth cluster center vjThe Euclidean distance between; c clustering centers obtained by clustering the score matrix T comprise clustering centers of normal working condition samples and clustering centers of c-1 abnormal working condition samples;
step 6.2: solving a membership matrix U:
step 6.2.1: initializing FCM algorithm parameters: setting a fuzzy index m, setting an algorithm termination limit epsilon and a maximum iteration count, setting an initialization iteration count k to be 1, and randomly initializing a membership matrix U(k)=(uij (k))n×cRandomly initializing a cluster center matrix V(k)=(vj (k))c×A
Step 6.2.2: v is to bej (k)Substitution formula
Figure GDA0002146462530000101
Calculating membership degree matrix U of k +1 iteration(k+1)=(uij (k+1))n×c
Step 6.2.3: will uij (k+1)Substitution formula
Figure GDA0002146462530000102
Calculating the clustering center matrix of the (k + 1) th iteration as V(k+1)=(vj (k+1))c×A
Step 6.2.4: if | | | U(k+1)-U(k)If | is less than epsilon or the iteration times k is more than count, stopping iteration to obtain a final membership matrix U, and entering step 6.3; otherwise, making k equal to k +1, and returning to the step 6.2.2;
step 6.3: monitoring abnormal working conditions in the sewage treatment process according to the membership matrix U: if the ith sample
Figure GDA0002146462530000103
And if the membership degree of the clustering center of the normal working condition sample is less than mu, the sewage treatment process is abnormal at the ith sample.
Wherein the fuzzy index m influences the fuzzy degree of the membership degree matrix. In this embodiment, the effect of the algorithm can be optimized by setting the fuzzy index m to 2.4.
In this example, the sample was monitored
Figure GDA0002146462530000104
For the clustering center v1 of normal condition sample and the clustering center v of abnormal condition sample2Degree of membership u ofi1、ui2As shown in fig. 3 and 4, respectively, the clustering effect of the score matrix T is shown in fig. 5. Set μ to 0.5. As can be seen from FIGS. 3 and 4, at the 500 th and 700 th samples, the samples
Figure GDA0002146462530000105
And the membership degree of the clustering center of the normal working condition samples is less than 0.5, so that the abnormality of the sewage treatment process at 700 th samples is judged. Therefore, the monitoring method can timely monitor the occurrence of abnormal working conditions in the sewage treatment process.
It is to be understood that the above-described embodiments are only a few embodiments of the present invention, and not all embodiments. The above examples are only for explaining the present invention and do not constitute a limitation to the scope of protection of the present invention. All other embodiments, which can be derived by those skilled in the art from the above-described embodiments without any creative effort, namely all modifications, equivalents, improvements and the like made within the spirit and principle of the present application, fall within the protection scope of the present invention claimed.

Claims (6)

1. A KPLS and FCM based sewage treatment process monitoring method is characterized by comprising the following steps:
step 1: respectively collecting data samples of a normal working condition and a sewage treatment process containing an abnormal working condition, wherein the data samples of the sewage treatment process comprise m1Operation variable m of sewage treatment2Individual effluent quality variables; adding the sewage treatment process data sample under the normal working condition before the sewage treatment process data sample under the abnormal working condition from the time angle to form a mixed data sample set; will be provided withMixed data sample set m1Taking the data of the running variable of the sewage treatment as an input data matrix X, and concentrating the mixed data sample into m2Taking the data of the effluent quality variable as an output data matrix Y;
step 2: preprocessing an input data matrix X and an output data matrix Y; the preprocessing comprises the steps of calculating the mean value and the standard deviation of each variable in an input data matrix X and an output data matrix Y, and normalizing the input data matrix X and the output data matrix Y into data with zero mean value and unit standard deviation;
and step 3: constructing a KPLS model for monitoring the sewage treatment process, and mapping an input sample X in an input data matrix X to a high-dimensional feature space F: x → phi (X) belongs to F, a Gaussian kernel function is introduced to obtain a Gram matrix K of the input data matrix X, and the Gram matrix K is subjected to centralization processing;
and 4, step 4: determining the number of pivot elements by adopting a cross verification method, and solving a score matrix T;
and 5: computing input samples X in an input data matrix XiPoint density value D ofiCalculating a structural function S (j), drawing an image of the structural function S (j), and determining a cluster number c according to the slope number of the image of the structural function S (j);
the step 5 comprises the following steps:
step 5.1: computing input samples X in an input data matrix XiPoint density value D ofiIs composed of
Figure FDA0002935515130000011
Wherein the content of the first and second substances,
Figure FDA0002935515130000012
rdis the effective radius of the neighborhood density,
Figure FDA0002935515130000013
step 5.2: calculating the constructor S (j) as
Figure FDA0002935515130000014
Step 5.3: drawing an image of a constructor S (j), and taking the slope number of the image of the constructor S (j) as a cluster number c;
step 6: and c is used as the clustering number, the scoring matrix T is clustered based on an FCM algorithm to obtain a membership matrix U, and abnormal working condition monitoring is carried out on the sewage treatment process according to the membership matrix U: and if the membership degree of the sample to the clustering center of the normal working condition sample at a certain moment is less than mu, the sewage treatment process generates an abnormality at the sample.
2. The KPLS and FCM based sewage treatment process monitoring method of claim 1, wherein the sewage treatment process is activated sludge process, the raw sewage is treated in the first stage, and then enters into the biochemical tank, after biological denitrification, one part of the raw sewage enters into the secondary sedimentation tank to be precipitated again through internal circulation reflux; the biochemical pool part comprises biochemical pool l is belonged to {1,2,3,4,5}, wherein the biochemical pool l1Belongs to {1,2} as an anoxic zone mainly completing the denitrification reaction process, a biochemical pool l2The epsilon {3,4,5} is an aerobic zone which mainly completes the nitration reaction process; in the step 1, m is1The operation variables of the sewage treatment comprise inflow, inflow ammonia nitrogen, the biomass of active heterotrophic bacteria in a biochemical pool I belonging to {1,2,3,4,5}, the biomass of easily biodegradable organic matters in the biochemical pool I belonging to {1,2,3,4,5}, and the operation variables of the sewage treatment comprise inflow, inflow ammonia nitrogen, inflow, outflow and outflow of water, outflow of water1Amount of nitrol in epsilon {1,2}, biochemical pool l2Activity autotrophic bacteria biomass in epsilon {3,4,5}, biochemical pool l2The ammonia nitrogen content in the epsilon {3,4,5 }; m is2The quality variables of the effluent comprise biochemical oxygen demand, chemical oxygen demand, suspended matters and ammonia nitrogen amount of the effluent.
3. A KPLS and FCM based sewage treatment process monitoring method according to claim 1 or 2, wherein said step 3 comprises the steps of:
step 3.1: the KPLS model for monitoring the sewage treatment process is constructed as
Φ=TP1'+Φr
Y=TQ'+Yr
Step 3.2: mapping input samples X in an input data matrix X to a high-dimensional feature space F: x → phi (X) belongs to F, a Gaussian kernel function is introduced to obtain a Gram matrix K of the input data matrix X, the Gram matrix K is subjected to centralization processing, and a KPLS model is converted into
K=TP2'+E
Y=TQ'+Yr
Wherein, the element of the ith row and the jth column of the Gram matrix K is Kij=k(xi,xj)=<Φ(xi),Φ(xj)>,xi、xjRespectively, the ith input sample X in the input data matrix XiJ-th input sample xj,k(xi,xj) Is a Gaussian kernel function, i, j belongs to {1, 2., n }, and n is the number of samples in the input data matrix X; t is high dimensional data phi ═ phi (x)i) I ∈ {1,2, …, n } } score matrix, T ═ T [ -T }1,…,tA]A is the number of pivot elements, P1=[p11,…,p1A]、P2=[p21,…,p2A]、Q=[q1,…,qA]Respectively a matrix phi, a Gram matrix K, a load matrix of an output data matrix Y, phir、E、YrRespectively are the modeling residual errors of the matrix phi, the Gram matrix K and the output data matrix Y.
4. The KPLS and FCM based sewage treatment process monitoring method of claim 3, wherein in step 4, cross validation method is used to determine principal component number A and solve scoring matrix T, comprising the following steps:
step 4.1: let u be any column of the output data matrix Y;
step 4.2: calculating a score vector: t is Ku;
step 4.3: normalizing the score vector t: l t | → 1;
step 4.4: and (3) performing regression on each column in the output data matrix Y on the score vector t: q ═ Y't;
step 4.5: calculating a new score for the output data matrix Y: u is Yq;
step 4.6: and (3) normalizing the u vector: | | u | → 1;
step 4.7: judging whether u converges: if yes, jumping to step 4.8; if not, jumping to step 4.2;
step 4.8: updating the matrix: repeating the steps 4.2 to 4.7 until a score vector is calculated, wherein K is (I-tt ') K (I-tt '), and Y is Y-tq ', and calculating the next score vector until a score vectors are extracted; wherein I is an identity matrix.
5. A KPLS and FCM based sewage treatment process monitoring method according to claim 3, wherein in step 3, the processed Gram matrix is centralized
Figure FDA0002935515130000031
Wherein E isnIs an n × n identity matrix, 1nIs n-dimensional all 1-column vector, 1'nIs 1nThe transposed matrix of (2).
6. A KPLS and FCM based sewage treatment process monitoring method according to claim 4, wherein said step 6 comprises the steps of:
step 6.1: clustering the score matrix T based on the FCM algorithm by taking c as the clustering number to construct an FCM target function
Figure FDA0002935515130000032
Wherein the content of the first and second substances,
Figure FDA0002935515130000033
is the ith row vector of the scoring matrix T,
Figure FDA0002935515130000034
is m1Input of dimensionSample xiCorresponding reduced A-dimensional new sample, uijIs a sample
Figure FDA0002935515130000035
For the jth clustering center vjThe degree of membership of (a) is,
Figure FDA0002935515130000036
Figure FDA0002935515130000037
membership matrix U ═ Uij)n×cThe cluster center matrix V ═ V (V)j)c×A;m∈[1,+∞]Is a fuzzy index;
Figure FDA0002935515130000041
is a sample
Figure FDA0002935515130000042
With the jth cluster center vjThe Euclidean distance between; c clustering centers obtained by clustering the score matrix T comprise clustering centers of normal working condition samples and clustering centers of c-1 abnormal working condition samples;
step 6.2: solving a membership matrix U:
step 6.2.1: initializing FCM algorithm parameters: setting a fuzzy index m, setting an algorithm termination limit epsilon and a maximum iteration count, setting an initialization iteration count k to be 1, and randomly initializing a membership matrix U(k)=(uij (k))n×cRandomly initializing a cluster center matrix V(k)=(vj (k))c×A
Step 6.2.2: v is to bej (k)Substitution formula
Figure FDA0002935515130000043
Calculating membership degree matrix U of k +1 iteration(k+1)=(uij (k+1))n×c
Step 6.2.3: will uij (k+1)Substitution formula
Figure FDA0002935515130000044
Calculating the clustering center matrix of the (k + 1) th iteration as V(k+1)=(vj (k+1))c×A
Step 6.2.4: if | | | U(k+1)-U(k)If | is less than epsilon or the iteration times k is more than count, stopping iteration to obtain a final membership matrix U, and entering step 6.3; otherwise, making k equal to k +1, and returning to the step 6.2.2;
step 6.3: monitoring abnormal working conditions in the sewage treatment process according to the membership matrix U: if the ith sample
Figure FDA0002935515130000045
And if the membership degree of the clustering center of the normal working condition sample is less than mu, the sewage treatment process is abnormal at the ith sample.
CN201910572930.7A 2019-06-28 2019-06-28 KPLS (kernel principal component plus minor component plus) and FCM (fiber channel model) -based sewage treatment process monitoring method Active CN110232062B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910572930.7A CN110232062B (en) 2019-06-28 2019-06-28 KPLS (kernel principal component plus minor component plus) and FCM (fiber channel model) -based sewage treatment process monitoring method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910572930.7A CN110232062B (en) 2019-06-28 2019-06-28 KPLS (kernel principal component plus minor component plus) and FCM (fiber channel model) -based sewage treatment process monitoring method

Publications (2)

Publication Number Publication Date
CN110232062A CN110232062A (en) 2019-09-13
CN110232062B true CN110232062B (en) 2021-04-02

Family

ID=67856615

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910572930.7A Active CN110232062B (en) 2019-06-28 2019-06-28 KPLS (kernel principal component plus minor component plus) and FCM (fiber channel model) -based sewage treatment process monitoring method

Country Status (1)

Country Link
CN (1) CN110232062B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110928187B (en) * 2019-12-03 2021-02-26 北京工业大学 Sewage treatment process fault monitoring method based on fuzzy width self-adaptive learning model
CN111233118A (en) * 2020-03-19 2020-06-05 中冶赛迪工程技术股份有限公司 Intelligent control system and control method for high-density sedimentation tank
CN113222324B (en) * 2021-03-13 2023-04-07 宁波大学科学技术学院 Sewage quality monitoring method based on PLS-PSO-RBF neural network model
CN114527249B (en) * 2022-01-17 2024-03-19 南方海洋科学与工程广东省实验室(广州) Quality control method and system for water quality monitoring data

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104299169A (en) * 2014-09-26 2015-01-21 华中科技大学 Online sewage disposal system information safety risk analysis method and system
EP3065076A1 (en) * 2015-03-04 2016-09-07 Secure-Nok AS System and method for responding to a cyber-attack-related incident against an industrial control system
CN107463093A (en) * 2017-07-13 2017-12-12 东北大学 A kind of blast-melted quality monitoring method based on KPLS robust reconstructed errors

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104299169A (en) * 2014-09-26 2015-01-21 华中科技大学 Online sewage disposal system information safety risk analysis method and system
EP3065076A1 (en) * 2015-03-04 2016-09-07 Secure-Nok AS System and method for responding to a cyber-attack-related incident against an industrial control system
CN107463093A (en) * 2017-07-13 2017-12-12 东北大学 A kind of blast-melted quality monitoring method based on KPLS robust reconstructed errors

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Adaptive Fuzzy C-Means clustering in process monitoring;Pekka Teppola等;《Chemometrics and Intelligent Laboratory Systems》;19991231;全文 *
面向污水处理的数据驱动故障诊断及预测方法综述;黄道平等;《华南理工大学学报(自然科学版)》;20150331;全文 *

Also Published As

Publication number Publication date
CN110232062A (en) 2019-09-13

Similar Documents

Publication Publication Date Title
CN110232062B (en) KPLS (kernel principal component plus minor component plus) and FCM (fiber channel model) -based sewage treatment process monitoring method
CN110232256B (en) KPLS (kernel principal component system) and RWFCM (wireless remote control unit) -based sewage treatment process monitoring method
US10570024B2 (en) Method for effluent total nitrogen-based on a recurrent self-organizing RBF neural network
Farhi et al. Prediction of wastewater treatment quality using LSTM neural network
CN111126870B (en) Sewage treatment process abnormal condition detection method by utilizing integrated principal component analysis
CN111160776A (en) Method for detecting abnormal working condition in sewage treatment process by utilizing block principal component analysis
CN112417765B (en) Sewage treatment process fault detection method based on improved teacher-student network model
CN106600509B (en) Method for analyzing and judging water exchange and pollution discharge behaviors of enterprise based on basic data
CN109064048B (en) Wastewater discharge source rapid investigation method and system based on wastewater treatment process analysis
Ba-Alawi et al. Missing data imputation and sensor self-validation towards a sustainable operation of wastewater treatment plants via deep variational residual autoencoders
Liu et al. Modeling of wastewater treatment processes using dynamic Bayesian networks based on fuzzy PLS
CN108088974A (en) A kind of flexible measurement method of anaerobism while denitrification methane phase process water outlet nitrate nitrogen
Pisa et al. A recurrent neural network for wastewater treatment plant effuents' prediction
Zhong et al. Water quality prediction of MBR based on machine learning: A novel dataset contribution analysis method
US20210355007A1 (en) System and method for predicting a parameter associated with a wastewater treatment process
Mbamba et al. Optimization of deep learning models for forecasting performance in the water industry using genetic algorithms
WO2014157750A1 (en) Apparatus and method for providing causative factors for state of quality of effluent water from sewage treatment plant
CN201330211Y (en) Working parameter self-optimizing simulation system for sewage treatment plant
CN116046048A (en) Sewage treatment sensor fault diagnosis method based on data driving
CN116048024A (en) Distributed typical correlation analysis process monitoring method and device
Huang et al. Improving nitrogen removal using a fuzzy neural network-based control system in the anoxic/oxic process
CN114565154A (en) Prediction method and optimization algorithm for carbon source adding amount of biochemical section of leachate
Kini et al. Enhanced data-driven monitoring of wastewater treatment plants using the Kolmogorov–Smirnov test
CN116110516B (en) Method and device for identifying abnormal working conditions in sewage treatment process
US20240083789A1 (en) Reconstruction method and system for wastewater biological treatment process based on machine learning system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant