CN112183676A - Water quality soft measurement method based on mixed dimensionality reduction and kernel function extreme learning machine - Google Patents
Water quality soft measurement method based on mixed dimensionality reduction and kernel function extreme learning machine Download PDFInfo
- Publication number
- CN112183676A CN112183676A CN202011249249.8A CN202011249249A CN112183676A CN 112183676 A CN112183676 A CN 112183676A CN 202011249249 A CN202011249249 A CN 202011249249A CN 112183676 A CN112183676 A CN 112183676A
- Authority
- CN
- China
- Prior art keywords
- data
- kernel function
- learning machine
- sample
- components
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000006870 function Effects 0.000 title claims abstract description 55
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 title claims abstract description 36
- 230000009467 reduction Effects 0.000 title claims abstract description 27
- 238000000691 measurement method Methods 0.000 title claims abstract description 10
- 239000010865 sewage Substances 0.000 claims abstract description 48
- 238000000034 method Methods 0.000 claims abstract description 37
- -1 ammonia nitrogen ions Chemical class 0.000 claims abstract description 21
- 238000005070 sampling Methods 0.000 claims abstract description 10
- 238000005259 measurement Methods 0.000 claims description 27
- 238000013528 artificial neural network Methods 0.000 claims description 21
- 238000004364 calculation method Methods 0.000 claims description 18
- 238000001514 detection method Methods 0.000 claims description 17
- 230000002596 correlated effect Effects 0.000 claims description 13
- 239000011159 matrix material Substances 0.000 claims description 13
- 238000012549 training Methods 0.000 claims description 11
- 238000009826 distribution Methods 0.000 claims description 10
- 239000013598 vector Substances 0.000 claims description 7
- 230000000875 corresponding effect Effects 0.000 claims description 6
- 125000001477 organic nitrogen group Chemical group 0.000 claims description 6
- 239000000758 substrate Substances 0.000 claims description 6
- 238000006243 chemical reaction Methods 0.000 claims description 4
- 238000010606 normalization Methods 0.000 claims description 4
- 238000012545 processing Methods 0.000 claims description 4
- 239000002028 Biomass Substances 0.000 claims description 3
- 229910002651 NO3 Inorganic materials 0.000 claims description 3
- NHNBFGGVMKEFGY-UHFFFAOYSA-N Nitrate Chemical compound [O-][N+]([O-])=O NHNBFGGVMKEFGY-UHFFFAOYSA-N 0.000 claims description 3
- IOVCWXUNBOPUCH-UHFFFAOYSA-M Nitrite anion Chemical compound [O-]N=O IOVCWXUNBOPUCH-UHFFFAOYSA-M 0.000 claims description 3
- 244000062766 autotrophic organism Species 0.000 claims description 3
- 230000005284 excitation Effects 0.000 claims description 3
- 244000059217 heterotrophic organism Species 0.000 claims description 3
- 239000002351 wastewater Substances 0.000 claims description 2
- 238000004065 wastewater treatment Methods 0.000 claims description 2
- 210000002569 neuron Anatomy 0.000 abstract description 8
- 230000008569 process Effects 0.000 abstract description 6
- 238000013507 mapping Methods 0.000 abstract description 4
- 238000012935 Averaging Methods 0.000 abstract 1
- 238000005457 optimization Methods 0.000 abstract 1
- 239000000306 component Substances 0.000 description 40
- 241000282414 Homo sapiens Species 0.000 description 8
- 239000013505 freshwater Substances 0.000 description 5
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 230000007423 decrease Effects 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 239000000126 substance Substances 0.000 description 2
- 241001465754 Metazoa Species 0.000 description 1
- 238000012271 agricultural production Methods 0.000 description 1
- XKMRRTOUMJRJIA-UHFFFAOYSA-N ammonia nh3 Chemical compound N.N XKMRRTOUMJRJIA-UHFFFAOYSA-N 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000004140 cleaning Methods 0.000 description 1
- 238000011217 control strategy Methods 0.000 description 1
- 239000003651 drinking water Substances 0.000 description 1
- 235000020188 drinking water Nutrition 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 150000002500 ions Chemical class 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000013178 mathematical model Methods 0.000 description 1
- 239000012533 medium component Substances 0.000 description 1
- 230000001537 neural effect Effects 0.000 description 1
- 238000011897 real-time detection Methods 0.000 description 1
- 241000894007 species Species 0.000 description 1
- 238000003911 water pollution Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/213—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N33/00—Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
- G01N33/18—Water
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Abstract
The invention discloses a water quality soft measurement method based on a hybrid dimensionality reduction and kernel function extreme learning machine, which considers the characteristics that mutual information measures the nonlinear correlation between two components in sewage treatment, a Pearson coefficient considers the linear correlation between the two components to preprocess and dimensionality reduction data, the learning speed of the extreme learning machine is extremely high, the model estimation precision is higher, and the like, and the kernel function does not need to know an explicit definition mapping function and the number of hidden layer neurons, so that the optimization time of the number of the neurons is saved, the estimation performance is improved, a method of sampling for averaging for multiple times is adopted, the requirement of an algorithm on computing equipment is further reduced, and the computation complexity is effectively reduced on the premise of ensuring the performance. The method can quickly and effectively estimate the concentration of the ammonia nitrogen ions, and effectively avoids the influence on the effluent quality caused by the performance of the sensor and the characteristics of sewage treatment, thereby improving the efficiency of the sewage treatment process and the effluent quality.
Description
Technical Field
The invention relates to the field of control science and engineering and environmental science and engineering, in particular to a water quality soft measurement method based on a hybrid dimensionality reduction and kernel function extreme learning machine.
Background
Water is one of the essential elements for life existence, and the worldwide water resource reserves are abundant and about 1.36X 1018m3, the fresh water resource only accounts for 2.53%, deep water and glaciers are the main resources, and the fresh water of lakes and rivers only accounts for 0.3% of the fresh water resource, so that the fresh water resource available for human beings is very limited from the group. Meanwhile, the development and the utilization of water resources by human beings are unreasonable, so that the water resources are greatly wasted and polluted, and the space of the available water resources is further compressed. The polluted water resource can cause damage to the environment, vegetation decline, animal death and the like, and can also harm human society and the human health and even the life of people with the polluted water resource.
Therefore, the treatment of water pollution is imperative, sewage treatment is a very effective means, various kinds of sewage generated by human society are purified, the water is discharged into rivers and lakes after reaching the standard or harmlessly utilized, so-called harmlessly utilized, namely, factory utilizes the treated water reaching the standard to carry out production operation, cities utilize the water reaching the standard to carry out non-catering related industries such as urban greening, road cleaning and the like, and in some countries or cities with extreme water shortage, such as Singapore and Nanbiya, the first city is temperate and the like, and the fresh water purified by adopting the advanced process is taken as urban drinking water. Therefore, the sewage treatment not only can reduce the damage of the sewage produced by human beings to the environment and the human society, but also can relieve the water resource shortage in the development process of the human society.
In view of the above-mentioned characteristics of sewage treatment, sewage treatment plants are widely designed in cities and factories, and even some rural environments are built with sewage treatment plants of a certain scale for treating sewage generated in the life of residents and agricultural production. However, due to the technology and functions of the sensor, the key components in the sewage cannot be directly measured or the measurement timeliness is poor, and meanwhile, the sewage treatment is a large delay system and cannot realize quick feedback and adjustment, so that the quality of the effluent water is influenced. In order to quickly and effectively detect the quality of the effluent water, the method provides a soft measurement method, and indirectly represents the quality of the effluent water based on a sewage quality index which is easy to measure, so that the quick detection and adjustment of the water quality are realized, and the problem that the water quality detection is not timely due to the fact that the performance of a sensor cannot meet the actual requirement and the characteristics of sewage treatment reaction is solved.
Disclosure of Invention
In order to realize the rapid estimation of some components which are difficult to measure in the sewage treatment water quality and facilitate the timely adjustment of control strategies for workers, the invention provides the water quality soft-measurement method based on the mixed dimensionality reduction and kernel function limit learning machine.
The purpose of the invention is realized by the following technical scheme: a water quality soft measurement method based on a mixed dimensionality reduction and kernel function extreme learning machine comprises the following steps:
(1) obtaining N from a wastewater treatment process0Group sample dataEach set of input vectors XiCharacterizing a plurality of wastewater quality components, corresponding expected output TiAnd characterizing the concentration of ammonia nitrogen ions in the effluent quality.
(2) Compressing the sample data by adopting a sampling mode, which specifically comprises the following steps: in [1,10 ]]Randomly selecting an integer initial value a, and acquiring data which is ten times compressed in a batch by acquiring the data at intervals of 10 pointsAnd repeatedly sampling, and resetting the initial value a every time to obtain p batches of sample data.
(3) Respectively carrying out descaler dimensionalization on each batch of sample data, and normalizing the data of different dimensions to [ -1,1] by a minimum maximum value normalization method]Get normalized sample data Xn。
(4) Introducing two indexes of mutual information index and Pearson coefficient to detect sample X obtained by sewage treatmentnAre respectively softMeasuring the concentration T of the target ammonia nitrogen ions to calculate the correlation, selecting strongly correlated components according to the strong and weak relation of the correlation, and eliminating weakly correlated or uncorrelated components, thereby realizing the dimensionality reduction of the detection sample data, and the method comprises the following specific steps:
step 1: respectively select XnCalculating a mutual information value and a Pearson coefficient value by one component in the soft measurement and the target ammonia nitrogen ion concentration T, and recording the component as A.
Step 2: calculating mutual information value MI (a, T):
wherein P (A)i) And P (T)j) Respectively represent variable AiAnd TjProbability distribution, P (A)i,Tj) Characterizing variable AiAnd TjMm and nn characterize the data types in a and T, respectively.
In the variable A, the mean value of the variable A is calculated asLess than mean valueA of (A)iIs 0, is greater than or equal to the mean valueA of (A)iIs 1; accordingly, the same processing is performed for the variable T, and there are: when A isi=0,TjNumber of cases of 0 is Z0When A isi=0,TjNumber of cases of 1 is Z1(ii) a When A isi=1,TjNumber of cases of 0 is Z2When A isi=1,TjNumber of cases of 1 is Z3(ii) a The sum of the times for all cases is set as: l ═ Z0+Z1+Z2+Z3(ii) a Then there is the following probability distribution:
and calculating a joint probability distribution, and calculating the mutual information values of the two components according to the MI (A, T) definitional formula.
Step 3: calculating the Pearson coefficient r:
whereinIs the mean value, σ, of sample AAIs the standard deviation of the sample a and,is the mean value, σ, of the sample TTIs the standard deviation of sample T, AkIs the kth data of sample A, TkIs the kth data of sample T.
Step 4: after the mutual information and the Pearson coefficient value of one component and T are calculated, another component is calculated until the mutual information and the Pearson coefficient value of all the components and T are calculated, the strongly related components are selected, and the detection data X' epsilon R is reconstructedN×qAnd q is the component category after dimensionality reduction.
(5) The extreme learning machine based on the kernel function is constructed, an input layer of the extreme learning machine is provided with q nodes, an output layer of the extreme learning machine is provided with 1 node, and the expression is as follows:
where f (X') is the neural network output, T is the target output of the training data, ILIs a unit matrix, C is a constant, K (X'i,X′j) Is a kernel function, ΩELMThe kernel matrix is in the following specific form:
wherein G (-) is the excitation function of the neural network, al,blThe method comprises the following steps that (L ═ 1, 2., L) are weight and deviant from an input layer to a hidden layer respectively, L represents the number of nodes of the hidden layer of the neural network, X' represents a total of N groups of neural network input data, namely data obtained after dimensionality reduction of sewage treatment detection data, each group has q characteristic values, namely the number of the nodes corresponding to the input layer of the neural network, and H is output from the hidden layer of the neural network.
(6) And (4) respectively training results of the extreme learning machine according to the p batches of sample data, and carrying out average calculation to obtain a final soft measurement result.
Further, in the step (1), N is obtained from the sewage treatment process0Group sample dataWherein each set of input vectors is of specific form Xi=[SI,i,SS,i,XI,i,XS,i,XBH,i,XBA,i,XP,i,SNO,i,SO,i,SND,i,XND,i]TRespectively representing 11 components of soluble inert organic matters, easily biodegradable substrates, insoluble inert organic matters, slowly biodegradable substrates, active heterotrophic organisms, active autotrophic organisms, biomass decay insoluble products, nitrate and nitrite, ammonium ions, soluble degradable organic nitrogen and insoluble degradable organic nitrogen in the sewage.
Further, in the step (4), the detection sample X obtained by sewage treatment is subjected tonRespectively with ammonia nitrogen of soft measurement targetAnd calculating the correlation of the ion concentration T, wherein mutual information is used for representing the nonlinear correlation among different components, the Pearson coefficient is used for representing the linear correlation among different components in the sewage treatment, strongly correlated components are respectively selected from the cross correlation and the Pearson coefficient according to the strong and weak relation of the correlation, then a union set is formed, so that strongly correlated components are obtained, weakly correlated or uncorrelated components are removed, and the dimension reduction of the detection sample data is realized.
Further, in the step (5), a kernel function K (X'i,X′j) Various forms can be selected:
radial radical kernel function: k (X'i,X′j)=exp(-γ||X′i-X′j||2)
wherein X 'in various kernel functions'i、X′jThe sample data of the i-th group and the j-th group are respectively referred, and a, c, p, gamma and sigma are set constants.
Kernel matrix omegaELM∈RN×NWith input data X 'only'iRelated to the number of training samples, and is determined by kernel function K (X'i,X′j) Inputting data (X ') in the low dimensional space'i,X′j) Conversion to inner product h (X ') in high dimensional feature space'i)·h(X′j) The method only needs to select kernel function in advance, does not need to define mapping function explicitly, and does not need to set the number of neurons in hidden layer, thereby saving the time for optimizing the number of neurons and being capable of changing the number of neuronsThe method is good for the problem of generalization and stability reduction caused by random assignment of the traditional hidden layer neurons.
The invention has the beneficial effects that: the method is applied to sewage treatment, and in consideration of the characteristics of multiple component types and large data volume in the sewage treatment process, the method of interval grouping is firstly adopted, so that the performance of a soft measurement model is ensured, the single calculated amount can be effectively reduced, the requirement on hardware equipment is reduced, the calculation complexity can be reduced, and the calculation complexity is reduced by multiple times. Meanwhile, non-relevant data is further removed by utilizing a mutual information and Pearson coefficient method, the calculated amount and the calculated complexity are further reduced, and the correlation between the sample components and the soft measurement target is enhanced. And an extreme learning machine method based on a kernel function is used for estimation, so that the soft measurement estimation performance aiming at the concentration of the ammonia nitrogen ions is effectively improved.
Drawings
FIG. 1 is a schematic view of the structure of the water quality soft measurement of the present invention;
FIG. 2 is a flow chart of the water quality soft measurement method of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the detailed description and specific examples, while indicating the scope of the invention, are intended for purposes of illustration only and are not intended to limit the scope of the invention.
Because some components in the sewage treatment are difficult to detect or need to be detected through long chemical experiments, real-time detection cannot be realized, and in order to provide reasonable guidance suggestions for adjusting the sewage treatment, the parameters of the components need to be quickly estimated. The invention provides a water quality soft measurement method based on a mixed dimensionality reduction and kernel function extreme learning machine, which integrates algorithms such as mutual information, Pearson coefficients, a kernel function method, an extreme learning machine and interval sampling and the like, and provides convenience for soft measurement of water quality components in sewage treatment; as shown in fig. 1 and 2, the method comprises the following steps:
(1) from sewage treatment processesGet N0Group sample dataWherein each set of input vectors is of specific form Xi=[SI,i,SS,i,XI,i,XS,i,XBH,i,XBA,i,XP,i,SNO,i,SO,i,SND,i,XND,i]TRespectively representing 11 components in the sewage, such as soluble inert organic matters, easily biodegradable substrates, insoluble inert organic matters, slowly biodegradable substrates, active heterotrophic organisms, active autotrophic organisms, biomass decay insoluble products, nitrate and nitrite, ammonium ions, soluble degradable organic nitrogen, insoluble degradable organic nitrogen and the like, and the corresponding expected output is Ti(SNH,i) Namely the concentration of ammonia nitrogen ions in the effluent quality, and the component can effectively represent the effluent quality.
(2) The neural network training learning based on the kernel function is to increase the dimension of low-dimensional data to a high-dimensional characteristic space so as to carry out calculation, meanwhile, the calculation of joint probability distribution in mutual information can also increase the calculation complexity, meanwhile, a large amount of data is generated in the sewage treatment process, the calculation amount and the complexity of the algorithm can be greatly increased, the method is not in line with the original purpose of practical application, and in order to reduce the calculation complexity and not influence the performance of the algorithm, the method adopts a multi-sampling calculation method. Knowing the original sample dataSampling method is adopted to obtain compressed sample data in [1,10 ]]Randomly selecting an integer initial value a, and acquiring data which is ten times compressed in a batch by acquiring every 10 pointsSampling is repeatedly carried out, the initial value a is reset every time, p batches of sample data are obtained, and the following processing and training are respectively carried out on the p batches of sample data.
(3) Because the dimensions of different components in sewage treatment are different and the scale difference between numerical values is huge, in order to eliminate the influence caused by the dimensions, the data of different dimensions are normalized to between [ -1,1] by a minimum maximum value normalization method aiming at each batch of sample data respectively, so that the influence of the dimensions on soft measurement is eliminated. The concrete form is as follows:
wherein X is sample data compressed in sewage treatment, and X isminIs the minimum value of X, and XmaxThen is the maximum value of X, XnIs normalized sample data, specifically Xn=[SIn,SSn,XIn,XSn,XBHn,XBAn,XPn,SNOn,SOn,SNDn,XNDn]T。
(4) Consider that not all sample data XnTarget ammonia nitrogen ion concentration T (S) in soft measurementNH) All have strong correlation, and the introduction of some unnecessary variables can increase the calculated amount of sewage treatment, so after the data are normalized, two indexes of mutual information index and Pearson coefficient are introduced, and a detection sample X obtained by sewage treatment is subjected tonRespectively with the ammonia nitrogen ion concentration T (S) of the soft measurement targetNH) And calculating correlation, wherein mutual information is used for representing the nonlinear correlation among different components, and the Pearson coefficient is used for representing the linear correlation among different components in sewage treatment, strongly correlated components are respectively selected from the cross correlation and the Pearson coefficient according to the strong and weak relation of the correlation, and then a union set is formed, so that the strongly correlated components are obtained, and the weakly correlated or uncorrelated components are removed, so that the dimension reduction of the detection sample data is realized, and the calculation complexity and the calculation amount are reduced. The specific operation steps are as follows:
step 1: respectively select Xn=[SIn,SSn,XIn,XSn,XBHn,XBAn,XPn,SNOn,SOn,SNDn,XNDn]TOne of 11 medium components and soft measurement target ammonia nitrogen ion concentration T (S)NH) Mutual information values and Pearson coefficient values are calculated. For simplicity of presentation, one of the 11 components is first defined as a, and the target detection variable is presented unchanged.
Step 2: calculating mutual information value MI (a, T):
wherein P (A)i) And P (T)j) Respectively represent variable AiAnd TjProbability distribution, and P (A)i,Tj) The variable A is characterizediAnd TjIn which mm and nn characterize the data classes in a and T, respectively (data with the same normalized concentration values are classified as one class). If MI (A, T) is larger, the variable A is closely related to T, otherwise, the variable A is less related to T. If MI (A, T) is zero, it indicates that the two variables are completely independent.
However, in actual operation, considering that the data types of each component in sewage treatment are very many, the calculation difficulty is greatly increased, and in order to simplify the calculation complexity, in the variable A, the average value of the calculation variable A is(i.e., the mean of all data in variable A), less than the meanA of (A)iIs 0, is greater than or equal to the mean valueA of (A)iIs 1; accordingly, the same processing is performed for the variable T, and there are: when A isi=0,TjNumber of cases of 0 is Z0When A isi=0,TjNumber of cases of 1 is Z1(ii) a When A isi=1,TjNumber of cases of 0 is Z2When is coming into contact withAi=1,TjNumber of cases of 1 is Z3;
The sum of the times for all cases is set as: l ═ Z0+Z1+Z2+Z3
The following probability distribution:
simultaneously calculating joint probability distribution:
mutual information values of the two components can be calculated according to MI (A, T) definitional formulas.
Step 3: calculating the Pearson coefficient r:
wherein the content of the first and second substances,is the average of sample A, and σAThen it is the standard deviation of the sample a,is the average value of the samples T, and σTThen is the standard deviation of the sample T, AkIs the kth data of sample A, TkIs the kth data of sample T.
The range of variation of the pearson coefficient is-1 to 1, and a coefficient value of 1 means that a and T can be well described by a linear equation, all data points well fall on a straight line, and a increases with the increase of T; a coefficient value of-1 means that all data points fall on a straight line and a decreases as T increases; a coefficient value of 0 means that there is no linear relationship between the two variables.
Step 4: after one component is calculated, the ammonia nitrogen ion concentration T (S) of the target of soft measurement is calculatedNH) After mutual information and Pearson coefficient value, another component is calculated until 11 components and the target ammonia nitrogen ion concentration T (S) of soft measurementNH) The mutual information and the Pearson coefficient value are calculated, the strongly related components are selected, and the detection data X' epsilon R is reconstructedN×q(q is the component species after dimensionality reduction, q is less than 11), and simultaneously soft-measuring the target ammonia nitrogen ion concentration T (S)NH) And is not changed.
(5) Considering the characteristics of multiple types of sewage treatment data and large calculated data amount, after the sewage treatment carries out data sampling and pretreatment dimension reduction for multiple times, the calculation complexity and the calculated amount are greatly reduced. Considering that the sewage treatment process is a very complex strong nonlinear system which is difficult to accurately model and is difficult to accurately express by using a mathematical model, in order to realize soft measurement more accurately, the method introduces an extreme learning machine based on a kernel function to carry out soft measurement. The neural network of the extreme learning machine is composed of an input layer, a hidden layer and an output layer, wherein the input layer of the neural network of the extreme learning machine is set to have q nodes according to the characteristics of sample data, and the output layer of the neural network of the extreme learning machine is set to have 1 node. The extreme learning machine has the following steps:
step 1: determining the type q and data length N of input data according to the size of a training sample data set
Wherein G (-) is the excitation function of the neural network, al,bl(L ═ 1, 2.. said., L) are weight and deviant from input layer to hidden layer, L represents number of hidden layer nodes of neural network, X' represents a total of N groups of neural network input data, i.e. data obtained by reducing dimension of sewage treatment detection data, and each group has q characteristic values, i.e. corresponding to neural networkThe number of nodes of the network input layer, H is the output of the hidden layer of the neural network.
Step 2: the concentration S of ammonia nitrogen ions in the effluent of the sewageNHAs the target history data T:
wherein T isj(j ═ 1,2, …, N) is the output of the jth set of target history data;
step 3: constructing a network from a hidden layer to an output layer, and selecting a Purelin function according to the output layer, wherein the Purelin function has the following characteristicsWriting this formula as a matrix form
T=βH
Wherein wlThen the weight from hidden layer to output layer is obtained, and the vector is beta epsilon RL,G(al,blX') is the hidden layer output and the output layer input, and the matrix form is H e RL×N。
Step 4: under the premise of obtaining step1 and step2, step3 is processed by adopting a generalized inverse calculation method to obtain a weight vector from a hidden layer to an output layer:
wherein ILIs an identity matrix of dimension L, and C is a constant.
Under the condition that the specific form of the feature mapping h (X') of the hidden layer in step1 is unknown, a kernel function needs to be introduced to measure the similarity between samples, and a kernel matrix of the extreme learning machine can be defined according to the Mercer condition, wherein the specific form is as follows:
wherein omegaELMIn the form of a kernel matrix, the kernel matrix,and kernel function K (X'i,X′j) Various forms can be selected, common types being:
radial Radical (RBF) kernel function: k (X'i,X′j)=exp(-γ||X′i-X′j||2)
wherein X 'in various kernel functions'i、X′jRespectively refer to the i-th group and the j-th group of sample data, and a, c, p, γ and σ are set constants, step4 is changed into:
where f (X') is the output of the neural network.
Thus the kernel matrix ΩELM∈RN×NWith input data X 'only'iRelated to the number of training samples, and is determined by kernel function K (X'i,X′j) Inputting data (X ') in the low dimensional space'i,X′j) Conversion to inner product h (X ') in high dimensional feature space'i)·h(X′j) The method only needs to select kernel functions in advance, does not need to define mapping functions explicitly, and does not need to set the number of the neurons in the hidden layer, thereby saving the time for optimizing the number of the neurons and improving the problem of the reduction of generalization and stability caused by the random assignment of the neurons in the traditional hidden layer.
(6) And (4) respectively training results of the extreme learning machine according to the p batches of sample data, and carrying out average calculation to obtain a final soft measurement result, namely the concentration of the ammonia nitrogen ions in the effluent quality.
The foregoing is only a preferred embodiment of the present invention, and although the present invention has been disclosed in the preferred embodiments, it is not intended to limit the present invention. Those skilled in the art can make numerous possible variations and modifications to the present teachings, or modify equivalent embodiments to equivalent variations, without departing from the scope of the present teachings, using the methods and techniques disclosed above. Therefore, any simple modification, equivalent change and modification made to the above embodiments according to the technical essence of the present invention are still within the scope of the protection of the technical solution of the present invention, unless the contents of the technical solution of the present invention are departed.
Claims (6)
1. A water quality soft measurement method based on a mixed dimensionality reduction and kernel function extreme learning machine is characterized by comprising the following steps:
(1) obtaining N from a wastewater treatment process0Group sample dataEach set of input vectors XiCharacterizing a plurality of wastewater quality components, corresponding expected output TiAnd characterizing the concentration of ammonia nitrogen ions in the effluent quality.
(2) Compressing the sample data by adopting a sampling mode, which specifically comprises the following steps: in [1,10 ]]Randomly selecting an integer initial value a, and acquiring data which is ten times compressed in a batch by acquiring the data at intervals of 10 pointsAnd repeatedly sampling, and resetting the initial value a every time to obtain p batches of sample data.
(3) Respectively carrying out descaler dimensionalization on each batch of sample data, and normalizing the data of different dimensions to [ -1,1] by a minimum maximum value normalization method]Get normalized sample data Xn。
(4) Introducing two indexes of mutual information index and Pearson coefficient to detect sample X obtained by sewage treatmentnRespectively calculating correlations with the concentration T of the ammonia nitrogen ions of the soft measurement target, selecting strongly correlated components according to the strong and weak relation of the correlations, and eliminating weakly correlated or uncorrelated components, thereby realizing the dimension reduction of the detection sample data, and the method comprises the following specific steps:
step 1: respectively select XnCalculating a mutual information value and a Pearson coefficient value by one component in the soft measurement and the target ammonia nitrogen ion concentration T, and recording the component as A.
Step 2: calculating mutual information value MI (a, T):
wherein P (A)i) And P (T)j) Respectively represent variable AiAnd TjProbability distribution, P (A)i,Tj) Characterizing variable AiAnd TjMm and nn characterize the data types in a and T, respectively.
In the variable A, the mean value of the variable A is calculated asLess than mean valueA of (A)iIs 0, is greater than or equal to the mean valueA of (A)iIs 1; accordingly, the same processing is performed for the variable T, and there are: when A isi=0,TjNumber of cases of 0 is Z0When A isi=0,TjNumber of cases of 1 is Z1(ii) a When A isi=1,TjNumber of cases of 0 is Z2When A isi=1,TjNumber of cases of 1 is Z3(ii) a Setting upThe sum of the times for all cases is: l ═ Z0+Z1+Z2+Z3(ii) a Then there is the following probability distribution:
and calculating a joint probability distribution, and calculating the mutual information values of the two components according to the MI (A, T) definitional formula.
Step 3: calculating the Pearson coefficient r:
whereinIs the mean value, σ, of sample AAIs the standard deviation of the sample a and,is the mean value, σ, of the sample TTIs the standard deviation of sample T, AkIs the kth data of sample A, TkIs the kth data of sample T.
Step 4: after the mutual information and the Pearson coefficient value of one component and T are calculated, another component is calculated until the mutual information and the Pearson coefficient value of all the components and T are calculated, the strongly related components are selected, and the detection data X' epsilon R is reconstructedN×qAnd q is the component category after dimensionality reduction.
(5) The extreme learning machine based on the kernel function is constructed, an input layer of the extreme learning machine is provided with q nodes, an output layer of the extreme learning machine is provided with 1 node, and the expression is as follows:
where f (X') is the neural network output, T is the target output of the training data, ILIs a unit matrix, C is a constant, K (X'i,X'j) Is a kernel function, ΩELMThe kernel matrix is in the following specific form:
wherein G (-) is the excitation function of the neural network, al,blThe method comprises the following steps that (L ═ 1, 2., L) are weight and deviant from an input layer to a hidden layer respectively, L represents the number of nodes of the hidden layer of the neural network, X' represents a total of N groups of neural network input data, namely data obtained after dimensionality reduction of sewage treatment detection data, each group has q characteristic values, namely the number of the nodes corresponding to the input layer of the neural network, and H is output from the hidden layer of the neural network.
(6) And (4) respectively training results of the extreme learning machine according to the p batches of sample data, and carrying out average calculation to obtain a final soft measurement result.
2. The method for soft measurement of water quality based on the hybrid dimensionality reduction and kernel function limit learning machine as claimed in claim 1, wherein in the step (1), N is obtained from a sewage treatment process0Group sample dataWherein each set of input vectors is of specific form Xi=[SI,i,SS,i,XI,i,XS,i,XBH,i,XBA,i,XP,i,SNO,i,SO,i,SND,i,XND,i]TRespectively representing 11 components of soluble inert organic matters, easily biodegradable substrates, insoluble inert organic matters, slowly biodegradable substrates, active heterotrophic organisms, active autotrophic organisms, biomass decay insoluble products, nitrate and nitrite, ammonium ions, soluble degradable organic nitrogen and insoluble degradable organic nitrogen in the sewage.
3. The method for soft measurement of water quality based on the hybrid dimensionality reduction and kernel function limit learning machine as claimed in claim 1, wherein in the step (3), the data normalization formula is as follows:
wherein X is sample data compressed in sewage treatment, and X isminIs the minimum of X, XmaxIs the maximum of X, XnThe normalized sample data.
4. The method for soft measurement of water quality based on the hybrid dimensionality reduction and kernel function limit learning machine of claim 1, wherein in the step (4), a detection sample X obtained by sewage treatment is subjected tonAnd calculating correlations with the concentration T of the ammonia nitrogen ions of the soft measurement target respectively, wherein mutual information is used for representing the nonlinear correlation among different components, the Pearson coefficient is used for representing the linear correlation among different components in sewage treatment, strongly correlated components are selected from cross correlation and Pearson coefficients respectively according to the strong and weak relations of the correlations, then a union set is formed, so that strongly correlated components are obtained, the weakly correlated or uncorrelated components are removed, and the dimension reduction of the detection sample data is realized.
5. The method for soft measurement of water quality based on hybrid dimensionality reduction and kernel function limit learning machine according to claim 1, wherein in the step (5), the kernel function K (X'i,X'j) Various forms can be selected:
linear kernel function: k (X'i,X'j)=X'i TX'j+c
Polynomial kernel function: k (X'i,X'j)=(aX'i TX'j+c)p
Radial radical kernel function: k (X'i,X'j)=exp(-γ||X'i-X'j||2)
wherein X 'in various kernel functions'i、X'jThe sample data of the i-th group and the j-th group are respectively referred, and a, c, p, gamma and sigma are set constants.
6. The method for soft measurement of water quality based on the hybrid dimensionality reduction and kernel function limit learning machine as claimed in claim 1, wherein in the step (5), the kernel matrix Ω isELM∈RN×NWith input data X 'only'iRelated to the number of training samples, and is determined by kernel function K (X'i,X'j) Inputting data (X ') in the low dimensional space'i,X'j) Conversion to inner product h (X ') in high dimensional feature space'i)·h(X'j) And the dimension of the feature space is irrelevant, so that the dimension disaster problem can be effectively avoided.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011249249.8A CN112183676A (en) | 2020-11-10 | 2020-11-10 | Water quality soft measurement method based on mixed dimensionality reduction and kernel function extreme learning machine |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011249249.8A CN112183676A (en) | 2020-11-10 | 2020-11-10 | Water quality soft measurement method based on mixed dimensionality reduction and kernel function extreme learning machine |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112183676A true CN112183676A (en) | 2021-01-05 |
Family
ID=73918141
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011249249.8A Pending CN112183676A (en) | 2020-11-10 | 2020-11-10 | Water quality soft measurement method based on mixed dimensionality reduction and kernel function extreme learning machine |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112183676A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113065279A (en) * | 2021-03-15 | 2021-07-02 | 中国石油大学(北京) | Method, device, equipment and storage medium for predicting total organic carbon content |
Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103728431A (en) * | 2014-01-09 | 2014-04-16 | 重庆科技学院 | Industrial sewage COD (chemical oxygen demand) online soft measurement method based on ELM (extreme learning machine) |
CN106874934A (en) * | 2017-01-12 | 2017-06-20 | 华南理工大学 | Sewage disposal method for diagnosing faults based on weighting extreme learning machine Integrated Algorithm |
CN107688825A (en) * | 2017-08-03 | 2018-02-13 | 华南理工大学 | A kind of follow-on integrated weighting extreme learning machine sewage disposal failure examines method |
CN107832785A (en) * | 2017-10-30 | 2018-03-23 | 天津理工大学 | A kind of non-linear limit learning machine algorithm |
CN109308571A (en) * | 2018-08-29 | 2019-02-05 | 华北电力科学研究院有限责任公司 | Distribution wire route becomes relationship detection method |
CN109614570A (en) * | 2018-11-15 | 2019-04-12 | 北京英视睿达科技有限公司 | Predict the method and device of section water quality parameter data |
CN110417011A (en) * | 2019-07-31 | 2019-11-05 | 三峡大学 | A kind of online dynamic secure estimation method based on mutual information Yu iteration random forest |
CN111178377A (en) * | 2019-10-12 | 2020-05-19 | 未鲲(上海)科技服务有限公司 | Visual feature screening method, server and storage medium |
CN111650834A (en) * | 2020-06-16 | 2020-09-11 | 湖南工业大学 | Sewage treatment process prediction control method based on Extreme Learning Machine (ELM) |
CN111814284A (en) * | 2020-06-30 | 2020-10-23 | 三峡大学 | On-line voltage stability evaluation method based on correlation detection and improved random forest |
CN111858699A (en) * | 2020-06-10 | 2020-10-30 | 新华三技术有限公司 | Time series correlation detection method, equipment and storage medium |
-
2020
- 2020-11-10 CN CN202011249249.8A patent/CN112183676A/en active Pending
Patent Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103728431A (en) * | 2014-01-09 | 2014-04-16 | 重庆科技学院 | Industrial sewage COD (chemical oxygen demand) online soft measurement method based on ELM (extreme learning machine) |
CN106874934A (en) * | 2017-01-12 | 2017-06-20 | 华南理工大学 | Sewage disposal method for diagnosing faults based on weighting extreme learning machine Integrated Algorithm |
CN107688825A (en) * | 2017-08-03 | 2018-02-13 | 华南理工大学 | A kind of follow-on integrated weighting extreme learning machine sewage disposal failure examines method |
CN107832785A (en) * | 2017-10-30 | 2018-03-23 | 天津理工大学 | A kind of non-linear limit learning machine algorithm |
CN109308571A (en) * | 2018-08-29 | 2019-02-05 | 华北电力科学研究院有限责任公司 | Distribution wire route becomes relationship detection method |
CN109614570A (en) * | 2018-11-15 | 2019-04-12 | 北京英视睿达科技有限公司 | Predict the method and device of section water quality parameter data |
CN110417011A (en) * | 2019-07-31 | 2019-11-05 | 三峡大学 | A kind of online dynamic secure estimation method based on mutual information Yu iteration random forest |
CN111178377A (en) * | 2019-10-12 | 2020-05-19 | 未鲲(上海)科技服务有限公司 | Visual feature screening method, server and storage medium |
CN111858699A (en) * | 2020-06-10 | 2020-10-30 | 新华三技术有限公司 | Time series correlation detection method, equipment and storage medium |
CN111650834A (en) * | 2020-06-16 | 2020-09-11 | 湖南工业大学 | Sewage treatment process prediction control method based on Extreme Learning Machine (ELM) |
CN111814284A (en) * | 2020-06-30 | 2020-10-23 | 三峡大学 | On-line voltage stability evaluation method based on correlation detection and improved random forest |
Non-Patent Citations (7)
Title |
---|
WEIWEI CAO 等: "Prediction Based on Online Extreme Learning Machine in WWTP Application", 《INTERNATIONAL CONFERENCE ON NEURAL INFORMATION PROCESSING》 * |
WEIWEICAO 等: "Online sequential extreme learning machine based adaptive control for wastewater treatment plant", 《NEUROCOMPUTING》 * |
YUJUN ZENG 等: "Traffic Sign Recognition Using Kernel Extreme Learning Machines With Deep Perceptual Features", 《 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS》 * |
孙超 等: "《有限空间水下结构近场声全息技术及应用》", 30 November 2018, 哈尔滨:哈尔滨工程大学出版社 * |
朱赫炎等: "计及复杂气象影响的含光伏电源的母线峰值负荷预测", 《可再生能源》 * |
杨国田等: "基于互信息变量选择与LSTM的电站锅炉NO_x排放动态预测", 《华北电力大学学报(自然科学版)》 * |
陈金楷等: "结合相空间重构和ELM的磨煤机振动软测量", 《热力发电》 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113065279A (en) * | 2021-03-15 | 2021-07-02 | 中国石油大学(北京) | Method, device, equipment and storage medium for predicting total organic carbon content |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Karul et al. | Case studies on the use of neural networks in eutrophication modeling | |
CN102854296B (en) | Sewage-disposal soft measurement method on basis of integrated neural network | |
US20170185892A1 (en) | Intelligent detection method for Biochemical Oxygen Demand based on a Self-organizing Recurrent RBF Neural Network | |
CN108898215B (en) | Intelligent sludge bulking identification method based on two-type fuzzy neural network | |
US20180029900A1 (en) | A Method for Effluent Total Nitrogen-based on a Recurrent Self-organizing RBF Neural Network | |
CN106022954B (en) | Multiple BP neural network load prediction method based on grey correlation degree | |
CN109344971B (en) | Effluent ammonia nitrogen concentration prediction method based on adaptive recursive fuzzy neural network | |
CN109657790B (en) | PSO-based recursive RBF neural network effluent BOD prediction method | |
CN107506857B (en) | Urban lake and reservoir cyanobacterial bloom multivariable prediction method based on fuzzy support vector machine | |
CN103793604A (en) | Sewage treatment soft measuring method based on RVM | |
CN107247888B (en) | Method for soft measurement of total phosphorus TP (thermal transfer profile) in sewage treatment effluent based on storage pool network | |
CN112989704A (en) | DE algorithm-based IRFM-CMNN effluent BOD concentration prediction method | |
CN114564699B (en) | Continuous online monitoring method and system for total phosphorus and total nitrogen | |
CN112183676A (en) | Water quality soft measurement method based on mixed dimensionality reduction and kernel function extreme learning machine | |
Roeva et al. | Comparison of different algorithms for InterCriteria relations calculation | |
CN113408799A (en) | River total nitrogen concentration prediction method based on hybrid neural network | |
CN112085348A (en) | Soil fertility assessment method based on fuzzy neural network | |
Lee et al. | Channel pruning via gradient of mutual information for light-weight convolutional neural networks | |
Yasmin et al. | Improved support vector machine using optimization techniques for an aerobic granular sludge | |
Mahmod et al. | Dynamic modelling of aerobic granular sludge artificial neural networks | |
CN114580266A (en) | Land-source pollutant intelligent comprehensive evaluation method and system | |
CN113111576A (en) | Mixed coding particle swarm-long and short term memory neural network based soft measurement method for ammonia nitrogen in effluent | |
Peng et al. | Monitoring of wastewater treatment process based on multi-stage variational autoencoder | |
Wang et al. | Monitoring of wastewater treatment process based on slow feature analysis variational autoencoder | |
CN110542748B (en) | Knowledge-based robust effluent ammonia nitrogen soft measurement method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20210105 |
|
RJ01 | Rejection of invention patent application after publication |