CN112183676A - Water quality soft measurement method based on mixed dimensionality reduction and kernel function extreme learning machine - Google Patents

Water quality soft measurement method based on mixed dimensionality reduction and kernel function extreme learning machine Download PDF

Info

Publication number
CN112183676A
CN112183676A CN202011249249.8A CN202011249249A CN112183676A CN 112183676 A CN112183676 A CN 112183676A CN 202011249249 A CN202011249249 A CN 202011249249A CN 112183676 A CN112183676 A CN 112183676A
Authority
CN
China
Prior art keywords
data
kernel function
learning machine
sample
components
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011249249.8A
Other languages
Chinese (zh)
Inventor
杨秦敏
曹伟伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN202011249249.8A priority Critical patent/CN112183676A/en
Publication of CN112183676A publication Critical patent/CN112183676A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/18Water
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The invention discloses a water quality soft measurement method based on a hybrid dimensionality reduction and kernel function extreme learning machine, which considers the characteristics that mutual information measures the nonlinear correlation between two components in sewage treatment, a Pearson coefficient considers the linear correlation between the two components to preprocess and dimensionality reduction data, the learning speed of the extreme learning machine is extremely high, the model estimation precision is higher, and the like, and the kernel function does not need to know an explicit definition mapping function and the number of hidden layer neurons, so that the optimization time of the number of the neurons is saved, the estimation performance is improved, a method of sampling for averaging for multiple times is adopted, the requirement of an algorithm on computing equipment is further reduced, and the computation complexity is effectively reduced on the premise of ensuring the performance. The method can quickly and effectively estimate the concentration of the ammonia nitrogen ions, and effectively avoids the influence on the effluent quality caused by the performance of the sensor and the characteristics of sewage treatment, thereby improving the efficiency of the sewage treatment process and the effluent quality.

Description

Water quality soft measurement method based on mixed dimensionality reduction and kernel function extreme learning machine
Technical Field
The invention relates to the field of control science and engineering and environmental science and engineering, in particular to a water quality soft measurement method based on a hybrid dimensionality reduction and kernel function extreme learning machine.
Background
Water is one of the essential elements for life existence, and the worldwide water resource reserves are abundant and about 1.36X 1018m3, the fresh water resource only accounts for 2.53%, deep water and glaciers are the main resources, and the fresh water of lakes and rivers only accounts for 0.3% of the fresh water resource, so that the fresh water resource available for human beings is very limited from the group. Meanwhile, the development and the utilization of water resources by human beings are unreasonable, so that the water resources are greatly wasted and polluted, and the space of the available water resources is further compressed. The polluted water resource can cause damage to the environment, vegetation decline, animal death and the like, and can also harm human society and the human health and even the life of people with the polluted water resource.
Therefore, the treatment of water pollution is imperative, sewage treatment is a very effective means, various kinds of sewage generated by human society are purified, the water is discharged into rivers and lakes after reaching the standard or harmlessly utilized, so-called harmlessly utilized, namely, factory utilizes the treated water reaching the standard to carry out production operation, cities utilize the water reaching the standard to carry out non-catering related industries such as urban greening, road cleaning and the like, and in some countries or cities with extreme water shortage, such as Singapore and Nanbiya, the first city is temperate and the like, and the fresh water purified by adopting the advanced process is taken as urban drinking water. Therefore, the sewage treatment not only can reduce the damage of the sewage produced by human beings to the environment and the human society, but also can relieve the water resource shortage in the development process of the human society.
In view of the above-mentioned characteristics of sewage treatment, sewage treatment plants are widely designed in cities and factories, and even some rural environments are built with sewage treatment plants of a certain scale for treating sewage generated in the life of residents and agricultural production. However, due to the technology and functions of the sensor, the key components in the sewage cannot be directly measured or the measurement timeliness is poor, and meanwhile, the sewage treatment is a large delay system and cannot realize quick feedback and adjustment, so that the quality of the effluent water is influenced. In order to quickly and effectively detect the quality of the effluent water, the method provides a soft measurement method, and indirectly represents the quality of the effluent water based on a sewage quality index which is easy to measure, so that the quick detection and adjustment of the water quality are realized, and the problem that the water quality detection is not timely due to the fact that the performance of a sensor cannot meet the actual requirement and the characteristics of sewage treatment reaction is solved.
Disclosure of Invention
In order to realize the rapid estimation of some components which are difficult to measure in the sewage treatment water quality and facilitate the timely adjustment of control strategies for workers, the invention provides the water quality soft-measurement method based on the mixed dimensionality reduction and kernel function limit learning machine.
The purpose of the invention is realized by the following technical scheme: a water quality soft measurement method based on a mixed dimensionality reduction and kernel function extreme learning machine comprises the following steps:
(1) obtaining N from a wastewater treatment process0Group sample data
Figure BDA0002771055970000021
Each set of input vectors XiCharacterizing a plurality of wastewater quality components, corresponding expected output TiAnd characterizing the concentration of ammonia nitrogen ions in the effluent quality.
(2) Compressing the sample data by adopting a sampling mode, which specifically comprises the following steps: in [1,10 ]]Randomly selecting an integer initial value a, and acquiring data which is ten times compressed in a batch by acquiring the data at intervals of 10 points
Figure BDA0002771055970000022
And repeatedly sampling, and resetting the initial value a every time to obtain p batches of sample data.
(3) Respectively carrying out descaler dimensionalization on each batch of sample data, and normalizing the data of different dimensions to [ -1,1] by a minimum maximum value normalization method]Get normalized sample data Xn
(4) Introducing two indexes of mutual information index and Pearson coefficient to detect sample X obtained by sewage treatmentnAre respectively softMeasuring the concentration T of the target ammonia nitrogen ions to calculate the correlation, selecting strongly correlated components according to the strong and weak relation of the correlation, and eliminating weakly correlated or uncorrelated components, thereby realizing the dimensionality reduction of the detection sample data, and the method comprises the following specific steps:
step 1: respectively select XnCalculating a mutual information value and a Pearson coefficient value by one component in the soft measurement and the target ammonia nitrogen ion concentration T, and recording the component as A.
Step 2: calculating mutual information value MI (a, T):
Figure BDA0002771055970000023
wherein P (A)i) And P (T)j) Respectively represent variable AiAnd TjProbability distribution, P (A)i,Tj) Characterizing variable AiAnd TjMm and nn characterize the data types in a and T, respectively.
In the variable A, the mean value of the variable A is calculated as
Figure BDA0002771055970000024
Less than mean value
Figure BDA0002771055970000025
A of (A)iIs 0, is greater than or equal to the mean value
Figure BDA0002771055970000026
A of (A)iIs 1; accordingly, the same processing is performed for the variable T, and there are: when A isi=0,TjNumber of cases of 0 is Z0When A isi=0,TjNumber of cases of 1 is Z1(ii) a When A isi=1,TjNumber of cases of 0 is Z2When A isi=1,TjNumber of cases of 1 is Z3(ii) a The sum of the times for all cases is set as: l ═ Z0+Z1+Z2+Z3(ii) a Then there is the following probability distribution:
Figure BDA0002771055970000031
Figure BDA0002771055970000032
and calculating a joint probability distribution, and calculating the mutual information values of the two components according to the MI (A, T) definitional formula.
Step 3: calculating the Pearson coefficient r:
Figure BDA0002771055970000033
wherein
Figure BDA0002771055970000034
Is the mean value, σ, of sample AAIs the standard deviation of the sample a and,
Figure BDA0002771055970000035
is the mean value, σ, of the sample TTIs the standard deviation of sample T, AkIs the kth data of sample A, TkIs the kth data of sample T.
Step 4: after the mutual information and the Pearson coefficient value of one component and T are calculated, another component is calculated until the mutual information and the Pearson coefficient value of all the components and T are calculated, the strongly related components are selected, and the detection data X' epsilon R is reconstructedN×qAnd q is the component category after dimensionality reduction.
(5) The extreme learning machine based on the kernel function is constructed, an input layer of the extreme learning machine is provided with q nodes, an output layer of the extreme learning machine is provided with 1 node, and the expression is as follows:
Figure BDA0002771055970000036
where f (X') is the neural network output, T is the target output of the training data, ILIs a unit matrix, C is a constant, K (X'i,X′j) Is a kernel function, ΩELMThe kernel matrix is in the following specific form:
Figure BDA0002771055970000037
Figure BDA0002771055970000038
wherein G (-) is the excitation function of the neural network, al,blThe method comprises the following steps that (L ═ 1, 2., L) are weight and deviant from an input layer to a hidden layer respectively, L represents the number of nodes of the hidden layer of the neural network, X' represents a total of N groups of neural network input data, namely data obtained after dimensionality reduction of sewage treatment detection data, each group has q characteristic values, namely the number of the nodes corresponding to the input layer of the neural network, and H is output from the hidden layer of the neural network.
(6) And (4) respectively training results of the extreme learning machine according to the p batches of sample data, and carrying out average calculation to obtain a final soft measurement result.
Further, in the step (1), N is obtained from the sewage treatment process0Group sample data
Figure BDA0002771055970000041
Wherein each set of input vectors is of specific form Xi=[SI,i,SS,i,XI,i,XS,i,XBH,i,XBA,i,XP,i,SNO,i,SO,i,SND,i,XND,i]TRespectively representing 11 components of soluble inert organic matters, easily biodegradable substrates, insoluble inert organic matters, slowly biodegradable substrates, active heterotrophic organisms, active autotrophic organisms, biomass decay insoluble products, nitrate and nitrite, ammonium ions, soluble degradable organic nitrogen and insoluble degradable organic nitrogen in the sewage.
Further, in the step (4), the detection sample X obtained by sewage treatment is subjected tonRespectively with ammonia nitrogen of soft measurement targetAnd calculating the correlation of the ion concentration T, wherein mutual information is used for representing the nonlinear correlation among different components, the Pearson coefficient is used for representing the linear correlation among different components in the sewage treatment, strongly correlated components are respectively selected from the cross correlation and the Pearson coefficient according to the strong and weak relation of the correlation, then a union set is formed, so that strongly correlated components are obtained, weakly correlated or uncorrelated components are removed, and the dimension reduction of the detection sample data is realized.
Further, in the step (5), a kernel function K (X'i,X′j) Various forms can be selected:
linear kernel function:
Figure BDA0002771055970000042
polynomial kernel function:
Figure BDA0002771055970000043
radial radical kernel function: k (X'i,X′j)=exp(-γ||X′i-X′j||2)
Gaussian kernel function:
Figure BDA0002771055970000044
wherein X 'in various kernel functions'i、X′jThe sample data of the i-th group and the j-th group are respectively referred, and a, c, p, gamma and sigma are set constants.
Kernel matrix omegaELM∈RN×NWith input data X 'only'iRelated to the number of training samples, and is determined by kernel function K (X'i,X′j) Inputting data (X ') in the low dimensional space'i,X′j) Conversion to inner product h (X ') in high dimensional feature space'i)·h(X′j) The method only needs to select kernel function in advance, does not need to define mapping function explicitly, and does not need to set the number of neurons in hidden layer, thereby saving the time for optimizing the number of neurons and being capable of changing the number of neuronsThe method is good for the problem of generalization and stability reduction caused by random assignment of the traditional hidden layer neurons.
The invention has the beneficial effects that: the method is applied to sewage treatment, and in consideration of the characteristics of multiple component types and large data volume in the sewage treatment process, the method of interval grouping is firstly adopted, so that the performance of a soft measurement model is ensured, the single calculated amount can be effectively reduced, the requirement on hardware equipment is reduced, the calculation complexity can be reduced, and the calculation complexity is reduced by multiple times. Meanwhile, non-relevant data is further removed by utilizing a mutual information and Pearson coefficient method, the calculated amount and the calculated complexity are further reduced, and the correlation between the sample components and the soft measurement target is enhanced. And an extreme learning machine method based on a kernel function is used for estimation, so that the soft measurement estimation performance aiming at the concentration of the ammonia nitrogen ions is effectively improved.
Drawings
FIG. 1 is a schematic view of the structure of the water quality soft measurement of the present invention;
FIG. 2 is a flow chart of the water quality soft measurement method of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the detailed description and specific examples, while indicating the scope of the invention, are intended for purposes of illustration only and are not intended to limit the scope of the invention.
Because some components in the sewage treatment are difficult to detect or need to be detected through long chemical experiments, real-time detection cannot be realized, and in order to provide reasonable guidance suggestions for adjusting the sewage treatment, the parameters of the components need to be quickly estimated. The invention provides a water quality soft measurement method based on a mixed dimensionality reduction and kernel function extreme learning machine, which integrates algorithms such as mutual information, Pearson coefficients, a kernel function method, an extreme learning machine and interval sampling and the like, and provides convenience for soft measurement of water quality components in sewage treatment; as shown in fig. 1 and 2, the method comprises the following steps:
(1) from sewage treatment processesGet N0Group sample data
Figure BDA0002771055970000051
Wherein each set of input vectors is of specific form Xi=[SI,i,SS,i,XI,i,XS,i,XBH,i,XBA,i,XP,i,SNO,i,SO,i,SND,i,XND,i]TRespectively representing 11 components in the sewage, such as soluble inert organic matters, easily biodegradable substrates, insoluble inert organic matters, slowly biodegradable substrates, active heterotrophic organisms, active autotrophic organisms, biomass decay insoluble products, nitrate and nitrite, ammonium ions, soluble degradable organic nitrogen, insoluble degradable organic nitrogen and the like, and the corresponding expected output is Ti(SNH,i) Namely the concentration of ammonia nitrogen ions in the effluent quality, and the component can effectively represent the effluent quality.
(2) The neural network training learning based on the kernel function is to increase the dimension of low-dimensional data to a high-dimensional characteristic space so as to carry out calculation, meanwhile, the calculation of joint probability distribution in mutual information can also increase the calculation complexity, meanwhile, a large amount of data is generated in the sewage treatment process, the calculation amount and the complexity of the algorithm can be greatly increased, the method is not in line with the original purpose of practical application, and in order to reduce the calculation complexity and not influence the performance of the algorithm, the method adopts a multi-sampling calculation method. Knowing the original sample data
Figure BDA0002771055970000052
Sampling method is adopted to obtain compressed sample data in [1,10 ]]Randomly selecting an integer initial value a, and acquiring data which is ten times compressed in a batch by acquiring every 10 points
Figure BDA0002771055970000061
Sampling is repeatedly carried out, the initial value a is reset every time, p batches of sample data are obtained, and the following processing and training are respectively carried out on the p batches of sample data.
(3) Because the dimensions of different components in sewage treatment are different and the scale difference between numerical values is huge, in order to eliminate the influence caused by the dimensions, the data of different dimensions are normalized to between [ -1,1] by a minimum maximum value normalization method aiming at each batch of sample data respectively, so that the influence of the dimensions on soft measurement is eliminated. The concrete form is as follows:
Figure BDA0002771055970000062
wherein X is sample data compressed in sewage treatment, and X isminIs the minimum value of X, and XmaxThen is the maximum value of X, XnIs normalized sample data, specifically Xn=[SIn,SSn,XIn,XSn,XBHn,XBAn,XPn,SNOn,SOn,SNDn,XNDn]T
(4) Consider that not all sample data XnTarget ammonia nitrogen ion concentration T (S) in soft measurementNH) All have strong correlation, and the introduction of some unnecessary variables can increase the calculated amount of sewage treatment, so after the data are normalized, two indexes of mutual information index and Pearson coefficient are introduced, and a detection sample X obtained by sewage treatment is subjected tonRespectively with the ammonia nitrogen ion concentration T (S) of the soft measurement targetNH) And calculating correlation, wherein mutual information is used for representing the nonlinear correlation among different components, and the Pearson coefficient is used for representing the linear correlation among different components in sewage treatment, strongly correlated components are respectively selected from the cross correlation and the Pearson coefficient according to the strong and weak relation of the correlation, and then a union set is formed, so that the strongly correlated components are obtained, and the weakly correlated or uncorrelated components are removed, so that the dimension reduction of the detection sample data is realized, and the calculation complexity and the calculation amount are reduced. The specific operation steps are as follows:
step 1: respectively select Xn=[SIn,SSn,XIn,XSn,XBHn,XBAn,XPn,SNOn,SOn,SNDn,XNDn]TOne of 11 medium components and soft measurement target ammonia nitrogen ion concentration T (S)NH) Mutual information values and Pearson coefficient values are calculated. For simplicity of presentation, one of the 11 components is first defined as a, and the target detection variable is presented unchanged.
Step 2: calculating mutual information value MI (a, T):
Figure BDA0002771055970000063
wherein P (A)i) And P (T)j) Respectively represent variable AiAnd TjProbability distribution, and P (A)i,Tj) The variable A is characterizediAnd TjIn which mm and nn characterize the data classes in a and T, respectively (data with the same normalized concentration values are classified as one class). If MI (A, T) is larger, the variable A is closely related to T, otherwise, the variable A is less related to T. If MI (A, T) is zero, it indicates that the two variables are completely independent.
However, in actual operation, considering that the data types of each component in sewage treatment are very many, the calculation difficulty is greatly increased, and in order to simplify the calculation complexity, in the variable A, the average value of the calculation variable A is
Figure BDA0002771055970000077
(i.e., the mean of all data in variable A), less than the mean
Figure BDA0002771055970000079
A of (A)iIs 0, is greater than or equal to the mean value
Figure BDA0002771055970000078
A of (A)iIs 1; accordingly, the same processing is performed for the variable T, and there are: when A isi=0,TjNumber of cases of 0 is Z0When A isi=0,TjNumber of cases of 1 is Z1(ii) a When A isi=1,TjNumber of cases of 0 is Z2When is coming into contact withAi=1,TjNumber of cases of 1 is Z3
The sum of the times for all cases is set as: l ═ Z0+Z1+Z2+Z3
The following probability distribution:
Figure BDA0002771055970000071
Figure BDA0002771055970000072
simultaneously calculating joint probability distribution:
Figure BDA0002771055970000073
mutual information values of the two components can be calculated according to MI (A, T) definitional formulas.
Step 3: calculating the Pearson coefficient r:
Figure BDA0002771055970000074
wherein the content of the first and second substances,
Figure BDA0002771055970000075
is the average of sample A, and σAThen it is the standard deviation of the sample a,
Figure BDA0002771055970000076
is the average value of the samples T, and σTThen is the standard deviation of the sample T, AkIs the kth data of sample A, TkIs the kth data of sample T.
The range of variation of the pearson coefficient is-1 to 1, and a coefficient value of 1 means that a and T can be well described by a linear equation, all data points well fall on a straight line, and a increases with the increase of T; a coefficient value of-1 means that all data points fall on a straight line and a decreases as T increases; a coefficient value of 0 means that there is no linear relationship between the two variables.
Step 4: after one component is calculated, the ammonia nitrogen ion concentration T (S) of the target of soft measurement is calculatedNH) After mutual information and Pearson coefficient value, another component is calculated until 11 components and the target ammonia nitrogen ion concentration T (S) of soft measurementNH) The mutual information and the Pearson coefficient value are calculated, the strongly related components are selected, and the detection data X' epsilon R is reconstructedN×q(q is the component species after dimensionality reduction, q is less than 11), and simultaneously soft-measuring the target ammonia nitrogen ion concentration T (S)NH) And is not changed.
(5) Considering the characteristics of multiple types of sewage treatment data and large calculated data amount, after the sewage treatment carries out data sampling and pretreatment dimension reduction for multiple times, the calculation complexity and the calculated amount are greatly reduced. Considering that the sewage treatment process is a very complex strong nonlinear system which is difficult to accurately model and is difficult to accurately express by using a mathematical model, in order to realize soft measurement more accurately, the method introduces an extreme learning machine based on a kernel function to carry out soft measurement. The neural network of the extreme learning machine is composed of an input layer, a hidden layer and an output layer, wherein the input layer of the neural network of the extreme learning machine is set to have q nodes according to the characteristics of sample data, and the output layer of the neural network of the extreme learning machine is set to have 1 node. The extreme learning machine has the following steps:
step 1: determining the type q and data length N of input data according to the size of a training sample data set
Figure BDA0002771055970000081
Wherein G (-) is the excitation function of the neural network, al,bl(L ═ 1, 2.. said., L) are weight and deviant from input layer to hidden layer, L represents number of hidden layer nodes of neural network, X' represents a total of N groups of neural network input data, i.e. data obtained by reducing dimension of sewage treatment detection data, and each group has q characteristic values, i.e. corresponding to neural networkThe number of nodes of the network input layer, H is the output of the hidden layer of the neural network.
Step 2: the concentration S of ammonia nitrogen ions in the effluent of the sewageNHAs the target history data T:
Figure BDA0002771055970000082
wherein T isj(j ═ 1,2, …, N) is the output of the jth set of target history data;
step 3: constructing a network from a hidden layer to an output layer, and selecting a Purelin function according to the output layer, wherein the Purelin function has the following characteristics
Figure BDA0002771055970000083
Writing this formula as a matrix form
T=βH
Wherein wlThen the weight from hidden layer to output layer is obtained, and the vector is beta epsilon RL,G(al,blX') is the hidden layer output and the output layer input, and the matrix form is H e RL×N
Step 4: under the premise of obtaining step1 and step2, step3 is processed by adopting a generalized inverse calculation method to obtain a weight vector from a hidden layer to an output layer:
Figure BDA0002771055970000091
wherein ILIs an identity matrix of dimension L, and C is a constant.
Under the condition that the specific form of the feature mapping h (X') of the hidden layer in step1 is unknown, a kernel function needs to be introduced to measure the similarity between samples, and a kernel matrix of the extreme learning machine can be defined according to the Mercer condition, wherein the specific form is as follows:
Figure BDA0002771055970000092
wherein omegaELMIn the form of a kernel matrix, the kernel matrix,and kernel function K (X'i,X′j) Various forms can be selected, common types being:
linear kernel function:
Figure BDA0002771055970000093
polynomial kernel function:
Figure BDA0002771055970000094
radial Radical (RBF) kernel function: k (X'i,X′j)=exp(-γ||X′i-X′j||2)
Gaussian (Gaussian) kernel function:
Figure BDA0002771055970000095
wherein X 'in various kernel functions'i、X′jRespectively refer to the i-th group and the j-th group of sample data, and a, c, p, γ and σ are set constants, step4 is changed into:
Figure BDA0002771055970000096
where f (X') is the output of the neural network.
Thus the kernel matrix ΩELM∈RN×NWith input data X 'only'iRelated to the number of training samples, and is determined by kernel function K (X'i,X′j) Inputting data (X ') in the low dimensional space'i,X′j) Conversion to inner product h (X ') in high dimensional feature space'i)·h(X′j) The method only needs to select kernel functions in advance, does not need to define mapping functions explicitly, and does not need to set the number of the neurons in the hidden layer, thereby saving the time for optimizing the number of the neurons and improving the problem of the reduction of generalization and stability caused by the random assignment of the neurons in the traditional hidden layer.
(6) And (4) respectively training results of the extreme learning machine according to the p batches of sample data, and carrying out average calculation to obtain a final soft measurement result, namely the concentration of the ammonia nitrogen ions in the effluent quality.
The foregoing is only a preferred embodiment of the present invention, and although the present invention has been disclosed in the preferred embodiments, it is not intended to limit the present invention. Those skilled in the art can make numerous possible variations and modifications to the present teachings, or modify equivalent embodiments to equivalent variations, without departing from the scope of the present teachings, using the methods and techniques disclosed above. Therefore, any simple modification, equivalent change and modification made to the above embodiments according to the technical essence of the present invention are still within the scope of the protection of the technical solution of the present invention, unless the contents of the technical solution of the present invention are departed.

Claims (6)

1. A water quality soft measurement method based on a mixed dimensionality reduction and kernel function extreme learning machine is characterized by comprising the following steps:
(1) obtaining N from a wastewater treatment process0Group sample data
Figure FDA0002771055960000011
Each set of input vectors XiCharacterizing a plurality of wastewater quality components, corresponding expected output TiAnd characterizing the concentration of ammonia nitrogen ions in the effluent quality.
(2) Compressing the sample data by adopting a sampling mode, which specifically comprises the following steps: in [1,10 ]]Randomly selecting an integer initial value a, and acquiring data which is ten times compressed in a batch by acquiring the data at intervals of 10 points
Figure FDA0002771055960000012
And repeatedly sampling, and resetting the initial value a every time to obtain p batches of sample data.
(3) Respectively carrying out descaler dimensionalization on each batch of sample data, and normalizing the data of different dimensions to [ -1,1] by a minimum maximum value normalization method]Get normalized sample data Xn
(4) Introducing two indexes of mutual information index and Pearson coefficient to detect sample X obtained by sewage treatmentnRespectively calculating correlations with the concentration T of the ammonia nitrogen ions of the soft measurement target, selecting strongly correlated components according to the strong and weak relation of the correlations, and eliminating weakly correlated or uncorrelated components, thereby realizing the dimension reduction of the detection sample data, and the method comprises the following specific steps:
step 1: respectively select XnCalculating a mutual information value and a Pearson coefficient value by one component in the soft measurement and the target ammonia nitrogen ion concentration T, and recording the component as A.
Step 2: calculating mutual information value MI (a, T):
Figure FDA0002771055960000013
wherein P (A)i) And P (T)j) Respectively represent variable AiAnd TjProbability distribution, P (A)i,Tj) Characterizing variable AiAnd TjMm and nn characterize the data types in a and T, respectively.
In the variable A, the mean value of the variable A is calculated as
Figure FDA0002771055960000014
Less than mean value
Figure FDA0002771055960000015
A of (A)iIs 0, is greater than or equal to the mean value
Figure FDA0002771055960000016
A of (A)iIs 1; accordingly, the same processing is performed for the variable T, and there are: when A isi=0,TjNumber of cases of 0 is Z0When A isi=0,TjNumber of cases of 1 is Z1(ii) a When A isi=1,TjNumber of cases of 0 is Z2When A isi=1,TjNumber of cases of 1 is Z3(ii) a Setting upThe sum of the times for all cases is: l ═ Z0+Z1+Z2+Z3(ii) a Then there is the following probability distribution:
Figure FDA0002771055960000017
Figure FDA0002771055960000021
and calculating a joint probability distribution, and calculating the mutual information values of the two components according to the MI (A, T) definitional formula.
Step 3: calculating the Pearson coefficient r:
Figure FDA0002771055960000022
wherein
Figure FDA0002771055960000023
Is the mean value, σ, of sample AAIs the standard deviation of the sample a and,
Figure FDA0002771055960000024
is the mean value, σ, of the sample TTIs the standard deviation of sample T, AkIs the kth data of sample A, TkIs the kth data of sample T.
Step 4: after the mutual information and the Pearson coefficient value of one component and T are calculated, another component is calculated until the mutual information and the Pearson coefficient value of all the components and T are calculated, the strongly related components are selected, and the detection data X' epsilon R is reconstructedN×qAnd q is the component category after dimensionality reduction.
(5) The extreme learning machine based on the kernel function is constructed, an input layer of the extreme learning machine is provided with q nodes, an output layer of the extreme learning machine is provided with 1 node, and the expression is as follows:
Figure FDA0002771055960000025
where f (X') is the neural network output, T is the target output of the training data, ILIs a unit matrix, C is a constant, K (X'i,X'j) Is a kernel function, ΩELMThe kernel matrix is in the following specific form:
Figure FDA0002771055960000026
Figure FDA0002771055960000027
wherein G (-) is the excitation function of the neural network, al,blThe method comprises the following steps that (L ═ 1, 2., L) are weight and deviant from an input layer to a hidden layer respectively, L represents the number of nodes of the hidden layer of the neural network, X' represents a total of N groups of neural network input data, namely data obtained after dimensionality reduction of sewage treatment detection data, each group has q characteristic values, namely the number of the nodes corresponding to the input layer of the neural network, and H is output from the hidden layer of the neural network.
(6) And (4) respectively training results of the extreme learning machine according to the p batches of sample data, and carrying out average calculation to obtain a final soft measurement result.
2. The method for soft measurement of water quality based on the hybrid dimensionality reduction and kernel function limit learning machine as claimed in claim 1, wherein in the step (1), N is obtained from a sewage treatment process0Group sample data
Figure FDA0002771055960000031
Wherein each set of input vectors is of specific form Xi=[SI,i,SS,i,XI,i,XS,i,XBH,i,XBA,i,XP,i,SNO,i,SO,i,SND,i,XND,i]TRespectively representing 11 components of soluble inert organic matters, easily biodegradable substrates, insoluble inert organic matters, slowly biodegradable substrates, active heterotrophic organisms, active autotrophic organisms, biomass decay insoluble products, nitrate and nitrite, ammonium ions, soluble degradable organic nitrogen and insoluble degradable organic nitrogen in the sewage.
3. The method for soft measurement of water quality based on the hybrid dimensionality reduction and kernel function limit learning machine as claimed in claim 1, wherein in the step (3), the data normalization formula is as follows:
Figure FDA0002771055960000032
wherein X is sample data compressed in sewage treatment, and X isminIs the minimum of X, XmaxIs the maximum of X, XnThe normalized sample data.
4. The method for soft measurement of water quality based on the hybrid dimensionality reduction and kernel function limit learning machine of claim 1, wherein in the step (4), a detection sample X obtained by sewage treatment is subjected tonAnd calculating correlations with the concentration T of the ammonia nitrogen ions of the soft measurement target respectively, wherein mutual information is used for representing the nonlinear correlation among different components, the Pearson coefficient is used for representing the linear correlation among different components in sewage treatment, strongly correlated components are selected from cross correlation and Pearson coefficients respectively according to the strong and weak relations of the correlations, then a union set is formed, so that strongly correlated components are obtained, the weakly correlated or uncorrelated components are removed, and the dimension reduction of the detection sample data is realized.
5. The method for soft measurement of water quality based on hybrid dimensionality reduction and kernel function limit learning machine according to claim 1, wherein in the step (5), the kernel function K (X'i,X'j) Various forms can be selected:
linear kernel function: k (X'i,X'j)=X'i TX'j+c
Polynomial kernel function: k (X'i,X'j)=(aX'i TX'j+c)p
Radial radical kernel function: k (X'i,X'j)=exp(-γ||X'i-X'j||2)
Gaussian kernel function:
Figure FDA0002771055960000033
wherein X 'in various kernel functions'i、X'jThe sample data of the i-th group and the j-th group are respectively referred, and a, c, p, gamma and sigma are set constants.
6. The method for soft measurement of water quality based on the hybrid dimensionality reduction and kernel function limit learning machine as claimed in claim 1, wherein in the step (5), the kernel matrix Ω isELM∈RN×NWith input data X 'only'iRelated to the number of training samples, and is determined by kernel function K (X'i,X'j) Inputting data (X ') in the low dimensional space'i,X'j) Conversion to inner product h (X ') in high dimensional feature space'i)·h(X'j) And the dimension of the feature space is irrelevant, so that the dimension disaster problem can be effectively avoided.
CN202011249249.8A 2020-11-10 2020-11-10 Water quality soft measurement method based on mixed dimensionality reduction and kernel function extreme learning machine Pending CN112183676A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011249249.8A CN112183676A (en) 2020-11-10 2020-11-10 Water quality soft measurement method based on mixed dimensionality reduction and kernel function extreme learning machine

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011249249.8A CN112183676A (en) 2020-11-10 2020-11-10 Water quality soft measurement method based on mixed dimensionality reduction and kernel function extreme learning machine

Publications (1)

Publication Number Publication Date
CN112183676A true CN112183676A (en) 2021-01-05

Family

ID=73918141

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011249249.8A Pending CN112183676A (en) 2020-11-10 2020-11-10 Water quality soft measurement method based on mixed dimensionality reduction and kernel function extreme learning machine

Country Status (1)

Country Link
CN (1) CN112183676A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113065279A (en) * 2021-03-15 2021-07-02 中国石油大学(北京) Method, device, equipment and storage medium for predicting total organic carbon content

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103728431A (en) * 2014-01-09 2014-04-16 重庆科技学院 Industrial sewage COD (chemical oxygen demand) online soft measurement method based on ELM (extreme learning machine)
CN106874934A (en) * 2017-01-12 2017-06-20 华南理工大学 Sewage disposal method for diagnosing faults based on weighting extreme learning machine Integrated Algorithm
CN107688825A (en) * 2017-08-03 2018-02-13 华南理工大学 A kind of follow-on integrated weighting extreme learning machine sewage disposal failure examines method
CN107832785A (en) * 2017-10-30 2018-03-23 天津理工大学 A kind of non-linear limit learning machine algorithm
CN109308571A (en) * 2018-08-29 2019-02-05 华北电力科学研究院有限责任公司 Distribution wire route becomes relationship detection method
CN109614570A (en) * 2018-11-15 2019-04-12 北京英视睿达科技有限公司 Predict the method and device of section water quality parameter data
CN110417011A (en) * 2019-07-31 2019-11-05 三峡大学 A kind of online dynamic secure estimation method based on mutual information Yu iteration random forest
CN111178377A (en) * 2019-10-12 2020-05-19 未鲲(上海)科技服务有限公司 Visual feature screening method, server and storage medium
CN111650834A (en) * 2020-06-16 2020-09-11 湖南工业大学 Sewage treatment process prediction control method based on Extreme Learning Machine (ELM)
CN111814284A (en) * 2020-06-30 2020-10-23 三峡大学 On-line voltage stability evaluation method based on correlation detection and improved random forest
CN111858699A (en) * 2020-06-10 2020-10-30 新华三技术有限公司 Time series correlation detection method, equipment and storage medium

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103728431A (en) * 2014-01-09 2014-04-16 重庆科技学院 Industrial sewage COD (chemical oxygen demand) online soft measurement method based on ELM (extreme learning machine)
CN106874934A (en) * 2017-01-12 2017-06-20 华南理工大学 Sewage disposal method for diagnosing faults based on weighting extreme learning machine Integrated Algorithm
CN107688825A (en) * 2017-08-03 2018-02-13 华南理工大学 A kind of follow-on integrated weighting extreme learning machine sewage disposal failure examines method
CN107832785A (en) * 2017-10-30 2018-03-23 天津理工大学 A kind of non-linear limit learning machine algorithm
CN109308571A (en) * 2018-08-29 2019-02-05 华北电力科学研究院有限责任公司 Distribution wire route becomes relationship detection method
CN109614570A (en) * 2018-11-15 2019-04-12 北京英视睿达科技有限公司 Predict the method and device of section water quality parameter data
CN110417011A (en) * 2019-07-31 2019-11-05 三峡大学 A kind of online dynamic secure estimation method based on mutual information Yu iteration random forest
CN111178377A (en) * 2019-10-12 2020-05-19 未鲲(上海)科技服务有限公司 Visual feature screening method, server and storage medium
CN111858699A (en) * 2020-06-10 2020-10-30 新华三技术有限公司 Time series correlation detection method, equipment and storage medium
CN111650834A (en) * 2020-06-16 2020-09-11 湖南工业大学 Sewage treatment process prediction control method based on Extreme Learning Machine (ELM)
CN111814284A (en) * 2020-06-30 2020-10-23 三峡大学 On-line voltage stability evaluation method based on correlation detection and improved random forest

Non-Patent Citations (7)

* Cited by examiner, † Cited by third party
Title
WEIWEI CAO 等: "Prediction Based on Online Extreme Learning Machine in WWTP Application", 《INTERNATIONAL CONFERENCE ON NEURAL INFORMATION PROCESSING》 *
WEIWEICAO 等: "Online sequential extreme learning machine based adaptive control for wastewater treatment plant", 《NEUROCOMPUTING》 *
YUJUN ZENG 等: "Traffic Sign Recognition Using Kernel Extreme Learning Machines With Deep Perceptual Features", 《 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS》 *
孙超 等: "《有限空间水下结构近场声全息技术及应用》", 30 November 2018, 哈尔滨:哈尔滨工程大学出版社 *
朱赫炎等: "计及复杂气象影响的含光伏电源的母线峰值负荷预测", 《可再生能源》 *
杨国田等: "基于互信息变量选择与LSTM的电站锅炉NO_x排放动态预测", 《华北电力大学学报(自然科学版)》 *
陈金楷等: "结合相空间重构和ELM的磨煤机振动软测量", 《热力发电》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113065279A (en) * 2021-03-15 2021-07-02 中国石油大学(北京) Method, device, equipment and storage medium for predicting total organic carbon content

Similar Documents

Publication Publication Date Title
Karul et al. Case studies on the use of neural networks in eutrophication modeling
CN102854296B (en) Sewage-disposal soft measurement method on basis of integrated neural network
US20170185892A1 (en) Intelligent detection method for Biochemical Oxygen Demand based on a Self-organizing Recurrent RBF Neural Network
CN108898215B (en) Intelligent sludge bulking identification method based on two-type fuzzy neural network
US20180029900A1 (en) A Method for Effluent Total Nitrogen-based on a Recurrent Self-organizing RBF Neural Network
CN106022954B (en) Multiple BP neural network load prediction method based on grey correlation degree
CN109344971B (en) Effluent ammonia nitrogen concentration prediction method based on adaptive recursive fuzzy neural network
CN109657790B (en) PSO-based recursive RBF neural network effluent BOD prediction method
CN107506857B (en) Urban lake and reservoir cyanobacterial bloom multivariable prediction method based on fuzzy support vector machine
CN103793604A (en) Sewage treatment soft measuring method based on RVM
CN107247888B (en) Method for soft measurement of total phosphorus TP (thermal transfer profile) in sewage treatment effluent based on storage pool network
CN112989704A (en) DE algorithm-based IRFM-CMNN effluent BOD concentration prediction method
CN114564699B (en) Continuous online monitoring method and system for total phosphorus and total nitrogen
CN112183676A (en) Water quality soft measurement method based on mixed dimensionality reduction and kernel function extreme learning machine
Roeva et al. Comparison of different algorithms for InterCriteria relations calculation
CN113408799A (en) River total nitrogen concentration prediction method based on hybrid neural network
CN112085348A (en) Soil fertility assessment method based on fuzzy neural network
Lee et al. Channel pruning via gradient of mutual information for light-weight convolutional neural networks
Yasmin et al. Improved support vector machine using optimization techniques for an aerobic granular sludge
Mahmod et al. Dynamic modelling of aerobic granular sludge artificial neural networks
CN114580266A (en) Land-source pollutant intelligent comprehensive evaluation method and system
CN113111576A (en) Mixed coding particle swarm-long and short term memory neural network based soft measurement method for ammonia nitrogen in effluent
Peng et al. Monitoring of wastewater treatment process based on multi-stage variational autoencoder
Wang et al. Monitoring of wastewater treatment process based on slow feature analysis variational autoencoder
CN110542748B (en) Knowledge-based robust effluent ammonia nitrogen soft measurement method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20210105

RJ01 Rejection of invention patent application after publication