CN115829157A - Chemical water quality index prediction method based on variational modal decomposition and auto former model - Google Patents
Chemical water quality index prediction method based on variational modal decomposition and auto former model Download PDFInfo
- Publication number
- CN115829157A CN115829157A CN202211697456.9A CN202211697456A CN115829157A CN 115829157 A CN115829157 A CN 115829157A CN 202211697456 A CN202211697456 A CN 202211697456A CN 115829157 A CN115829157 A CN 115829157A
- Authority
- CN
- China
- Prior art keywords
- water quality
- decomposition
- modal
- information
- auto
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 title claims abstract description 113
- 238000000354 decomposition reaction Methods 0.000 title claims abstract description 66
- 239000000126 substance Substances 0.000 title claims abstract description 47
- 238000000034 method Methods 0.000 title claims abstract description 35
- 238000012545 processing Methods 0.000 claims abstract description 16
- 230000002776 aggregation Effects 0.000 claims abstract description 12
- 238000004220 aggregation Methods 0.000 claims abstract description 12
- 238000012216 screening Methods 0.000 claims abstract description 9
- 238000007781 pre-processing Methods 0.000 claims abstract description 8
- 238000005070 sampling Methods 0.000 claims description 13
- 230000000737 periodic effect Effects 0.000 claims description 12
- 238000004364 calculation method Methods 0.000 claims description 11
- 230000000875 corresponding effect Effects 0.000 claims description 11
- 238000009499 grossing Methods 0.000 claims description 10
- 230000008569 process Effects 0.000 claims description 9
- 238000004458 analytical method Methods 0.000 claims description 7
- IJGRMHOSHXDMSA-UHFFFAOYSA-N Atomic nitrogen Chemical compound N#N IJGRMHOSHXDMSA-UHFFFAOYSA-N 0.000 claims description 6
- 101001013046 Homo sapiens MICOS complex subunit MIC27 Proteins 0.000 claims description 6
- 102100029628 MICOS complex subunit MIC27 Human genes 0.000 claims description 6
- 230000002159 abnormal effect Effects 0.000 claims description 6
- 230000004931 aggregating effect Effects 0.000 claims description 6
- QVGXLLKOCUKJST-UHFFFAOYSA-N atomic oxygen Chemical compound [O] QVGXLLKOCUKJST-UHFFFAOYSA-N 0.000 claims description 6
- 229910052760 oxygen Inorganic materials 0.000 claims description 6
- 239000001301 oxygen Substances 0.000 claims description 6
- 238000012549 training Methods 0.000 claims description 6
- 230000000007 visual effect Effects 0.000 claims description 6
- 238000005259 measurement Methods 0.000 claims description 4
- OAICVXFJPJFONN-UHFFFAOYSA-N Phosphorus Chemical compound [P] OAICVXFJPJFONN-UHFFFAOYSA-N 0.000 claims description 3
- XLYOFNOQVPJJNP-PWCQTSIFSA-N Tritiated water Chemical compound [3H]O[3H] XLYOFNOQVPJJNP-PWCQTSIFSA-N 0.000 claims description 3
- XKMRRTOUMJRJIA-UHFFFAOYSA-N ammonia nh3 Chemical compound N.N XKMRRTOUMJRJIA-UHFFFAOYSA-N 0.000 claims description 3
- 230000003190 augmentative effect Effects 0.000 claims description 3
- 238000004422 calculation algorithm Methods 0.000 claims description 3
- 230000002596 correlated effect Effects 0.000 claims description 3
- 238000002790 cross-validation Methods 0.000 claims description 3
- 238000007405 data analysis Methods 0.000 claims description 3
- 238000009826 distribution Methods 0.000 claims description 3
- 238000005516 engineering process Methods 0.000 claims description 3
- 230000002708 enhancing effect Effects 0.000 claims description 3
- 238000011156 evaluation Methods 0.000 claims description 3
- 238000004519 manufacturing process Methods 0.000 claims description 3
- 229910052757 nitrogen Inorganic materials 0.000 claims description 3
- 229910052698 phosphorus Inorganic materials 0.000 claims description 3
- 239000011574 phosphorus Substances 0.000 claims description 3
- 238000001556 precipitation Methods 0.000 claims description 3
- 230000009467 reduction Effects 0.000 claims description 3
- 238000013215 result calculation Methods 0.000 claims description 3
- 230000003068 static effect Effects 0.000 claims description 3
- 230000007812 deficiency Effects 0.000 claims description 2
- 230000009466 transformation Effects 0.000 claims description 2
- 238000013178 mathematical model Methods 0.000 description 4
- 238000010586 diagram Methods 0.000 description 3
- 230000007246 mechanism Effects 0.000 description 3
- 238000005457 optimization Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000012163 sequencing technique Methods 0.000 description 1
- 239000010865 sewage Substances 0.000 description 1
- 238000007619 statistical method Methods 0.000 description 1
- 238000003911 water pollution Methods 0.000 description 1
Images
Classifications
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A20/00—Water conservation; Efficient water supply; Efficient water use
- Y02A20/152—Water filtration
Landscapes
- Activated Sludge Processes (AREA)
Abstract
A chemical water quality index prediction method based on variational modal decomposition and an auto-former model comprises the steps of conducting data preprocessing on a water quality parameter data set and related factor historical information, searching for a smooth critical point and a processing missing value, decomposing water quality characteristic parameters by using variational modal decomposition, then modeling a water quality parameter data set, screening characteristics on the related factor historical information data set through a characteristic engineering model, calculating according to weather seasons and flow, using the characteristics to understand and analyze chemical water quality, gradually extracting trend terms and period terms from hidden variables through a sequence decomposition unit, and conducting time delay information aggregation and cycle dependency discovery on similar subsequences of different periods. The method of the invention combines the historical characteristic information and adds the modal information in the chemical water quality time series prediction by extracting the time series characteristics, decomposing the characteristic information by using the variation mode and calculating the importance of the characteristic, so that the prediction result is more accurate and reasonable.
Description
Technical Field
The invention relates to the technical field of chemical water quality index prediction, in particular to a chemical water quality index prediction method based on variational modal decomposition and an auto-former model.
Background
The water quality index can be used as a specific measurement scale for the water pollution degree, and the chemical plant pollution discharge index data is acquired through automatic real-time acquisition of the sewage treatment station. At present, the commonly used mathematical models of water quality mainly include two categories: while a water quality mechanism mathematical model and a water quality data driving class mathematical model are adopted, a large amount of appropriate historical hydrological water quality data are needed to calibrate model parameters when the common water quality mechanism mathematical model based on theoretical basis is applied to the model, and meanwhile, when influence factors influencing certain water quality indexes are more, the mechanism becomes very complex, so that the model is difficult to establish and related parameters are difficult to obtain; in recent years, under the rapid development of big data research, a data-driven model taking an ash box or black box equation as a means based on sample data has been widely applied to a plurality of subjects, and the data-driven model is also applied to river water quality prediction and early warning.
The traditional water quality index prediction method generally adopts an inherent model to predict the water quality index, the prediction precision based on a statistical method is limited, and the analysis of the nonlinear characteristics of the water environment is lacked, and on the other hand, the time sequence of the water quality index has larger noise due to the complex water environment, so that the traditional model is difficult to effectively predict the water environment index under the complex water environment condition.
Disclosure of Invention
Aiming at the technical problems, the technical scheme provides a chemical water quality index prediction method based on variational modal decomposition and an auto-former model, and the method realizes accurate prediction of the chemical water quality index by comprehensively analyzing a time series data set and a related historical information data set, screening characteristics and adopting a chemical water quality index prediction method based on a multivariable time series of variational modal decomposition; the problems can be effectively solved.
The invention is realized by the following technical scheme:
a chemical water quality index prediction method based on variational modal decomposition and an auto-former model comprises the steps of conducting data preprocessing on a water quality parameter data set and related factor historical information, searching for a smooth critical point and a processing deficiency value, decomposing water quality characteristic parameters by using variational modal decomposition, then modeling the water quality parameter data set, screening characteristics on the related factor historical information data set through a characteristic engineering model, calculating according to weather seasons and flow, using the characteristics and the flow for understanding and analyzing of chemical water quality, gradually extracting trend items and period items from hidden variables through a sequence decomposition unit, and conducting time delay information aggregation and cycle dependency discovery on similar subsequences of different cycles; the method comprises the following specific steps:
step 1: performing preliminary data preprocessing on the water quality parameter data set and historical information of relevant influence factors, filling missing values, removing highly relevant features and feature codes, understanding data distribution conditions and extracting effective features through visual analysis, observing the relation between water quality parameters and time, and performing smoothing processing on abnormal values; the specific operation mode is as follows:
step 1.1: defining WATER as WATER quality parameter data, wherein id, WATER-time, WATER-PH, WATER-ZN, WATER-P, WATER-NH, WATER-cod, WATER-flow and WATER-temperature are respectively numbers corresponding to chemical WATER quality, time information, PH pH value, total nitrogen content, total phosphorus content, ammonia nitrogen content, chemical oxygen demand, flow and WATER temperature, wherein the flow is instantaneous flow during measurement, and the daytime floating air temperature is positively correlated with the WATER temperature;
step 1.2: defining RELATE as an information data set of related factors, defining time, relationship-rain and relationship-temperature respectively corresponding to time, date, precipitation and temperature of the day, and satisfying relationship RELATE = { time, relationship-rain and relationship };
step 1.3: in a WATER quality parameter data set WATER, sorting according to id and time, filling missing values, aggregating WATER quality characteristics according to a day, setting the ph value to be 7 under the condition that the ph value is 0, and filling the rest missing values according to the average value of the day;
step 1.4: establishing historical information of static characteristics, judging whether the historical information has correlation with a prediction target by taking chemical oxygen demand as a dimension, if so, retaining the characteristics, otherwise, discarding the characteristics to reduce the interference of a model and reduce the dimension, and coding the characteristics which cannot be directly calculated;
step 1.5: performing visual analysis on the water quality parameter data, observing the relationship between the parameters and time, smoothing abnormal values, and judging the correlation of historical influence factors;
step 1.6: obtaining WATER quality parameter information WATER-PRE after final pretreatment;
step 2: screening characteristics of the related historical information data set through a characteristic engineering model, discarding historical information characteristics weakly related to a predicted target by utilizing data analysis, decomposing water quality characteristics through a variational mode, performing frequency domain enhancement and obtaining characteristic importance for understanding and analyzing water quality parameters;
and step 3: processing different characteristics by adopting sequence decomposition, decomposing an initial trend item and a period item, and aggregating time delay information;
and 4, step 4: inputting the decomposed and polymerized subsequence into an auto-former model for training;
and 5: and accumulating and reconstructing the obtained prediction results to obtain a final prediction result.
Further, the specific operation mode of step 2 is as follows:
step 2.1: through a variation modal decomposition method, searching a set of modal components and the center frequency of each modal, thereby realizing the division of frequency domains, and the specific calculation formula that each modal is smooth after being demodulated into a baseband is as follows:
where K is the number of modes to be decomposed, { ω k },{u k The k modal component and the center frequency after decomposition are respectively corresponding; δ (t) is a dirac function; * Is the convolution operator;
for solving the mode K, the alternative direction multiplier (ADMM) and an iterative algorithm are combined with Parseval/Plancherel and Fourier equidistant transformation to optimize and obtain each mode component and central frequency, saddle points of an augmented Lagrange function are searched, and omega after iteration is alternately optimized k ,u k The expression of λ: the specific calculation process is as follows:
wherein gamma is noise tolerance, meets the fidelity requirement of signal decomposition, andare respectively asu i Fourier transforms corresponding to (ω), f (ω), λ (ω);
step 2.2: the main iterative solving steps of the variational modal decomposition are as follows: first, initialization is performedλ 1 And maximum number of iterations N, pairω k Andupdating iterations with precision convergence ∈>0, if not satisfiedThen pairω k Continuing iteration;
step 2.3: processing the modal data subjected to modal decomposition by a frequency domain enhancement technology, and enhancing a frequency band signal of the modal data, so that the processed data can better accord with the actual chemical water quality;
step 2.4: the data after modal decomposition is projected to a frequency domain space, and random sampling is performed on each modal frequency domain space, so that the length of an input vector can be greatly reduced, the calculation complexity is reduced, a signal base band is lost by random sampling, but due to the fact that a large amount of noise exists in a high frequency domain of actual production data, the high frequency sampling after modal decomposition needs to be relatively sparse, and the loss can be reduced;
step 2.5: weighting each modal data after random sampling, importing and merging two data sets, modeling according to each screened modal characteristic, and performing 7-fold cross validation to obtain characteristic importance and use the characteristic importance in understanding and analyzing water quality indexes.
Further, the specific operation mode of step 3 is as follows:
step 3.1: smoothing the periodic term and highlighting the trend term based on the idea of moving average by a sequence decomposition unit in the auto-former: xi t =APOOL(Padding(ξ)),ξ a =ξ-ξ t ;
In which ξ t ,Represents the seasonality and the extracted periodic part, and APOOL (Padding) represents the operation of moving average and Padding adopted to ensure that the sequence length is constant;
step 3.2: the model adopts an encoder-decoder structure (encoder-decoder) as a whole, wherein the encoder inputs the information with the time length LThe decoder input needs to be processed by a sequence decomposition unit according to the following specific formula:
x des =Concat(x ens ,x 0 ),
x det =Concat(x ent ,x Mean );
fromThe latter half of the decomposition is carried out to a length of L/2 to obtain x des And x det Wherein x is des Concat of (c) is 0 fill, x det Concat of (c) is mean filling;
step 3.3: and combining the subsequences and initially establishing a chemical water quality index model.
Further, the specific operation mode of step 4 is as follows:
step 4.1: training is carried out through an Auto-former model, chemical water quality indexes are predicted and analyzed, the model is linked by adopting Auto-correlation attention, the original self-attention is improved, the information utility is expanded, similar sub-processes among similar phases of a period are concerned, the periodic dependence discovery and the aggregation of time delay information are mainly included, and the linking of a sequence set is realized;
step 4.2: for periodic dependency finding, by means of random process theory, for discrete time processes { x } t Calculate its autocorrelation coefficient
Wherein the autocorrelation coefficientRepresenting a sequence x t A and a x t -φ Similarity between them, for similar periods
Step 4.3: aggregation of delay information is to aggregate similar subsequence information for sequence chaining, align information by Roll () operation according to an estimated cycle length, and then aggregate information:
wherein the content of the first and second substances,query, key, value, indicating self-attention; to avoid picking an irrelevant or opposite phase,a cycle length;
step 4.4: taking the prediction of the future 5 days as an example, the former 5-day mode is used for decomposition, each decomposed mode is projected to different frequency domain base bands, noise reduction processing is carried out, then an auto former model is input, and each mode target prediction value is finally output through a sequence decomposition unit.
Further, the specific operation mode of step 5 is as follows:
step 5.1: combining the results obtained in the step 4.4 with the evaluation function and the prediction results of each mode according to different weights to finally form prediction results;
step 5.2: the coefficients of each mode according to its autocorrelationReconstructing, wherein a final prediction result calculation formula is as follows:where k is the number of modes, ζ k Is the prediction result of the kth modality;
step 5.3: and obtaining a final water quality prediction result.
Advantageous effects
Compared with the prior art, the chemical water quality index prediction method based on variational modal decomposition and auto-former model has the following beneficial effects:
(1) The technical scheme mainly comprises the steps of preprocessing data of a water quality parameter data set and related factor historical information, searching a smooth critical point, processing a missing value, decomposing water quality characteristic parameters by using variational modal decomposition, modeling the water quality parameter data set, screening characteristics of the related factor historical information data set through a characteristic engineering model, calculating according to weather seasons and flow, calculating and analyzing the understanding of chemical water quality, gradually extracting trend items and period items from hidden variables through a sequence decomposition unit, and discovering delay information aggregation and cycle dependency of similar subsequences of different periods; the characteristic modes are decomposed through the variation modes, the frequency domains are used for enhancing and randomly sampling the frequency domains of the modes, and historical characteristic information is combined in the chemical water quality index prediction, so that the data correction is more detailed.
(2) According to the prediction method of the technical scheme, a water quality parameter data set, related factor historical information and variational modal decomposition are fully utilized to decompose water quality characteristic parameters, then the water quality parameter data set is modeled and used for understanding and analyzing of chemical water quality, trend items and period items are gradually extracted from hidden variables through a sequence decomposition unit, and time delay information aggregation and cycle dependency discovery are carried out on similar subsequences of different periods. In the chemical water quality index prediction, the historical characteristic information is combined, and meanwhile, the modal information is added, so that the prediction result is more accurate and reasonable. The time series and other related information are combined, and the prediction accuracy is improved through feature screening and modeling.
Drawings
FIG. 1 is a schematic overall flow diagram of the present invention.
FIG. 2 is a flow chart of the pretreatment of the water quality parameter data set in the present invention.
FIG. 3 is a diagram of the feature of the invention of variation modal decomposition and frequency domain enhancement.
FIG. 4 is an architecture diagram of the Autoformer model in the present invention.
Detailed Description
The technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. The described embodiments are only some embodiments of the invention, not all embodiments. Various modifications and improvements of the technical solutions of the present invention may be made by those skilled in the art without departing from the design concept of the present invention, and all of them should fall into the protection scope of the present invention.
Example 1:
as shown in fig. 1 to 4, a chemical water quality index prediction method based on variational modal decomposition and auto-former model, data preprocessing is performed on a water quality parameter data set and related factor historical information, a smoothing critical point is searched, a missing value is processed, water quality characteristic parameters are decomposed by using variational modal decomposition, then a water quality parameter data set is modeled, characteristics are screened on the related factor historical information data set through a characteristic engineering model, calculation is performed according to weather seasons and flow, the calculation is used for understanding and analyzing of chemical water quality, trend items and period items are gradually extracted from hidden variables through a sequence decomposition unit, and time delay information aggregation and cycle dependency discovery are performed on similar subsequences of different periods; the method comprises the following specific steps:
step 1: performing preliminary data preprocessing on the water quality parameter data set and historical information of relevant influence factors, filling missing values, removing highly relevant features and feature codes, understanding data distribution conditions and extracting effective features through visual analysis, observing the relation between water quality parameters and time, and performing smoothing processing on abnormal values; the specific operation mode is as follows:
step 1.1: defining WATER as WATER quality parameter data, wherein id, WATER-time, WATER-PH, WATER-ZN, WATER-P, WATER-NH, WATER-cod, WATER-flow and WATER-temperature are respectively numbers corresponding to chemical WATER quality, time information, PH pH value, total nitrogen content, total phosphorus content, ammonia nitrogen content, chemical oxygen demand, flow and WATER temperature, wherein the flow is instantaneous flow during measurement, and the daytime floating air temperature is positively correlated with the WATER temperature.
Step 1.2: defining RELATE as the information data set of the relevant factors, defining time, relationship-rain, relationship-temperature corresponding to time, date, precipitation and temperature of the day respectively, and satisfying the relationship RELATE = { time, relationship-rain, relationship-temperature }.
Step 1.3: and (4) sequencing according to id and time in a WATER quality parameter data set WATER, filling missing values, aggregating WATER quality characteristics according to a day, setting the ph value to be 7 under the condition that the ph value is 0, and filling the rest missing values according to the mean value of the day.
Step 1.4: and (3) constructing historical information of static characteristics, judging whether the historical information has correlation with a prediction target by taking the chemical oxygen demand as a dimension, if so, retaining the characteristics, otherwise, discarding the historical information to reduce the interference of the model and reduce the dimension, and coding the characteristics which cannot be directly calculated.
Step 1.5: and performing visual analysis on the water quality parameter data, observing the relation between the parameters and time, smoothing abnormal values, and judging the correlation of historical influence factors.
Step 1.6: and obtaining the final pretreated WATER quality parameter information WATER-PRE.
Step 2: screening characteristics of the related historical information data set through a characteristic engineering model, discarding historical information characteristics weakly related to a predicted target by utilizing data analysis, decomposing water quality characteristics through a variational mode, performing frequency domain enhancement and obtaining characteristic importance for understanding and analyzing water quality parameters; the specific operation mode is as follows:
step 2.1: through a variation modal decomposition method, searching a set of modal components and the center frequency of each modal, thereby realizing the division of frequency domains, and the specific calculation formula that each modal is smooth after being demodulated into a baseband is as follows:
where K is the number of modes to be decomposed, { ω k },{u k The k modal component and the center frequency after decomposition are respectively corresponding; δ (t) is a dirac function; * Is the convolution operator.
For solving the mode K, all modes are obtained through optimization by using an alternating direction multiplier (ADMM), an iterative algorithm, parseval/Plancherel and Fourier equidistant transformationThe state component and the center frequency are searched, saddle points of the augmented Lagrange function are searched, and omega after the alternate optimization iteration k ,u k The expression of λ: the specific calculation process is as follows:
wherein gamma is noise tolerance, meets the fidelity requirement of signal decomposition, andare respectively asu i (ω), f (ω), λ (ω) are fourier transformed accordingly.
Step 2.2: the main iterative solving steps of the variational modal decomposition are as follows: first, initialization is performedλ 1 And maximum number of iterations N, pairω k Andupdating iterations with precision convergence ∈>0, if not satisfiedThen pairω k The iteration continues.
Step 2.3: and processing the modal data subjected to modal decomposition by a frequency domain enhancement technology to enhance the frequency band signal of the modal data, so that the processed data can better accord with the actual chemical water quality.
Step 2.4: the data after modal decomposition is projected to a frequency domain space, random sampling is carried out on each modal frequency domain space, the length of an input vector can be greatly reduced, the calculation complexity is further reduced, signal base bands can be lost through random sampling, but due to the fact that a large amount of noise exists in the high frequency domain of actual production data, the high frequency sampling after modal decomposition needs to be relatively sparse, and loss can be reduced.
Step 2.5: weighting each modal data after random sampling, importing and merging two data sets, modeling according to each screened modal characteristic, and performing 7-fold cross validation to obtain characteristic importance and use the characteristic importance in understanding and analyzing water quality indexes.
And step 3: processing different characteristics by adopting sequence decomposition, decomposing an initial trend item and a period item, and aggregating time delay information; the specific operation mode is as follows:
step 3.1: smoothing the periodic term and highlighting the trend term based on the idea of moving average by a sequence decomposition unit in the auto-former: xi t =APOOL(Padding(ξ)),ξ a =ξ-ξ t 。
In which ξ t ,Representing the seasonality and the periodic part of the extraction, APOOL (Padding) represents the operation of moving average and Padding taken to ensure that the sequence length is constant.
Step 3.2: the model adopts an encoder-decoder structure (encoder-decoder) as a whole, wherein an encoder inputOf length LThe decoder input needs to be processed by a sequence decomposition unit according to the following specific formula:
x des =Concat(x ens ,x 0 ),
x des =Concat(x ent ,x Mean );
fromThe latter half of the decomposition is carried out to a length of L/2 to obtain x des And x det Wherein x is des Concat of (c) is 0 fill, x det Is mean filling.
Step 3.3: and combining the subsequences and initially establishing a chemical water quality index model.
And 4, step 4: inputting the decomposed and polymerized subsequence into an auto-former model for training; the specific operation mode is as follows:
step 4.1: training is carried out through an Auto-former model, chemical water quality indexes are predicted and analyzed, the model is linked by adopting Auto-correlation attention, the original self-attention is improved, the information utility is expanded, similar subprocesses among similar phases of a period are concerned, the periodic dependence discovery and the aggregation of time delay information are mainly included, and the linking of a sequence set is realized.
Step 4.2: for periodic dependency finding, by means of random process theory, for discrete time processes { x } t Calculate its autocorrelation coefficient
Wherein the autocorrelation coefficientRepresenting a sequence x t And { x } t-φ Similarity between them, for similar periods
Step 4.3: aggregation of delay information is to aggregate similar subsequence information for sequence chaining, align information by Roll () operation according to an estimated cycle length, and then aggregate information:
wherein the content of the first and second substances,query, key, value, indicating self-attention; to avoid choosing an irrelevant or opposite phase,the length of each period.
Step 4.4: taking the prediction of the future 5 days as an example, the former 5-day mode is used for decomposition, each decomposed mode is projected to different frequency domain base bands, noise reduction processing is carried out, then the mode is input into an auto-former model, and the target prediction value of each mode is finally output through a sequence decomposition unit.
And 5: accumulating and reconstructing the obtained prediction results, wherein the specific operation mode is as follows:
step 5.1: and (4) combining the results obtained in the step (4.4) with the evaluation function and the prediction results of each mode according to different weights to finally form a prediction result.
Step 5.2: the coefficients of each mode according to its autocorrelationReconstructing, wherein a final prediction result calculation formula is as follows:where k is the number of modes, ζ k Is the predicted result of the kth modality.
Step 5.3: and obtaining a final water quality prediction result.
Claims (5)
1. A chemical water quality index prediction method based on variational modal decomposition and an auto-former model comprises the steps of conducting data preprocessing on a water quality parameter data set and related factor historical information, searching for a smooth critical point and a processing deficiency value, decomposing water quality characteristic parameters by using variational modal decomposition, then modeling the water quality parameter data set, screening characteristics on the related factor historical information data set through a characteristic engineering model, calculating according to weather seasons and flow, using the characteristics and the flow for understanding and analyzing of chemical water quality, gradually extracting trend items and period items from hidden variables through a sequence decomposition unit, and conducting time delay information aggregation and cycle dependency discovery on similar subsequences of different cycles; the method comprises the following specific steps:
step 1: performing preliminary data preprocessing on the water quality parameter data set and historical information of relevant influence factors, filling missing values, removing highly relevant features and feature codes, understanding data distribution conditions and extracting effective features through visual analysis, observing the relation between water quality parameters and time, and performing smoothing processing on abnormal values; the specific operation mode is as follows:
step 1.1: defining WATER as WATER quality parameter data, wherein id, WATER-time, WATER-PH, WATER-ZN, WATER-P, WATER-NH, WATER-cod, WATER-flow and WATER-temperature are respectively numbers corresponding to chemical WATER quality, time information, PH pH value, total nitrogen content, total phosphorus content, ammonia nitrogen content, chemical oxygen demand, flow and WATER temperature, wherein the flow is instantaneous flow during measurement, and the daytime floating air temperature is positively correlated with the WATER temperature;
step 1.2: defining RELATE as an information data set of related factors, defining time, relationship-rain and relationship-temperature respectively corresponding to time, date, precipitation and temperature of the day, and satisfying relationship RELATE = { time, relationship-rain and relationship };
step 1.3: in a WATER quality parameter data set WATER, sorting according to id and time, filling missing values, aggregating WATER quality characteristics according to a day, setting the ph value to be 7 under the condition that the ph value is 0, and filling the rest missing values according to the average value of the day;
step 1.4: establishing historical information of static characteristics, judging whether the historical information has correlation with a prediction target by taking chemical oxygen demand as a dimension, if so, retaining the characteristics, otherwise, discarding the characteristics to reduce the interference of a model and reduce the dimension, and coding the characteristics which cannot be directly calculated;
step 1.5: performing visual analysis on the water quality parameter data, observing the relationship between the parameters and time, smoothing abnormal values, and judging the correlation of historical influence factors;
step 1.6: obtaining WATER quality parameter information WATER-PRE after final pretreatment;
step 2: screening characteristics of the related historical information data set through a characteristic engineering model, discarding historical information characteristics weakly related to a predicted target by utilizing data analysis, decomposing water quality characteristics through a variational mode, performing frequency domain enhancement and obtaining characteristic importance for understanding and analyzing water quality parameters;
and step 3: processing different characteristics by adopting sequence decomposition, decomposing an initial trend item and a period item, and aggregating time delay information;
and 4, step 4: inputting the decomposed and polymerized subsequence into an auto-former model for training;
and 5: and accumulating and reconstructing the obtained prediction results to obtain a final prediction result.
2. The chemical water quality index prediction method based on variational modal decomposition and auto-former model according to claim 1, characterized in that: the specific operation mode of the step 2 is as follows:
step 2.1: through a variation modal decomposition method, searching a set of modal components and the center frequency of each modal, thereby realizing the division of frequency domains, and the specific calculation formula that each modal is smooth after being demodulated into a baseband is as follows:
where K is the number of modes to be decomposed, { ω k },{u k Corresponding to the k modal component and the center frequency after decomposition respectively; δ (t) is a dirac function; * Is the convolution operator;
for solving the mode K, the alternative direction multiplier (ADMM) and an iterative algorithm are combined with Parseval/Plancherel and Fourier equidistant transformation to optimize and obtain each mode component and central frequency, saddle points of an augmented Lagrange function are searched, and omega after iteration is alternately optimized k ,u k The expression of λ: the specific calculation process is as follows:
wherein gamma is noise tolerance, meets the fidelity requirement of signal decomposition, andare respectively asu i Fourier transforms corresponding to (ω), f (ω), λ (ω);
step 2.2: the main iteration solving steps of the variational modal decomposition are as follows: first, initialization is performedλ 1 And maximum number of iterations N, forAndupdating iterations with precision convergence ∈>0, if not satisfiedAnd n is<N, then pairω k Continuing iteration;
step 2.3: processing the modal data subjected to modal decomposition by a frequency domain enhancement technology, and enhancing a frequency band signal of the modal data, so that the processed data can better accord with the actual chemical water quality;
step 2.4: the data after modal decomposition is projected to a frequency domain space, and random sampling is performed on each modal frequency domain space, so that the length of an input vector can be greatly reduced, the calculation complexity is reduced, a signal base band is lost by random sampling, but due to the fact that a large amount of noise exists in a high frequency domain of actual production data, the high frequency sampling after modal decomposition needs to be relatively sparse, and the loss can be reduced;
step 2.5: weighting each modal data after random sampling, importing and merging two data sets, modeling according to each screened modal characteristic, and performing 7-fold cross validation to obtain characteristic importance and use the characteristic importance in understanding and analyzing water quality indexes.
3. The chemical water quality index prediction method based on variational modal decomposition and auto-former model according to claim 1, characterized in that: the specific operation mode of the step 3 is as follows:
step 3.1: smoothing the periodic term and highlighting the trend term based on the idea of moving average by a sequence decomposition unit in the auto-former: xi t =APOOL(Padding(ξ)),ξ a =ξ-ξ t ;
Xi therein t ,Represents the seasonality and the extracted periodic part, and APOOL (Padding) represents the operation of moving average and Padding adopted to ensure that the sequence length is constant;
step 3.2: the whole model adopts an encoder-decoder structure (encoder-decoder), wherein x with the time length L is input into the encoder en (ii) a The decoder input needs to be processed by a sequence decomposition unit according to the following specific formula:
x des =Concat(x ens ,x 0 ),
x det =Concat(x ent ,x Mean );
from chi en The latter half of the decomposition is carried out to a length of L/2 to obtain x des And x det Wherein x is des Concat of (b) is 0 filled, x det Concat of (c) is mean filling;
step 3.3: and combining the subsequences and initially establishing a chemical water quality index model.
4. The chemical water quality index prediction method based on variational modal decomposition and auto-former model according to claim 1, characterized in that: the specific operation mode of the step 4 is as follows:
step 4.1: training is carried out through an Auto-former model, chemical water quality indexes are predicted and analyzed, the model is linked by adopting Auto-correlation attention, the original self-attention is improved, the information utility is expanded, similar sub-processes among similar phases of a period are concerned, the periodic dependence discovery and the aggregation of time delay information are mainly included, and the linking of a sequence set is realized;
step 4.2: for periodic dependency finding, by means of random process theory, for discrete time processes { x } t Calculate its autocorrelation coefficient
Wherein the autocorrelation coefficientRepresenting a sequence x t And { x } t-φ Similarity between them, for similar periods
Step 4.3: aggregation of delay information is to aggregate similar subsequence information for sequence chaining, align information by Roll () operation according to an estimated cycle length, and then aggregate information:
wherein the content of the first and second substances,query, key, value, indicating self-attention; to avoid choosing an irrelevant or opposite phase,a cycle length;
step 4.4: taking the prediction of the future 5 days as an example, the former 5-day mode is used for decomposition, each decomposed mode is projected to different frequency domain base bands, noise reduction processing is carried out, then the mode is input into an auto-former model, and the target prediction value of each mode is finally output through a sequence decomposition unit.
5. The method for predicting the chemical water quality index based on the variational modal decomposition and the auto-former model according to claim 1, which is characterized in that: the specific operation mode of the step 5 is as follows:
step 5.1: combining the results obtained in the step 4.4 with the evaluation function and the prediction results of each mode according to different weights to finally form prediction results;
step 5.2: the coefficients of each mode according to its autocorrelationReconstructing, wherein a final prediction result calculation formula is as follows:where k is the number of modes, ζ k Is the k-thA prediction result of a modality;
step 5.3: and obtaining a final water quality prediction result.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211697456.9A CN115829157A (en) | 2022-12-28 | 2022-12-28 | Chemical water quality index prediction method based on variational modal decomposition and auto former model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211697456.9A CN115829157A (en) | 2022-12-28 | 2022-12-28 | Chemical water quality index prediction method based on variational modal decomposition and auto former model |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115829157A true CN115829157A (en) | 2023-03-21 |
Family
ID=85518944
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211697456.9A Withdrawn CN115829157A (en) | 2022-12-28 | 2022-12-28 | Chemical water quality index prediction method based on variational modal decomposition and auto former model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115829157A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116776228A (en) * | 2023-08-17 | 2023-09-19 | 合肥工业大学 | Power grid time sequence data decoupling self-supervision pre-training method and system |
-
2022
- 2022-12-28 CN CN202211697456.9A patent/CN115829157A/en not_active Withdrawn
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116776228A (en) * | 2023-08-17 | 2023-09-19 | 合肥工业大学 | Power grid time sequence data decoupling self-supervision pre-training method and system |
CN116776228B (en) * | 2023-08-17 | 2023-10-20 | 合肥工业大学 | Power grid time sequence data decoupling self-supervision pre-training method and system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Bommidi et al. | Hybrid wind speed forecasting using ICEEMDAN and transformer model with novel loss function | |
CN110674604A (en) | Transformer DGA data prediction method based on multi-dimensional time sequence frame convolution LSTM | |
CN107885951B (en) | A kind of Time series hydrological forecasting method based on built-up pattern | |
CN109886464B (en) | Low-information-loss short-term wind speed prediction method based on optimized singular value decomposition generated feature set | |
CN111242377B (en) | Short-term wind speed prediction method integrating deep learning and data denoising | |
CN105981025A (en) | Actually-measured marine environment data assimilation method based on sequence recursive filtering three-dimensional variation | |
CN107292446B (en) | Hybrid wind speed prediction method based on component relevance wavelet decomposition | |
CN112364975A (en) | Terminal operation state prediction method and system based on graph neural network | |
CN105117550A (en) | Product multidimensional correlation-oriented degradation failure modeling method | |
CN113780420B (en) | GRU-GCN-based method for predicting concentration of dissolved gas in transformer oil | |
CN115130495A (en) | Rolling bearing fault prediction method and system | |
CN115829157A (en) | Chemical water quality index prediction method based on variational modal decomposition and auto former model | |
CN112434891A (en) | Method for predicting solar irradiance time sequence based on WCNN-ALSTM | |
CN115495991A (en) | Rainfall interval prediction method based on time convolution network | |
CN116956120A (en) | Prediction method for water quality non-stationary time sequence based on improved TFT model | |
CN116577464A (en) | Intelligent monitoring system and method for atmospheric pollution | |
CN117076936A (en) | Time sequence data anomaly detection method based on multi-head attention model | |
Yang et al. | Teacher-student uncertainty autoencoder for the process-relevant and quality-relevant fault detection in the industrial process | |
CN112561161A (en) | Time series trend extraction and prediction method based on compressed sensing | |
Wu et al. | Process monitoring of nonlinear uncertain systems based on part interval stacked autoencoder and support vector data description | |
CN113887119A (en) | River water quality prediction method based on SARIMA-LSTM | |
CN116881665A (en) | CMOA optimization-based TimesNet-BiLSTM photovoltaic output prediction method | |
CN112001115A (en) | Soft measurement modeling method of semi-supervised dynamic soft measurement network | |
CN116933033A (en) | River channel water level out-of-limit prediction method and system based on ARIMA model | |
CN116739168A (en) | Runoff prediction method based on gray theory and codec |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WW01 | Invention patent application withdrawn after publication |
Application publication date: 20230321 |
|
WW01 | Invention patent application withdrawn after publication |