CN115829157A - Chemical water quality index prediction method based on variational modal decomposition and auto former model - Google Patents

Chemical water quality index prediction method based on variational modal decomposition and auto former model Download PDF

Info

Publication number
CN115829157A
CN115829157A CN202211697456.9A CN202211697456A CN115829157A CN 115829157 A CN115829157 A CN 115829157A CN 202211697456 A CN202211697456 A CN 202211697456A CN 115829157 A CN115829157 A CN 115829157A
Authority
CN
China
Prior art keywords
water quality
decomposition
modal
information
auto
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202211697456.9A
Other languages
Chinese (zh)
Inventor
周泓
徐斌
严虹
赵保中
石锐
李国豪
廖阳阳
宗美晨
古小诗
杜萌
刘云飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huaiyin Institute of Technology
Original Assignee
Huaiyin Institute of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huaiyin Institute of Technology filed Critical Huaiyin Institute of Technology
Priority to CN202211697456.9A priority Critical patent/CN115829157A/en
Publication of CN115829157A publication Critical patent/CN115829157A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A20/00Water conservation; Efficient water supply; Efficient water use
    • Y02A20/152Water filtration

Landscapes

  • Activated Sludge Processes (AREA)

Abstract

A chemical water quality index prediction method based on variational modal decomposition and an auto-former model comprises the steps of conducting data preprocessing on a water quality parameter data set and related factor historical information, searching for a smooth critical point and a processing missing value, decomposing water quality characteristic parameters by using variational modal decomposition, then modeling a water quality parameter data set, screening characteristics on the related factor historical information data set through a characteristic engineering model, calculating according to weather seasons and flow, using the characteristics to understand and analyze chemical water quality, gradually extracting trend terms and period terms from hidden variables through a sequence decomposition unit, and conducting time delay information aggregation and cycle dependency discovery on similar subsequences of different periods. The method of the invention combines the historical characteristic information and adds the modal information in the chemical water quality time series prediction by extracting the time series characteristics, decomposing the characteristic information by using the variation mode and calculating the importance of the characteristic, so that the prediction result is more accurate and reasonable.

Description

Chemical water quality index prediction method based on variational modal decomposition and auto former model
Technical Field
The invention relates to the technical field of chemical water quality index prediction, in particular to a chemical water quality index prediction method based on variational modal decomposition and an auto-former model.
Background
The water quality index can be used as a specific measurement scale for the water pollution degree, and the chemical plant pollution discharge index data is acquired through automatic real-time acquisition of the sewage treatment station. At present, the commonly used mathematical models of water quality mainly include two categories: while a water quality mechanism mathematical model and a water quality data driving class mathematical model are adopted, a large amount of appropriate historical hydrological water quality data are needed to calibrate model parameters when the common water quality mechanism mathematical model based on theoretical basis is applied to the model, and meanwhile, when influence factors influencing certain water quality indexes are more, the mechanism becomes very complex, so that the model is difficult to establish and related parameters are difficult to obtain; in recent years, under the rapid development of big data research, a data-driven model taking an ash box or black box equation as a means based on sample data has been widely applied to a plurality of subjects, and the data-driven model is also applied to river water quality prediction and early warning.
The traditional water quality index prediction method generally adopts an inherent model to predict the water quality index, the prediction precision based on a statistical method is limited, and the analysis of the nonlinear characteristics of the water environment is lacked, and on the other hand, the time sequence of the water quality index has larger noise due to the complex water environment, so that the traditional model is difficult to effectively predict the water environment index under the complex water environment condition.
Disclosure of Invention
Aiming at the technical problems, the technical scheme provides a chemical water quality index prediction method based on variational modal decomposition and an auto-former model, and the method realizes accurate prediction of the chemical water quality index by comprehensively analyzing a time series data set and a related historical information data set, screening characteristics and adopting a chemical water quality index prediction method based on a multivariable time series of variational modal decomposition; the problems can be effectively solved.
The invention is realized by the following technical scheme:
a chemical water quality index prediction method based on variational modal decomposition and an auto-former model comprises the steps of conducting data preprocessing on a water quality parameter data set and related factor historical information, searching for a smooth critical point and a processing deficiency value, decomposing water quality characteristic parameters by using variational modal decomposition, then modeling the water quality parameter data set, screening characteristics on the related factor historical information data set through a characteristic engineering model, calculating according to weather seasons and flow, using the characteristics and the flow for understanding and analyzing of chemical water quality, gradually extracting trend items and period items from hidden variables through a sequence decomposition unit, and conducting time delay information aggregation and cycle dependency discovery on similar subsequences of different cycles; the method comprises the following specific steps:
step 1: performing preliminary data preprocessing on the water quality parameter data set and historical information of relevant influence factors, filling missing values, removing highly relevant features and feature codes, understanding data distribution conditions and extracting effective features through visual analysis, observing the relation between water quality parameters and time, and performing smoothing processing on abnormal values; the specific operation mode is as follows:
step 1.1: defining WATER as WATER quality parameter data, wherein id, WATER-time, WATER-PH, WATER-ZN, WATER-P, WATER-NH, WATER-cod, WATER-flow and WATER-temperature are respectively numbers corresponding to chemical WATER quality, time information, PH pH value, total nitrogen content, total phosphorus content, ammonia nitrogen content, chemical oxygen demand, flow and WATER temperature, wherein the flow is instantaneous flow during measurement, and the daytime floating air temperature is positively correlated with the WATER temperature;
step 1.2: defining RELATE as an information data set of related factors, defining time, relationship-rain and relationship-temperature respectively corresponding to time, date, precipitation and temperature of the day, and satisfying relationship RELATE = { time, relationship-rain and relationship };
step 1.3: in a WATER quality parameter data set WATER, sorting according to id and time, filling missing values, aggregating WATER quality characteristics according to a day, setting the ph value to be 7 under the condition that the ph value is 0, and filling the rest missing values according to the average value of the day;
step 1.4: establishing historical information of static characteristics, judging whether the historical information has correlation with a prediction target by taking chemical oxygen demand as a dimension, if so, retaining the characteristics, otherwise, discarding the characteristics to reduce the interference of a model and reduce the dimension, and coding the characteristics which cannot be directly calculated;
step 1.5: performing visual analysis on the water quality parameter data, observing the relationship between the parameters and time, smoothing abnormal values, and judging the correlation of historical influence factors;
step 1.6: obtaining WATER quality parameter information WATER-PRE after final pretreatment;
step 2: screening characteristics of the related historical information data set through a characteristic engineering model, discarding historical information characteristics weakly related to a predicted target by utilizing data analysis, decomposing water quality characteristics through a variational mode, performing frequency domain enhancement and obtaining characteristic importance for understanding and analyzing water quality parameters;
and step 3: processing different characteristics by adopting sequence decomposition, decomposing an initial trend item and a period item, and aggregating time delay information;
and 4, step 4: inputting the decomposed and polymerized subsequence into an auto-former model for training;
and 5: and accumulating and reconstructing the obtained prediction results to obtain a final prediction result.
Further, the specific operation mode of step 2 is as follows:
step 2.1: through a variation modal decomposition method, searching a set of modal components and the center frequency of each modal, thereby realizing the division of frequency domains, and the specific calculation formula that each modal is smooth after being demodulated into a baseband is as follows:
Figure BDA0004024078260000041
Figure BDA0004024078260000042
where K is the number of modes to be decomposed, { ω k },{u k The k modal component and the center frequency after decomposition are respectively corresponding; δ (t) is a dirac function; * Is the convolution operator;
for solving the mode K, the alternative direction multiplier (ADMM) and an iterative algorithm are combined with Parseval/Plancherel and Fourier equidistant transformation to optimize and obtain each mode component and central frequency, saddle points of an augmented Lagrange function are searched, and omega after iteration is alternately optimized k ,u k The expression of λ: the specific calculation process is as follows:
Figure BDA0004024078260000043
Figure BDA0004024078260000044
Figure BDA0004024078260000045
wherein gamma is noise tolerance, meets the fidelity requirement of signal decomposition,
Figure BDA0004024078260000046
Figure BDA0004024078260000047
and
Figure BDA0004024078260000048
are respectively as
Figure BDA0004024078260000049
u i Fourier transforms corresponding to (ω), f (ω), λ (ω);
step 2.2: the main iterative solving steps of the variational modal decomposition are as follows: first, initialization is performed
Figure BDA00040240782600000410
λ 1 And maximum number of iterations N, pair
Figure BDA00040240782600000411
ω k And
Figure BDA00040240782600000412
updating iterations with precision convergence ∈>0, if not satisfied
Figure BDA00040240782600000415
Then pair
Figure BDA00040240782600000414
ω k Continuing iteration;
step 2.3: processing the modal data subjected to modal decomposition by a frequency domain enhancement technology, and enhancing a frequency band signal of the modal data, so that the processed data can better accord with the actual chemical water quality;
step 2.4: the data after modal decomposition is projected to a frequency domain space, and random sampling is performed on each modal frequency domain space, so that the length of an input vector can be greatly reduced, the calculation complexity is reduced, a signal base band is lost by random sampling, but due to the fact that a large amount of noise exists in a high frequency domain of actual production data, the high frequency sampling after modal decomposition needs to be relatively sparse, and the loss can be reduced;
step 2.5: weighting each modal data after random sampling, importing and merging two data sets, modeling according to each screened modal characteristic, and performing 7-fold cross validation to obtain characteristic importance and use the characteristic importance in understanding and analyzing water quality indexes.
Further, the specific operation mode of step 3 is as follows:
step 3.1: smoothing the periodic term and highlighting the trend term based on the idea of moving average by a sequence decomposition unit in the auto-former: xi t =APOOL(Padding(ξ)),ξ a =ξ-ξ t
In which ξ t
Figure BDA0004024078260000051
Represents the seasonality and the extracted periodic part, and APOOL (Padding) represents the operation of moving average and Padding adopted to ensure that the sequence length is constant;
step 3.2: the model adopts an encoder-decoder structure (encoder-decoder) as a whole, wherein the encoder inputs the information with the time length L
Figure BDA0004024078260000052
The decoder input needs to be processed by a sequence decomposition unit according to the following specific formula:
Figure BDA0004024078260000053
x des =Concat(x ens ,x 0 ),
x det =Concat(x ent ,x Mean );
from
Figure BDA0004024078260000054
The latter half of the decomposition is carried out to a length of L/2 to obtain x des And x det Wherein x is des Concat of (c) is 0 fill, x det Concat of (c) is mean filling;
step 3.3: and combining the subsequences and initially establishing a chemical water quality index model.
Further, the specific operation mode of step 4 is as follows:
step 4.1: training is carried out through an Auto-former model, chemical water quality indexes are predicted and analyzed, the model is linked by adopting Auto-correlation attention, the original self-attention is improved, the information utility is expanded, similar sub-processes among similar phases of a period are concerned, the periodic dependence discovery and the aggregation of time delay information are mainly included, and the linking of a sequence set is realized;
step 4.2: for periodic dependency finding, by means of random process theory, for discrete time processes { x } t Calculate its autocorrelation coefficient
Figure BDA0004024078260000061
Wherein the autocorrelation coefficient
Figure BDA0004024078260000062
Representing a sequence x t A and a x t Similarity between them, for similar periods
Figure BDA0004024078260000063
Step 4.3: aggregation of delay information is to aggregate similar subsequence information for sequence chaining, align information by Roll () operation according to an estimated cycle length, and then aggregate information:
Figure BDA0004024078260000064
Figure BDA0004024078260000065
Figure BDA0004024078260000066
wherein the content of the first and second substances,
Figure BDA0004024078260000067
query, key, value, indicating self-attention; to avoid picking an irrelevant or opposite phase,
Figure BDA0004024078260000068
a cycle length;
step 4.4: taking the prediction of the future 5 days as an example, the former 5-day mode is used for decomposition, each decomposed mode is projected to different frequency domain base bands, noise reduction processing is carried out, then an auto former model is input, and each mode target prediction value is finally output through a sequence decomposition unit.
Further, the specific operation mode of step 5 is as follows:
step 5.1: combining the results obtained in the step 4.4 with the evaluation function and the prediction results of each mode according to different weights to finally form prediction results;
step 5.2: the coefficients of each mode according to its autocorrelation
Figure BDA0004024078260000069
Reconstructing, wherein a final prediction result calculation formula is as follows:
Figure BDA00040240782600000610
where k is the number of modes, ζ k Is the prediction result of the kth modality;
step 5.3: and obtaining a final water quality prediction result.
Advantageous effects
Compared with the prior art, the chemical water quality index prediction method based on variational modal decomposition and auto-former model has the following beneficial effects:
(1) The technical scheme mainly comprises the steps of preprocessing data of a water quality parameter data set and related factor historical information, searching a smooth critical point, processing a missing value, decomposing water quality characteristic parameters by using variational modal decomposition, modeling the water quality parameter data set, screening characteristics of the related factor historical information data set through a characteristic engineering model, calculating according to weather seasons and flow, calculating and analyzing the understanding of chemical water quality, gradually extracting trend items and period items from hidden variables through a sequence decomposition unit, and discovering delay information aggregation and cycle dependency of similar subsequences of different periods; the characteristic modes are decomposed through the variation modes, the frequency domains are used for enhancing and randomly sampling the frequency domains of the modes, and historical characteristic information is combined in the chemical water quality index prediction, so that the data correction is more detailed.
(2) According to the prediction method of the technical scheme, a water quality parameter data set, related factor historical information and variational modal decomposition are fully utilized to decompose water quality characteristic parameters, then the water quality parameter data set is modeled and used for understanding and analyzing of chemical water quality, trend items and period items are gradually extracted from hidden variables through a sequence decomposition unit, and time delay information aggregation and cycle dependency discovery are carried out on similar subsequences of different periods. In the chemical water quality index prediction, the historical characteristic information is combined, and meanwhile, the modal information is added, so that the prediction result is more accurate and reasonable. The time series and other related information are combined, and the prediction accuracy is improved through feature screening and modeling.
Drawings
FIG. 1 is a schematic overall flow diagram of the present invention.
FIG. 2 is a flow chart of the pretreatment of the water quality parameter data set in the present invention.
FIG. 3 is a diagram of the feature of the invention of variation modal decomposition and frequency domain enhancement.
FIG. 4 is an architecture diagram of the Autoformer model in the present invention.
Detailed Description
The technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. The described embodiments are only some embodiments of the invention, not all embodiments. Various modifications and improvements of the technical solutions of the present invention may be made by those skilled in the art without departing from the design concept of the present invention, and all of them should fall into the protection scope of the present invention.
Example 1:
as shown in fig. 1 to 4, a chemical water quality index prediction method based on variational modal decomposition and auto-former model, data preprocessing is performed on a water quality parameter data set and related factor historical information, a smoothing critical point is searched, a missing value is processed, water quality characteristic parameters are decomposed by using variational modal decomposition, then a water quality parameter data set is modeled, characteristics are screened on the related factor historical information data set through a characteristic engineering model, calculation is performed according to weather seasons and flow, the calculation is used for understanding and analyzing of chemical water quality, trend items and period items are gradually extracted from hidden variables through a sequence decomposition unit, and time delay information aggregation and cycle dependency discovery are performed on similar subsequences of different periods; the method comprises the following specific steps:
step 1: performing preliminary data preprocessing on the water quality parameter data set and historical information of relevant influence factors, filling missing values, removing highly relevant features and feature codes, understanding data distribution conditions and extracting effective features through visual analysis, observing the relation between water quality parameters and time, and performing smoothing processing on abnormal values; the specific operation mode is as follows:
step 1.1: defining WATER as WATER quality parameter data, wherein id, WATER-time, WATER-PH, WATER-ZN, WATER-P, WATER-NH, WATER-cod, WATER-flow and WATER-temperature are respectively numbers corresponding to chemical WATER quality, time information, PH pH value, total nitrogen content, total phosphorus content, ammonia nitrogen content, chemical oxygen demand, flow and WATER temperature, wherein the flow is instantaneous flow during measurement, and the daytime floating air temperature is positively correlated with the WATER temperature.
Step 1.2: defining RELATE as the information data set of the relevant factors, defining time, relationship-rain, relationship-temperature corresponding to time, date, precipitation and temperature of the day respectively, and satisfying the relationship RELATE = { time, relationship-rain, relationship-temperature }.
Step 1.3: and (4) sequencing according to id and time in a WATER quality parameter data set WATER, filling missing values, aggregating WATER quality characteristics according to a day, setting the ph value to be 7 under the condition that the ph value is 0, and filling the rest missing values according to the mean value of the day.
Step 1.4: and (3) constructing historical information of static characteristics, judging whether the historical information has correlation with a prediction target by taking the chemical oxygen demand as a dimension, if so, retaining the characteristics, otherwise, discarding the historical information to reduce the interference of the model and reduce the dimension, and coding the characteristics which cannot be directly calculated.
Step 1.5: and performing visual analysis on the water quality parameter data, observing the relation between the parameters and time, smoothing abnormal values, and judging the correlation of historical influence factors.
Step 1.6: and obtaining the final pretreated WATER quality parameter information WATER-PRE.
Step 2: screening characteristics of the related historical information data set through a characteristic engineering model, discarding historical information characteristics weakly related to a predicted target by utilizing data analysis, decomposing water quality characteristics through a variational mode, performing frequency domain enhancement and obtaining characteristic importance for understanding and analyzing water quality parameters; the specific operation mode is as follows:
step 2.1: through a variation modal decomposition method, searching a set of modal components and the center frequency of each modal, thereby realizing the division of frequency domains, and the specific calculation formula that each modal is smooth after being demodulated into a baseband is as follows:
Figure BDA0004024078260000091
Figure BDA0004024078260000092
where K is the number of modes to be decomposed, { ω k },{u k The k modal component and the center frequency after decomposition are respectively corresponding; δ (t) is a dirac function; * Is the convolution operator.
For solving the mode K, all modes are obtained through optimization by using an alternating direction multiplier (ADMM), an iterative algorithm, parseval/Plancherel and Fourier equidistant transformationThe state component and the center frequency are searched, saddle points of the augmented Lagrange function are searched, and omega after the alternate optimization iteration k ,u k The expression of λ: the specific calculation process is as follows:
Figure BDA0004024078260000101
Figure BDA0004024078260000102
Figure BDA0004024078260000103
wherein gamma is noise tolerance, meets the fidelity requirement of signal decomposition,
Figure BDA0004024078260000104
Figure BDA0004024078260000105
and
Figure BDA0004024078260000106
are respectively as
Figure BDA0004024078260000107
u i (ω), f (ω), λ (ω) are fourier transformed accordingly.
Step 2.2: the main iterative solving steps of the variational modal decomposition are as follows: first, initialization is performed
Figure BDA0004024078260000108
λ 1 And maximum number of iterations N, pair
Figure BDA0004024078260000109
ω k And
Figure BDA00040240782600001010
updating iterations with precision convergence ∈>0, if not satisfied
Figure BDA00040240782600001013
Then pair
Figure BDA00040240782600001012
ω k The iteration continues.
Step 2.3: and processing the modal data subjected to modal decomposition by a frequency domain enhancement technology to enhance the frequency band signal of the modal data, so that the processed data can better accord with the actual chemical water quality.
Step 2.4: the data after modal decomposition is projected to a frequency domain space, random sampling is carried out on each modal frequency domain space, the length of an input vector can be greatly reduced, the calculation complexity is further reduced, signal base bands can be lost through random sampling, but due to the fact that a large amount of noise exists in the high frequency domain of actual production data, the high frequency sampling after modal decomposition needs to be relatively sparse, and loss can be reduced.
Step 2.5: weighting each modal data after random sampling, importing and merging two data sets, modeling according to each screened modal characteristic, and performing 7-fold cross validation to obtain characteristic importance and use the characteristic importance in understanding and analyzing water quality indexes.
And step 3: processing different characteristics by adopting sequence decomposition, decomposing an initial trend item and a period item, and aggregating time delay information; the specific operation mode is as follows:
step 3.1: smoothing the periodic term and highlighting the trend term based on the idea of moving average by a sequence decomposition unit in the auto-former: xi t =APOOL(Padding(ξ)),ξ a =ξ-ξ t
In which ξ t
Figure BDA0004024078260000111
Representing the seasonality and the periodic part of the extraction, APOOL (Padding) represents the operation of moving average and Padding taken to ensure that the sequence length is constant.
Step 3.2: the model adopts an encoder-decoder structure (encoder-decoder) as a whole, wherein an encoder inputOf length L
Figure BDA0004024078260000112
The decoder input needs to be processed by a sequence decomposition unit according to the following specific formula:
Figure BDA0004024078260000113
x des =Concat(x ens ,x 0 ),
x des =Concat(x ent ,x Mean );
from
Figure BDA0004024078260000114
The latter half of the decomposition is carried out to a length of L/2 to obtain x des And x det Wherein x is des Concat of (c) is 0 fill, x det Is mean filling.
Step 3.3: and combining the subsequences and initially establishing a chemical water quality index model.
And 4, step 4: inputting the decomposed and polymerized subsequence into an auto-former model for training; the specific operation mode is as follows:
step 4.1: training is carried out through an Auto-former model, chemical water quality indexes are predicted and analyzed, the model is linked by adopting Auto-correlation attention, the original self-attention is improved, the information utility is expanded, similar subprocesses among similar phases of a period are concerned, the periodic dependence discovery and the aggregation of time delay information are mainly included, and the linking of a sequence set is realized.
Step 4.2: for periodic dependency finding, by means of random process theory, for discrete time processes { x } t Calculate its autocorrelation coefficient
Figure BDA0004024078260000115
Wherein the autocorrelation coefficient
Figure BDA0004024078260000121
Representing a sequence x t And { x } t-φ Similarity between them, for similar periods
Figure BDA0004024078260000122
Step 4.3: aggregation of delay information is to aggregate similar subsequence information for sequence chaining, align information by Roll () operation according to an estimated cycle length, and then aggregate information:
Figure BDA0004024078260000123
Figure BDA0004024078260000124
Figure BDA0004024078260000125
wherein the content of the first and second substances,
Figure BDA0004024078260000126
query, key, value, indicating self-attention; to avoid choosing an irrelevant or opposite phase,
Figure BDA0004024078260000127
the length of each period.
Step 4.4: taking the prediction of the future 5 days as an example, the former 5-day mode is used for decomposition, each decomposed mode is projected to different frequency domain base bands, noise reduction processing is carried out, then the mode is input into an auto-former model, and the target prediction value of each mode is finally output through a sequence decomposition unit.
And 5: accumulating and reconstructing the obtained prediction results, wherein the specific operation mode is as follows:
step 5.1: and (4) combining the results obtained in the step (4.4) with the evaluation function and the prediction results of each mode according to different weights to finally form a prediction result.
Step 5.2: the coefficients of each mode according to its autocorrelation
Figure BDA0004024078260000128
Reconstructing, wherein a final prediction result calculation formula is as follows:
Figure BDA0004024078260000129
where k is the number of modes, ζ k Is the predicted result of the kth modality.
Step 5.3: and obtaining a final water quality prediction result.

Claims (5)

1. A chemical water quality index prediction method based on variational modal decomposition and an auto-former model comprises the steps of conducting data preprocessing on a water quality parameter data set and related factor historical information, searching for a smooth critical point and a processing deficiency value, decomposing water quality characteristic parameters by using variational modal decomposition, then modeling the water quality parameter data set, screening characteristics on the related factor historical information data set through a characteristic engineering model, calculating according to weather seasons and flow, using the characteristics and the flow for understanding and analyzing of chemical water quality, gradually extracting trend items and period items from hidden variables through a sequence decomposition unit, and conducting time delay information aggregation and cycle dependency discovery on similar subsequences of different cycles; the method comprises the following specific steps:
step 1: performing preliminary data preprocessing on the water quality parameter data set and historical information of relevant influence factors, filling missing values, removing highly relevant features and feature codes, understanding data distribution conditions and extracting effective features through visual analysis, observing the relation between water quality parameters and time, and performing smoothing processing on abnormal values; the specific operation mode is as follows:
step 1.1: defining WATER as WATER quality parameter data, wherein id, WATER-time, WATER-PH, WATER-ZN, WATER-P, WATER-NH, WATER-cod, WATER-flow and WATER-temperature are respectively numbers corresponding to chemical WATER quality, time information, PH pH value, total nitrogen content, total phosphorus content, ammonia nitrogen content, chemical oxygen demand, flow and WATER temperature, wherein the flow is instantaneous flow during measurement, and the daytime floating air temperature is positively correlated with the WATER temperature;
step 1.2: defining RELATE as an information data set of related factors, defining time, relationship-rain and relationship-temperature respectively corresponding to time, date, precipitation and temperature of the day, and satisfying relationship RELATE = { time, relationship-rain and relationship };
step 1.3: in a WATER quality parameter data set WATER, sorting according to id and time, filling missing values, aggregating WATER quality characteristics according to a day, setting the ph value to be 7 under the condition that the ph value is 0, and filling the rest missing values according to the average value of the day;
step 1.4: establishing historical information of static characteristics, judging whether the historical information has correlation with a prediction target by taking chemical oxygen demand as a dimension, if so, retaining the characteristics, otherwise, discarding the characteristics to reduce the interference of a model and reduce the dimension, and coding the characteristics which cannot be directly calculated;
step 1.5: performing visual analysis on the water quality parameter data, observing the relationship between the parameters and time, smoothing abnormal values, and judging the correlation of historical influence factors;
step 1.6: obtaining WATER quality parameter information WATER-PRE after final pretreatment;
step 2: screening characteristics of the related historical information data set through a characteristic engineering model, discarding historical information characteristics weakly related to a predicted target by utilizing data analysis, decomposing water quality characteristics through a variational mode, performing frequency domain enhancement and obtaining characteristic importance for understanding and analyzing water quality parameters;
and step 3: processing different characteristics by adopting sequence decomposition, decomposing an initial trend item and a period item, and aggregating time delay information;
and 4, step 4: inputting the decomposed and polymerized subsequence into an auto-former model for training;
and 5: and accumulating and reconstructing the obtained prediction results to obtain a final prediction result.
2. The chemical water quality index prediction method based on variational modal decomposition and auto-former model according to claim 1, characterized in that: the specific operation mode of the step 2 is as follows:
step 2.1: through a variation modal decomposition method, searching a set of modal components and the center frequency of each modal, thereby realizing the division of frequency domains, and the specific calculation formula that each modal is smooth after being demodulated into a baseband is as follows:
Figure FDA0004024078250000021
Figure FDA0004024078250000031
where K is the number of modes to be decomposed, { ω k },{u k Corresponding to the k modal component and the center frequency after decomposition respectively; δ (t) is a dirac function; * Is the convolution operator;
for solving the mode K, the alternative direction multiplier (ADMM) and an iterative algorithm are combined with Parseval/Plancherel and Fourier equidistant transformation to optimize and obtain each mode component and central frequency, saddle points of an augmented Lagrange function are searched, and omega after iteration is alternately optimized k ,u k The expression of λ: the specific calculation process is as follows:
Figure FDA0004024078250000032
Figure FDA0004024078250000033
Figure FDA0004024078250000034
wherein gamma is noise tolerance, meets the fidelity requirement of signal decomposition,
Figure FDA0004024078250000035
Figure FDA0004024078250000036
and
Figure FDA0004024078250000037
are respectively as
Figure FDA0004024078250000038
u i Fourier transforms corresponding to (ω), f (ω), λ (ω);
step 2.2: the main iteration solving steps of the variational modal decomposition are as follows: first, initialization is performed
Figure FDA0004024078250000039
λ 1 And maximum number of iterations N, for
Figure FDA00040240782500000310
And
Figure FDA00040240782500000311
updating iterations with precision convergence ∈>0, if not satisfied
Figure FDA00040240782500000312
And n is<N, then pair
Figure FDA00040240782500000313
ω k Continuing iteration;
step 2.3: processing the modal data subjected to modal decomposition by a frequency domain enhancement technology, and enhancing a frequency band signal of the modal data, so that the processed data can better accord with the actual chemical water quality;
step 2.4: the data after modal decomposition is projected to a frequency domain space, and random sampling is performed on each modal frequency domain space, so that the length of an input vector can be greatly reduced, the calculation complexity is reduced, a signal base band is lost by random sampling, but due to the fact that a large amount of noise exists in a high frequency domain of actual production data, the high frequency sampling after modal decomposition needs to be relatively sparse, and the loss can be reduced;
step 2.5: weighting each modal data after random sampling, importing and merging two data sets, modeling according to each screened modal characteristic, and performing 7-fold cross validation to obtain characteristic importance and use the characteristic importance in understanding and analyzing water quality indexes.
3. The chemical water quality index prediction method based on variational modal decomposition and auto-former model according to claim 1, characterized in that: the specific operation mode of the step 3 is as follows:
step 3.1: smoothing the periodic term and highlighting the trend term based on the idea of moving average by a sequence decomposition unit in the auto-former: xi t =APOOL(Padding(ξ)),ξ a =ξ-ξ t
Xi therein t
Figure FDA0004024078250000041
Represents the seasonality and the extracted periodic part, and APOOL (Padding) represents the operation of moving average and Padding adopted to ensure that the sequence length is constant;
step 3.2: the whole model adopts an encoder-decoder structure (encoder-decoder), wherein x with the time length L is input into the encoder en (ii) a The decoder input needs to be processed by a sequence decomposition unit according to the following specific formula:
Figure FDA0004024078250000042
x des =Concat(x ens ,x 0 ),
x det =Concat(x ent ,x Mean );
from chi en The latter half of the decomposition is carried out to a length of L/2 to obtain x des And x det Wherein x is des Concat of (b) is 0 filled, x det Concat of (c) is mean filling;
step 3.3: and combining the subsequences and initially establishing a chemical water quality index model.
4. The chemical water quality index prediction method based on variational modal decomposition and auto-former model according to claim 1, characterized in that: the specific operation mode of the step 4 is as follows:
step 4.1: training is carried out through an Auto-former model, chemical water quality indexes are predicted and analyzed, the model is linked by adopting Auto-correlation attention, the original self-attention is improved, the information utility is expanded, similar sub-processes among similar phases of a period are concerned, the periodic dependence discovery and the aggregation of time delay information are mainly included, and the linking of a sequence set is realized;
step 4.2: for periodic dependency finding, by means of random process theory, for discrete time processes { x } t Calculate its autocorrelation coefficient
Figure FDA0004024078250000051
Wherein the autocorrelation coefficient
Figure FDA0004024078250000052
Representing a sequence x t And { x } t-φ Similarity between them, for similar periods
Figure FDA0004024078250000053
Step 4.3: aggregation of delay information is to aggregate similar subsequence information for sequence chaining, align information by Roll () operation according to an estimated cycle length, and then aggregate information:
Figure FDA0004024078250000054
Figure FDA0004024078250000055
Figure FDA0004024078250000056
wherein the content of the first and second substances,
Figure FDA0004024078250000057
query, key, value, indicating self-attention; to avoid choosing an irrelevant or opposite phase,
Figure FDA0004024078250000058
a cycle length;
step 4.4: taking the prediction of the future 5 days as an example, the former 5-day mode is used for decomposition, each decomposed mode is projected to different frequency domain base bands, noise reduction processing is carried out, then the mode is input into an auto-former model, and the target prediction value of each mode is finally output through a sequence decomposition unit.
5. The method for predicting the chemical water quality index based on the variational modal decomposition and the auto-former model according to claim 1, which is characterized in that: the specific operation mode of the step 5 is as follows:
step 5.1: combining the results obtained in the step 4.4 with the evaluation function and the prediction results of each mode according to different weights to finally form prediction results;
step 5.2: the coefficients of each mode according to its autocorrelation
Figure FDA0004024078250000059
Reconstructing, wherein a final prediction result calculation formula is as follows:
Figure FDA0004024078250000061
where k is the number of modes, ζ k Is the k-thA prediction result of a modality;
step 5.3: and obtaining a final water quality prediction result.
CN202211697456.9A 2022-12-28 2022-12-28 Chemical water quality index prediction method based on variational modal decomposition and auto former model Withdrawn CN115829157A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211697456.9A CN115829157A (en) 2022-12-28 2022-12-28 Chemical water quality index prediction method based on variational modal decomposition and auto former model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211697456.9A CN115829157A (en) 2022-12-28 2022-12-28 Chemical water quality index prediction method based on variational modal decomposition and auto former model

Publications (1)

Publication Number Publication Date
CN115829157A true CN115829157A (en) 2023-03-21

Family

ID=85518944

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211697456.9A Withdrawn CN115829157A (en) 2022-12-28 2022-12-28 Chemical water quality index prediction method based on variational modal decomposition and auto former model

Country Status (1)

Country Link
CN (1) CN115829157A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116776228A (en) * 2023-08-17 2023-09-19 合肥工业大学 Power grid time sequence data decoupling self-supervision pre-training method and system

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116776228A (en) * 2023-08-17 2023-09-19 合肥工业大学 Power grid time sequence data decoupling self-supervision pre-training method and system
CN116776228B (en) * 2023-08-17 2023-10-20 合肥工业大学 Power grid time sequence data decoupling self-supervision pre-training method and system

Similar Documents

Publication Publication Date Title
Bommidi et al. Hybrid wind speed forecasting using ICEEMDAN and transformer model with novel loss function
CN110674604A (en) Transformer DGA data prediction method based on multi-dimensional time sequence frame convolution LSTM
CN107885951B (en) A kind of Time series hydrological forecasting method based on built-up pattern
CN109886464B (en) Low-information-loss short-term wind speed prediction method based on optimized singular value decomposition generated feature set
CN111242377B (en) Short-term wind speed prediction method integrating deep learning and data denoising
CN105981025A (en) Actually-measured marine environment data assimilation method based on sequence recursive filtering three-dimensional variation
CN107292446B (en) Hybrid wind speed prediction method based on component relevance wavelet decomposition
CN112364975A (en) Terminal operation state prediction method and system based on graph neural network
CN105117550A (en) Product multidimensional correlation-oriented degradation failure modeling method
CN113780420B (en) GRU-GCN-based method for predicting concentration of dissolved gas in transformer oil
CN115130495A (en) Rolling bearing fault prediction method and system
CN115829157A (en) Chemical water quality index prediction method based on variational modal decomposition and auto former model
CN112434891A (en) Method for predicting solar irradiance time sequence based on WCNN-ALSTM
CN115495991A (en) Rainfall interval prediction method based on time convolution network
CN116956120A (en) Prediction method for water quality non-stationary time sequence based on improved TFT model
CN116577464A (en) Intelligent monitoring system and method for atmospheric pollution
CN117076936A (en) Time sequence data anomaly detection method based on multi-head attention model
Yang et al. Teacher-student uncertainty autoencoder for the process-relevant and quality-relevant fault detection in the industrial process
CN112561161A (en) Time series trend extraction and prediction method based on compressed sensing
Wu et al. Process monitoring of nonlinear uncertain systems based on part interval stacked autoencoder and support vector data description
CN113887119A (en) River water quality prediction method based on SARIMA-LSTM
CN116881665A (en) CMOA optimization-based TimesNet-BiLSTM photovoltaic output prediction method
CN112001115A (en) Soft measurement modeling method of semi-supervised dynamic soft measurement network
CN116933033A (en) River channel water level out-of-limit prediction method and system based on ARIMA model
CN116739168A (en) Runoff prediction method based on gray theory and codec

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication

Application publication date: 20230321

WW01 Invention patent application withdrawn after publication