CN116050595A

CN116050595A - Attention mechanism and decomposition mechanism coupled runoff amount prediction method

Info

Publication number: CN116050595A
Application number: CN202211710368.8A
Authority: CN
Inventors: 李正浩; 宋雯程; 张林鹏; 刘建伟; 赵迅逸; 谢方立
Original assignee: Yantai New And Old Kinetic Energy Conversion Research Institute And Yantai Demonstration Base For Transfer And Transformation Of Scientific And Technological Achievements
Current assignee: Yantai New And Old Kinetic Energy Conversion Research Institute And Yantai Demonstration Base For Transfer And Transformation Of Scientific And Technological Achievements
Priority date: 2022-12-29
Filing date: 2022-12-29
Publication date: 2023-05-02

Abstract

The invention belongs to the field of machine learning, and particularly discloses a method for predicting runoff quantity by coupling an attention mechanism and a decomposition mechanism, which comprises the following steps: normalizing the historically observed runoff data, and inputting the normalized runoff data into a trained runoff prediction model to obtain a runoff prediction sequence; the runoff quantity prediction model comprises a positive standardization module, a time sequence decomposition module, a multi-head self-attention module, a time convolution network module and an inverse standardization module; the invention combines time sequence decomposition, a multi-head self-attention mechanism and a time convolution network, couples trend characteristics, periodic characteristics and self-attention characteristics on the basis of initial prediction of the time convolution network, comprehensively and efficiently digs strong trend characteristics and strong periodic characteristics of runoff data, adopts inverse standardized reduction, enhances the distribution consistency of the data, and realizes high-accuracy prediction of the runoff.

Description

Attention mechanism and decomposition mechanism coupled runoff amount prediction method

Technical Field

The invention belongs to the field of machine learning, and particularly relates to a runoff prediction method with a coupling of an attention mechanism and a decomposition mechanism.

Background

The runoff is the total flow flowing through the section of the river channel within a specific time range (such as three days, one week, half month, one quarter and the like). In real life, the runoff amount prediction technology has wide application scenes. For example, by predicting the runoff of each main flow and each tributary in the flow domain, important data support can be provided for overall command of flood control and drought control, irrigation and reasonable dispatching of drinking water resources.

From the observation data over the years, the nature of natural river channel runoff is a time series with strong periodicity. The time sequence is a data sequence sequenced according to time sequence, and reflects the trend of the continuous evolution change of the observed parameters along with time. The current runoff prediction method generally adopts a time sequence prediction thought, and can be divided into a traditional runoff prediction method, a runoff prediction method based on pattern recognition and a runoff prediction method based on deep learning. The traditional runoff quantity prediction method utilizes the statistical characteristics of the historical time sequence to establish a statistical model and solve the problems of high parameter sensitivity and the like; the runoff amount prediction method based on pattern recognition needs to select characteristics by using professional knowledge, and the migration capability of a usage scene is poor; the runoff amount prediction method based on deep learning has the problems of high calculation complexity, insufficient utilization of historical data and the like.

Disclosure of Invention

Aiming at the defects of the prior art, the invention provides a runoff quantity prediction method with a coupling attention mechanism and a decomposition mechanism, which aims to excavate strong trend characteristics and strong periodic characteristics of runoff quantity observation data in month and quarter scales, integrate a multi-head self-attention mechanism and a decomposition mechanism and realize high-accuracy prediction of runoff quantity.

In order to achieve the above purpose, the specific technical scheme is as follows:

a method of traffic prediction with a coupling of an attention mechanism and a resolution mechanism, comprising: normalizing the historically observed runoff data, and inputting the normalized runoff data into a trained runoff prediction model to obtain a runoff prediction sequence; the runoff quantity prediction model comprises a positive standardization module, a time sequence decomposition module, a multi-head self-attention module, a time convolution network module and an inverse standardization module;

the positive standardization module is used for carrying out positive standardization processing on the runoff sequence samples in the training set to obtain a standardized runoff sequence;

the time sequence decomposition module is used for decomposing the standardized runoff sequence to obtain a trend component and a periodic component;

the multi-head self-attention module is used for obtaining self-attention prediction components;

the time convolution network module is used for obtaining an initial prediction component;

the inverse normalization module is used for performing inverse normalization processing on the coupling prediction sequence obtained by the multi-component feature fusion to obtain a runoff prediction sequence.

Further, the process of training the runoff quantity prediction model comprises the following steps:

s1: all data in the runoff data set are normalized;

s2: dividing the runoff data set into a training set, a verification set and a test set according to the proportion;

s3: inputting the runoff sequence samples in the training set into a positive standardization module for positive standardization processing to obtain a standardized runoff sequence;

s4: inputting the standardized runoff sequence into a time sequence decomposition module for time sequence decomposition, and inputting the decomposition result into a linear layer to obtain a trend component and a periodic component;

s5: inputting the standardized runoff sequence into a multi-head self-attention module to obtain a self-attention prediction component;

s6: inputting the standardized runoff sequence into a time convolution network, and obtaining initial predicted components through a plurality of time convolution modules connected in series;

s7: weighting and adding the trend component, the periodic component, the self-attention prediction component and the initial prediction component to obtain a coupling prediction sequence;

s8: inputting the coupling prediction sequence into an inverse standardization module for inverse standardization processing to obtain a runoff prediction sequence;

s9: calculating a loss function of the runoff prediction model according to the runoff prediction sequence and the runoff observation sequence;

s10: continuously adjusting the learning rate, and dynamically optimizing model parameters according to the learning rate;

s11: and verifying the model obtained by training on a verification set, and completing model training when the loss function is minimum.

Wherein the loss function in step S9 is preferably a mean square error (Mean Square Error, MSE); the strategy for adjusting the learning rate in step S10 is preferably piecewise constant decay, and the model optimization is preferably Adam algorithm.

In step S1, the normalization process calculates the mean value and standard deviation of all the data, and assigns values to each sample in the runoff data set again, so as to make it conform to gaussian distribution.

Further, in step S3, the positive normalization process is performed on the traffic sequence samples in the training set, which may be expressed as the following formula:

wherein ,

indicating the observation value of the runoff quantity at the moment i;

n represents the length of the runoff amount prediction period;

an average value representing the traffic volume observation value;

the observation value of the i-time runoff amount after the positive normalization processing is shown.

Further, in step S4, the process of obtaining the trend component and the periodic component includes the following steps:

s4.1: the data complement at two ends of the data are adjusted according to the average kernel size, and one-dimensional average pooling is carried out on the data after complement to obtain an initial trend component;

s4.2: subtracting the initial trend component from the complement data to obtain an initial periodic component;

s4.3: and respectively inputting the initial trend component and the initial periodic component into a linear layer, and unifying the output dimensions to be the same as the target sequence to obtain the trend component and the periodic component.

Further, in step S5, the process of obtaining the self-attention prediction component includes the steps of:

s5.1: inputting the normalized runoff sequence into a linear layer to obtain an initial time sequence component;

s5.2: respectively inputting the initial time sequence components into three different linear layers to obtain a query sub-component, a key value sub-component and a numerical value sub-component;

s5.3: performing self-attention operation on the query sub-component, the key value sub-component and the numerical sub-component to obtain a self-attention prediction initial component;

s5.4: the self-attention prediction initial component is input into a linear layer, and the output dimension is adjusted to obtain the self-attention prediction component.

Still further, the process of deriving the self-attention prediction component may be expressed as the following formula:

wherein ,

representing a normalized runoff amount observation;

linear represents a Linear layer;

representing an initial time series component;

W _Q 、W _K and W_V Respectively representing a weight matrix corresponding to each of the query sub-component, the key value sub-component and the numerical value sub-component;

q, K and V represent query and value sub-components, respectively;

K ^T a transposed component representing a key-value subcomponent;

d _k representing model dimensions;

softmax represents the normalized exponential function;

F _A representing the self-attention prediction component.

Further, in step S6, the process of obtaining the initial predicted component includes the following steps:

s6.1: constructing a time convolution network, wherein the time convolution network comprises six time convolution modules connected in series, and each time convolution module comprises two sub-modules consisting of a one-dimensional expansion convolution layer and a shear layer;

s6.2: in the time convolution module, firstly, one-dimensional expansion convolution is carried out on input sequence data, then redundant data of a header is sheared to ensure one-way transmission of predicted information flow, and then one-dimensional expansion convolution and shearing are carried out again;

s6.3: the output of the time convolution module will be the input of the next stage of time convolution module until six serially connected time convolution blocks are passed.

Still further, the process of obtaining the initial predicted component may be expressed as the following formula:

wherein ,

representing a normalized runoff amount observation;

conv represents a one-dimensional dilated convolution layer;

chomp represents a shear layer;

ReLU represents a nonlinear activation function;

dropout represents a random inactivation function;

conv1D represents one-dimensional convolution;

F _t representing the data sequence processed by a basic dilated convolution layer;

F _c representing the data sequence processed by a time convolution module;

F _T representing the initial predicted component extracted over time by the convolutional network.

Further, in step S8, the process of performing inverse normalization processing on the coupling prediction sequence to obtain the runout prediction value may be expressed as the following formula:

wherein ,

y represents a coupled predicted sequence;

sigma represents the standard deviation of the normalized runoff observation;

an average value representing the normalized runoff amount observation;

a sequence of traffic prediction is shown.

The invention has the following remarkable effects:

the invention combines time sequence decomposition, a multi-head self-attention mechanism and a time convolution network, couples trend characteristics, periodic characteristics and self-attention characteristics on the basis of initial prediction of the time convolution network, comprehensively and efficiently digs strong trend characteristics and strong periodic characteristics of runoff data, adopts inverse standardized reduction, enhances the distribution consistency of the data, and realizes high-accuracy prediction of the runoff.

Drawings

FIG. 1 is a flow chart of the runoff prediction of a preferred embodiment of the present invention.

Detailed Description

The principles and features of the present invention are described below in connection with examples, which are set forth only to illustrate the present invention and not to limit the scope of the invention.

A method of traffic prediction with a coupling of an attention mechanism and a resolution mechanism, comprising: normalizing the historically observed runoff data, and inputting the normalized runoff data into a trained runoff prediction model to obtain a runoff prediction sequence (a flow chart is shown in figure 1); the runoff quantity prediction model comprises a positive standardization module, a time sequence decomposition module, a multi-head self-attention module, a time convolution network module and an inverse standardization module;

the time sequence decomposition module is used for decomposing the standardized runoff flow sequence to obtain a trend component and a periodic component;

and the inverse normalization module is used for performing inverse normalization processing on the coupling prediction sequence obtained by the multi-component feature fusion to obtain a runoff prediction sequence.

The process for training the runoff quantity prediction model comprises the following steps:

s1: and (5) normalizing all data in the runoff data set. The runoff data set is a runoff data set of hydrological stations along the Yangtze river basin, the normalization processing is to calculate the mean value and standard deviation of all data, and each sample in the runoff data set is assigned again to enable the samples to accord with Gaussian distribution;

s2: dividing the runoff data set into a training set, a verification set and a test set according to the ratio of 7:1:2, wherein the training set is used for training the runoff prediction model, the verification set is used for checking and iteratively optimizing the effect of the model in the training process, and the test set is used for evaluating the prediction effect.

s7: weighting and adding the trend component, the periodic component, the self-attention prediction component and the initial prediction component to obtain a coupling prediction sequence, and setting weights according to a ratio of 1:1:1:1 in order to simplify calculation;

s9: calculating a mean square error (Mean Square Error, MSE) loss function of the traffic prediction model according to the traffic prediction sequence and the traffic observation sequence;

s10: continuously adjusting the learning rate by adopting a piecewise constant attenuation strategy, and dynamically optimizing model parameters by adopting an Adam algorithm according to the learning rate;

In step S1, normalization processing is to calculate the mean value and standard deviation of all data, and assign values to each sample in the runoff data set again, so as to make it conform to gaussian distribution.

In step S3, the positive normalization process is performed on the runoff sequence samples in the training set, which can be expressed as the following formula:

/>

wherein ,

indicating the observation value of the runoff quantity at the moment i;

n represents the length of the runoff amount prediction period;

an average value representing the traffic volume observation value;

In step S4, the process of obtaining the trend component and the periodic component includes the following steps:

In step S5, the process of obtaining the self-attention prediction component includes the steps of:

The process of deriving the self-attention prediction component S5.1-S5.4 can be expressed as follows:

wherein ,

representing a normalized runoff amount observation;

linear represents a Linear layer;

representing an initial time series component; />

q, K and V represent query and value sub-components, respectively;

K ^T a transposed component representing a key-value subcomponent;

d _k representing model dimensions;

softmax represents the normalized exponential function;

F _A representing the self-attention prediction component.

In step S6, the process of obtaining the initial predicted component includes the steps of:

s6.3: the output of the time convolution module is used as the input of the next time convolution module until the time convolution expansion rates in the six time convolution blocks are sequentially 1, 2, 4, 8, 16 and 32, and the number of middle convolution layer channels is sequentially 32, 64, 128, 64, 32 and N _V Number of tail convolutional layer channels N _V Should be consistent with the number of variables predicted at the same time.

The process of obtaining the initial predicted component in S6.1-S6.3 can be expressed as follows:

wherein ,

representing a normalized runoff amount observation;

conv represents a one-dimensional dilated convolution layer;

chomp represents a shear layer;

ReLU represents a nonlinear activation function;

dropout represents a random inactivation function;

conv1D represents one-dimensional convolution;

F _c representing the data sequence processed by a time convolution module;

In step S8, the process of performing inverse normalization processing on the coupling prediction sequence to obtain the runout prediction value may be expressed as the following formula:

wherein ,

y represents a coupled predicted sequence;

sigma represents the standard deviation of the normalized runoff observation;

an average value representing the normalized runoff amount observation; />

A sequence of traffic prediction is shown.

In step S9, a Mean Square Error (MSE) loss function of the traffic prediction model is calculated according to the traffic prediction sequence and the traffic observation sequence, where the expression is:

wherein ,

n represents the length of the runoff amount prediction period;

indicating a predicted value of the runoff quantity at the moment i;

and the observation value of the runoff quantity at the moment i is shown.

In order to verify the effectiveness of the method, the method is compared with a classical time sequence prediction method Long-Short Term Memory (LSTM) and a transform to predict the effect in a Yangtze river basin along a hydrographic station runoff data set, and an average absolute error (Mean Absolute Error, MAE) and a Nash-Sutcliffe Efficiency, NSE are adopted as evaluation indexes.

MAE is the average value of absolute values of all single predicted values and observed value differences, and can better reflect the magnitude of the prediction error, and the calculation expression is as follows:

wherein ,

n represents the length of the runoff amount prediction period;

indicating a predicted value of the runoff quantity at the moment i;

representing i moment runoffAnd (5) measuring the observed value.

NSE is an index commonly used in hydrology for evaluating the prediction result of a hydrologic model, and the range of the NSE is minus infinity to 1. The NSE value is close to 1, indicating high reliability of prediction; the value is close to 0, which means that the predicted result is close to the average level of the observed value, namely the overall result is reliable, but a certain error exists in a single predicted value; a value much smaller than 0 indicates that the prediction result is not trusted.

wherein ,

n represents the length of the runoff amount prediction period;

indicating a predicted value of the runoff quantity at the moment i;

indicating the observation value of the runoff quantity at the moment i;

the average value of the runoff amount observation values is shown.

Table 1 shows MAEs at different prediction periods (three days, one week, half month, one quarter) for a Yangtze river basin standing along a hydrologic station runoff data set to a home dam hydrologic station in cubic meters per second. From the data in the table, the MAE values of the method of the present invention were lower than those of the comparative method in all test cycles, indicating that the overall prediction error was minimal and that greater advantage was exhibited in long-cycle prediction.

TABLE 1 mean absolute error contrast at different prediction periods

Table 2 shows NSE at different prediction periods (three days, one week, half month, one quarter) for the Yangtze river basin along the hydrographic station runoff data set to the home dam hydrographic station. As can be seen from the data in the table, compared with the comparison method, the NSE value of the method is closer to 1 in all test periods, so that the reliability of the diameter flow prediction result is highest, and the method also shows greater advantages in long period prediction.

TABLE 2 Nash correlation coefficient comparison at different prediction periods

The foregoing description of the preferred embodiments of the invention is not intended to limit the invention to the precise form disclosed, and any such modifications, equivalents, and alternatives falling within the spirit and scope of the invention are intended to be included within the scope of the invention.

Claims

1. A method for predicting a runoff amount by coupling an attention mechanism with a decomposition mechanism, comprising: normalizing the historically observed runoff data, and inputting the normalized runoff data into a trained runoff prediction model to obtain a runoff prediction sequence; the runoff quantity prediction model comprises a positive standardization module, a time sequence decomposition module, a multi-head self-attention module, a time convolution network module and an inverse standardization module;

2. The method of traffic prediction according to claim 1, characterized in that the training of the traffic prediction model comprises the steps of:

s1: all data in the runoff data set are normalized;

3. The method for predicting the runoff amount according to claim 2, wherein in step S3, the forward normalization processing is performed on the runoff amount sequence samples in the training set, and may be expressed as the following formula:

/>

wherein ,

indicating the observation value of the runoff quantity at the moment i;

n represents the length of the runoff amount prediction period;

an average value representing the traffic volume observation value;

4. The runoff amount prediction method according to claim 2, wherein in step S4, the process of obtaining the trend component and the periodic component includes the steps of:

5. The runoff amount prediction method according to claim 2, wherein in step S5, the process of obtaining the self-attention prediction component includes the steps of:

6. The runoff amount prediction method of claim 5, wherein the process of obtaining the self-attention prediction component may be expressed as the following formula:

wherein ,

representing a normalized runoff amount observation;

linear represents a Linear layer;

representing an initial time series component;

q, K and V represent query and value sub-components, respectively;

K ^T a transposed component representing a key-value subcomponent;

d _k representing model dimensions;

softmax represents the normalized exponential function;

F _A representing the self-attention prediction component.

7. The method for predicting runout according to claim 2, wherein in step S6, the process of obtaining the initial predicted component includes the steps of:

s6.2: in the time convolution module, firstly, one-dimensional expansion convolution is carried out on input sequence data, then redundant data of a header is sheared, and then one-dimensional expansion convolution and shearing are carried out again;

8. The method of traffic prediction according to claim 7, wherein the process of obtaining the initial predicted component can be expressed as the following formula:

wherein ,

representing a normalized runoff amount observation;

conv represents a one-dimensional dilated convolution layer;

chomp represents a shear layer;

ReLU represents a nonlinear activation function;

dropout represents a random inactivation function;

conv1D represents one-dimensional convolution;

F _c representing the data sequence processed by a time convolution module;

9. The method for predicting the runout according to claim 2, wherein in step S8, the process of performing inverse normalization processing on the coupling prediction sequence to obtain the predicted value of the runout may be expressed as the following formula:

wherein ,

y represents a coupled predicted sequence;

sigma represents the standard deviation of the normalized runoff observation;

an average value representing the normalized runoff amount observation;

a sequence of traffic prediction is shown. />