CN114399101A - TCN-BIGRU-based gas load prediction method and device - Google Patents

TCN-BIGRU-based gas load prediction method and device Download PDF

Info

Publication number
CN114399101A
CN114399101A CN202111658841.8A CN202111658841A CN114399101A CN 114399101 A CN114399101 A CN 114399101A CN 202111658841 A CN202111658841 A CN 202111658841A CN 114399101 A CN114399101 A CN 114399101A
Authority
CN
China
Prior art keywords
data
tcn
bigru
historical
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111658841.8A
Other languages
Chinese (zh)
Inventor
袁烨
承灿赟
金骏阳
朱大令
张永
李泽明
童剑峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuxi China Resources Gas Co Ltd
HUST Wuxi Research Institute
Original Assignee
Wuxi China Resources Gas Co Ltd
HUST Wuxi Research Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuxi China Resources Gas Co Ltd, HUST Wuxi Research Institute filed Critical Wuxi China Resources Gas Co Ltd
Priority to CN202111658841.8A priority Critical patent/CN114399101A/en
Publication of CN114399101A publication Critical patent/CN114399101A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06393Score-carding, benchmarking or key performance indicator [KPI] analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Human Resources & Organizations (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Economics (AREA)
  • General Health & Medical Sciences (AREA)
  • Strategic Management (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Artificial Intelligence (AREA)
  • Tourism & Hospitality (AREA)
  • Development Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Game Theory and Decision Science (AREA)
  • Educational Administration (AREA)
  • Public Health (AREA)
  • Water Supply & Treatment (AREA)
  • Primary Health Care (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The embodiment of the invention provides a gas load prediction method and device based on TCN-BIGRU. The method comprises the steps of obtaining historical characteristic data; screening historical characteristic data, preprocessing the screened historical characteristic data, and taking the preprocessed historical characteristic data as training data; and constructing a TCN-BIGRU model, inputting training data into the TCN-BIGRU model, training the TCN-BIGRU model, and predicting the gas load of the next day by taking the trained TCN-BIGRU model as a gas load prediction model. In this way, the gas load can be accurately predicted through the TCN-BIGRU model, so that the operation efficiency of a gas company is improved, and the purchase cost is reduced.

Description

TCN-BIGRU-based gas load prediction method and device
Technical Field
The present invention relates generally to the field of gas load prediction, and more particularly, to a TCN-BIGRU based gas load prediction method and apparatus.
Background
With the evolution of energy structures and environmental protection policies, the demand for natural gas in china is rapidly increasing. However, natural gas resources in China are extremely unbalanced in distribution and are severely mismatched in supply and demand. Especially, during peak periods of gas use such as winter, shortage of natural gas is almost inevitable. Therefore, the method has important significance for making a strategy of buying and selling natural gas by accurately predicting the consumption of the natural gas.
Natural gas load is affected by weather (e.g., temperature, humidity, atmospheric pressure, etc.) and social activities (e.g., economic development, population growth, industrial manufacturing, etc.), with holidays and seasons being the most important factors. The natural gas consumption has the characteristics of large fluctuation, high randomness, large time fluctuation and the like, so that the prediction task is difficult. Conventional time series analysis methods have been used to predict natural gas consumption, including moving averages, autoregressive moving integral averages, autoregressive moving averages, kalman filtering, and wavelet transforms. These methods may capture the linear relationship between influencing factors, but are weak in describing non-linear features.
Conventional artificial intelligence methods are also widely used, such as support vector regression, artificial neural networks, bayesian networks, matrix decomposition, and gaussian process regression. These methods can extract non-linear relationships between features and can process small sample data. However, when processing large amounts of data, they are affected by dimension disasters and have higher computational complexity. In order to solve these problems, a deep belief network, a superposition denoising autoencoder and a convolutional neural network based on a restricted boltzmann machine have been proposed, which have superior performance to the above methods. However, they require manual extraction of features, and it is difficult to extract the relationship between past time points and future time points. A Recurrent Neural Network (RNN) is proposed for processing time series, but it is difficult to remember long-term information. In addition, when the time interval is long, gradient disappearance and explosion occur. Long-short memory (LSTM) and gated round robin unit (GRU) effectively solve this problem. A gated recursion unit is improved based on a recurrent neural network that uses forgetting gates, input gates, and output gates to efficiently explore time series and improve prediction accuracy. However, the gated-round unit (GRU) can only extract features from a single direction.
Disclosure of Invention
According to the embodiment of the invention, a gas load prediction scheme based on TCN-BIGRU is provided. According to the scheme, the gas load can be accurately predicted through the TCN-BIGRU model, so that the operation efficiency of a gas company is improved, and the purchasing cost is reduced.
In a first aspect of the invention, a TCN-BIGRU based gas load prediction method is provided. The method comprises the following steps:
acquiring historical characteristic data;
screening the historical characteristic data, preprocessing the screened historical characteristic data, and taking the preprocessed historical characteristic data as training data;
and constructing a TCN-BIGRU model, inputting the training data into the TCN-BIGRU model, training the TCN-BIGRU model, and predicting the gas load of the next day by taking the trained TCN-BIGRU model as a gas load prediction model.
Further, the historical feature data comprises time-series data and non-time-series data; wherein, the time sequence data is the total daily load data in the historical data; the non-time sequence data are holiday data and weather data corresponding to the current day of the historical data.
Further, the screening the historical feature data includes:
screening out historical characteristic data with the Pearson correlation coefficient larger than a threshold value;
wherein the Pearson correlation coefficient is:
Figure BDA0003449158290000031
wherein p isX,YIs the Pearson correlation coefficient; x is weather data of historical data on the day; y is the total daily load in the historical data; e (.) indicates expectation.
Further, the preprocessing the screened historical feature data includes:
normalizing the data of the total daily load data in the historical data;
carrying out one-hot coding on the holiday data corresponding to the current historical data;
the highest temperature data and the lowest temperature data among the weather data are normally normalized.
Further, the TCN-BIGRU model comprises an input layer, a one-dimensional convolution layer, a causal expansion convolution layer, a BIGRU layer and an output layer which are sequentially arranged;
the input layer is used for filtering the time series data by setting a sliding window and outputting the filtered time series data to the one-dimensional convolutional layer;
the one-dimensional convolutional layer is used for extracting local trend characteristics of the filtered time series data and outputting the local trend characteristics to the causal expansion convolutional layer;
the causal expansion convolutional layer is used for extracting hidden information and long-term time relation in the features and outputting the hidden information and the long-term time relation to the BIGRU layer;
the BIGRU layer learns the output vector of the causal expansion convolution layer by using a forward GRU network structure and a reverse GRU network structure to obtain a bidirectional time sequence characteristic, combines the bidirectional time sequence characteristic with a non-time sequence characteristic and inputs the bidirectional time sequence characteristic into the output layer;
and the output layer selects a full connection layer and is used for outputting the gas load predicted value of the next day according to the combined result of the time sequence characteristic and the non-time sequence characteristic.
Further, defining the loss function of the TCN-BIGRU model as an average value of absolute errors; the average of the absolute errors is:
Figure BDA0003449158290000041
wherein MAE is the average of absolute errors; m is the sum of days for predicting the next day gas quantity; y isiThe actual gas quantity of the ith day;
Figure BDA0003449158290000042
gas quantity was predicted for day i.
Further, the time-series data is:
x1=[xt-T+1,xt-T+2,...,xt]T
wherein x is1Is time series data; t is any time; t is a sliding window;
x2=[Qmax(s),Qmin(s),I(s),i(s)]
wherein x is2Is non-time series data; qmax(s) predicting the highest temperature on the current day; qmin(s) is the lowest temperature predicted for the day; i(s) is a working day indication function, and if the current day is predicted to be a working day, I(s) is 1; i(s) is a non-workday indication function, and if the current day is predicted to be a non-workday, i(s) is 0.
In a second aspect of the invention, a TCN-BIGRU based gas load prediction apparatus is provided. The device includes:
the acquisition module is used for acquiring historical characteristic data;
the preprocessing module is used for screening the historical characteristic data, preprocessing the screened historical characteristic data and taking the preprocessed historical characteristic data as training data;
and the model training module is used for constructing a TCN-BIGRU model, inputting the training data into the TCN-BIGRU model, training the TCN-BIGRU model, and predicting the gas load of the next day by taking the trained TCN-BIGRU model as a gas load prediction model.
In a third aspect of the invention, an electronic device is provided. The electronic device at least one processor; and a memory communicatively coupled to the at least one processor; the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of the first aspect of the invention.
In a fourth aspect of the invention, there is provided a non-transitory computer readable storage medium having stored thereon computer instructions for causing a computer to perform the method of the first aspect of the invention.
It should be understood that the statements herein reciting aspects are not intended to limit the critical or essential features of any embodiment of the invention, nor are they intended to limit the scope of the invention. Other features of the present invention will become apparent from the following description.
Drawings
The above and other features, advantages and aspects of embodiments of the present invention will become more apparent by referring to the following detailed description when taken in conjunction with the accompanying drawings. In the drawings, like or similar reference characters denote like or similar elements, and wherein:
FIG. 1 shows a flow diagram of a TCN-BIGRU based gas load prediction method according to an embodiment of the invention;
FIG. 2 shows a schematic structural diagram of a TCN-BIGRU model according to an embodiment of the invention;
FIG. 3 shows a block diagram of a TCN-BIGRU based gas load prediction device according to an embodiment of the present invention;
FIG. 4 illustrates a block diagram of an exemplary electronic device capable of implementing embodiments of the present invention;
here, 400 denotes an electronic device, 401 denotes a CPU, 402 denotes a ROM, 403 denotes a RAM, 404 denotes a bus, 405 denotes an I/O interface, 406 denotes an input unit, 407 denotes an output unit, 408 denotes a storage unit, and 409 denotes a communication unit.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention.
In addition, the term "and/or" herein is only one kind of association relationship describing an associated object, and means that there may be three kinds of relationships, for example, a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein generally indicates that the former and latter related objects are in an "or" relationship.
In the invention, the meteorological features, data features and economic features of the natural gas are analyzed. Further, the data is processed herein using a sliding window, intended to capture the actual variation and trend of fluctuations in gas properties. Next, the Temporal Convolutional Network (TCN) is used herein to extract hidden features and information that easily change the size of the receptive field. Finally, a Bi-GRU model is introduced to extract past and future features. The results show that the neural network based on the bidirectional gating recursive unit and the causal extension convolution model has good performance.
FIG. 1 shows a flow chart of a TCN-BIGRU based gas load prediction method of an embodiment of the present invention.
The method comprises the following steps:
and S101, acquiring historical characteristic data.
The historical feature data includes time series data and non-time series data.
The time series data is daily load total data in the history data. The daily load total data is the total load of each day in the historical data. The time sequence data is data sorted according to the sequence of time.
The non-time sequence data are holiday data and weather data corresponding to the current day of the historical data. The holiday data includes a working day and a non-working day, and the working day and the non-working day are represented by different identifiers, for example, the working day is represented as "0", and the non-working day is represented as "1". The weather data includes a temperature condition, a humidity condition, a weather condition, and the like corresponding to the current day of the calendar history data. Non-time series data is data that is not sorted in chronological order.
S102, screening the historical characteristic data, preprocessing the screened historical characteristic data, and taking the preprocessed historical characteristic data as training data.
Firstly, the screening the historical feature data includes:
screening out historical characteristic data with the Pearson correlation coefficient larger than a threshold value; the threshold value is, for example, 0.5.
The Pearson correlation coefficient is:
Figure BDA0003449158290000071
wherein p isX,YIs the Pearson correlation coefficient; x is weather data of historical data on the day; y is the total daily load in the historical data; e (.) indicates expectation; .
Through the screening, historical daily load total data, holiday data, highest temperature data and lowest temperature data are selected.
Secondly, the preprocessing the screened historical characteristic data comprises the following steps:
normalizing the data of the total daily load data in the historical data;
carrying out one-hot coding on the holiday data corresponding to the current historical data; for example, weekdays and non-weekdays are represented by one-hot codes of "0" and "1".
The highest temperature data and the lowest temperature data among the weather data are normally normalized.
S103, building a TCN-BIGRU model, inputting the training data into the TCN-BIGRU model, training the TCN-BIGRU model, and taking the trained TCN-BIGRU model as a gas load prediction model. And predicting the gas load of the next day according to the gas load prediction model.
The TCN-BIGRU model, as shown in fig. 2, includes an input layer, a one-dimensional convolution layer, a causal expansion convolution layer, a BIGRU layer, and an output layer, which are sequentially arranged.
The input layer is used for filtering the time series data by setting a sliding window and outputting the filtered time series data to the one-dimensional convolutional layer.
The input layer of the TCN-BIGRU model uses the preprocessed and feature-filtered data, i.e., the highest temperature, the lowest temperature, the holiday information, and the gas load value of the current time step as inputs. The input includes time-series data and non-time-series data.
In the present embodiment, a sliding window of a time step is set, and the sliding window size is set to 10. The time-series data input at this time are as follows:
the time-series data is:
x1=[xt-T+1,xt-T+2,...,xt]T
wherein x is1Is time series data; t is any time; t is a sliding window;
x2=[Qmax(s),Qmin(s),I(s),i(s)]
wherein x is2Is non-time series data; qmax(s) predicting the highest temperature on the current day; qmin(s) is the lowest temperature predicted for the day; i(s) is a working day indication function, and if the current day is predicted to be a working day, I(s) is 1; i(s) is a non-workday indication function, and if the current day is predicted to be a non-workday, i(s) is 0.
And the one-dimensional convolutional layer is used for extracting local trend characteristics of the filtered time series data and outputting the local trend characteristics to the causal expansion convolutional layer.
In each one-dimensional convolutional layer, a window is used to process time series data of a time length, and a sliding window is used to capture a short-term fluctuation trend. The sequence segments can be learned within the window size to capture local trend features of the time series. After one-dimensional processing is carried out on the gas data, characteristics are extracted after one-dimensional processing. The process shortens the length of the one-dimensional time sequence and improves the operation efficiency.
And the causal expansion convolutional layer is used for extracting hidden information and long-term time relation in the features and outputting the hidden information and the long-term time relation to the BIGRU layer. Specifically, data output by the one-dimensional convolution layer is input into the causal expansion convolution layer for feature extraction, the causal expansion convolution layer can effectively extract features of input data, hidden information and long-term time relation in the features can be extracted, feature dimensionality of the input data can be reduced, and operation efficiency is improved.
The causal dilation convolutional layer, i.e., the Time Convolutional Network (TCN), is an algorithm that processes time series. And a causal convolution, an expansion convolution and a residual module are introduced, so that the problem of long-term extraction of the time sequence is solved. The structure of the device is composed of the following three parts.
First, causal convolution (Causal Convo l ut ions)
For the input, the output at this time depends only on the current time and the past time. It is not dependent on future inputs, which means that causal convolutions are susceptible to historical data. This structure does not provide a better prediction for longer time sequences.
Second, dilation convolution (Di l ated Convo l entries)
In order to solve the problem that the causal convolution can only receive short-time historical information, the dilation convolution is introduced.
For a one-dimensional time series X ═ X0,x1,x2,x3…xt…xT) And filter f 0,1,2, n-1, timeThe dilation convolution operation H on a sequence is defined as follows:
Figure BDA0003449158290000091
where n represents the size of the filter, d represents the expansion factor, and f (i) represents the input.
By increasing the size and the dilation factor of the filter, the TCN can better perform feature extraction. The top layer of the TCN may accept a greater range of historical information input as the expansion factor increases. After adding the extended convolution, the range is significantly improved for the information input to the network compared to the previous receive field size.
Third, residual module (Res idua l B l cups)
In addition to continuously adjusting the filter size and the amplification factor size, the receive field size of the TCN can be enlarged by increasing the number of hidden layers.
Specifically, the structure comprises a causal expansion layer, a weight norm layer, an activation layer and an abscission layer. In particular, causal diffusion convolution is used to extract hidden information from the input. The hidden information represents information which cannot be obtained by directly observing data; the hidden information is extracted through causality expansion convolution to the characteristics of the input time sequence, and the expansion convolution and the characteristics which are mined by the residual error module and cannot be observed. WeightNorm is used to limit the weight range to vary the training speed. The active layer employs linear cells with good convergence effect (Relu) and uses attenuation to solve the overfitting problem of the network. The deeper causal expansion convolution network formed by the superposition of the residual modules can better extract features, so that each convolution of the output layer can extract more information from the input layer.
The branches of the remaining modules are used to perform the conversion operation. At the input, the remaining modules add a branch to perform the conversion to conform to the number of existing functions. The outputs of the remaining modules are defined as follows:
H(x)=F(x(h-1))+x(h-1)
x(h)=δ(H(X))
where F () denotes an activation operation and H () is a series of conversion operations.
Wherein, general RNN can only extract the last several characteristic relations of the predicted value and the input sequence, and the relation between the predicted value and a longer time sequence, namely the long-term time relation, can be extracted through a causal convolution network.
And the BIGRU layer learns the output vector of the causal expansion convolution layer by using a forward GRU network structure and a reverse GRU network structure to obtain a bidirectional time sequence characteristic, combines the bidirectional time sequence characteristic with a non-time sequence characteristic and inputs the bidirectional time sequence characteristic into the output layer. Specifically, a bidirectional GRU mechanism is utilized, a forward GRU network structure and a reverse GRU network structure are used for learning an output vector of the TCN network, and the hidden layer state of the GRU layer at the t moment is remembered to be ht
The GRU neural network is structurally simplified relative to the LSTM, making the parameters less convergent and easier to converge. The GRU comprises two gate control units, an updating gate and a resetting gate, and also comprises hidden states and candidate hidden states, zt represents the output of the updating gate at the time t, rt represents the output of the resetting gate at the time t, htAnd
Figure RE-GDA0003526210490000101
representing the output of the hidden state and the candidate hidden state, respectively, the expressions of GRU are as follows:
rt=σ(Wr*[ht-1,Xt]+br)
zt=σ(Wz*[ht-1,xt]+bz)
Figure RE-GDA0003526210490000111
Figure RE-GDA0003526210490000112
in the classical recurrent neural network, the transmission of states is developed from front to back in a single direction. When some devices perform data processing, the output at the present time is related not only to the previous state but also to the subsequent state. The bidirectional GRU is formed by superposing two GRUs up and down, the output is jointly determined by the states of the two GRUs, and the modeling capability of the time sequence of equipment operation in the degradation process can be better excavated.
And the output layer selects a full connection layer and is used for outputting a gas load predicted value at the next day, namely (t +1) according to the combined result of the time sequence characteristic and the non-time sequence characteristic and recording the predicted value as yt
The data transfer process of the TCN-BIGRU model comprises the following steps:
extracting local trend characteristics of the time sequence through the one-dimensional convolution layer from the time sequence data subjected to data preprocessing; then the extracted local trend characteristics are used for further extracting hidden information and long-term time relation in the characteristics through a causal expansion convolutional layer; then, inputting the characteristics extracted by the causal expansion convolution layer into a BIGRU layer for better learning bidirectional time sequence characteristics; and finally, combining the features and the non-time sequence features learned by the BIGRU layer, and inputting the combined features and the non-time sequence features into the full connection layer to obtain the final output.
Further, the penalty function of the TCN-BIGRU model is defined as the Mean of Absolute Error (MAE), which reflects the actual situation of predicted value error. The average of the absolute errors is:
Figure BDA0003449158290000114
wherein MAE is the average of absolute errors; m is the sum of days for predicting the next day gas quantity; y isiThe actual gas quantity of the ith day;
Figure BDA0003449158290000115
gas quantity was predicted for day i.
Due to the large amount of gas load data and the long time span, a large amount of abnormal data may occur. However, MAE is very robust to outlier data. Meanwhile, the dynamic learning rate can effectively improve the defect of MAE fixed gradient.
Further, the output result of the gas load prediction model, namely the TCN-BIGRU model, is used as the gas load prediction value of the next day.
As an embodiment of the present invention, for more complete evaluation of the prediction performance of the method, the Root Mean Square Error (RMSE), the Mean Absolute Error (MAE), the Mean Absolute Percentage Error (MAPE) and the R can be selected2To evaluate the performance of the model. The definition method of the statistical index is as follows:
Figure BDA0003449158290000121
Figure BDA0003449158290000122
Figure BDA0003449158290000123
Figure BDA0003449158290000124
wherein N represents the number of instances tested, zwAnd
Figure BDA0003449158290000125
representing the actual load value and the predicted load value in the w-th case, each of the evaluation indexes having different advantages and disadvantages, wherein the RMSE measures the accuracy of the model by comparing the deviation between the predicted load value and the actual load value, the measurement method is maintained in the same dimension with the load, but is easily influenced by the error value because of being very sensitive to the error of the larger value and the smaller value, the MAE represents the average absolute error between the predicted load value and the actual load value, and the index shows better abnormal values than the RMSEBut it does not fully reflect the degree of prediction bias. MAPE represents the accuracy of the model by calculating a percentage of absolute error that takes into account the relative error between the predicted load value and the actual load value, but cannot be used when the actual load value is 0. R2The regression result is reduced to between 0 and 1, and the value is closer to 1, which means that the model is better, so that different models can be compared more conveniently. Therefore, in view of the above-mentioned evaluation indexes, it is necessary to evaluate the prediction performance of the model by integrating a plurality of evaluation indexes. The evaluation results of the predicted load values based on the evaluation indexes are shown in table 1:
Figure BDA0003449158290000131
TABLE 1
As can be seen from Table 1, the TCN-BiGRU model is superior to other models in each evaluation index, and in RMSE, compared with the TCN model, the GRU model and the BiGRU model, the model provided by the invention has better effect, and compared with other indexes, the evaluation index of MAE is respectively reduced by 5.98, 6.52 and 2.68 percentage points, thereby proving that the method provided by the invention has greater improvement on the gas load prediction accuracy.
In summary, according to the embodiments of the present invention, based on the causal dilation convolution and the depth learning method of the bidirectional GRU, in addition, for the problem that the short-term gas load prediction is more random and difficult to predict, the present invention adopts a sliding window with a fixed length to reconstruct the gas data, then uses the TCN to increase the time acceptance domain so as to better extract hidden features, inputs the output of the TCN into the bidirectional GRU to perform feature extraction from the forward direction and the reverse direction, and finally adopts a full connection layer to output the final load prediction result. In addition, the causal expansion convolution and the bidirectional gating circulation unit are well combined together, hidden features can be well extracted, the short-term gas load prediction accuracy is further improved, and in the comparison process of the traditional methods, the method has good scores and high accuracy under different evaluation indexes.
It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present invention is not limited by the order of acts, as some steps may occur in other orders or concurrently in accordance with the invention. Further, those skilled in the art should also appreciate that the embodiments described in the specification are exemplary and alternative embodiments and that the acts and modules illustrated are not necessarily required to practice the invention.
The above is a description of an embodiment of the method, and the following is a further description of the solution of the present invention by an embodiment of the apparatus.
As shown in fig. 3, the apparatus 300 includes:
an obtaining module 310, configured to obtain historical feature data;
the preprocessing module 320 is configured to screen the historical feature data, preprocess the screened historical feature data, and use the preprocessed historical feature data as training data;
the model training module 330 is used for constructing a TCN-BIGRU model, inputting the training data into the TCN-BIGRU model, training the TCN-BIGRU model, and taking the trained TCN-BIGRU model as a gas load prediction model; and predicting the gas load of the next day according to the gas load prediction model.
It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working process of the described module may refer to the corresponding process in the foregoing method embodiment, and is not described herein again.
In the technical scheme of the invention, the acquisition, storage, application and the like of the personal information of the related user all accord with the regulations of related laws and regulations without violating the good customs of the public order.
The invention also provides an electronic device and a readable storage medium according to the embodiment of the invention.
FIG. 4 shows a schematic block diagram of an electronic device 400 that may be used to implement an embodiment of the invention. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not intended to limit implementations of the inventions described and/or claimed herein.
The device 400 comprises a computing unit 401 which may perform various suitable actions and processes in accordance with a computer program stored in a Read Only Memory (ROM)402 or a computer program loaded from a storage unit 408 into a Random Access Memory (RAM) 403. In the RAM 403, various programs and data required for the operation of the device 400 can also be stored. The computing unit 401, ROM 402, and RAM 403 are connected to each other via a bus 404. An input/output (I/O) interface 405 is also connected to bus 404.
A number of components in device 400 are connected to I/O interface 405, including: an input unit 406 such as a keyboard, a mouse, or the like; an output unit 407 such as various types of displays, speakers, and the like; a storage unit 408 such as a magnetic disk, optical disk, or the like; and a communication unit 409 such as a network card, modem, wireless communication transceiver, etc. The communication unit 409 allows the device 400 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunication networks.
Computing unit 401 may be a variety of general and/or special purpose processing components with processing and computing capabilities. Some examples of the computing unit 401 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various dedicated Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and so forth. The calculation unit 401 executes the respective methods and processes described above, such as the methods S101 to S103. For example, in some embodiments, methods S101-S103 may be implemented as a computer software program tangibly embodied in a machine-readable medium, such as storage unit 408. In some embodiments, part or all of the computer program may be loaded and/or installed onto the device 400 via the ROM 402 and/or the communication unit 409. When the computer program is loaded into RAM 403 and executed by computing unit 401, one or more steps of methods S101-S103 described above may be performed. Alternatively, in other embodiments, the computing unit 401 may be configured to perform the methods S101-S103 by any other suitable means (e.g., by means of a solid piece).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for implementing the methods of the present invention may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/acts specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of the present invention, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.
The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server may be a cloud server, a server of a distributed system, or a server combining a blockchain.
It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present invention may be executed in parallel, sequentially, or in different orders, and are not limited herein as long as the desired results of the technical solution of the present invention can be achieved.
The above-described embodiments should not be construed as limiting the scope of the invention. Those skilled in the art will appreciate that various modifications, combinations, sub-combinations, and substitutions are possible, depending on design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. A gas load prediction method based on TCN-BIGRU is characterized by comprising the following steps:
acquiring historical characteristic data;
screening the historical characteristic data, preprocessing the screened historical characteristic data, and taking the preprocessed historical characteristic data as training data;
and constructing a TCN-BIGRU model, inputting the training data into the TCN-BIGRU model, training the TCN-BIGRU model, and predicting the gas load of the next day by taking the trained TCN-BIGRU model as a gas load prediction model.
2. The method of claim 1, wherein the historical characterization data comprises time series data and non-time series data; the time sequence data is total daily load data in the historical data; the non-time sequence data are holiday data and weather data corresponding to the current day of the historical data.
3. The method of claim 2, wherein the filtering the historical feature data comprises:
screening out historical characteristic data with the Pearson correlation coefficient larger than a threshold value;
wherein the Pearson correlation coefficient is:
Figure FDA0003449158280000011
wherein p isX,YIs the Pearson correlation coefficient; x is weather data of historical data on the day; y is the total daily load in the historical data; e (.) indicates expectation.
4. The method of claim 2, wherein the pre-processing the filtered historical characterization data comprises:
normalizing the data of the total daily load data in the historical data;
carrying out one-hot coding on the holiday data corresponding to the current historical data;
the highest temperature data and the lowest temperature data among the weather data are normally normalized.
5. The method of claim 1, wherein the TCN-BIGRU model comprises an input layer, a one-dimensional convolutional layer, a causal expansion convolutional layer, a BIGRU layer, and an output layer arranged in this order;
the input layer is used for filtering the time series data by setting a sliding window and outputting the filtered time series data to the one-dimensional convolutional layer;
the one-dimensional convolutional layer is used for extracting local trend characteristics of the filtered time series data and outputting the local trend characteristics to the causal expansion convolutional layer;
the causal expansion convolutional layer is used for extracting hidden information and long-term time relation in the features and outputting the hidden information and the long-term time relation to the BIGRU layer;
the BIGRU layer learns the output vector of the causal expansion convolution layer by using a forward GRU network structure and a reverse GRU network structure to obtain a bidirectional time sequence characteristic, combines the bidirectional time sequence characteristic with a non-time sequence characteristic and inputs the bidirectional time sequence characteristic into the output layer;
and the output layer selects a full connection layer and is used for outputting the gas load predicted value of the next day according to the combined result of the time sequence characteristic and the non-time sequence characteristic.
6. The method of claim 5, wherein the penalty function for the TCN-BIGRU model is defined as an average of absolute errors; the average of the absolute errors is:
Figure FDA0003449158280000021
wherein MAE is the average of absolute errors; m is the sum of days for predicting the next day gas quantity; y isiThe actual gas quantity of the ith day;
Figure FDA0003449158280000022
gas quantity was predicted for day i.
7. The method of claim 2, wherein the time series data is:
x1=[xt-T+1,xt-T+2,...,xt]T
wherein x is1Is time series data; t is any time; t is a sliding window;
x2=[Qmax(s),Qmin(s),I(s),i(s)]
wherein x is2Is non-time series data; qmax(s) predicting the highest temperature on the current day; qmin(s) is the lowest temperature predicted for the day; i(s) is a working day indication function, and if the current day is predicted to be a working day, I(s) is 1; i(s) is a non-workday indication function, and if the current day is predicted to be a non-workday, i(s) is 0.
8. A TCN-BIGRU-based gas load prediction device is characterized by comprising:
the acquisition module is used for acquiring historical characteristic data;
the preprocessing module is used for screening the historical characteristic data, preprocessing the screened historical characteristic data and taking the preprocessed historical characteristic data as training data;
and the model training module is used for constructing a TCN-BIGRU model, inputting the training data into the TCN-BIGRU model, training the TCN-BIGRU model, and predicting the gas load of the next day by taking the trained TCN-BIGRU model as a gas load prediction model.
9. An electronic device, at least one processor; and
a memory communicatively coupled to the at least one processor; it is characterized in that the preparation method is characterized in that,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-7.
10. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of claims 1-7.
CN202111658841.8A 2021-12-30 2021-12-30 TCN-BIGRU-based gas load prediction method and device Pending CN114399101A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111658841.8A CN114399101A (en) 2021-12-30 2021-12-30 TCN-BIGRU-based gas load prediction method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111658841.8A CN114399101A (en) 2021-12-30 2021-12-30 TCN-BIGRU-based gas load prediction method and device

Publications (1)

Publication Number Publication Date
CN114399101A true CN114399101A (en) 2022-04-26

Family

ID=81228281

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111658841.8A Pending CN114399101A (en) 2021-12-30 2021-12-30 TCN-BIGRU-based gas load prediction method and device

Country Status (1)

Country Link
CN (1) CN114399101A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117454124A (en) * 2023-12-26 2024-01-26 山东大学 Ship motion prediction method and system based on deep learning

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111815035A (en) * 2020-06-22 2020-10-23 国网上海市电力公司 Short-term load prediction method fusing morphological clustering and TCN-Attention

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111815035A (en) * 2020-06-22 2020-10-23 国网上海市电力公司 Short-term load prediction method fusing morphological clustering and TCN-Attention

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
LIANG LI,ET AL: ""Temporal Attention Based TCN-BIGRU Model for Energy Time Series Forecasting"", 《2021 IEEE INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE AND ELECTRONIC ENGINEERING (CSAIEE)》, pages 187 - 192 *
郭玲 等: ""基于TCN-GRU模型的短期负荷预测方法"", 《电力工程技术》, vol. 40, no. 3, pages 66 - 71 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117454124A (en) * 2023-12-26 2024-01-26 山东大学 Ship motion prediction method and system based on deep learning
CN117454124B (en) * 2023-12-26 2024-03-29 山东大学 Ship motion prediction method and system based on deep learning

Similar Documents

Publication Publication Date Title
Ma et al. A hybrid attention-based deep learning approach for wind power prediction
Ding et al. Point and interval forecasting for wind speed based on linear component extraction
Bin et al. Regression model for appraisal of real estate using recurrent neural network and boosting tree
CN115587666A (en) Load prediction method and system based on seasonal trend decomposition and hybrid neural network
CN111985719A (en) Power load prediction method based on improved long-term and short-term memory network
CN111738331A (en) User classification method and device, computer-readable storage medium and electronic device
CN116485031A (en) Method, device, equipment and storage medium for predicting short-term power load
CN116757465A (en) Line risk assessment method and device based on double training weight distribution model
CN116489038A (en) Network traffic prediction method, device, equipment and medium
CN114399101A (en) TCN-BIGRU-based gas load prediction method and device
CN114266602A (en) Deep learning electricity price prediction method and device for multi-source data fusion of power internet of things
CN116885699A (en) Power load prediction method based on dual-attention mechanism
Zhang et al. Collaborative Forecasting and Analysis of Fish Catch in Hokkaido From Multiple Scales by Using Neural Network and ARIMA Model
CN115759751A (en) Enterprise risk prediction method and device, storage medium, electronic equipment and product
CN115545319A (en) Power grid short-term load prediction method based on meteorological similar day set
CN115759343A (en) E-LSTM-based user electric quantity prediction method and device
CN114861800A (en) Model training method, probability determination method, device, equipment, medium and product
Sun et al. Short-term stock price forecasting based on an svd-lstm model
CN113033903A (en) Fruit price prediction method, medium and equipment of LSTM model and seq2seq model
Liu Stock prediction using lstm and gru
Wang et al. A-ConvRNN: A Prediction Model for E-Commerce Page Views Based on Convolutional Neural Network and Attention Mechanism
CN115759373A (en) Gas daily load prediction method, device and equipment
US20230419128A1 (en) Methods for development of a machine learning system through layered gradient boosting
CN115689036A (en) Gas daily load prediction method based on Prophet-BIGRU
CN117875467A (en) Power system payload prediction method, device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination