WO2022241932A1 - Prediction method based on non-intrusive attention preprocessing process and bilstm model - Google Patents
Prediction method based on non-intrusive attention preprocessing process and bilstm model Download PDFInfo
- Publication number
- WO2022241932A1 WO2022241932A1 PCT/CN2021/105889 CN2021105889W WO2022241932A1 WO 2022241932 A1 WO2022241932 A1 WO 2022241932A1 CN 2021105889 W CN2021105889 W CN 2021105889W WO 2022241932 A1 WO2022241932 A1 WO 2022241932A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- data
- bilstm
- window
- model
- input
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 43
- 230000008569 process Effects 0.000 title claims abstract description 28
- 238000007781 pre-processing Methods 0.000 title claims abstract description 26
- 230000007246 mechanism Effects 0.000 claims abstract description 10
- 238000004364 calculation method Methods 0.000 claims description 15
- 238000012549 training Methods 0.000 claims description 13
- 239000011159 matrix material Substances 0.000 claims description 10
- 238000012360 testing method Methods 0.000 claims description 10
- 238000013277 forecasting method Methods 0.000 claims description 9
- 210000002569 neuron Anatomy 0.000 claims description 9
- 238000012795 verification Methods 0.000 claims description 7
- 238000012545 processing Methods 0.000 claims description 5
- 230000006870 function Effects 0.000 claims description 4
- 238000005070 sampling Methods 0.000 claims description 3
- 230000007774 longterm Effects 0.000 abstract description 7
- 238000013136 deep learning model Methods 0.000 abstract description 2
- 238000005265 energy consumption Methods 0.000 abstract description 2
- 238000013507 mapping Methods 0.000 abstract description 2
- 238000010586 diagram Methods 0.000 description 4
- 230000015654 memory Effects 0.000 description 4
- 238000013528 artificial neural network Methods 0.000 description 3
- 230000005611 electricity Effects 0.000 description 3
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000010200 validation analysis Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000002457 bidirectional effect Effects 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 125000004122 cyclic group Chemical group 0.000 description 1
- 238000013079 data visualisation Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000007787 long-term memory Effects 0.000 description 1
- 230000000306 recurrent effect Effects 0.000 description 1
- 238000000714 time series forecasting Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/067—Enterprise or organisation modelling
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/06—Energy or water supply
Definitions
- the invention relates to the technical field of power load forecasting, specifically based on the electricity consumption data of the existing time period, using a prediction method combining non-intrusive attention preprocessing and BiLSTM model to predict the electricity consumption data of the future time period.
- Power load forecasting is to predict its future value based on the past and present of the power load. Forecasting the power load can infer the development trend and possible state of the load, and improve economic and social benefits.
- the current power load forecasting methods are generally divided into traditional forecasting methods and modern forecasting methods, among which modern forecasting methods mainly include the following: forecasting methods based on convolutional neural network models, using LSTM (long-short-term memory long-short-term memory Network model) model combined with time series to predict power system load, and neural network methods such as direct convolution using multi-dimensional data such as power consumption data, temperature, time, etc., have achieved good results.
- LSTM long-short-term memory long-short-term memory Network model
- BiLSTM Bi-directional long-short-term memory bidirectional long-short-term memory network model
- the BiLSTM model solves the vanishing gradient problem and provides long-term correlation.
- BiLSTM performs well in time series forecasting tasks, fundamental constraints of sequence computation still exist.
- This problem can be solved by attention mechanism, which enables the model to achieve better results in the input or output sequence, regardless of the length of the input data.
- attention mechanism which enables the model to achieve better results in the input or output sequence, regardless of the length of the input data.
- the present invention uses the attention mechanism as a preprocessing process, combined with the BiLSTM model, which not only enhances the long-term memory ability of the model, but also avoids internal modification of the BiLSTM model.
- the invention discloses a prediction method based on a non-intrusive attention preprocessing process and a BiLSTM model.
- a deep learning model enhanced by a non-intrusive attention mechanism is used for long-term energy consumption prediction.
- a preprocessing model based on an attention mechanism and A general BiLSTM network is composed, called AP-BiLSTM.
- the preprocessing model based on the attention mechanism is completed by the dot product of the convolutional layer and the fully connected layer.
- a new forecasting method based on non-intrusive attention preprocessing process and BiLSTM model for power load forecasting including the following steps:
- S2 Input the results in S1 into the BiLSTM network model to get the final prediction result.
- S1 Data preprocessing based on the non-intrusive attention mechanism.
- the output obtained after the training data passes through the processing module has learned the relationship between the data before and after the time series, but it can still be used as a new input and input into the BiLSTM network. , specifically include the following steps:
- S1.2 Divide the data into 20%, 10%, and 70% ratios: test set, verification set, and training set;
- the BiLSTM network model described has an Encoder-Decoder architecture, the Encoder architecture is composed of a BiLSTM network layer, and the Decoder architecture is composed of an LSTM network layer and a Dense network layer, specifically including the following steps:
- S2.1 Input the results in S1.6 into the BiLSTM network layer.
- the forward propagation layer of BiLSTM has 15 neurons, and the back propagation layer has 15 neurons.
- the number of neurons can be represented by units, and the output data It can be represented by a matrix of shape (m, window_size, 30), where m represents the number of samples;
- S2.2 Express the output in S2.1 as (y′ t1 , y′ t2 ,...,y′ units ), and input it into the LSTM network layer;
- S2.3 Express the output in S2.2 as (y′′ t1 ,y′′ t2 ,...,y′′ window_size-1 ), and input it into the Dense network layer, and output the final prediction result output, that is, the shape is ( m,1) matrix;
- MSE mean square error
- S2.5 Use steps S1 and S2.1-S2.4 to obtain the optimized final network model, test the performance of the model through the test set, and finally apply it in the actual prediction work.
- the method of the present invention adopts a new non-intrusive attention preprocessing process, and by extracting the attention operation from the internal model into the preprocessing process, both the local and global associations in the long-term dependencies of the input data are obtained enhanced.
- this preprocessing process can be applied to most deep learning networks and avoids modification of their network structures. Therefore, the method of the present invention still performs better in long-term forecasting.
- Fig. 1 is a flow chart of the prediction method of the present invention.
- Fig. 2 is an overall model architecture diagram of the present invention.
- Fig. 3 is a structural diagram of the present invention based on non-intrusive attention preprocessing module. (innovation part)
- Fig. 4 is a data visualization diagram used in the examples of the present invention.
- Fig. 5 is an indicator diagram of the prediction results of the method of the present invention and other 4 comparison methods.
- S1 Express the original input data as: x 1 ,x 2 ,...,x m , where 1,2,...,m represent the duration of the input time series, the total length is m, and the data shape is (m,1).
- the data is preprocessed according to the time series sliding window sampling, the window is recorded as window_size, and the window data with the shape of (window_size, 1) is sequentially intercepted as a sample to construct a sample data set, then the preprocessed sample data is: (m, window_size, 1), in the present invention, the input time series length is 1667, and window_size is set to 50, then the sample data shape after preprocessing is (1667,50,1);
- test set, validation set, and training set sample numbers are respectively: 333, 167, and 1167, and the shapes are respectively (333,50,1),(167,50,1),(1167,50,1);
- S3 Perform one-dimensional convolution on the input training data in S2, the convolution kernel is k, and the process is expressed as: Among them, x t represents the input data at time t, x t+n represents the n-order neighbor of x t , where n ⁇ k is the k-order neighbor, and w c and b c represent the parameters to be learned.
- the convolution kernel is set to 3;
- S6 Connect the output results in S5 with the original input data.
- Each data sample can be expressed as: x 1 , x 2 ,...,x 50 , and the process can be expressed as: in Indicates the calculation result in S5, x t indicates the original input, and the data shape after connection processing is (50,2);
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Strategic Management (AREA)
- Economics (AREA)
- Human Resources & Organizations (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Marketing (AREA)
- Tourism & Hospitality (AREA)
- General Business, Economics & Management (AREA)
- Entrepreneurship & Innovation (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Development Economics (AREA)
- Game Theory and Decision Science (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Public Health (AREA)
- Water Supply & Treatment (AREA)
- General Engineering & Computer Science (AREA)
- Computing Systems (AREA)
- Primary Health Care (AREA)
- Molecular Biology (AREA)
- Evolutionary Computation (AREA)
- Data Mining & Analysis (AREA)
- Computational Linguistics (AREA)
- Educational Administration (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
A prediction method based on a non-intrusive attention preprocessing process and a BiLSTM model. A deep learning model enhanced by a non-intrusive attention mechanism is used for long-term energy consumption prediction, consists of an attention mechanism-based preprocessing model and a universal BiLSTM network, and is called as AP-BiLSTM. The attention mechanism-based preprocessing model is completed by the dot product of a convolutional layer and a fully connected layer. The two layers perform feature mapping of the original input data, which is critical to improve the performance of an AP-BiLSTM method. By means of the manner, both local and global associations in the long-term dependency of the input data are enhanced. The method comprises the following steps: S1: performing a non-intrusive data preprocessing process; and S2: inputting the result in S1 into the BiLSTM network model to obtain a final prediction result.
Description
本发明涉及电力负荷预测技术领域,具体为基于已有时间段的用电数据,利用结合非侵入式注意力预处理与BiLSTM模型的预测方法来预测未来时间段的用电数据。The invention relates to the technical field of power load forecasting, specifically based on the electricity consumption data of the existing time period, using a prediction method combining non-intrusive attention preprocessing and BiLSTM model to predict the electricity consumption data of the future time period.
电力负荷预测是根据电力负荷的过去和现在推测它的未来数值,进行电力负荷预测可以推知负荷的发展趋势和可能达到的状况,提高经济效益和社会效益。Power load forecasting is to predict its future value based on the past and present of the power load. Forecasting the power load can infer the development trend and possible state of the load, and improve economic and social benefits.
当前的电力负荷预测方法大方向分为传统预测方法和现代预测方法,其中现代预测方法主要包括以下几种:基于卷积神经网络模型的预测方法,利用LSTM(long-short-term memory长短期记忆网络模型)模型结合时间序列预测电力系统负荷的方法,以及利用用电数据、温度、时间等多维数据进行直接卷积等神经网络方法,都取得了较好的效果。The current power load forecasting methods are generally divided into traditional forecasting methods and modern forecasting methods, among which modern forecasting methods mainly include the following: forecasting methods based on convolutional neural network models, using LSTM (long-short-term memory long-short-term memory Network model) model combined with time series to predict power system load, and neural network methods such as direct convolution using multi-dimensional data such as power consumption data, temperature, time, etc., have achieved good results.
其中,BiLSTM(Bi-directional long-short-term memory双向长短期记忆网络模型)是一种人工循环神经网络结构,非常适合基于时间序列数据的预测。BiLSTM模型解决了梯度消失问题,并提供了长期的相关性。然而,虽然BiLSTM在时间序列预测任务中表现良好,但序列计算的基本约束仍然存在。注意力机制可以解决这个问题,注意力机制使得模型在输入或输出序列中,无论输入数据的长度如何,都能获得更好的结果。但是当前结合注意力与BiLSTM等循环神经网络的模型中,通常需要对BiLSTM内部结构进行修改,增加了模型设计的难度。基于此,本发明将注意力机制作为一个预处理过程,再结合BiLSTM模型,既增强了模型的长期记忆能力,又避免了对BiLSTM模型内部的修改。Among them, BiLSTM (Bi-directional long-short-term memory bidirectional long-short-term memory network model) is an artificial cyclic neural network structure, which is very suitable for prediction based on time series data. The BiLSTM model solves the vanishing gradient problem and provides long-term correlation. However, while BiLSTM performs well in time series forecasting tasks, fundamental constraints of sequence computation still exist. This problem can be solved by attention mechanism, which enables the model to achieve better results in the input or output sequence, regardless of the length of the input data. However, in the current model that combines attention and BiLSTM and other recurrent neural networks, it is usually necessary to modify the internal structure of BiLSTM, which increases the difficulty of model design. Based on this, the present invention uses the attention mechanism as a preprocessing process, combined with the BiLSTM model, which not only enhances the long-term memory ability of the model, but also avoids internal modification of the BiLSTM model.
发明内容Contents of the invention
本发明公开了一种基于非侵入式注意力预处理过程与BiLSTM模型的预测方法,采用非侵入式注意机制增强的深度学习模型用于长期能量消耗预测,由一个基于注意机制的预处理模型和一个通用的BiLSTM网络组成,称为AP-BiLSTM。基于注意机制的预处理模型是由卷积层 和全连接层的点积来完成,这两层进行原始输入数据的特征映射,这是提高AP-BiLSTM方法性能的关键。通过这种方式,输入数据长期依赖关系中的本地和全局关联都得到了增强。The invention discloses a prediction method based on a non-intrusive attention preprocessing process and a BiLSTM model. A deep learning model enhanced by a non-intrusive attention mechanism is used for long-term energy consumption prediction. A preprocessing model based on an attention mechanism and A general BiLSTM network is composed, called AP-BiLSTM. The preprocessing model based on the attention mechanism is completed by the dot product of the convolutional layer and the fully connected layer. These two layers perform feature mapping of the original input data, which is the key to improving the performance of the AP-BiLSTM method. In this way, both local and global associations in the long-term dependencies of the input data are enhanced.
技术方案Technical solutions
一种新的用于电力负荷预测的基于非侵入式注意力预处理过程与BiLSTM模型的预测方法,包括以下步骤:A new forecasting method based on non-intrusive attention preprocessing process and BiLSTM model for power load forecasting, including the following steps:
S1:非侵入式的数据预处理过程;S1: Non-intrusive data preprocessing process;
S2:将S1中的结果输入进BiLSTM网络模型中,得出最终的预测结果。S2: Input the results in S1 into the BiLSTM network model to get the final prediction result.
S1中:基于非侵入式注意力机制的数据预处理,训练数据经过该处理模块后所得出的输出已经学习了时间序列前后数据之间的关系,但是仍然可以作为新的输入,输入进BiLSTM网络中,具体包括以下步骤:In S1: Data preprocessing based on the non-intrusive attention mechanism. The output obtained after the training data passes through the processing module has learned the relationship between the data before and after the time series, but it can still be used as a new input and input into the BiLSTM network. , specifically include the following steps:
S1.1:将原始输入数据表示为:x
1,x
2,…,x
m,其中1,2,…,m表示输入时间序列的时长,则总长度为m,则该数据可用形状为(m,1)的矩阵表示。对数据进行根据时间序列滑动窗口采样预处理,窗口长度记为window_size,依次截取形状为(window_size,1)的窗口数据,作为样本,构造样本数据集,则预处理后的每个样本可以表示为:x
1,x
2,…,x
window_size,则总样本数据可用形状为:(m,window_size,1)的矩阵表示;
S1.1: Express the original input data as: x 1 , x 2 ,…,x m , where 1, 2,…, m represent the duration of the input time series, then the total length is m, and the available shape of the data is ( m,1) matrix representation. The data is preprocessed according to the time series sliding window sampling, the window length is recorded as window_size, and the window data with the shape of (window_size, 1) is sequentially intercepted as samples to construct a sample data set, then each sample after preprocessing can be expressed as : x 1 , x 2 ,…, x window_size , then the total sample data can be represented by a matrix with shape: (m, window_size, 1);
S1.2:将数据按照20%,10%,70%的比例分为:测试集,验证集,训练集;S1.2: Divide the data into 20%, 10%, and 70% ratios: test set, verification set, and training set;
S1.3:对S1.2中的每个输入训练数据做一维卷积操作,卷积核大小为k,卷积过程表示为:
其中cx
t表示卷积结果,x
t表示t时刻输入数据,x
t+n表示x
t的n阶近邻,其中n<k,即k阶近邻,w
c和b
c表示要学习的参数;
S1.3: Perform one-dimensional convolution operation on each input training data in S1.2, the convolution kernel size is k, and the convolution process is expressed as: Where cx t represents the convolution result, x t represents the input data at time t, x t+n represents the n-order neighbor of x t , where n<k, that is, the k-order neighbor, w c and b c represent the parameters to be learned;
S1.4:对S1.2中的输入训练数据做全连接计算,计算过程可表示为:dx
t=ω
dx
t+b
d,其中w
d和b
d表示要学习的参数,dx
t表示该步输出结果;
S1.4: Perform fully connected calculations on the input training data in S1.2. The calculation process can be expressed as: dx t = ω d x t + b d , where w d and b d represent the parameters to be learned, and dx t represents The output result of this step;
S1.5:对S1.3和S1.4中结果做加权求和,计算过程可表示为:
其中cx
t是S1.3中的卷积结果,dx
t是S1.4中的全连接层的计算结果,[·]表示点乘;
S1.5: Do a weighted summation of the results in S1.3 and S1.4, the calculation process can be expressed as: Among them, cx t is the convolution result in S1.3, dx t is the calculation result of the fully connected layer in S1.4, and [ ] means dot product;
S1.6:将S1.5中的输出结果(即
),与原始输入序列:x
1,x
2,…,x
window_size做连接操作,过程可以表示为:
其中
表示S1.5中的计算结果,x
t表示原始输入,连接处理后的数据可用形状为(window_size,2)的矩阵表示,window_size是S1.1中输入数据的滑动窗口大小;
S1.6: The output result in S1.5 (ie ), connect with the original input sequence: x 1 , x 2 ,…, x window_size , the process can be expressed as: in Indicates the calculation result in S1.5, x t indicates the original input, and the data after connection processing can be represented by a matrix of shape (window_size, 2), where window_size is the sliding window size of the input data in S1.1;
S2中:所述的BiLSTM网络模型为,具有Encoder-Decoder架构,Encoder架构由BiLSTM网络层组成,Decoder架构由LSTM网络层和Dense网络层组成,具体包括以下步骤:In S2: the BiLSTM network model described has an Encoder-Decoder architecture, the Encoder architecture is composed of a BiLSTM network layer, and the Decoder architecture is composed of an LSTM network layer and a Dense network layer, specifically including the following steps:
S2.1:将S1.6中的结果输入BiLSTM网络层中,BiLSTM的前向传播层具有15个神经元,反向传播层具有15个神经元,神经元个数可用units表示,则输出数据可用形状为(m,window_size,30)的矩阵表示,m表示样本数量;S2.1: Input the results in S1.6 into the BiLSTM network layer. The forward propagation layer of BiLSTM has 15 neurons, and the back propagation layer has 15 neurons. The number of neurons can be represented by units, and the output data It can be represented by a matrix of shape (m, window_size, 30), where m represents the number of samples;
S2.2:将S2.1中的输出表示为(y′
t1,y′
t2,…,y′
units),并输入到LSTM网络层中;
S2.2: Express the output in S2.1 as (y′ t1 , y′ t2 ,…,y′ units ), and input it into the LSTM network layer;
S2.3:将S2.2中的输出表示为(y″
t1,y″
t2,…,y″
window_size-1),并输入到Dense网络层中,输出最终的预测结果output,即形状为(m,1)的矩阵;
S2.3: Express the output in S2.2 as (y″ t1 ,y″ t2 ,…,y″ window_size-1 ), and input it into the Dense network layer, and output the final prediction result output, that is, the shape is ( m,1) matrix;
S2.4:通过S1和S2.1-S2.3所述的网络模型,所输出的预测序列output,可展开表示为(y
1,y
2,…y
i),其中y
i,i=1…m,分别表示第i个样本的数据的预测结果,再将y
i与真实值对比,使用通过反向传播更新网络参数,其中损失函数为均值平方差(MSE)损失函数,学习率设置为0.01。并利用模型在验证集上的表现实现早停,重复以上步骤,不断调整模型参数,得到较好的准确率。
S2.4: Through the network model described in S1 and S2.1-S2.3, the output prediction sequence output can be expanded and expressed as (y 1 , y 2 ,...y i ), where y i , i=1 ...m, respectively represent the prediction results of the data of the i-th sample, and then compare y i with the real value, and update the network parameters through backpropagation, where the loss function is the mean square error (MSE) loss function, and the learning rate is set to 0.01. And use the performance of the model on the verification set to achieve early stopping, repeat the above steps, and continuously adjust the model parameters to obtain better accuracy.
S2.5:使用步骤S1及S2.1-S2.4得到优化后的最终网络模型,通过测试集测试模型表现,最终应用在实际的预测工作中。S2.5: Use steps S1 and S2.1-S2.4 to obtain the optimized final network model, test the performance of the model through the test set, and finally apply it in the actual prediction work.
本发明有益效果:本发明方法通过一个新的非侵入式注意力预处理过程,通过将注意力操作从内部模型提取到预处理过程中,输入数据长期依赖关系中的本地和全局关联都得到了增强。同时,该预处理过程可以应用于大多数深度学习网络,并避免了对其网络结构的修改。因此,本发明所述方法在长期预测中表现仍然较好。Beneficial effects of the present invention: the method of the present invention adopts a new non-intrusive attention preprocessing process, and by extracting the attention operation from the internal model into the preprocessing process, both the local and global associations in the long-term dependencies of the input data are obtained enhanced. At the same time, this preprocessing process can be applied to most deep learning networks and avoids modification of their network structures. Therefore, the method of the present invention still performs better in long-term forecasting.
图1是本发明预测方法流程图。Fig. 1 is a flow chart of the prediction method of the present invention.
图2是本发明整体模型架构图。Fig. 2 is an overall model architecture diagram of the present invention.
图3是本发明基于非侵入式注意力预处理模块结构图。(创新部分)Fig. 3 is a structural diagram of the present invention based on non-intrusive attention preprocessing module. (innovation part)
图4是本发明实例中用到的数据可视化图。Fig. 4 is a data visualization diagram used in the examples of the present invention.
图5是本发明方法与其他4种对比方法的预测结果指标图。Fig. 5 is an indicator diagram of the prediction results of the method of the present invention and other 4 comparison methods.
为了使本发明的目的和效果更加清楚,下面以本发明中一种基于非侵入式注意力预处理过程与BiLSTM模型的预测方法为例,利用美国全国2015年7月1日至2020年1月23日共1667条每日总用电负荷数据,对本发明的集成模型进行详细描述。In order to make the purpose and effect of the present invention clearer, take a prediction method based on the non-intrusive attention preprocessing process and the BiLSTM model in the present invention as an example, using the national data from July 1, 2015 to January 2020 A total of 1667 pieces of daily total electricity load data on the 23rd describe the integrated model of the present invention in detail.
S1:将原始输入数据表示为:x
1,x
2,…,x
m,其中1,2,…,m表示输入时间序列的时长,则总长度为m,数据形状为(m,1)。对数据进行根据时间序列滑动窗口采样预处理,窗口记为window_size,依次截取形状为(window_size,1)的窗口数据,作为样本,构造样本数据集,则预处理后的样本数据为:(m,window_size,1),本发明中输入时间序列长度为1667,window_size设定为50,则预处理后的样本数据形状为(1667,50,1);
S1: Express the original input data as: x 1 ,x 2 ,…,x m , where 1,2,…,m represent the duration of the input time series, the total length is m, and the data shape is (m,1). The data is preprocessed according to the time series sliding window sampling, the window is recorded as window_size, and the window data with the shape of (window_size, 1) is sequentially intercepted as a sample to construct a sample data set, then the preprocessed sample data is: (m, window_size, 1), in the present invention, the input time series length is 1667, and window_size is set to 50, then the sample data shape after preprocessing is (1667,50,1);
S2:将数据按照20%,10%,70%的比例分为:测试集,验证集,训练集,则测试集,验证集,训练集样本数量分别为:333,167,1167,形状分别为(333,50,1),(167,50,1),(1167,50,1);S2: Divide the data into test set, validation set, and training set according to the ratio of 20%, 10%, and 70%. The test set, validation set, and training set sample numbers are respectively: 333, 167, and 1167, and the shapes are respectively (333,50,1),(167,50,1),(1167,50,1);
S3:对S2中的输入训练数据做一维卷积,卷积核为k,过程表示为:
其中x
t表示t时刻输入数据,x
t+n表示x
t的n阶近邻,其中n<k,即k阶近邻,w
c和b
c表示要学习的参数。本发明中,卷积核设为3;
S3: Perform one-dimensional convolution on the input training data in S2, the convolution kernel is k, and the process is expressed as: Among them, x t represents the input data at time t, x t+n represents the n-order neighbor of x t , where n<k is the k-order neighbor, and w c and b c represent the parameters to be learned. In the present invention, the convolution kernel is set to 3;
S4:对S2中的输入训练数据做全连接计算,计算过程可表示为:dx
t=ω
dx
t+b
d,其中w
d和b
d表示要学习的参数;
S4: Perform full connection calculation on the input training data in S2, the calculation process can be expressed as: dx t = ω d x t + b d , where w d and b d represent the parameters to be learned;
S5:对S2和S4中结果做加权求和,计算过程可表示为:
其中cx
t是S3中的卷积结果,dx
t是S4中的全连接层的计算结果;
S5: Perform weighted summation of the results in S2 and S4, the calculation process can be expressed as: Where cx t is the convolution result in S3, and dx t is the calculation result of the fully connected layer in S4;
S6:将S5中的输出结果,与原始输入数据做连接操作,每个数据样本可以表示为:x
1,x
2,…,x
50,过程可以表示为:
其中
表示S5中的计算结果,x
t表示原始输入,连接处理后的数据形状为(50,2);
S6: Connect the output results in S5 with the original input data. Each data sample can be expressed as: x 1 , x 2 ,…,x 50 , and the process can be expressed as: in Indicates the calculation result in S5, x t indicates the original input, and the data shape after connection processing is (50,2);
S7:将S5中的结果输入BiLSTM网络层中,BiLSTM的前向传播层具有15个神经元,反向传播层具有15个神经元,则输出数据的形状为(1167,50,30);S7: Input the result in S5 into the BiLSTM network layer, the forward propagation layer of BiLSTM has 15 neurons, and the back propagation layer has 15 neurons, then the shape of the output data is (1167,50,30);
S8:将S7中的输出输入到LSTM网络层中,设置有10个神经元,则输出形状为(1167,10);S8: Input the output in S7 into the LSTM network layer, set 10 neurons, then the output shape is (1167,10);
S9:将S8中的输入到Dense网络层中,输出最终的预测结果,其输出形状为(1167,1);S9: Input the data in S8 into the Dense network layer, and output the final prediction result, whose output shape is (1167,1);
S10:通过S1-S9所述的网络模型,输出预测序列(y
1,y
2,…y
1167),其中y
i,表示第i个训练样本的预测结果,再将y
i与真实值对比,通过反向传播更新网络参数。同时,利用验证 集进行验证,当验证集的准确率连续10轮不下降时,停止训练,重复以上步骤,不断调整模型参数,得到较好的准确率。
S10: Through the network model described in S1-S9, output the prediction sequence (y 1 , y 2 ,...y 1167 ), where y i represents the prediction result of the i-th training sample, and then compare y i with the real value, Update network parameters through backpropagation. At the same time, use the verification set for verification. When the accuracy rate of the verification set does not decrease for 10 consecutive rounds, stop the training, repeat the above steps, and continuously adjust the model parameters to obtain a better accuracy rate.
S11:使用步骤S1-S10得到的最终网络模型,利用测试数据集测试其最终的表现,应用在实际的预测工作中。S11: Use the final network model obtained in steps S1-S10, use the test data set to test its final performance, and apply it in the actual prediction work.
Claims (3)
- 一种新的用于电力负荷预测的基于非侵入式注意力预处理过程与BiLSTM模型的预测方法,其特征在于,包括以下步骤:A new forecasting method based on non-intrusive attention preprocessing process and BiLSTM model for power load forecasting, characterized in that it includes the following steps:S1:非侵入式的数据预处理过程;S1: Non-intrusive data preprocessing process;S2:将S1中的结果输入进BiLSTM网络模型中,得出最终的预测结果。S2: Input the results in S1 into the BiLSTM network model to get the final prediction result.
- 如权利要求书1所述一种新的用于电力负荷预测的基于非侵入式注意力预处理过程与BiLSTM模型的预测方法,其特征在于,S1中:基于非侵入式注意力机制的数据预处理过程,具体包括以下步骤:A new forecasting method based on non-intrusive attention preprocessing process and BiLSTM model for power load forecasting as described in claim 1, characterized in that, in S1: data pre-processing based on non-intrusive attention mechanism The processing process specifically includes the following steps:S1.1:将原始输入数据表示为:x 1,x 2,…,x m,其中1,2,…,m表示输入时间序列的时长,则总长度为m,则该数据可用形状为(m,1)的矩阵表示;对数据进行根据时间序列滑动窗口采样预处理,窗口长度记为window_size,依次截取形状为(window_size,1)的窗口数据,作为样本,构造样本数据集,则预处理后的每个样本可以表示为:x 1,x 2,…,x window_size,则总样本数据可用形状为:(m,window_size,1)的矩阵表示; S1.1: Express the original input data as: x 1 , x 2 ,…,x m , where 1, 2,…, m represent the duration of the input time series, then the total length is m, and the available shape of the data is ( The matrix representation of m, 1); the data is preprocessed according to the time series sliding window sampling, the window length is recorded as window_size, and the window data with the shape of (window_size, 1) is sequentially intercepted as samples, and the sample data set is constructed, then the preprocessing After each sample can be expressed as: x 1 , x 2 ,…,x window_size , then the total sample data can be expressed as a matrix with shape: (m, window_size, 1);S1.2:将数据分为:测试集,验证集,训练集;S1.2: Divide the data into: test set, verification set, training set;S1.3:对S1.2中的每个输入训练数据做一维卷积操作,卷积核大小为k,卷积过程表示为: 其中cx t表示卷积结果,x t表示t时刻输入数据,x t+n表示x t的n阶近邻,其中n<k,即k阶近邻,w c和b c表示要学习的参数; S1.3: Perform one-dimensional convolution operation on each input training data in S1.2, the convolution kernel size is k, and the convolution process is expressed as: Where cx t represents the convolution result, x t represents the input data at time t, x t+n represents the n-order neighbor of x t , where n<k, that is, the k-order neighbor, w c and b c represent the parameters to be learned;S1.4:对S1.2中的输入训练数据做全连接计算,计算过程可表示为:dx t=ω dx t+b d,其中w d和b d表示要学习的参数,dx t表示该步输出结果; S1.4: Perform fully connected calculations on the input training data in S1.2. The calculation process can be expressed as: dx t = ω d x t + b d , where w d and b d represent the parameters to be learned, and dx t represents The output result of this step;S1.5:对S1.3和S1.4中结果做加权求和,计算过程可表示为: 其中cx t是S1.3中的卷积结果,dx t是S1.4中的全连接层的计算结果,[·]表示点乘; S1.5: Do a weighted summation of the results in S1.3 and S1.4, the calculation process can be expressed as: Among them, cx t is the convolution result in S1.3, dx t is the calculation result of the fully connected layer in S1.4, and [ ] means dot product;S1.6:将S1.5中的输出结果(即 ),与原始输入序列:x 1,x 2,…,x window_size做连接操作,过程可以表示为: 其中 表示S1.5中的计算结果,x t表示原始输入,连接处理后的数据可用形状为(window_size,2)的矩阵表示,window_size是S1.1中输入数据的滑动窗口大小。 S1.6: The output result in S1.5 (ie ), connect with the original input sequence: x 1 , x 2 ,…, x window_size , the process can be expressed as: in Indicates the calculation result in S1.5, x t indicates the original input, and the data after connection processing can be represented by a matrix with shape (window_size, 2), where window_size is the sliding window size of the input data in S1.1.
- 如权利要求书1所述一种新的用于电力负荷预测的基于非侵入式注意力预处理过程与BiLSTM模型的预测方法,其特征在于,S2中:所述的BiLSTM网络模型为,具有Encoder- Decoder架构,Encoder架构由BiLSTM网络层组成,Decoder架构由LSTM网络层和Dense网络层组成,具体包括以下步骤:A new forecasting method based on a non-intrusive attention preprocessing process and a BiLSTM model for power load forecasting as described in claim 1, wherein in S2: the BiLSTM network model is, with an Encoder - Decoder architecture, Encoder architecture is composed of BiLSTM network layer, Decoder architecture is composed of LSTM network layer and Dense network layer, including the following steps:S2.1:将S1.6中的结果输入BiLSTM网络层中,BiLSTM的前向传播层具有15个神经元,反向传播层具有15个神经元,神经元个数可用units表示,则输出数据可用形状为(m,window_size,30)的矩阵表示,m表示样本数量;S2.1: Input the results in S1.6 into the BiLSTM network layer. The forward propagation layer of BiLSTM has 15 neurons, and the back propagation layer has 15 neurons. The number of neurons can be represented by units, and the output data It can be represented by a matrix of shape (m, window_size, 30), where m represents the number of samples;S2.2:将S2.1中的输出表示为(y′ t1,y′ t2,…,y′ units),并输入到LSTM网络层中; S2.2: Express the output in S2.1 as (y′ t1 , y′ t2 ,…,y′ units ), and input it into the LSTM network layer;S2.3:将S2.2中的输出表示为(y″ t1,y″ t2,…,y″ window_size-1),并输入到Dense网络层中,输出最终的预测结果output,即形状为(m,1)的矩阵; S2.3: Express the output in S2.2 as (y″ t1 ,y″ t2 ,…,y″ window_size-1 ), and input it into the Dense network layer, and output the final prediction result output, that is, the shape is ( m,1) matrix;S2.4:通过S1和S2.1-S2.3所述的网络模型,所输出的预测序列output,可展开表示为(y 1,y 2,…y i),其中y i,i=1…m,分别表示第i个样本的数据的预测结果,再将y i与真实值对比,使用通过反向传播更新网络参数,其中损失函数为均值平方差(MSE)损失函数,学习率设置为0.01;并利用模型在验证集上的表现实现早停,重复以上步骤,不断调整模型参数,得到较好的准确率; S2.4: Through the network model described in S1 and S2.1-S2.3, the output prediction sequence output can be expanded and expressed as (y 1 , y 2 ,...y i ), where y i , i=1 ...m, respectively represent the prediction results of the data of the i-th sample, and then compare y i with the real value, and update the network parameters through backpropagation, where the loss function is the mean square error (MSE) loss function, and the learning rate is set to 0.01; and use the performance of the model on the verification set to achieve early stopping, repeat the above steps, and continuously adjust the model parameters to obtain a better accuracy rate;S2.5:使用步骤S1及S2.1-S2.4得到优化后的最终网络模型,通过测试集测试模型表现,最终应用在实际的预测工作中。S2.5: Use steps S1 and S2.1-S2.4 to obtain the optimized final network model, test the performance of the model through the test set, and finally apply it in the actual prediction work.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110557297.1 | 2021-05-21 | ||
CN202110557297.1A CN113177666A (en) | 2021-05-21 | 2021-05-21 | Prediction method based on non-invasive attention preprocessing process and BilSTM model |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022241932A1 true WO2022241932A1 (en) | 2022-11-24 |
Family
ID=76929640
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2021/105889 WO2022241932A1 (en) | 2021-05-21 | 2021-07-13 | Prediction method based on non-intrusive attention preprocessing process and bilstm model |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN113177666A (en) |
WO (1) | WO2022241932A1 (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116108350A (en) * | 2023-01-06 | 2023-05-12 | 中南大学 | Non-invasive electrical appliance identification method and system based on multitasking learning |
CN117350158A (en) * | 2023-10-13 | 2024-01-05 | 湖北华中电力科技开发有限责任公司 | Electric power short-term load prediction method by mixing RetNet and AM-BiLSTM algorithm |
CN117674098A (en) * | 2023-11-29 | 2024-03-08 | 国网浙江省电力有限公司丽水供电公司 | Multi-element load space-time probability distribution prediction method and system for different permeability |
CN118568681A (en) * | 2024-07-22 | 2024-08-30 | 齐鲁工业大学(山东省科学院) | Deep learning-based refrigeration system energy consumption prediction method and system |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113553988A (en) * | 2021-08-03 | 2021-10-26 | 同济大学 | Analog signal identification method based on complex neural network and attention mechanism |
CN113890024B (en) * | 2021-09-30 | 2024-08-13 | 清华大学 | Non-invasive intelligent load decomposition and optimization control method |
CN114676787A (en) * | 2022-04-08 | 2022-06-28 | 浙江大学 | Non-invasive load identification method based on BilSTM-CRF algorithm |
CN115100466A (en) * | 2022-06-22 | 2022-09-23 | 国网江苏省电力有限公司信息通信分公司 | Non-invasive load monitoring method, device and medium |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109685314A (en) * | 2018-11-20 | 2019-04-26 | 中国电力科学研究院有限公司 | A kind of non-intruding load decomposition method and system based on shot and long term memory network |
US20200073937A1 (en) * | 2018-08-30 | 2020-03-05 | International Business Machines Corporation | Multi-aspect sentiment analysis by collaborative attention allocation |
CN112529283A (en) * | 2020-12-04 | 2021-03-19 | 天津天大求实电力新技术股份有限公司 | Comprehensive energy system short-term load prediction method based on attention mechanism |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11586880B2 (en) * | 2018-08-28 | 2023-02-21 | Beijing Jingdong Shangke Information Technology Co., Ltd. | System and method for multi-horizon time series forecasting with dynamic temporal context learning |
CN110889545A (en) * | 2019-11-20 | 2020-03-17 | 国网重庆市电力公司电力科学研究院 | Power load prediction method and device and readable storage medium |
CN111191841B (en) * | 2019-12-30 | 2020-08-25 | 润联软件系统(深圳)有限公司 | Power load prediction method and device, computer equipment and storage medium |
CN111652225B (en) * | 2020-04-29 | 2024-02-27 | 杭州未名信科科技有限公司 | Non-invasive camera shooting and reading method and system based on deep learning |
CN112819256A (en) * | 2021-03-08 | 2021-05-18 | 重庆邮电大学 | Convolution time sequence room price prediction method based on attention mechanism |
-
2021
- 2021-05-21 CN CN202110557297.1A patent/CN113177666A/en active Pending
- 2021-07-13 WO PCT/CN2021/105889 patent/WO2022241932A1/en active Application Filing
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20200073937A1 (en) * | 2018-08-30 | 2020-03-05 | International Business Machines Corporation | Multi-aspect sentiment analysis by collaborative attention allocation |
CN109685314A (en) * | 2018-11-20 | 2019-04-26 | 中国电力科学研究院有限公司 | A kind of non-intruding load decomposition method and system based on shot and long term memory network |
CN112529283A (en) * | 2020-12-04 | 2021-03-19 | 天津天大求实电力新技术股份有限公司 | Comprehensive energy system short-term load prediction method based on attention mechanism |
Non-Patent Citations (1)
Title |
---|
SHUNMIAO ZHANG, CHEN MINGLONG: "Non-intrusive load decomposition based on attention mechanism and ConvBiLSTM", JOURNAL OF FUJIAN UNIVERSITY OF TECHNOLOGY, vol. 18, no. 4, 25 August 2020 (2020-08-25), pages 336 - 342, XP093005794 * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116108350A (en) * | 2023-01-06 | 2023-05-12 | 中南大学 | Non-invasive electrical appliance identification method and system based on multitasking learning |
CN116108350B (en) * | 2023-01-06 | 2023-10-20 | 中南大学 | Non-invasive electrical appliance identification method and system based on multitasking learning |
CN117350158A (en) * | 2023-10-13 | 2024-01-05 | 湖北华中电力科技开发有限责任公司 | Electric power short-term load prediction method by mixing RetNet and AM-BiLSTM algorithm |
CN117674098A (en) * | 2023-11-29 | 2024-03-08 | 国网浙江省电力有限公司丽水供电公司 | Multi-element load space-time probability distribution prediction method and system for different permeability |
CN117674098B (en) * | 2023-11-29 | 2024-06-07 | 国网浙江省电力有限公司丽水供电公司 | Multi-element load space-time probability distribution prediction method and system for different permeability |
CN118568681A (en) * | 2024-07-22 | 2024-08-30 | 齐鲁工业大学(山东省科学院) | Deep learning-based refrigeration system energy consumption prediction method and system |
Also Published As
Publication number | Publication date |
---|---|
CN113177666A (en) | 2021-07-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2022241932A1 (en) | Prediction method based on non-intrusive attention preprocessing process and bilstm model | |
WO2020024319A1 (en) | Convolutional neural network based multi-point regression forecasting model for traffic flow forecasting | |
CN109035779B (en) | DenseNet-based expressway traffic flow prediction method | |
CN110909926A (en) | TCN-LSTM-based solar photovoltaic power generation prediction method | |
CN111860982A (en) | Wind power plant short-term wind power prediction method based on VMD-FCM-GRU | |
Wang et al. | OGRU: An optimized gated recurrent unit neural network | |
CN110428082B (en) | Water quality prediction method based on attention neural network | |
CN107562784A (en) | Short text classification method based on ResLCNN models | |
CN110751318A (en) | IPSO-LSTM-based ultra-short-term power load prediction method | |
CN113673242A (en) | Text classification method based on K-neighborhood node algorithm and comparative learning | |
CN117094451B (en) | Power consumption prediction method, device and terminal | |
CN115051929B (en) | Network fault prediction method and device based on self-supervision target perception neural network | |
CN113112791A (en) | Traffic flow prediction method based on sliding window long-and-short term memory network | |
CN109598002A (en) | Neural machine translation method and system based on bidirectional circulating neural network | |
CN113935489A (en) | Variational quantum model TFQ-VQA based on quantum neural network and two-stage optimization method thereof | |
CN113836783A (en) | Digital regression model modeling method for main beam temperature-induced deflection monitoring reference value of cable-stayed bridge | |
CN113157919A (en) | Sentence text aspect level emotion classification method and system | |
Ying et al. | Processor free time forecasting based on convolutional neural network | |
Liu et al. | Prediction of Temperature Time Series Based on Wavelet Transform and Support Vector Machine. | |
CN118134284A (en) | Deep learning wind power prediction method based on multi-stage attention mechanism | |
CN116543289B (en) | Image description method based on encoder-decoder and Bi-LSTM attention model | |
CN105787265A (en) | Atomic spinning top random error modeling method based on comprehensive integration weighting method | |
CN116993185A (en) | Time sequence prediction method, device, equipment and storage medium | |
CN116843012A (en) | Time sequence prediction method integrating personalized context and time domain dynamic characteristics | |
CN112232570A (en) | Forward active total electric quantity prediction method and device and readable storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21940381 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 21940381 Country of ref document: EP Kind code of ref document: A1 |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 21940381 Country of ref document: EP Kind code of ref document: A1 |