WO2022241932A1 - Prediction method based on non-intrusive attention preprocessing process and bilstm model - Google Patents

Prediction method based on non-intrusive attention preprocessing process and bilstm model Download PDF

Info

Publication number
WO2022241932A1
WO2022241932A1 PCT/CN2021/105889 CN2021105889W WO2022241932A1 WO 2022241932 A1 WO2022241932 A1 WO 2022241932A1 CN 2021105889 W CN2021105889 W CN 2021105889W WO 2022241932 A1 WO2022241932 A1 WO 2022241932A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
bilstm
window
model
input
Prior art date
Application number
PCT/CN2021/105889
Other languages
French (fr)
Chinese (zh)
Inventor
史清江
李丹丹
曾歆
Original Assignee
同济大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 同济大学 filed Critical 同济大学
Publication of WO2022241932A1 publication Critical patent/WO2022241932A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/067Enterprise or organisation modelling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply

Definitions

  • the invention relates to the technical field of power load forecasting, specifically based on the electricity consumption data of the existing time period, using a prediction method combining non-intrusive attention preprocessing and BiLSTM model to predict the electricity consumption data of the future time period.
  • Power load forecasting is to predict its future value based on the past and present of the power load. Forecasting the power load can infer the development trend and possible state of the load, and improve economic and social benefits.
  • the current power load forecasting methods are generally divided into traditional forecasting methods and modern forecasting methods, among which modern forecasting methods mainly include the following: forecasting methods based on convolutional neural network models, using LSTM (long-short-term memory long-short-term memory Network model) model combined with time series to predict power system load, and neural network methods such as direct convolution using multi-dimensional data such as power consumption data, temperature, time, etc., have achieved good results.
  • LSTM long-short-term memory long-short-term memory Network model
  • BiLSTM Bi-directional long-short-term memory bidirectional long-short-term memory network model
  • the BiLSTM model solves the vanishing gradient problem and provides long-term correlation.
  • BiLSTM performs well in time series forecasting tasks, fundamental constraints of sequence computation still exist.
  • This problem can be solved by attention mechanism, which enables the model to achieve better results in the input or output sequence, regardless of the length of the input data.
  • attention mechanism which enables the model to achieve better results in the input or output sequence, regardless of the length of the input data.
  • the present invention uses the attention mechanism as a preprocessing process, combined with the BiLSTM model, which not only enhances the long-term memory ability of the model, but also avoids internal modification of the BiLSTM model.
  • the invention discloses a prediction method based on a non-intrusive attention preprocessing process and a BiLSTM model.
  • a deep learning model enhanced by a non-intrusive attention mechanism is used for long-term energy consumption prediction.
  • a preprocessing model based on an attention mechanism and A general BiLSTM network is composed, called AP-BiLSTM.
  • the preprocessing model based on the attention mechanism is completed by the dot product of the convolutional layer and the fully connected layer.
  • a new forecasting method based on non-intrusive attention preprocessing process and BiLSTM model for power load forecasting including the following steps:
  • S2 Input the results in S1 into the BiLSTM network model to get the final prediction result.
  • S1 Data preprocessing based on the non-intrusive attention mechanism.
  • the output obtained after the training data passes through the processing module has learned the relationship between the data before and after the time series, but it can still be used as a new input and input into the BiLSTM network. , specifically include the following steps:
  • S1.2 Divide the data into 20%, 10%, and 70% ratios: test set, verification set, and training set;
  • the BiLSTM network model described has an Encoder-Decoder architecture, the Encoder architecture is composed of a BiLSTM network layer, and the Decoder architecture is composed of an LSTM network layer and a Dense network layer, specifically including the following steps:
  • S2.1 Input the results in S1.6 into the BiLSTM network layer.
  • the forward propagation layer of BiLSTM has 15 neurons, and the back propagation layer has 15 neurons.
  • the number of neurons can be represented by units, and the output data It can be represented by a matrix of shape (m, window_size, 30), where m represents the number of samples;
  • S2.2 Express the output in S2.1 as (y′ t1 , y′ t2 ,...,y′ units ), and input it into the LSTM network layer;
  • S2.3 Express the output in S2.2 as (y′′ t1 ,y′′ t2 ,...,y′′ window_size-1 ), and input it into the Dense network layer, and output the final prediction result output, that is, the shape is ( m,1) matrix;
  • MSE mean square error
  • S2.5 Use steps S1 and S2.1-S2.4 to obtain the optimized final network model, test the performance of the model through the test set, and finally apply it in the actual prediction work.
  • the method of the present invention adopts a new non-intrusive attention preprocessing process, and by extracting the attention operation from the internal model into the preprocessing process, both the local and global associations in the long-term dependencies of the input data are obtained enhanced.
  • this preprocessing process can be applied to most deep learning networks and avoids modification of their network structures. Therefore, the method of the present invention still performs better in long-term forecasting.
  • Fig. 1 is a flow chart of the prediction method of the present invention.
  • Fig. 2 is an overall model architecture diagram of the present invention.
  • Fig. 3 is a structural diagram of the present invention based on non-intrusive attention preprocessing module. (innovation part)
  • Fig. 4 is a data visualization diagram used in the examples of the present invention.
  • Fig. 5 is an indicator diagram of the prediction results of the method of the present invention and other 4 comparison methods.
  • S1 Express the original input data as: x 1 ,x 2 ,...,x m , where 1,2,...,m represent the duration of the input time series, the total length is m, and the data shape is (m,1).
  • the data is preprocessed according to the time series sliding window sampling, the window is recorded as window_size, and the window data with the shape of (window_size, 1) is sequentially intercepted as a sample to construct a sample data set, then the preprocessed sample data is: (m, window_size, 1), in the present invention, the input time series length is 1667, and window_size is set to 50, then the sample data shape after preprocessing is (1667,50,1);
  • test set, validation set, and training set sample numbers are respectively: 333, 167, and 1167, and the shapes are respectively (333,50,1),(167,50,1),(1167,50,1);
  • S3 Perform one-dimensional convolution on the input training data in S2, the convolution kernel is k, and the process is expressed as: Among them, x t represents the input data at time t, x t+n represents the n-order neighbor of x t , where n ⁇ k is the k-order neighbor, and w c and b c represent the parameters to be learned.
  • the convolution kernel is set to 3;
  • S6 Connect the output results in S5 with the original input data.
  • Each data sample can be expressed as: x 1 , x 2 ,...,x 50 , and the process can be expressed as: in Indicates the calculation result in S5, x t indicates the original input, and the data shape after connection processing is (50,2);

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Strategic Management (AREA)
  • Economics (AREA)
  • Human Resources & Organizations (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Marketing (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Development Economics (AREA)
  • Game Theory and Decision Science (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Public Health (AREA)
  • Water Supply & Treatment (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Primary Health Care (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Educational Administration (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

A prediction method based on a non-intrusive attention preprocessing process and a BiLSTM model. A deep learning model enhanced by a non-intrusive attention mechanism is used for long-term energy consumption prediction, consists of an attention mechanism-based preprocessing model and a universal BiLSTM network, and is called as AP-BiLSTM. The attention mechanism-based preprocessing model is completed by the dot product of a convolutional layer and a fully connected layer. The two layers perform feature mapping of the original input data, which is critical to improve the performance of an AP-BiLSTM method. By means of the manner, both local and global associations in the long-term dependency of the input data are enhanced. The method comprises the following steps: S1: performing a non-intrusive data preprocessing process; and S2: inputting the result in S1 into the BiLSTM network model to obtain a final prediction result.

Description

一种基于非侵入式注意力预处理过程与BiLSTM模型的预测方法A prediction method based on non-intrusive attention preprocessing and BiLSTM model 技术领域technical field
本发明涉及电力负荷预测技术领域,具体为基于已有时间段的用电数据,利用结合非侵入式注意力预处理与BiLSTM模型的预测方法来预测未来时间段的用电数据。The invention relates to the technical field of power load forecasting, specifically based on the electricity consumption data of the existing time period, using a prediction method combining non-intrusive attention preprocessing and BiLSTM model to predict the electricity consumption data of the future time period.
背景技术Background technique
电力负荷预测是根据电力负荷的过去和现在推测它的未来数值,进行电力负荷预测可以推知负荷的发展趋势和可能达到的状况,提高经济效益和社会效益。Power load forecasting is to predict its future value based on the past and present of the power load. Forecasting the power load can infer the development trend and possible state of the load, and improve economic and social benefits.
当前的电力负荷预测方法大方向分为传统预测方法和现代预测方法,其中现代预测方法主要包括以下几种:基于卷积神经网络模型的预测方法,利用LSTM(long-short-term memory长短期记忆网络模型)模型结合时间序列预测电力系统负荷的方法,以及利用用电数据、温度、时间等多维数据进行直接卷积等神经网络方法,都取得了较好的效果。The current power load forecasting methods are generally divided into traditional forecasting methods and modern forecasting methods, among which modern forecasting methods mainly include the following: forecasting methods based on convolutional neural network models, using LSTM (long-short-term memory long-short-term memory Network model) model combined with time series to predict power system load, and neural network methods such as direct convolution using multi-dimensional data such as power consumption data, temperature, time, etc., have achieved good results.
其中,BiLSTM(Bi-directional long-short-term memory双向长短期记忆网络模型)是一种人工循环神经网络结构,非常适合基于时间序列数据的预测。BiLSTM模型解决了梯度消失问题,并提供了长期的相关性。然而,虽然BiLSTM在时间序列预测任务中表现良好,但序列计算的基本约束仍然存在。注意力机制可以解决这个问题,注意力机制使得模型在输入或输出序列中,无论输入数据的长度如何,都能获得更好的结果。但是当前结合注意力与BiLSTM等循环神经网络的模型中,通常需要对BiLSTM内部结构进行修改,增加了模型设计的难度。基于此,本发明将注意力机制作为一个预处理过程,再结合BiLSTM模型,既增强了模型的长期记忆能力,又避免了对BiLSTM模型内部的修改。Among them, BiLSTM (Bi-directional long-short-term memory bidirectional long-short-term memory network model) is an artificial cyclic neural network structure, which is very suitable for prediction based on time series data. The BiLSTM model solves the vanishing gradient problem and provides long-term correlation. However, while BiLSTM performs well in time series forecasting tasks, fundamental constraints of sequence computation still exist. This problem can be solved by attention mechanism, which enables the model to achieve better results in the input or output sequence, regardless of the length of the input data. However, in the current model that combines attention and BiLSTM and other recurrent neural networks, it is usually necessary to modify the internal structure of BiLSTM, which increases the difficulty of model design. Based on this, the present invention uses the attention mechanism as a preprocessing process, combined with the BiLSTM model, which not only enhances the long-term memory ability of the model, but also avoids internal modification of the BiLSTM model.
发明内容Contents of the invention
本发明公开了一种基于非侵入式注意力预处理过程与BiLSTM模型的预测方法,采用非侵入式注意机制增强的深度学习模型用于长期能量消耗预测,由一个基于注意机制的预处理模型和一个通用的BiLSTM网络组成,称为AP-BiLSTM。基于注意机制的预处理模型是由卷积层 和全连接层的点积来完成,这两层进行原始输入数据的特征映射,这是提高AP-BiLSTM方法性能的关键。通过这种方式,输入数据长期依赖关系中的本地和全局关联都得到了增强。The invention discloses a prediction method based on a non-intrusive attention preprocessing process and a BiLSTM model. A deep learning model enhanced by a non-intrusive attention mechanism is used for long-term energy consumption prediction. A preprocessing model based on an attention mechanism and A general BiLSTM network is composed, called AP-BiLSTM. The preprocessing model based on the attention mechanism is completed by the dot product of the convolutional layer and the fully connected layer. These two layers perform feature mapping of the original input data, which is the key to improving the performance of the AP-BiLSTM method. In this way, both local and global associations in the long-term dependencies of the input data are enhanced.
技术方案Technical solutions
一种新的用于电力负荷预测的基于非侵入式注意力预处理过程与BiLSTM模型的预测方法,包括以下步骤:A new forecasting method based on non-intrusive attention preprocessing process and BiLSTM model for power load forecasting, including the following steps:
S1:非侵入式的数据预处理过程;S1: Non-intrusive data preprocessing process;
S2:将S1中的结果输入进BiLSTM网络模型中,得出最终的预测结果。S2: Input the results in S1 into the BiLSTM network model to get the final prediction result.
S1中:基于非侵入式注意力机制的数据预处理,训练数据经过该处理模块后所得出的输出已经学习了时间序列前后数据之间的关系,但是仍然可以作为新的输入,输入进BiLSTM网络中,具体包括以下步骤:In S1: Data preprocessing based on the non-intrusive attention mechanism. The output obtained after the training data passes through the processing module has learned the relationship between the data before and after the time series, but it can still be used as a new input and input into the BiLSTM network. , specifically include the following steps:
S1.1:将原始输入数据表示为:x 1,x 2,…,x m,其中1,2,…,m表示输入时间序列的时长,则总长度为m,则该数据可用形状为(m,1)的矩阵表示。对数据进行根据时间序列滑动窗口采样预处理,窗口长度记为window_size,依次截取形状为(window_size,1)的窗口数据,作为样本,构造样本数据集,则预处理后的每个样本可以表示为:x 1,x 2,…,x window_size,则总样本数据可用形状为:(m,window_size,1)的矩阵表示; S1.1: Express the original input data as: x 1 , x 2 ,…,x m , where 1, 2,…, m represent the duration of the input time series, then the total length is m, and the available shape of the data is ( m,1) matrix representation. The data is preprocessed according to the time series sliding window sampling, the window length is recorded as window_size, and the window data with the shape of (window_size, 1) is sequentially intercepted as samples to construct a sample data set, then each sample after preprocessing can be expressed as : x 1 , x 2 ,…, x window_size , then the total sample data can be represented by a matrix with shape: (m, window_size, 1);
S1.2:将数据按照20%,10%,70%的比例分为:测试集,验证集,训练集;S1.2: Divide the data into 20%, 10%, and 70% ratios: test set, verification set, and training set;
S1.3:对S1.2中的每个输入训练数据做一维卷积操作,卷积核大小为k,卷积过程表示为:
Figure PCTCN2021105889-appb-000001
其中cx t表示卷积结果,x t表示t时刻输入数据,x t+n表示x t的n阶近邻,其中n<k,即k阶近邻,w c和b c表示要学习的参数;
S1.3: Perform one-dimensional convolution operation on each input training data in S1.2, the convolution kernel size is k, and the convolution process is expressed as:
Figure PCTCN2021105889-appb-000001
Where cx t represents the convolution result, x t represents the input data at time t, x t+n represents the n-order neighbor of x t , where n<k, that is, the k-order neighbor, w c and b c represent the parameters to be learned;
S1.4:对S1.2中的输入训练数据做全连接计算,计算过程可表示为:dx t=ω dx t+b d,其中w d和b d表示要学习的参数,dx t表示该步输出结果; S1.4: Perform fully connected calculations on the input training data in S1.2. The calculation process can be expressed as: dx t = ω d x t + b d , where w d and b d represent the parameters to be learned, and dx t represents The output result of this step;
S1.5:对S1.3和S1.4中结果做加权求和,计算过程可表示为:
Figure PCTCN2021105889-appb-000002
Figure PCTCN2021105889-appb-000003
其中cx t是S1.3中的卷积结果,dx t是S1.4中的全连接层的计算结果,[·]表示点乘;
S1.5: Do a weighted summation of the results in S1.3 and S1.4, the calculation process can be expressed as:
Figure PCTCN2021105889-appb-000002
Figure PCTCN2021105889-appb-000003
Among them, cx t is the convolution result in S1.3, dx t is the calculation result of the fully connected layer in S1.4, and [ ] means dot product;
S1.6:将S1.5中的输出结果(即
Figure PCTCN2021105889-appb-000004
),与原始输入序列:x 1,x 2,…,x window_size做连接操作,过程可以表示为:
Figure PCTCN2021105889-appb-000005
其中
Figure PCTCN2021105889-appb-000006
表示S1.5中的计算结果,x t表示原始输入,连接处理后的数据可用形状为(window_size,2)的矩阵表示,window_size是S1.1中输入数据的滑动窗口大小;
S1.6: The output result in S1.5 (ie
Figure PCTCN2021105889-appb-000004
), connect with the original input sequence: x 1 , x 2 ,…, x window_size , the process can be expressed as:
Figure PCTCN2021105889-appb-000005
in
Figure PCTCN2021105889-appb-000006
Indicates the calculation result in S1.5, x t indicates the original input, and the data after connection processing can be represented by a matrix of shape (window_size, 2), where window_size is the sliding window size of the input data in S1.1;
S2中:所述的BiLSTM网络模型为,具有Encoder-Decoder架构,Encoder架构由BiLSTM网络层组成,Decoder架构由LSTM网络层和Dense网络层组成,具体包括以下步骤:In S2: the BiLSTM network model described has an Encoder-Decoder architecture, the Encoder architecture is composed of a BiLSTM network layer, and the Decoder architecture is composed of an LSTM network layer and a Dense network layer, specifically including the following steps:
S2.1:将S1.6中的结果输入BiLSTM网络层中,BiLSTM的前向传播层具有15个神经元,反向传播层具有15个神经元,神经元个数可用units表示,则输出数据可用形状为(m,window_size,30)的矩阵表示,m表示样本数量;S2.1: Input the results in S1.6 into the BiLSTM network layer. The forward propagation layer of BiLSTM has 15 neurons, and the back propagation layer has 15 neurons. The number of neurons can be represented by units, and the output data It can be represented by a matrix of shape (m, window_size, 30), where m represents the number of samples;
S2.2:将S2.1中的输出表示为(y′ t1,y′ t2,…,y′ units),并输入到LSTM网络层中; S2.2: Express the output in S2.1 as (y′ t1 , y′ t2 ,…,y′ units ), and input it into the LSTM network layer;
S2.3:将S2.2中的输出表示为(y″ t1,y″ t2,…,y″ window_size-1),并输入到Dense网络层中,输出最终的预测结果output,即形状为(m,1)的矩阵; S2.3: Express the output in S2.2 as (y″ t1 ,y″ t2 ,…,y″ window_size-1 ), and input it into the Dense network layer, and output the final prediction result output, that is, the shape is ( m,1) matrix;
S2.4:通过S1和S2.1-S2.3所述的网络模型,所输出的预测序列output,可展开表示为(y 1,y 2,…y i),其中y i,i=1…m,分别表示第i个样本的数据的预测结果,再将y i与真实值对比,使用通过反向传播更新网络参数,其中损失函数为均值平方差(MSE)损失函数,学习率设置为0.01。并利用模型在验证集上的表现实现早停,重复以上步骤,不断调整模型参数,得到较好的准确率。 S2.4: Through the network model described in S1 and S2.1-S2.3, the output prediction sequence output can be expanded and expressed as (y 1 , y 2 ,...y i ), where y i , i=1 ...m, respectively represent the prediction results of the data of the i-th sample, and then compare y i with the real value, and update the network parameters through backpropagation, where the loss function is the mean square error (MSE) loss function, and the learning rate is set to 0.01. And use the performance of the model on the verification set to achieve early stopping, repeat the above steps, and continuously adjust the model parameters to obtain better accuracy.
S2.5:使用步骤S1及S2.1-S2.4得到优化后的最终网络模型,通过测试集测试模型表现,最终应用在实际的预测工作中。S2.5: Use steps S1 and S2.1-S2.4 to obtain the optimized final network model, test the performance of the model through the test set, and finally apply it in the actual prediction work.
本发明有益效果:本发明方法通过一个新的非侵入式注意力预处理过程,通过将注意力操作从内部模型提取到预处理过程中,输入数据长期依赖关系中的本地和全局关联都得到了增强。同时,该预处理过程可以应用于大多数深度学习网络,并避免了对其网络结构的修改。因此,本发明所述方法在长期预测中表现仍然较好。Beneficial effects of the present invention: the method of the present invention adopts a new non-intrusive attention preprocessing process, and by extracting the attention operation from the internal model into the preprocessing process, both the local and global associations in the long-term dependencies of the input data are obtained enhanced. At the same time, this preprocessing process can be applied to most deep learning networks and avoids modification of their network structures. Therefore, the method of the present invention still performs better in long-term forecasting.
附图说明Description of drawings
图1是本发明预测方法流程图。Fig. 1 is a flow chart of the prediction method of the present invention.
图2是本发明整体模型架构图。Fig. 2 is an overall model architecture diagram of the present invention.
图3是本发明基于非侵入式注意力预处理模块结构图。(创新部分)Fig. 3 is a structural diagram of the present invention based on non-intrusive attention preprocessing module. (innovation part)
图4是本发明实例中用到的数据可视化图。Fig. 4 is a data visualization diagram used in the examples of the present invention.
图5是本发明方法与其他4种对比方法的预测结果指标图。Fig. 5 is an indicator diagram of the prediction results of the method of the present invention and other 4 comparison methods.
具体实施方式Detailed ways
为了使本发明的目的和效果更加清楚,下面以本发明中一种基于非侵入式注意力预处理过程与BiLSTM模型的预测方法为例,利用美国全国2015年7月1日至2020年1月23日共1667条每日总用电负荷数据,对本发明的集成模型进行详细描述。In order to make the purpose and effect of the present invention clearer, take a prediction method based on the non-intrusive attention preprocessing process and the BiLSTM model in the present invention as an example, using the national data from July 1, 2015 to January 2020 A total of 1667 pieces of daily total electricity load data on the 23rd describe the integrated model of the present invention in detail.
S1:将原始输入数据表示为:x 1,x 2,…,x m,其中1,2,…,m表示输入时间序列的时长,则总长度为m,数据形状为(m,1)。对数据进行根据时间序列滑动窗口采样预处理,窗口记为window_size,依次截取形状为(window_size,1)的窗口数据,作为样本,构造样本数据集,则预处理后的样本数据为:(m,window_size,1),本发明中输入时间序列长度为1667,window_size设定为50,则预处理后的样本数据形状为(1667,50,1); S1: Express the original input data as: x 1 ,x 2 ,…,x m , where 1,2,…,m represent the duration of the input time series, the total length is m, and the data shape is (m,1). The data is preprocessed according to the time series sliding window sampling, the window is recorded as window_size, and the window data with the shape of (window_size, 1) is sequentially intercepted as a sample to construct a sample data set, then the preprocessed sample data is: (m, window_size, 1), in the present invention, the input time series length is 1667, and window_size is set to 50, then the sample data shape after preprocessing is (1667,50,1);
S2:将数据按照20%,10%,70%的比例分为:测试集,验证集,训练集,则测试集,验证集,训练集样本数量分别为:333,167,1167,形状分别为(333,50,1),(167,50,1),(1167,50,1);S2: Divide the data into test set, validation set, and training set according to the ratio of 20%, 10%, and 70%. The test set, validation set, and training set sample numbers are respectively: 333, 167, and 1167, and the shapes are respectively (333,50,1),(167,50,1),(1167,50,1);
S3:对S2中的输入训练数据做一维卷积,卷积核为k,过程表示为:
Figure PCTCN2021105889-appb-000007
Figure PCTCN2021105889-appb-000008
其中x t表示t时刻输入数据,x t+n表示x t的n阶近邻,其中n<k,即k阶近邻,w c和b c表示要学习的参数。本发明中,卷积核设为3;
S3: Perform one-dimensional convolution on the input training data in S2, the convolution kernel is k, and the process is expressed as:
Figure PCTCN2021105889-appb-000007
Figure PCTCN2021105889-appb-000008
Among them, x t represents the input data at time t, x t+n represents the n-order neighbor of x t , where n<k is the k-order neighbor, and w c and b c represent the parameters to be learned. In the present invention, the convolution kernel is set to 3;
S4:对S2中的输入训练数据做全连接计算,计算过程可表示为:dx t=ω dx t+b d,其中w d和b d表示要学习的参数; S4: Perform full connection calculation on the input training data in S2, the calculation process can be expressed as: dx t = ω d x t + b d , where w d and b d represent the parameters to be learned;
S5:对S2和S4中结果做加权求和,计算过程可表示为:
Figure PCTCN2021105889-appb-000009
Figure PCTCN2021105889-appb-000010
其中cx t是S3中的卷积结果,dx t是S4中的全连接层的计算结果;
S5: Perform weighted summation of the results in S2 and S4, the calculation process can be expressed as:
Figure PCTCN2021105889-appb-000009
Figure PCTCN2021105889-appb-000010
Where cx t is the convolution result in S3, and dx t is the calculation result of the fully connected layer in S4;
S6:将S5中的输出结果,与原始输入数据做连接操作,每个数据样本可以表示为:x 1,x 2,…,x 50,过程可以表示为:
Figure PCTCN2021105889-appb-000011
其中
Figure PCTCN2021105889-appb-000012
表示S5中的计算结果,x t表示原始输入,连接处理后的数据形状为(50,2);
S6: Connect the output results in S5 with the original input data. Each data sample can be expressed as: x 1 , x 2 ,…,x 50 , and the process can be expressed as:
Figure PCTCN2021105889-appb-000011
in
Figure PCTCN2021105889-appb-000012
Indicates the calculation result in S5, x t indicates the original input, and the data shape after connection processing is (50,2);
S7:将S5中的结果输入BiLSTM网络层中,BiLSTM的前向传播层具有15个神经元,反向传播层具有15个神经元,则输出数据的形状为(1167,50,30);S7: Input the result in S5 into the BiLSTM network layer, the forward propagation layer of BiLSTM has 15 neurons, and the back propagation layer has 15 neurons, then the shape of the output data is (1167,50,30);
S8:将S7中的输出输入到LSTM网络层中,设置有10个神经元,则输出形状为(1167,10);S8: Input the output in S7 into the LSTM network layer, set 10 neurons, then the output shape is (1167,10);
S9:将S8中的输入到Dense网络层中,输出最终的预测结果,其输出形状为(1167,1);S9: Input the data in S8 into the Dense network layer, and output the final prediction result, whose output shape is (1167,1);
S10:通过S1-S9所述的网络模型,输出预测序列(y 1,y 2,…y 1167),其中y i,表示第i个训练样本的预测结果,再将y i与真实值对比,通过反向传播更新网络参数。同时,利用验证 集进行验证,当验证集的准确率连续10轮不下降时,停止训练,重复以上步骤,不断调整模型参数,得到较好的准确率。 S10: Through the network model described in S1-S9, output the prediction sequence (y 1 , y 2 ,...y 1167 ), where y i represents the prediction result of the i-th training sample, and then compare y i with the real value, Update network parameters through backpropagation. At the same time, use the verification set for verification. When the accuracy rate of the verification set does not decrease for 10 consecutive rounds, stop the training, repeat the above steps, and continuously adjust the model parameters to obtain a better accuracy rate.
S11:使用步骤S1-S10得到的最终网络模型,利用测试数据集测试其最终的表现,应用在实际的预测工作中。S11: Use the final network model obtained in steps S1-S10, use the test data set to test its final performance, and apply it in the actual prediction work.

Claims (3)

  1. 一种新的用于电力负荷预测的基于非侵入式注意力预处理过程与BiLSTM模型的预测方法,其特征在于,包括以下步骤:A new forecasting method based on non-intrusive attention preprocessing process and BiLSTM model for power load forecasting, characterized in that it includes the following steps:
    S1:非侵入式的数据预处理过程;S1: Non-intrusive data preprocessing process;
    S2:将S1中的结果输入进BiLSTM网络模型中,得出最终的预测结果。S2: Input the results in S1 into the BiLSTM network model to get the final prediction result.
  2. 如权利要求书1所述一种新的用于电力负荷预测的基于非侵入式注意力预处理过程与BiLSTM模型的预测方法,其特征在于,S1中:基于非侵入式注意力机制的数据预处理过程,具体包括以下步骤:A new forecasting method based on non-intrusive attention preprocessing process and BiLSTM model for power load forecasting as described in claim 1, characterized in that, in S1: data pre-processing based on non-intrusive attention mechanism The processing process specifically includes the following steps:
    S1.1:将原始输入数据表示为:x 1,x 2,…,x m,其中1,2,…,m表示输入时间序列的时长,则总长度为m,则该数据可用形状为(m,1)的矩阵表示;对数据进行根据时间序列滑动窗口采样预处理,窗口长度记为window_size,依次截取形状为(window_size,1)的窗口数据,作为样本,构造样本数据集,则预处理后的每个样本可以表示为:x 1,x 2,…,x window_size,则总样本数据可用形状为:(m,window_size,1)的矩阵表示; S1.1: Express the original input data as: x 1 , x 2 ,…,x m , where 1, 2,…, m represent the duration of the input time series, then the total length is m, and the available shape of the data is ( The matrix representation of m, 1); the data is preprocessed according to the time series sliding window sampling, the window length is recorded as window_size, and the window data with the shape of (window_size, 1) is sequentially intercepted as samples, and the sample data set is constructed, then the preprocessing After each sample can be expressed as: x 1 , x 2 ,…,x window_size , then the total sample data can be expressed as a matrix with shape: (m, window_size, 1);
    S1.2:将数据分为:测试集,验证集,训练集;S1.2: Divide the data into: test set, verification set, training set;
    S1.3:对S1.2中的每个输入训练数据做一维卷积操作,卷积核大小为k,卷积过程表示为:
    Figure PCTCN2021105889-appb-100001
    其中cx t表示卷积结果,x t表示t时刻输入数据,x t+n表示x t的n阶近邻,其中n<k,即k阶近邻,w c和b c表示要学习的参数;
    S1.3: Perform one-dimensional convolution operation on each input training data in S1.2, the convolution kernel size is k, and the convolution process is expressed as:
    Figure PCTCN2021105889-appb-100001
    Where cx t represents the convolution result, x t represents the input data at time t, x t+n represents the n-order neighbor of x t , where n<k, that is, the k-order neighbor, w c and b c represent the parameters to be learned;
    S1.4:对S1.2中的输入训练数据做全连接计算,计算过程可表示为:dx t=ω dx t+b d,其中w d和b d表示要学习的参数,dx t表示该步输出结果; S1.4: Perform fully connected calculations on the input training data in S1.2. The calculation process can be expressed as: dx t = ω d x t + b d , where w d and b d represent the parameters to be learned, and dx t represents The output result of this step;
    S1.5:对S1.3和S1.4中结果做加权求和,计算过程可表示为:
    Figure PCTCN2021105889-appb-100002
    Figure PCTCN2021105889-appb-100003
    其中cx t是S1.3中的卷积结果,dx t是S1.4中的全连接层的计算结果,[·]表示点乘;
    S1.5: Do a weighted summation of the results in S1.3 and S1.4, the calculation process can be expressed as:
    Figure PCTCN2021105889-appb-100002
    Figure PCTCN2021105889-appb-100003
    Among them, cx t is the convolution result in S1.3, dx t is the calculation result of the fully connected layer in S1.4, and [ ] means dot product;
    S1.6:将S1.5中的输出结果(即
    Figure PCTCN2021105889-appb-100004
    ),与原始输入序列:x 1,x 2,…,x window_size做连接操作,过程可以表示为:
    Figure PCTCN2021105889-appb-100005
    其中
    Figure PCTCN2021105889-appb-100006
    表示S1.5中的计算结果,x t表示原始输入,连接处理后的数据可用形状为(window_size,2)的矩阵表示,window_size是S1.1中输入数据的滑动窗口大小。
    S1.6: The output result in S1.5 (ie
    Figure PCTCN2021105889-appb-100004
    ), connect with the original input sequence: x 1 , x 2 ,…, x window_size , the process can be expressed as:
    Figure PCTCN2021105889-appb-100005
    in
    Figure PCTCN2021105889-appb-100006
    Indicates the calculation result in S1.5, x t indicates the original input, and the data after connection processing can be represented by a matrix with shape (window_size, 2), where window_size is the sliding window size of the input data in S1.1.
  3. 如权利要求书1所述一种新的用于电力负荷预测的基于非侵入式注意力预处理过程与BiLSTM模型的预测方法,其特征在于,S2中:所述的BiLSTM网络模型为,具有Encoder- Decoder架构,Encoder架构由BiLSTM网络层组成,Decoder架构由LSTM网络层和Dense网络层组成,具体包括以下步骤:A new forecasting method based on a non-intrusive attention preprocessing process and a BiLSTM model for power load forecasting as described in claim 1, wherein in S2: the BiLSTM network model is, with an Encoder - Decoder architecture, Encoder architecture is composed of BiLSTM network layer, Decoder architecture is composed of LSTM network layer and Dense network layer, including the following steps:
    S2.1:将S1.6中的结果输入BiLSTM网络层中,BiLSTM的前向传播层具有15个神经元,反向传播层具有15个神经元,神经元个数可用units表示,则输出数据可用形状为(m,window_size,30)的矩阵表示,m表示样本数量;S2.1: Input the results in S1.6 into the BiLSTM network layer. The forward propagation layer of BiLSTM has 15 neurons, and the back propagation layer has 15 neurons. The number of neurons can be represented by units, and the output data It can be represented by a matrix of shape (m, window_size, 30), where m represents the number of samples;
    S2.2:将S2.1中的输出表示为(y′ t1,y′ t2,…,y′ units),并输入到LSTM网络层中; S2.2: Express the output in S2.1 as (y′ t1 , y′ t2 ,…,y′ units ), and input it into the LSTM network layer;
    S2.3:将S2.2中的输出表示为(y″ t1,y″ t2,…,y″ window_size-1),并输入到Dense网络层中,输出最终的预测结果output,即形状为(m,1)的矩阵; S2.3: Express the output in S2.2 as (y″ t1 ,y″ t2 ,…,y″ window_size-1 ), and input it into the Dense network layer, and output the final prediction result output, that is, the shape is ( m,1) matrix;
    S2.4:通过S1和S2.1-S2.3所述的网络模型,所输出的预测序列output,可展开表示为(y 1,y 2,…y i),其中y i,i=1…m,分别表示第i个样本的数据的预测结果,再将y i与真实值对比,使用通过反向传播更新网络参数,其中损失函数为均值平方差(MSE)损失函数,学习率设置为0.01;并利用模型在验证集上的表现实现早停,重复以上步骤,不断调整模型参数,得到较好的准确率; S2.4: Through the network model described in S1 and S2.1-S2.3, the output prediction sequence output can be expanded and expressed as (y 1 , y 2 ,...y i ), where y i , i=1 ...m, respectively represent the prediction results of the data of the i-th sample, and then compare y i with the real value, and update the network parameters through backpropagation, where the loss function is the mean square error (MSE) loss function, and the learning rate is set to 0.01; and use the performance of the model on the verification set to achieve early stopping, repeat the above steps, and continuously adjust the model parameters to obtain a better accuracy rate;
    S2.5:使用步骤S1及S2.1-S2.4得到优化后的最终网络模型,通过测试集测试模型表现,最终应用在实际的预测工作中。S2.5: Use steps S1 and S2.1-S2.4 to obtain the optimized final network model, test the performance of the model through the test set, and finally apply it in the actual prediction work.
PCT/CN2021/105889 2021-05-21 2021-07-13 Prediction method based on non-intrusive attention preprocessing process and bilstm model WO2022241932A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110557297.1 2021-05-21
CN202110557297.1A CN113177666A (en) 2021-05-21 2021-05-21 Prediction method based on non-invasive attention preprocessing process and BilSTM model

Publications (1)

Publication Number Publication Date
WO2022241932A1 true WO2022241932A1 (en) 2022-11-24

Family

ID=76929640

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/105889 WO2022241932A1 (en) 2021-05-21 2021-07-13 Prediction method based on non-intrusive attention preprocessing process and bilstm model

Country Status (2)

Country Link
CN (1) CN113177666A (en)
WO (1) WO2022241932A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116108350A (en) * 2023-01-06 2023-05-12 中南大学 Non-invasive electrical appliance identification method and system based on multitasking learning
CN117350158A (en) * 2023-10-13 2024-01-05 湖北华中电力科技开发有限责任公司 Electric power short-term load prediction method by mixing RetNet and AM-BiLSTM algorithm
CN117674098A (en) * 2023-11-29 2024-03-08 国网浙江省电力有限公司丽水供电公司 Multi-element load space-time probability distribution prediction method and system for different permeability
CN118568681A (en) * 2024-07-22 2024-08-30 齐鲁工业大学(山东省科学院) Deep learning-based refrigeration system energy consumption prediction method and system

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113553988A (en) * 2021-08-03 2021-10-26 同济大学 Analog signal identification method based on complex neural network and attention mechanism
CN113890024B (en) * 2021-09-30 2024-08-13 清华大学 Non-invasive intelligent load decomposition and optimization control method
CN114676787A (en) * 2022-04-08 2022-06-28 浙江大学 Non-invasive load identification method based on BilSTM-CRF algorithm
CN115100466A (en) * 2022-06-22 2022-09-23 国网江苏省电力有限公司信息通信分公司 Non-invasive load monitoring method, device and medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109685314A (en) * 2018-11-20 2019-04-26 中国电力科学研究院有限公司 A kind of non-intruding load decomposition method and system based on shot and long term memory network
US20200073937A1 (en) * 2018-08-30 2020-03-05 International Business Machines Corporation Multi-aspect sentiment analysis by collaborative attention allocation
CN112529283A (en) * 2020-12-04 2021-03-19 天津天大求实电力新技术股份有限公司 Comprehensive energy system short-term load prediction method based on attention mechanism

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11586880B2 (en) * 2018-08-28 2023-02-21 Beijing Jingdong Shangke Information Technology Co., Ltd. System and method for multi-horizon time series forecasting with dynamic temporal context learning
CN110889545A (en) * 2019-11-20 2020-03-17 国网重庆市电力公司电力科学研究院 Power load prediction method and device and readable storage medium
CN111191841B (en) * 2019-12-30 2020-08-25 润联软件系统(深圳)有限公司 Power load prediction method and device, computer equipment and storage medium
CN111652225B (en) * 2020-04-29 2024-02-27 杭州未名信科科技有限公司 Non-invasive camera shooting and reading method and system based on deep learning
CN112819256A (en) * 2021-03-08 2021-05-18 重庆邮电大学 Convolution time sequence room price prediction method based on attention mechanism

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200073937A1 (en) * 2018-08-30 2020-03-05 International Business Machines Corporation Multi-aspect sentiment analysis by collaborative attention allocation
CN109685314A (en) * 2018-11-20 2019-04-26 中国电力科学研究院有限公司 A kind of non-intruding load decomposition method and system based on shot and long term memory network
CN112529283A (en) * 2020-12-04 2021-03-19 天津天大求实电力新技术股份有限公司 Comprehensive energy system short-term load prediction method based on attention mechanism

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
SHUNMIAO ZHANG, CHEN MINGLONG: "Non-intrusive load decomposition based on attention mechanism and ConvBiLSTM", JOURNAL OF FUJIAN UNIVERSITY OF TECHNOLOGY, vol. 18, no. 4, 25 August 2020 (2020-08-25), pages 336 - 342, XP093005794 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116108350A (en) * 2023-01-06 2023-05-12 中南大学 Non-invasive electrical appliance identification method and system based on multitasking learning
CN116108350B (en) * 2023-01-06 2023-10-20 中南大学 Non-invasive electrical appliance identification method and system based on multitasking learning
CN117350158A (en) * 2023-10-13 2024-01-05 湖北华中电力科技开发有限责任公司 Electric power short-term load prediction method by mixing RetNet and AM-BiLSTM algorithm
CN117674098A (en) * 2023-11-29 2024-03-08 国网浙江省电力有限公司丽水供电公司 Multi-element load space-time probability distribution prediction method and system for different permeability
CN117674098B (en) * 2023-11-29 2024-06-07 国网浙江省电力有限公司丽水供电公司 Multi-element load space-time probability distribution prediction method and system for different permeability
CN118568681A (en) * 2024-07-22 2024-08-30 齐鲁工业大学(山东省科学院) Deep learning-based refrigeration system energy consumption prediction method and system

Also Published As

Publication number Publication date
CN113177666A (en) 2021-07-27

Similar Documents

Publication Publication Date Title
WO2022241932A1 (en) Prediction method based on non-intrusive attention preprocessing process and bilstm model
WO2020024319A1 (en) Convolutional neural network based multi-point regression forecasting model for traffic flow forecasting
CN109035779B (en) DenseNet-based expressway traffic flow prediction method
CN110909926A (en) TCN-LSTM-based solar photovoltaic power generation prediction method
CN111860982A (en) Wind power plant short-term wind power prediction method based on VMD-FCM-GRU
Wang et al. OGRU: An optimized gated recurrent unit neural network
CN110428082B (en) Water quality prediction method based on attention neural network
CN107562784A (en) Short text classification method based on ResLCNN models
CN110751318A (en) IPSO-LSTM-based ultra-short-term power load prediction method
CN113673242A (en) Text classification method based on K-neighborhood node algorithm and comparative learning
CN117094451B (en) Power consumption prediction method, device and terminal
CN115051929B (en) Network fault prediction method and device based on self-supervision target perception neural network
CN113112791A (en) Traffic flow prediction method based on sliding window long-and-short term memory network
CN109598002A (en) Neural machine translation method and system based on bidirectional circulating neural network
CN113935489A (en) Variational quantum model TFQ-VQA based on quantum neural network and two-stage optimization method thereof
CN113836783A (en) Digital regression model modeling method for main beam temperature-induced deflection monitoring reference value of cable-stayed bridge
CN113157919A (en) Sentence text aspect level emotion classification method and system
Ying et al. Processor free time forecasting based on convolutional neural network
Liu et al. Prediction of Temperature Time Series Based on Wavelet Transform and Support Vector Machine.
CN118134284A (en) Deep learning wind power prediction method based on multi-stage attention mechanism
CN116543289B (en) Image description method based on encoder-decoder and Bi-LSTM attention model
CN105787265A (en) Atomic spinning top random error modeling method based on comprehensive integration weighting method
CN116993185A (en) Time sequence prediction method, device, equipment and storage medium
CN116843012A (en) Time sequence prediction method integrating personalized context and time domain dynamic characteristics
CN112232570A (en) Forward active total electric quantity prediction method and device and readable storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21940381

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21940381

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 21940381

Country of ref document: EP

Kind code of ref document: A1