CN113962454A - LSTM energy consumption prediction method based on dual feature selection + particle swarm optimization - Google Patents
LSTM energy consumption prediction method based on dual feature selection + particle swarm optimization Download PDFInfo
- Publication number
- CN113962454A CN113962454A CN202111213171.9A CN202111213171A CN113962454A CN 113962454 A CN113962454 A CN 113962454A CN 202111213171 A CN202111213171 A CN 202111213171A CN 113962454 A CN113962454 A CN 113962454A
- Authority
- CN
- China
- Prior art keywords
- lstm
- prediction
- particle
- model
- energy consumption
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 239000002245 particle Substances 0.000 title claims abstract description 67
- 238000000034 method Methods 0.000 title claims abstract description 39
- 238000005265 energy consumption Methods 0.000 title claims abstract description 36
- 238000005457 optimization Methods 0.000 title claims abstract description 17
- 230000009977 dual effect Effects 0.000 title claims abstract description 9
- 238000012549 training Methods 0.000 claims abstract description 12
- 238000010219 correlation analysis Methods 0.000 claims abstract description 3
- 230000006870 function Effects 0.000 claims description 15
- 239000011159 matrix material Substances 0.000 claims description 10
- 230000004913 activation Effects 0.000 claims description 4
- 230000003044 adaptive effect Effects 0.000 claims description 4
- 238000001914 filtration Methods 0.000 claims description 2
- 238000012986 modification Methods 0.000 claims description 2
- 230000004048 modification Effects 0.000 claims description 2
- 238000010248 power generation Methods 0.000 claims description 2
- 238000012163 sequencing technique Methods 0.000 claims 1
- 230000008901 benefit Effects 0.000 abstract description 4
- 230000005611 electricity Effects 0.000 description 11
- 238000011156 evaluation Methods 0.000 description 11
- 230000000694 effects Effects 0.000 description 9
- 238000002474 experimental method Methods 0.000 description 9
- 238000010586 diagram Methods 0.000 description 5
- 230000000875 corresponding effect Effects 0.000 description 4
- YHXISWVBGDMDLQ-UHFFFAOYSA-N moclobemide Chemical compound C1=CC(Cl)=CC=C1C(=O)NCCN1CCOCC1 YHXISWVBGDMDLQ-UHFFFAOYSA-N 0.000 description 4
- 238000010187 selection method Methods 0.000 description 4
- 238000004458 analytical method Methods 0.000 description 3
- 238000013528 artificial neural network Methods 0.000 description 3
- 241001123248 Arma Species 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 238000003062 neural network model Methods 0.000 description 2
- 238000007781 pre-processing Methods 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 230000009467 reduction Effects 0.000 description 2
- 238000012706 support-vector machine Methods 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 230000002596 correlated effect Effects 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 238000013136 deep learning model Methods 0.000 description 1
- 230000008034 disappearance Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000004880 explosion Methods 0.000 description 1
- 238000013277 forecasting method Methods 0.000 description 1
- 230000002068 genetic effect Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 230000001537 neural effect Effects 0.000 description 1
- 230000035515 penetration Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000010206 sensitivity analysis Methods 0.000 description 1
- 230000002194 synthesizing effect Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 239000002699 waste material Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/211—Selection of the most significant subset of features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/086—Learning methods using evolutionary algorithms, e.g. genetic algorithms or genetic programming
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/06—Energy or water supply
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- General Physics & Mathematics (AREA)
- Business, Economics & Management (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Economics (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Human Resources & Organizations (AREA)
- Computational Linguistics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Molecular Biology (AREA)
- Biomedical Technology (AREA)
- Computing Systems (AREA)
- Biophysics (AREA)
- Strategic Management (AREA)
- Tourism & Hospitality (AREA)
- Evolutionary Biology (AREA)
- Marketing (AREA)
- General Business, Economics & Management (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Game Theory and Decision Science (AREA)
- Entrepreneurship & Innovation (AREA)
- Development Economics (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Physiology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Public Health (AREA)
- Water Supply & Treatment (AREA)
- Primary Health Care (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
Description
技术领域technical field
本发明涉及建筑能耗预测技术领域,更具体第说它是一种基于双重特征选择+粒子群优化的LSTM能耗预测方法。The invention relates to the technical field of building energy consumption prediction, more specifically, it is an LSTM energy consumption prediction method based on dual feature selection + particle swarm optimization.
背景技术Background technique
随着越来越复杂的科技产品的广泛应用,对电力的需求目前正在全球范围内逐渐增大,需要对电网进行控制从而实现电力的可持续发展。在人工智能时代电力物联网已经逐渐接入日常生活中,而智能电网的发展也需要与之相适应的测试能力,智能电表应运而生。智能电表基础设施在全球范围内的持续扩展也为将有功电能系统引入智能电网奠定了基础。自2009年推出“坚强智能电网”计划以来,中国国家电网公司一直在大规模部署智能电表、配电自动化和嵌入式智能等技术。With the widespread application of more and more complex technological products, the demand for electricity is gradually increasing worldwide, and it is necessary to control the power grid to achieve sustainable development of electricity. In the era of artificial intelligence, the Internet of Things has gradually been integrated into daily life, and the development of smart grids also requires the corresponding testing capabilities, and smart meters emerge as the times require. The continued expansion of smart metering infrastructure across the globe has also laid the foundation for the introduction of active energy systems into the smart grid. The State Grid Corporation of China has been deploying technologies such as smart meters, distribution automation and embedded intelligence on a large scale since launching its "Strong Smart Grid" program in 2009.
对于家庭建筑和企业建筑,通过能耗的预测提高能耗的使用效率,降低能耗具有很大的现实意义。商业和住宅建筑占智能楼宇能耗总量的30%至40%。当前的趋势表明,这一百分比在不久的将来可能会增加,并且全球的能源消耗和渗透率正在增加。所以短期能耗预测至关重要,由于建筑物的基础设施行为的复杂性和各种不确定性,以及传统电网存在着效率低,电能浪费严重,信息交互能力弱和自动化程度低的缺点,使得这成为一个具有挑战性的问题。For home buildings and enterprise buildings, it is of great practical significance to improve the efficiency of energy consumption and reduce energy consumption through the prediction of energy consumption. Commercial and residential buildings account for 30% to 40% of total energy consumption in smart buildings. Current trends suggest that this percentage is likely to increase in the near future and that energy consumption and penetration are increasing globally. Therefore, short-term energy consumption prediction is very important. Due to the complexity and various uncertainties of the building's infrastructure behavior, as well as the shortcomings of traditional power grids such as low efficiency, serious power waste, weak information interaction ability and low degree of automation, the This becomes a challenging problem.
鉴于此,研究人员开发了许多预测方法来改善电网质量并优化能源的使用。在很多相关的研究中,时间序列模型ARIMA等也经常作为参考模型,用于验证某些新提出的方法其预测性能是否优越。目前研究人员常常将历史数据与机器学习和深度学习算法结合使用,例如人工神经网络(ANN),支持向量机(SVM),自适应神经模糊推理系统(ANFIS)和极限学习机(ELM)进行预测。其中卷积神经网络和BP神经网络等在用电量领域已经有所研究,但其仍然处于预测方法的初级阶段。With this in mind, researchers have developed a number of forecasting methods to improve grid quality and optimize energy use. In many related studies, time series models such as ARIMA are often used as reference models to verify whether some newly proposed methods have superior prediction performance. Currently researchers often use historical data in conjunction with machine learning and deep learning algorithms, such as artificial neural networks (ANN), support vector machines (SVM), adaptive neuro-fuzzy inference systems (ANFIS) and extreme learning machines (ELM) to make predictions . Among them, convolutional neural network and BP neural network have been studied in the field of electricity consumption, but they are still in the initial stage of prediction methods.
在数据预处理过程中,对原始数据进行特征选择的好坏很大程度上决定了模型的精确度。如果可以通过选择最有效和有用的输入来减少输入数据特征的数量,则预测模型会得到更好的增强。特征选择方法的方法包括相关性分析和数值灵敏度分析,但是这些方法都是线性的输入选择方法,而能耗数据则是非线性的。因此,互信息特征选择方法将更加有效,此方法计算输入和输出数据相关性的效率是很高的。基于互信息进行特征变量选择是一种新型的变量选择方法,其中互信息量化并计算了不同相关变量之间的关联性。In the process of data preprocessing, the quality of feature selection on the original data largely determines the accuracy of the model. Predictive models are better enhanced if the number of input data features can be reduced by selecting the most efficient and useful inputs. The methods of feature selection methods include correlation analysis and numerical sensitivity analysis, but these methods are all linear input selection methods, while energy consumption data are nonlinear. Therefore, the mutual information feature selection method will be more effective, and the efficiency of this method to calculate the correlation between input and output data is very high. Feature variable selection based on mutual information is a novel variable selection method, in which mutual information quantifies and calculates the correlation between different correlated variables.
1)MI互信息算法1) MI mutual information algorithm
互信息(Mutual Information,MI),表示两个变量X与Y之间的相互依赖性。Mutual Information (MI) represents the interdependence between two variables X and Y.
X,Y之间的互信息I(X;Y)定义为:The mutual information I(X; Y) between X and Y is defined as:
其中,p(x,y)是联合概率密度函数,p(x),p(y)分别为x,y的边缘概率密度函数。MI是用来评价一个事件的出现对于另一个事件的出现所贡献的信息量。MI互信息法通过计算所有特征与目标特征的互信息度量,然后进行排序,选取N′个相关性最高的特征,从而达到特征选择的目的。Among them, p(x, y) is the joint probability density function, p(x), p(y) are the marginal probability density functions of x and y, respectively. MI is used to evaluate the amount of information that the occurrence of one event contributes to the occurrence of another event. The MI mutual information method achieves the purpose of feature selection by calculating the mutual information measure of all features and target features, and then sorting them to select N' features with the highest correlation.
2)Person相关系数2) Person correlation coefficient
其中,分别为X,Y的平均值。如果r≥0.5说明X,Y之间相关性较强,否则说明X,Y之间相关性较弱。通过Person相关系数进行二次特征选择,可进一步减少特征。in, are the mean values of X and Y, respectively. If r≥0.5, it means that the correlation between X and Y is strong; otherwise, it means that the correlation between X and Y is weak. Quadratic feature selection by Person correlation coefficient can further reduce features.
3)LSTM模型3) LSTM model
LSTM是一种深度学习模型,可以有效地处理较长的时间序列并自动学习数据并挖掘更深层次的功能。但是与其他神经网络模型类似,LSTM神经网络模型中部分超参数的设置,往往依赖研究者的经验,这样的模型缺乏科学严谨性。PSO的优势在于简单容易实现,PSO解决方案提供了更快的收敛速度,并且没有许多参数需要调整。遗传算法和蚁群算法等不具备这种引导机制。LSTM is a deep learning model that can efficiently process longer time series and automatically learn data and mine deeper features. However, similar to other neural network models, the setting of some hyperparameters in the LSTM neural network model often relies on the experience of researchers, and such models lack scientific rigor. The advantage of PSO is that it is simple and easy to implement, the PSO solution provides faster convergence, and there are not many parameters to adjust. Genetic algorithm and ant colony algorithm do not have this kind of guidance mechanism.
长短时神经记忆网络(LSTM)是由Hochreiter提出的用于解决时间反向传播(Back-propagation Through Time,BPTT)存在的梯度消失和梯度爆炸问题。随着模型不断改善,逐渐演变成被广泛使用的LSTM网络架构。其内部是由3个独特的门结构和1个用于存储记忆的状态模块组成。LSTM单元的结构如图1所示。其中Ct为本LSTM单元存储的状态信息,ht为本单元隐含层的输出,ft为遗忘门,it为输入门,为当前时刻信息,ot为输出门,表示矩阵元素相乘,表示矩阵相加。Long-short-term neural memory network (LSTM) was proposed by Hochreiter to solve the gradient disappearance and gradient explosion problems of Back-propagation Through Time (BPTT). As the model continues to improve, it gradually evolves into the widely used LSTM network architecture. Its interior is composed of 3 unique gate structures and a state module for storing memory. The structure of the LSTM unit is shown in Figure 1. Among them, C t is the state information stored by the LSTM unit, h t is the output of the hidden layer of the unit, f t is the forget gate, and i t is the input gate, is the current moment information, o t is the output gate, represents the multiplication of matrix elements, Represents matrix addition.
遗忘门:控制上一单元状态Ct-1被遗忘的程度:Forget Gate: Controls the degree to which the previous unit state C t-1 is forgotten:
ft=σ(Wf·[ht-1,xt]+bf) (3)f t =σ(W f ·[h t-1 ,x t ]+b f ) (3)
输入门:控制哪些信息被加入到本单元中:Input Gate: Controls what information is added to this unit:
it=σ(Wi·[ht-1,xt]+bi) (4)i t =σ(W i ·[h t-1 ,x t ]+b i ) (4)
单元状态更新:根据ft将新信息有选择的记录到Ct中:Cell state update: Selectively record new information into C t according to f t :
输出门:将Ct激活,并控制Ct被过滤的程度:Output gate: activates C t and controls how much C t is filtered:
ot=σ(Wo·[ht-1,xt]+bo) (7)o t =σ(W o ·[h t-1 ,x t ]+b o ) (7)
Wf,Wi,Wo为各个模块对应的权重矩阵,bf,bi,bo为偏置项,σ为sigmoid激活函数,tanh为双曲正切激活函数,定义为W f , W i , W o is the weight matrix corresponding to each module, b f , b i , b o is the bias term, σ is the sigmoid activation function, and tanh is the hyperbolic tangent activation function, which is defined as
σ(x)=1/(1+e-x) (9)σ(x)=1/(1+e -x ) (9)
tanh(x)=(ex-e-x)/(ex+e-x) (10)tanh(x)=(e x -e -x )/(e x +e -x ) (10)
输出层依据式(11)将ht经过一个全连接层(dense)得到最终预测值yt:The output layer passes h t through a fully connected layer (dense) according to formula (11) to obtain the final predicted value y t :
其中,Wy,by分别为权重矩阵和偏置项。Among them, W y and by are the weight matrix and the bias term, respectively.
yt=σ(Wy·ht+by) (11) y t =σ(W y ·h t +by ) (11)
LSTM通过门函数,控制历史信息的传递,具备一定的时间序列处理与预测能力。LSTM controls the transmission of historical information through the gate function, and has certain time series processing and forecasting capabilities.
4)PSO粒子群优化算法4) PSO particle swarm optimization algorithm
粒子群算法的基本思想:一群鸟在一定的区域内随机飞往某处搜索食物,所有的鸟仅知道自己与食物的距离和其他鸟的位置信息。每一只鸟在离开当前所在位置飞往其他位置时,会依赖于下列信息:目前离食物最近的鸟的周围区域、根据自己飞行的经验判断食物的所在。The basic idea of particle swarm algorithm: a group of birds randomly fly somewhere in a certain area to search for food, all birds only know the distance between themselves and the food and the location information of other birds. When each bird leaves its current location to fly to another location, it will rely on the following information: the surrounding area of the bird that is currently closest to the food, and judge the location of the food based on its own flying experience.
PSO初始化为一群随机粒子(随机解)。然后通过迭代找到最优解。在每一次的迭代中,粒子通过跟踪两个“极值”(局部最优解pbest,全局最优解gbest)来更新自己。在找到这两个最优值后,粒子通过下面的公式来更新自己的速度和位置。PSO is initialized as a group of random particles (random solution). Then iteratively find the optimal solution. In each iteration, the particle updates itself by tracking two "extremes" (the local optimal solution pbest and the global optimal solution gbest). After finding these two optimal values, the particle updates its velocity and position by the following formula.
vi=vi+c1×rand()×(pbesti-xi)+c2×rand()×(gbesti-xi) (12)v i =v i +c 1 ×rand()×(pbest i -x i )+c 2 ×rand()×(gbest i -x i ) (12)
xi=xi+vi x i = x i +v i
其中,i=1,2,…,N,N是粒子群的粒子总数。Among them, i=1, 2, ..., N, N is the total number of particles in the particle swarm.
vi:第i个粒子的当前速度v i : the current velocity of the ith particle
rand():介于(0,1)之间的随机数rand(): random number between (0, 1)
xi:i粒子的当前位置x i : the current position of the i particle
c1和c2:学习因子c 1 and c 2 : learning factors
pbesti和gbesti分别是当前粒子群局部最优位置和全局最优位置。pbest i and gbest i are the local optimal position and the global optimal position of the current particle swarm, respectively.
但现有的MI互信息算法、LSTM模型、PSO粒子群优化算法对能耗预测的精度不高、且预测性能不稳定,不满足建筑能耗预测的要求。因此,开发一种预测精度高,预测性能稳定的应用于建筑的能耗预测方法很有必要。However, the existing MI mutual information algorithm, LSTM model, and PSO particle swarm optimization algorithm have low accuracy for energy consumption prediction, and the prediction performance is unstable, which does not meet the requirements of building energy consumption prediction. Therefore, it is necessary to develop a building energy consumption prediction method with high prediction accuracy and stable prediction performance.
发明内容SUMMARY OF THE INVENTION
本发明的目的是为了提供一种基于多维特征选择+粒子群优化的LSTM能耗预测方法,为一种应用于建筑的能耗预测方法,预测精度高,预测性能稳定。The purpose of the present invention is to provide an LSTM energy consumption prediction method based on multi-dimensional feature selection + particle swarm optimization, which is an energy consumption prediction method applied to buildings, with high prediction accuracy and stable prediction performance.
为了实现上述目的,本发明的技术方案为:一种基于MI-LSTM-PSO的能耗预测方法,其特征在于:如图2所示,包括如下步骤,In order to achieve the above purpose, the technical solution of the present invention is: a method for predicting energy consumption based on MI-LSTM-PSO, which is characterized in that: as shown in FIG. 2 , it includes the following steps:
步骤一:采用MI互信息法,对原始数据集时间和特征维度上进行相关性分析,选取对能耗预测目标值最有效的前N′维特征,从而消除冗余数据,起到提高模型算法效率的作用;Step 1: Use the MI mutual information method to analyze the correlation between the time and feature dimensions of the original data set, and select the most effective front N' dimension features for the energy consumption prediction target value, so as to eliminate redundant data and improve the model algorithm. the role of efficiency;
步骤二:计算由MI互信息法选择得到的前N′维特征与被预测序列之间的pearson相关系数值,选择pearson相关系数值大于或等于0.5的N″维特征;Step 2: Calculate the pearson correlation coefficient value between the first N'-dimensional feature selected by the MI mutual information method and the predicted sequence, and select the N"-dimensional feature whose pearson correlation coefficient value is greater than or equal to 0.5;
步骤三:采用LSTM模型,对PMI双重特征选择后的N″维特征数据进行模型训练和预测,得到初始预测序列y(t);Step 3: Use the LSTM model to perform model training and prediction on the N"-dimensional feature data after the PMI double feature selection, and obtain the initial prediction sequence y(t);
步骤四:采用粒子群优化PSO算法对LSTM模型的超参数units、dropout、batchsize进行寻优,从而提高LSTM模型预测的精度,最终得到MI-LSTM-PSO模型。Step 4: Use the particle swarm optimization PSO algorithm to optimize the hyperparameters units, dropout, and batchsize of the LSTM model, thereby improving the prediction accuracy of the LSTM model, and finally obtain the MI-LSTM-PSO model.
在上述技术方案中,在步骤一和步骤二中,N′为60,即选取对能耗预测目标值最有效的前60维特征。In the above technical solution, in
在上述技术方案中,步骤一具体包括如下步骤,In the above technical solution,
S11,使用滑动窗口将前24小时20维特征数据形成24M(即480)维特征分量,其中原始数据序列包括:2个区域的光伏发电量,17个区域不同设施的能耗量,系统电网输入总电量(根据不同场景数据集,数据会序列有所不同);S11, use the sliding window to form the 20-dimensional feature data of the first 24 hours into 24M (ie, 480)-dimensional feature components, wherein the original data sequence includes: photovoltaic power generation in 2 regions, energy consumption of different facilities in 17 regions, system grid input Total power (the data sequence will be different according to different scene datasets);
S12,采用MI互信息法对以上24M(即480)维特征分量进行特征选择;S12, using the MI mutual information method to perform feature selection on the above 24M (ie 480) dimension feature components;
其中,p(x,y)是x和y的联合概率密度函数,而p(x)和p(y)是边际密度函数,如果x与y完全不相关,则p(x,y)将等于p(x)p(y),其互信息将等0,若I(X;Y)越大,则表示两个变量相关性越强;where p(x,y) is the joint probability density function of x and y, and p(x) and p(y) are the marginal density functions. If x is completely uncorrelated with y, then p(x,y) will be equal to p(x)p(y), its mutual information will be equal to 0, if I(X; Y) is larger, it means that the correlation between the two variables is stronger;
S13,通过实验寻优,确定MI特征选择维数的最优参数N;N值太大,模型训练数据集中会包含过多的冗余信息和噪声,会使预测性能变差,而N值太小,模型训练数据集包含的信息量太少,同样会使预测结果变差;通常,最优的N值在3M~6M之间,选择预测性能较好,且N值比较小的特征维数;S13, determine the optimal parameter N of the MI feature selection dimension through experimental optimization; if the value of N is too large, the model training data set will contain too much redundant information and noise, which will deteriorate the prediction performance, and the value of N is too large. Small, the model training data set contains too little information, which will also make the prediction results worse; usually, the optimal N value is between 3M and 6M, the prediction performance is better, and the N value is relatively small. ;
S14,基于特征序列x(t)和目标序列Y的互信息排序,综合时间和特征维度数据,选取对能耗预测目标值最有效的前60维特征作为后续模型的训练数据集。S14 , based on the mutual information ranking of the feature sequence x(t) and the target sequence Y, synthesizing the time and feature dimension data, select the top 60-dimensional features that are most effective for predicting the target value of energy consumption as the training data set of the subsequent model.
在上述技术方案中,步骤二具体包括如下步骤,In the above technical solution, step 2 specifically includes the following steps:
S21,计算以上60维特征分量与目标序列Y(即Gi)的皮尔逊相关系数;S21, calculate the Pearson correlation coefficient between the above 60-dimensional feature components and the target sequence Y (ie Gi);
其中,分别为X,Y的平均值;如果r≥0.5说明X,Y之间相关性较强,否则说明X,Y之间相关性较弱;in, are the average values of X and Y respectively; if r≥0.5, it means that the correlation between X and Y is strong, otherwise, it means that the correlation between X and Y is weak;
S22,根据pearson相关系数小于0.5说明两者相关性较弱,选择pearson相关系数大于或等于0.5的37维特征数据。S22, according to the fact that the pearson correlation coefficient is less than 0.5, the correlation between the two is weak, and the 37-dimensional feature data with the pearson correlation coefficient greater than or equal to 0.5 is selected.
在上述技术方案中,LSTM网络内部包括三个门结构和一个用于存储记忆的状态模块,如图1所示,步骤三具体包括如下步骤:In the above technical solution, the LSTM network includes three gate structures and a state module for storing memory. As shown in Figure 1, step 3 specifically includes the following steps:
S31,设Ct为本LSTM单元存储的状态信息,xt为输入层的输入,ht为本单元隐含层的输出,ft为遗忘门,it为输入门,为当前时刻信息,ot为输出门,“×”表示矩阵元素相乘,“+”表示相加运算,σ为sigmoid函数;S31, set C t as the state information stored by the LSTM unit, x t as the input of the input layer, h t as the output of the hidden layer of the unit, f t as the forgetting gate, i t as the input gate, is the current moment information, o t is the output gate, "×" means multiplication of matrix elements, "+" means addition operation, and σ is the sigmoid function;
S32,遗忘门:用于控制上一单元状态Ct-1被遗忘的程度,其表达式如下:S32, forget gate: used to control the degree to which the previous unit state C t-1 is forgotten, its expression is as follows:
ft=σ(Wf*[ht-1,xt]+bf) (3)f t =σ(W f *[h t-1 ,x t ]+b f ) (3)
S33,输入门:用于控制哪些信息被加入到本单元中,其表达式如下:S33, input gate: used to control which information is added to this unit, and its expression is as follows:
it=σ(Wi*[ht-1,xt]+bi) (4)i t =σ(W i *[h t-1 ,x t ]+b i ) (4)
S34,单元存储的状态信息:用于根据ft和it将新信息有选择的记录到Ct中,其表达式如下:S34, the state information stored by the unit: it is used to selectively record new information into C t according to f t and it, and its expression is as follows:
S35,输出门:用于将Ct激活,并控制Ct被过滤的程度,其表达式如下:S35, output gate: used to activate C t and control the degree to which C t is filtered, its expression is as follows:
ot=σ(Wo*[ht-1,xt]+bo) (7)o t =σ(W o *[h t-1 ,x t ]+b o ) (7)
ht=ot*tanh(Ct) (8)h t =o t *tanh(C t ) (8)
其中,ht为本单元隐含层的输出;ht-1则为上一单元隐含层的输出;Wf、Wi、Wo分别为ft、it、ot对应的权重矩阵,bf、bi、bo分别为ft、it、ot对应的偏置项,tanh为双曲正切激活函数,定义如下:Among them, h t is the output of the hidden layer of the unit; h t-1 is the output of the hidden layer of the previous unit; W f , Wi , W o are ft , it , Weight matrix corresponding to o t , b f , b i , b o are ft , it , The bias term corresponding to o t , tanh is the hyperbolic tangent activation function, which is defined as follows:
σ(x)=1/(1+e-x) (9)σ(x)=1/(1+e -x ) (9)
tanh(x)=(ex-e-x)/(ex+e-x) (10)tanh(x)=(e x -e -x )/(e x +e -x ) (10)
S36,输出层则依据下式将ht经过一个全连接层得到最终预测值yt:S36, the output layer passes h t through a fully connected layer to obtain the final predicted value y t according to the following formula:
yt=σ(Wy*ht+by) (11) y t =σ(W y *h t +by ) (11)
上式中,Wy和by分别为权重矩阵和偏置项。In the above formula, W y and by are the weight matrix and the bias term, respectively.
在上述技术方案中,步骤四具体包括如下步骤,In the above technical solution, step 4 specifically includes the following steps:
S41,初始化修改参数,设置以下参数的范围units∈[20,300],dropout∈[0,1],batchsize∈[20,300];S41, initialize the modification parameters, and set the ranges of the following parameters units∈[20,300], dropout∈[0,1], batchsize∈[20,300];
S42,在初始范围内,对粒子群(20个粒子)随机初始化,根据fitness function(LSTM模型拟合结果),计算每个粒子的适应值(平均绝对误差MAE),根据当前每个粒子的预测指标MAE确定这次迭代的粒子群的最优位置(pbest)以及历史粒子种群的最佳方位(gbest);S42, within the initial range, randomly initialize the particle swarm (20 particles), calculate the fitness value (mean absolute error MAE) of each particle according to the fitness function (LSTM model fitting result), and calculate the fitness value (mean absolute error MAE) of each particle according to the current prediction of each particle The index MAE determines the optimal position of the particle swarm in this iteration (pbest) and the best orientation of the historical particle swarm (gbest);
S43,根据最优粒子的位置和速度,对当前粒子的位置和速度进行更新,将更新后的粒子通过LSTM模型拟合后,计算每个粒子的MAE,根据MAE更新pbest和gbest;S43, according to the position and velocity of the optimal particle, update the position and velocity of the current particle, fit the updated particle through the LSTM model, calculate the MAE of each particle, and update pbest and gbest according to the MAE;
vi=vi+c1×rand()×(pbesti-xi)+c2×rand()×(gbesti-xi) (12)v i =v i +c 1 ×rand()×(pbest i -x i )+c 2 ×rand()×(gbest i -x i ) (12)
xi=xi+vi x i = x i +v i
式(12)中:i=1,2,…,N,N是粒子群的粒子总数;In formula (12): i = 1, 2, ..., N, N is the total number of particles in the particle swarm;
vi:第i个粒子的当前速度;v i : the current velocity of the i-th particle;
rand():介于(0,1)之间的随机数;rand(): a random number between (0, 1);
xi:i粒子的当前位置;x i : the current position of the i particle;
c1和c2:学习因子;c 1 and c 2 : learning factors;
pbesti和gbesti分别是当前粒子群局部最优位置和全局最优位置;pbest i and gbest i are the local optimal position and the global optimal position of the current particle swarm, respectively;
S44,将更新后的粒子通过LSTM模型训练后,计算每个粒子的适应值,根据适应值更新这次迭代的粒子群的最优位置以及历史粒子种群的最佳方位;S44, after the updated particles are trained by the LSTM model, the fitness value of each particle is calculated, and the optimal position of the particle swarm in this iteration and the optimal orientation of the historical particle swarm are updated according to the fitness value;
S45,当最优粒子的适应度值不再变化或者迭代次数达到上限值即认为此时算法已经达到收敛;若粒子未收敛,则继续返回S33进行粒子更新;S45, when the fitness value of the optimal particle no longer changes or the number of iterations reaches the upper limit value, it is considered that the algorithm has reached convergence; if the particle does not converge, continue to return to S33 to update the particle;
S46,将得到的最优粒子参数units、dropout、batchsize代入到LSTM模型中,对步骤一中的数据进行模型预测,得到最终的预测结果。S46: Substitute the obtained optimal particle parameters units, dropout, and batchsize into the LSTM model, perform model prediction on the data in
上述“*”均表示:乘以。The above "*" all mean: multiply.
本发明具有如下优点:The present invention has the following advantages:
(1)本发明为一种应用于建筑的能耗预测方法,预测精度高,预测性能稳定;(1) The present invention is an energy consumption prediction method applied to buildings, with high prediction accuracy and stable prediction performance;
(2)本发明通过MI减少了87.5%的多余特征,对提高模型算法效率起到了很好的作用,模型算法效率高;(2) The present invention reduces 87.5% of redundant features through MI, which plays a good role in improving the efficiency of the model algorithm, and the model algorithm has high efficiency;
(3)本发明采用PSO算法对LSTM模型的超参数units、dropout、batchsize进行寻优,从而提高LSTM模型预测的精度,且模型拟合效果好;(3) The present invention uses the PSO algorithm to optimize the hyperparameters units, dropout and batchsize of the LSTM model, thereby improving the accuracy of the LSTM model prediction, and the model fitting effect is good;
(4)本发明中的PMI-PSO-LSTM模型的预测值基本处于真实值的置信区间内,且预测的趋势与真实值接近,预测精度高;(4) The predicted value of the PMI-PSO-LSTM model in the present invention is basically within the confidence interval of the real value, and the predicted trend is close to the real value, and the prediction accuracy is high;
(5)本发明中的PMI-PSO-LSTM组合模型的MAE和SMAPE均优于其他模型的所有结果,具有更高的鲁棒性以及更为稳定的预测性能。(5) The MAE and SMAPE of the PMI-PSO-LSTM combined model in the present invention are superior to all results of other models, and have higher robustness and more stable prediction performance.
附图说明Description of drawings
图1为现有的LSTM内部结构示意图。Figure 1 is a schematic diagram of the internal structure of the existing LSTM.
图2为本发明的PMI-PSO-LSTM模型结构示意图。FIG. 2 is a schematic structural diagram of the PMI-PSO-LSTM model of the present invention.
图3为本发明实施例基础模型预测结果对比曲线图。FIG. 3 is a graph showing a comparison of prediction results of a basic model according to an embodiment of the present invention.
图4为本发明实施例基础模型预测结果对比散点图。FIG. 4 is a comparison scatter diagram of the prediction results of the basic model according to the embodiment of the present invention.
图5为本发明实施例组合模型预测结果对比曲线图。FIG. 5 is a graph showing a comparison of prediction results of a combined model according to an embodiment of the present invention.
图6为本发明实施例组合模型预测结果对比散点图。FIG. 6 is a comparison scatter diagram of the prediction results of the combined model according to the embodiment of the present invention.
图7为本发明实施例模型评价指标对比图。FIG. 7 is a comparison diagram of model evaluation indicators according to an embodiment of the present invention.
具体实施方式Detailed ways
下面结合附图详细说明本发明的实施情况,但它们并不构成对本发明的限定,仅作举例而已。同时通过说明使本发明的优点更加清楚和容易理解。The implementation of the present invention will be described in detail below with reference to the accompanying drawings, but they do not constitute a limitation of the present invention, but are merely examples. At the same time, the advantages of the present invention are made clearer and easier to understand by the description.
实施例Example
现以本发明试用于某建筑的用电量预测为实施例对本发明进行详细说明,对本发明应用于其它建筑能耗预测同样具有指导作用。Now, the present invention will be described in detail by taking the power consumption prediction of a certain building as an example, which also has a guiding role for the present invention to be applied to the energy consumption prediction of other buildings.
本实施将某建筑的历史用电量作为时间序列进行短期单步1h的用电量预测。In this implementation, the historical electricity consumption of a building is used as a time series for short-term single-step 1h electricity consumption forecast.
本实施例中,某建筑的用电量预测,包括如下内容:In this embodiment, the electricity consumption forecast of a building includes the following contents:
1.实验数据集及MI特征选择1. Experimental dataset and MI feature selection
本实施例所用数据集为某建筑2019年10月15日至2019年6月4日的用电量,该数据集一共20个特征。这些特征的描述如表1所示。其中第5列数据为当前特征与Gi特征的pearsonr相关系数值。The data set used in this example is the electricity consumption of a building from October 15, 2019 to June 4, 2019, and the data set has a total of 20 features. A description of these features is shown in Table 1. The data in the fifth column is the pearsonr correlation coefficient value between the current feature and the Gi feature.
表1数据集说明Table 1 Dataset Description
本实施例使用前24小时的数据预测下一小时Gi的值,故使用滑动窗口将24小时的20个特征的数据形成480个特征分量。然后使用MI互信息法选择使用滑动窗口法形成的480个特征分量中MI值最大的前60维特征。In this embodiment, the data of the previous 24 hours is used to predict the value of Gi in the next hour, so a sliding window is used to form 480 feature components from the data of 20 features in 24 hours. Then the MI mutual information method is used to select the top 60-dimensional features with the largest MI value among the 480 feature components formed using the sliding window method.
选择结果如表2所示;选择结果如表2所示;The selection results are shown in Table 2; the selection results are shown in Table 2;
其中,选择的特征例如Gi(t-1)表示,以当前时间为基准前一小时从工业厂房公共电网中输入;Among them, the selected feature, such as Gi(t-1), is input from the public power grid of the industrial plant one hour before the current time as the benchmark;
表2 MI选择的特征Table 2 Features selected by MI
其中,选择的特征例如Gi(t-1)表示,以当前时间为基准前一小时从工业厂房公共电网中输入。MI值为当前特征分量X与以当前时间为基准的Gi分量(即I(X;Gi(t))的互信息值大小。由表2可知前四小时的大部分特征与当前时刻的Gi特征的互信息值较大,Gi、Ao、Co、A2前24个小时的特征与当前时刻的Gi特征的互信息值也相对较大。MI减少了87.5%的多余特征,对提高模型算法效率起到了很好的作用。Among them, the selected feature, such as Gi(t-1), represents the input from the public grid of the industrial plant one hour before the current time. The MI value is the mutual information value of the current feature component X and the Gi component (that is, I(X; Gi(t)) based on the current time. It can be seen from Table 2 that most of the features in the previous four hours and the Gi feature at the current moment The mutual information value of Gi, Ao, Co, and A2 is relatively large, and the mutual information value of the Gi, Ao, Co, and A2 features in the first 24 hours and the Gi feature at the current moment are also relatively large. MI reduces the redundant features by 87.5%, which is effective in improving the efficiency of the model algorithm. to a good effect.
本实施例采用的数据集是20维特征,用之前24小时数据,来预测未来的第25小时数据。The data set used in this embodiment is a 20-dimensional feature, and the data of the previous 24 hours is used to predict the data of the 25th hour in the future.
对本实施例中20维特征的数据集,进行了实验,实验结果表明:Experiments have been carried out on the data set of 20-dimensional features in this embodiment, and the experimental results show that:
1)选择前60维特征得到的预测结果与100维差不多;1) The prediction result obtained by selecting the first 60-dimensional features is similar to that of 100-dimensional features;
2)特征数据维度再增多(即选择特征数据维度大于100)后,会使预测结果变差;2) After the dimension of feature data is increased (that is, the dimension of feature data is selected to be greater than 100), the prediction result will be worse;
3)特征数据维度减少(即选择特征数据维度小于60)后,数据集包含的信息量太少,同样会使预测结果变差。3) After the dimension of the feature data is reduced (that is, the dimension of the feature data is selected to be less than 60), the data set contains too little information, which will also make the prediction result worse.
因此,本实施例使用MI互信息法选择使用滑动窗口法形成的480个特征分量中MI值最大的前60维特征。Therefore, in this embodiment, the MI mutual information method is used to select the top 60-dimensional features with the largest MI value among the 480 feature components formed by the sliding window method.
2.评价指标2. Evaluation indicators
使用4种评价指标来评判模型的好坏程度。Use 4 evaluation indicators to judge the quality of the model.
均方根误差:RMSE,数值越小,表示模型拟合效果越好。Root mean square error: RMSE, the smaller the value, the better the model fitting effect.
平均绝对误差:MAE,数值越小,表示模型拟合效果越好。Mean absolute error: MAE, the smaller the value, the better the model fitting effect.
对称平均绝对百分比误差:SMAPE,数值越小,表示模型拟合效果越好。Symmetric mean absolute percentage error: SMAPE, the smaller the value, the better the model fitting effect.
可决系数:R2,数值越大,表示模型拟合效果越好。Coefficient of determination: R2, the larger the value, the better the model fitting effect.
式(13)、(14)、(15)、(16)中,为预测值,yi为真实值,为真实值的均值,n为数据数量。In formulas (13), (14), (15), (16), is the predicted value, y i is the actual value, is the mean of the true values, and n is the number of data.
3.模型参数设置3. Model parameter settings
为了验证提出MI+PSO-LSTM组合模型的预测效果,本实施例采用表3中的两组6个实验模型(即M1-M6)做实验对比,模型的主要参数如表4,表5所示。In order to verify the prediction effect of the proposed MI+PSO-LSTM combined model, two groups of 6 experimental models (ie M1-M6) in Table 3 are used for experimental comparison in this embodiment. The main parameters of the models are shown in Table 4 and Table 5. .
表3实验对比基准模型Table 3 Experiments vs. benchmark models
表4对比模型主要参数1Table 4 The main parameters of the
表5对比模型主要参数2Table 5 The main parameters of the comparison model 2
4.模型实验数据分析4. Model experiment data analysis
4.1、基础模型实验结果分析4.1. Analysis of the experimental results of the basic model
本实施例采用表3的基础模型M1-M3,通过特征1-20对公共电网输入总电量Gi,进行单步预测实验对比。In this embodiment, the basic models M1-M3 in Table 3 are used, and a single-step prediction experiment comparison is performed for the total electricity Gi input to the public power grid through features 1-20.
实验对比结果(表6)中,从可决系数、均方根误差、对称平均绝对百分比误差这四个模型预测评价指标中均可看出LSTM模型预测结果最好。In the experimental comparison results (Table 6), it can be seen from the four model prediction evaluation indicators of coefficient of determination, root mean square error, and symmetric mean absolute percentage error that the LSTM model has the best prediction results.
表6基础模型实验对比Table 6 Comparison of basic model experiments
ARMA、K近邻和LSTM预测1h用电量的预测结果与真实值的比对如图3和图4所示。由图3和图4可以看出LSTM模型预测的趋势与真实值最接近,且仅有LSTM模型在原始值的置信区间里。ARIMA与K近邻模型预测的结果曲线既不在真实值的置信区间内,又存在预测滞后问题。综上对比于ARMA、K近邻回归模型,LSTM模型的预测效果是最佳的。所以选择LSTM作为实验基础模型。Figures 3 and 4 show the comparison between the predicted results of ARMA, K-nearest neighbors and LSTM to predict 1h electricity consumption and the actual value. It can be seen from Figure 3 and Figure 4 that the trend predicted by the LSTM model is the closest to the real value, and only the LSTM model is in the confidence interval of the original value. The result curve predicted by ARIMA and K-nearest neighbor model is not within the confidence interval of the true value, but also has the problem of prediction lag. In summary, compared with the ARMA and K nearest neighbor regression models, the prediction effect of the LSTM model is the best. So choose LSTM as the experimental base model.
4.2、LSTM组合模型实验结果分析4.2. Analysis of experimental results of LSTM combined model
本实施例采用表3的组合模型M3-M6,通过特征1-20对公共电网输入总电量Gi,进行了20组单步预测对比实验。In this embodiment, the combined models M3-M6 in Table 3 are used, and 20 groups of single-step prediction and comparison experiments are carried out for the total electricity Gi input to the public power grid through features 1-20.
四种模型预测1h用电量Gi的预测结果与真实值的比对如图5和图6所示。由图5和图6可以看出四个模型的预测值基本处于真实值的置信区间内,而且PMI-PSO-LSTM模型预测的趋势与真实值最接近。从图7可以看出,PMI-PSO-LSTM模型的各项评价指标均为最优(图7中,M3、M4、M5、M6分别为本实施例采用表3的组合模型M3-M6)。Figure 5 and Figure 6 show the comparison between the prediction results of the four models to predict the electricity consumption Gi for 1 hour and the actual value. It can be seen from Figure 5 and Figure 6 that the predicted values of the four models are basically within the confidence interval of the true value, and the trend predicted by the PMI-PSO-LSTM model is the closest to the true value. It can be seen from FIG. 7 that the evaluation indicators of the PMI-PSO-LSTM model are all optimal (in FIG. 7, M3, M4, M5, and M6 respectively use the combined models M3-M6 in Table 3 in this embodiment).
表7给出了四种组合模型20组实验结果的平均值,前四列为预测模型的四种评价指标,第五列为预测模型的训练时间。从表7中可以看出,对比于LSTM、MI-LSTM和PMI-LSTM模型,MI+PSO-LSTM模型在R2上提高并不明显,但是在MAE、SMAPE上性能分别提高了20%、10%、5%左右。对比于LSTM模型,MI-LSTM的性能并没有显著提升,但是通过MI选择特征之后,输入数据的维数减少了87.5%,导致模型训练的时间减少了63%左右。对比于MI-LSTM模型,PMI-LSTM的性能并几乎没有提升,但是通过二次特征选择后,输入数据的维数减了40%左右,导致模型训练的时间减少了20%左右;Table 7 shows the average of the 20 experimental results of the four combined models. The first four columns are the four evaluation indicators of the prediction model, and the fifth column is the training time of the prediction model. As can be seen from Table 7, compared with the LSTM, MI-LSTM and PMI-LSTM models, the MI+PSO-LSTM model is not significantly improved on R2, but the performance on MAE and SMAPE is improved by 20% and 10% respectively. , 5% or so. Compared with the LSTM model, the performance of MI-LSTM is not significantly improved, but after selecting features through MI, the dimension of the input data is reduced by 87.5%, resulting in a reduction of model training time by about 63%. Compared with the MI-LSTM model, the performance of the PMI-LSTM has not improved almost, but after the secondary feature selection, the dimension of the input data is reduced by about 40%, resulting in a reduction of the model training time by about 20%;
表7组合模型评价指标对比Table 7 Comparison of evaluation indicators of combined models
图7是M3-M6的20组实验四项评价指标的箱线图。图7中不在箱子形状内的’+’符号为异常值(可以忽略不计)。从图7中可以看出,MI-PSO-LSTM模型的四项评价指标明显优于其他三种模型,且MI-PSO-LSTM模型每次的MAE和SMAPE均优于其他模型的所有结果,而MI-PSO-LSTM模型每次的R2和RMSE也有95%左右的数据优于其他模型。而MI+LSTM的四项评价指标与LSTM虽然有部分重合,但是MI-LSTM总体趋势上是优于LSTM模型的。从图7中可以看出,对比于LSTM模型、MI-LSTM模型和PMI-LSTM模型,MI-PSO-LSTM模型的箱线图形状(上下四分位数差值)最小,这说明MI-PSO-LSTM模型比其他模型更为稳定。Figure 7 is a boxplot of the four evaluation indicators of the 20 groups of M3-M6 experiments. The '+' symbols that are not within the box shape in Figure 7 are outliers (can be ignored). As can be seen from Figure 7, the four evaluation indicators of the MI-PSO-LSTM model are significantly better than the other three models, and the MAE and SMAPE of the MI-PSO-LSTM model each time are better than all the results of the other models, while The MI-PSO-LSTM model also outperforms other models by about 95% of the data in R2 and RMSE each time. The four evaluation indicators of MI+LSTM and LSTM partially overlap, but the overall trend of MI-LSTM is better than the LSTM model. As can be seen from Figure 7, compared with the LSTM model, the MI-LSTM model and the PMI-LSTM model, the MI-PSO-LSTM model has the smallest boxplot shape (the difference between the upper and lower quartiles), which shows that the MI-PSO - LSTM model is more stable than other models.
综上所述,本发明提出了一种基于PMI、PSO、LSTM的短期能耗组合预测模型。首先,在数据预处理阶段,使用互信息法和皮尔逊系相关系数对原始数据进行双重特征选择,删除冗余特征。然后使用PSO对LSTM的网络架构进行匹配化寻优,使得LSTM的拓扑结构与当前输入数据适配性达到最好,最后将特征选择后的数据输入到优化好的LSTM中,对能耗数据进行短期预测。本发明为了验证MI-PSO-LSTM模型在短期能耗预测上的效果,对某建筑的能耗时间序列数据集进行了多维单步预测对比实验。综合上述实验的结果表明,MI-PSO-LSTM组合模型的4种评价指标均为最优,即说明MI-PSO-LSTM模型具有更高的预测精度和鲁棒性以及更为稳定的预测性能。MI-PSO-LSTM组合模型可以为利用深度学习探索时间序列的预测分析方面提供一个有益的研究思路。然而,MI-PSO-LSTM组合模型仍有很大的优化空间,例如研究时间序列的噪声过滤问题和特征动态智能选择问题,从而进一步优化模型预测精度。To sum up, the present invention proposes a short-term energy consumption combined prediction model based on PMI, PSO and LSTM. First, in the data preprocessing stage, double feature selection is performed on the original data using the mutual information method and the Pearson correlation coefficient to remove redundant features. Then use PSO to match and optimize the network architecture of LSTM, so that the topological structure of LSTM is best suited to the current input data. Finally, the data after feature selection is input into the optimized LSTM, and the energy consumption data is analyzed. short-term forecast. In order to verify the effect of the MI-PSO-LSTM model on short-term energy consumption prediction, the present invention conducts a multi-dimensional single-step prediction comparison experiment on the energy consumption time series data set of a building. The results of the above experiments show that the four evaluation indicators of the MI-PSO-LSTM combined model are all optimal, which means that the MI-PSO-LSTM model has higher prediction accuracy, robustness and more stable prediction performance. The MI-PSO-LSTM combined model can provide a useful research idea for exploring the predictive analysis aspects of time series using deep learning. However, the MI-PSO-LSTM combined model still has a lot of room for optimization, such as studying the noise filtering problem of time series and the problem of dynamic intelligent selection of features, so as to further optimize the prediction accuracy of the model.
其它未说明的部分均属于现有技术。Other unexplained parts belong to the prior art.
Claims (5)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111213171.9A CN113962454A (en) | 2021-10-18 | 2021-10-18 | LSTM energy consumption prediction method based on dual feature selection + particle swarm optimization |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111213171.9A CN113962454A (en) | 2021-10-18 | 2021-10-18 | LSTM energy consumption prediction method based on dual feature selection + particle swarm optimization |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113962454A true CN113962454A (en) | 2022-01-21 |
Family
ID=79464357
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111213171.9A Pending CN113962454A (en) | 2021-10-18 | 2021-10-18 | LSTM energy consumption prediction method based on dual feature selection + particle swarm optimization |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113962454A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116561554A (en) * | 2023-04-18 | 2023-08-08 | 南方电网电力科技股份有限公司 | Feature extraction method, system, equipment and medium of boiler soot blower |
CN117455053A (en) * | 2023-10-31 | 2024-01-26 | 郑州轻工业大学 | Random configuration network prediction building energy consumption method based on search interval reconstruction |
CN118249408A (en) * | 2024-05-29 | 2024-06-25 | 浙江禹贡信息科技有限公司 | Grid-connected hybrid renewable energy system based on combination optimization and machine learning algorithm |
CN119203774A (en) * | 2024-10-14 | 2024-12-27 | 南京工业大学 | A PSO-LSTM temperature prediction method introducing attention mechanism |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108986470A (en) * | 2018-08-20 | 2018-12-11 | 华南理工大学 | The Travel Time Estimation Method of particle swarm algorithm optimization LSTM neural network |
CN110417011A (en) * | 2019-07-31 | 2019-11-05 | 三峡大学 | An online dynamic security assessment method based on mutual information and iterative random forest |
CN111783953A (en) * | 2020-06-30 | 2020-10-16 | 重庆大学 | A 7-day prediction method of 24-point power load value based on optimized LSTM network |
CN111985706A (en) * | 2020-08-15 | 2020-11-24 | 西北工业大学 | A daily passenger flow prediction method for scenic spots based on feature selection and LSTM |
CN112418504A (en) * | 2020-11-17 | 2021-02-26 | 西安热工研究院有限公司 | Wind speed prediction method based on mixed variable selection optimization deep belief network |
-
2021
- 2021-10-18 CN CN202111213171.9A patent/CN113962454A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108986470A (en) * | 2018-08-20 | 2018-12-11 | 华南理工大学 | The Travel Time Estimation Method of particle swarm algorithm optimization LSTM neural network |
CN110417011A (en) * | 2019-07-31 | 2019-11-05 | 三峡大学 | An online dynamic security assessment method based on mutual information and iterative random forest |
CN111783953A (en) * | 2020-06-30 | 2020-10-16 | 重庆大学 | A 7-day prediction method of 24-point power load value based on optimized LSTM network |
CN111985706A (en) * | 2020-08-15 | 2020-11-24 | 西北工业大学 | A daily passenger flow prediction method for scenic spots based on feature selection and LSTM |
CN112418504A (en) * | 2020-11-17 | 2021-02-26 | 西安热工研究院有限公司 | Wind speed prediction method based on mixed variable selection optimization deep belief network |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116561554A (en) * | 2023-04-18 | 2023-08-08 | 南方电网电力科技股份有限公司 | Feature extraction method, system, equipment and medium of boiler soot blower |
CN117455053A (en) * | 2023-10-31 | 2024-01-26 | 郑州轻工业大学 | Random configuration network prediction building energy consumption method based on search interval reconstruction |
CN118249408A (en) * | 2024-05-29 | 2024-06-25 | 浙江禹贡信息科技有限公司 | Grid-connected hybrid renewable energy system based on combination optimization and machine learning algorithm |
CN118249408B (en) * | 2024-05-29 | 2024-08-02 | 浙江禹贡信息科技有限公司 | Grid-connected hybrid renewable energy system based on combination optimization and machine learning algorithm |
CN119203774A (en) * | 2024-10-14 | 2024-12-27 | 南京工业大学 | A PSO-LSTM temperature prediction method introducing attention mechanism |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Jallal et al. | A hybrid neuro-fuzzy inference system-based algorithm for time series forecasting applied to energy consumption prediction | |
CN113962454A (en) | LSTM energy consumption prediction method based on dual feature selection + particle swarm optimization | |
Pranolo et al. | Robust LSTM with tuned-PSO and bifold-attention mechanism for analyzing multivariate time-series | |
CN110163410B (en) | A Prediction Method of Line Loss Electricity Based on Neural Network-Time Series | |
Li et al. | Building’s hourly electrical load prediction based on data clustering and ensemble learning strategy | |
CN109359786A (en) | A kind of power station area short-term load forecasting method | |
Liu et al. | Industrial time series forecasting based on improved Gaussian process regression | |
Zhao et al. | Heating load prediction of residential district using hybrid model based on CNN | |
CN113112077A (en) | HVAC control system based on multi-step prediction deep reinforcement learning algorithm | |
Fan et al. | Multi-objective LSTM ensemble model for household short-term load forecasting | |
Zuo | Integrated forecasting models based on LSTM and TCN for short-term electricity load forecasting | |
CN115759415A (en) | Electricity Demand Forecasting Method Based on LSTM-SVR | |
CN112330044A (en) | Support vector regression model based on iterative aggregation grid search algorithm | |
Li et al. | BO-STA-LSTM: Building energy prediction based on a Bayesian optimized spatial-temporal attention enhanced LSTM method | |
Wang et al. | Complexity-based structural optimization of deep belief network and application in wastewater treatment process | |
Zhao et al. | Multi-point temperature or humidity prediction for office building indoor environment based on CGC-BiLSTM deep neural network | |
Wang et al. | Stochastic configuration networks for short-term power load forecasting | |
Zhu | Research on adaptive combined wind speed prediction for each season based on improved gray relational analysis | |
CN118211847A (en) | Carbon emission short-term prediction method for multi-energy load | |
CN117709744A (en) | Photovoltaic power generation power integration prediction method, system, equipment and medium | |
Attarde et al. | A CNN and BiLSTM Fusion Approach Toward Precise Appliance Energy Forecasts. | |
Ding et al. | A Safe and Data-Efficient Model-Based Reinforcement Learning System for HVAC Control | |
Shu et al. | Wind power generation prediction based on the SSA-CNN-BiLSTM neural network model | |
Sung et al. | Cluster-based deep transfer learning with attention mechanism for residential air conditioning systems | |
Zhao et al. | Photovoltaic maximum power point tracking based on IWD-SVM |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |