WO2023035564A1 - 基于分位数梯度提升决策树的负荷区间预测方法及系统 - Google Patents

基于分位数梯度提升决策树的负荷区间预测方法及系统 Download PDF

Info

Publication number
WO2023035564A1
WO2023035564A1 PCT/CN2022/079202 CN2022079202W WO2023035564A1 WO 2023035564 A1 WO2023035564 A1 WO 2023035564A1 CN 2022079202 W CN2022079202 W CN 2022079202W WO 2023035564 A1 WO2023035564 A1 WO 2023035564A1
Authority
WO
WIPO (PCT)
Prior art keywords
quantile
decision tree
load
distribution network
value
Prior art date
Application number
PCT/CN2022/079202
Other languages
English (en)
French (fr)
Inventor
黄园芳
段新辉
郑世明
李玲
林荣秋
吴莉琳
魏焱
刘云凯
彭显刚
付振宇
吴超成
陈宇钊
王志强
曹彦朝
谢卓均
李琦
王奕
张俊宏
Original Assignee
广东电网有限责任公司湛江供电局
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 广东电网有限责任公司湛江供电局 filed Critical 广东电网有限责任公司湛江供电局
Publication of WO2023035564A1 publication Critical patent/WO2023035564A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N7/00Computing arrangements based on specific mathematical models
    • G06N7/01Probabilistic graphical models, e.g. probabilistic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/06Electricity, gas or water supply
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Definitions

  • the present application relates to the technical field of power load forecasting, in particular to a load interval forecasting method and system based on a quantile gradient boosting decision tree.
  • the distribution network load is directly related to the user's electricity consumption behavior. Compared with the system-level load, it has higher uncertainty, which objectively directly affects the accuracy of the traditional distribution network load prediction and poses a threat to the security of the distribution network. Stable operation has a greater impact.
  • Traditional distribution network station area load forecasting adopts the point forecast method, which only gives a single deterministic value, and cannot take into account the possible probability distribution of distribution network station area load in the future, and it is difficult to meet the actual load uncertainty analysis. need.
  • the load interval prediction method of the distribution network station area can obtain the confidence interval of the distribution network station area load at a certain time point in the future, and realize the effective quantification of the load uncertainty.
  • Optimal scheduling and other aspects have certain application value and research significance.
  • the statistical method of forecasting error distribution has higher requirements on the quality of historical data, and there is a certain degree of subjectivity in dividing statistical intervals. It is sensitive to parameter settings, which greatly affects the reliability of constructing historical forecast error distribution; probabilistic forecasting methods usually assume that the distribution network load obeys a specific distribution, but the validity of this assumption It is difficult to be strictly proved by statistical methods, and it is easy to have a large deviation from the actual distribution, which will affect the accuracy of the load interval prediction of the distribution network station area.
  • Traditional quantile regression methods are mostly based on shallow machine learning algorithms such as BP neural networks, which tend to fall into local optimal solutions during the model training phase, resulting in insufficient model generalization capabilities.
  • the present application provides a load interval forecasting method and system based on quantile gradient boosting decision tree, which is used to solve the above-mentioned technical problems of poor reliability and accuracy of forecasting and insufficient generalization ability.
  • the first aspect of the present application provides a load interval prediction method based on quantile gradient boosting decision tree, including the following steps:
  • step S1 includes:
  • the original data of the distribution network station area load is collected, and the original data is cleaned to obtain the original distribution network station area load sequence.
  • the original distribution network station area load sequence has a time sequence, so The above raw data includes active power.
  • the step S1 specifically includes:
  • step S102 Repeat step S101 M times, so as to add different Gaussian white noise to the load sequence of the original distribution network station area each time, so as to obtain M groups of eigenmode components and residual components;
  • the mean value of the residual component is expressed as,
  • x(i) represent the modal component values before and after normalization, respectively
  • x min and x max are the minimum and maximum values of the modal components, respectively.
  • step S2 specifically includes:
  • training samples generate a quantile gradient boosting decision tree composed of n decision trees, where n is the preset number of decision trees;
  • D 1 represents a part of the training set, also denoted as D 1 (j, s)
  • m 1 represents the modal component set corresponding to D 1
  • c 1 represents the expected value corresponding to D 1
  • D 2 represents a part of the training set, also denoted as D 2 (j, s)
  • m 2 represents the modal component set corresponding to D 2
  • c 2 represents the expected value corresponding to D 2 ;
  • pinball loss function As an evaluation model for the prediction performance of the quantile gradient boosting decision tree prediction model, wherein the pinball loss function is,
  • L(y i , c) represents the pinball loss function value
  • is the preset quantile point
  • ⁇ ⁇ represents the checking function
  • r ti represents the negative gradient
  • f t-1 (x) represents the load prediction value when the quantile gradient boosting decision tree prediction model iterates t-1 times
  • c tj represents the best estimated value corresponding to ( xi , r ti );
  • f t (x) represents the output value of the updated quantile gradient boosting decision tree prediction model at t iterations
  • I( ) represents the step function
  • f T (x) represents the output value of the updated quantile gradient boosting decision tree prediction model at T iterations
  • the probability density function is,
  • n is the number of quantile points
  • K( ) represents the Gaussian kernel function
  • h is the preset window width coefficient
  • y represents the label of the test sample.
  • step S4 specifically includes:
  • s.t. represents the constraint condition
  • Pr(y ⁇ [L, U]) represents the probability that y falls in the confidence interval [L, U], where [L, U] is the confidence interval satisfying the predetermined confidence level , and output it as the load interval prediction result of the distribution network platform area.
  • the present invention also provides a load interval forecasting system based on quantile gradient lifting decision tree, including:
  • the modal decomposition module is used to decompose the load sequence of the original distribution network station area by using the lumped empirical mode, obtain several modal components, and perform normalization processing on each modal component;
  • the decision tree prediction module is used to establish a quantile gradient boosting decision tree prediction model for each modal component, obtain the predicted value of each modal component under different quantile conditions, and carry out the predicted value of each modal component Accumulate to get the conditional distribution when the predicted value is at the preset quantile point;
  • the probability density calculation module is used to obtain the probability density function of the future distribution network station area load in the conditional distribution when the predicted value is at the preset quantile point by using the kernel density estimation method;
  • the confidence prediction module is used to obtain the confidence interval meeting the predetermined confidence level through the calculation of the probability density function, so as to output the prediction result of the distribution network station area load interval.
  • the present invention also provides a computer-readable storage medium, in which a computer program is stored, and when the computer program is loaded and executed by a processor, the above-mentioned load interval prediction based on the quantile gradient boosting decision tree is realized. method steps.
  • the present invention also provides an electronic device, including: a processor and a memory; wherein,
  • the memory is used to store computer programs
  • the processor is configured to load and execute the computer program, so that the electronic device executes the steps of the above-mentioned load interval prediction method based on quantile gradient boosting decision tree.
  • the present invention has the following advantages:
  • the invention decomposes the load sequence of the original distribution network station area by adopting the lumped experience mode to obtain modal components with different characteristics, reduces the complexity of the subsequent quantile gradient lifting decision tree prediction model training, and improves the accuracy of prediction.
  • the randomness ensures the diversification of learning between samples, making the quantile gradient boosting decision tree less likely to fall into the risk of overfitting and has good generalization ability.
  • Fig. 1 is a flow chart of a load interval prediction method based on a quantile gradient lifting decision tree provided by an embodiment of the present application
  • FIG. 2 is a schematic structural diagram of a load interval forecasting system based on a quantile gradient boosting decision tree provided by an embodiment of the present application.
  • the probabilistic forecasting method is a machine learning algorithm developed from Bayesian theory. Most of them use kernel functions as the basis of regression analysis, and the representative algorithm is Gaussian process regression. Gaussian process regression assumes that the variance of the random variable obeys the Gaussian distribution, and is mainly used to obtain the expected value of the predicted quantity and its distribution, and then obtain the interval prediction results under any confidence level of the distribution network platform load. However, it is usually assumed that the distribution network The station area load obeys a specific distribution, but the validity of this assumption is difficult to be strictly proved by statistical methods, and it is easy to have a large deviation from the actual distribution, thus affecting the accuracy of the distribution network station area load interval prediction;
  • a kind of load interval prediction method based on quantile gradient lifting decision tree provided by the present invention, comprises the following steps:
  • the invention decomposes the load sequence of the original distribution network station area by adopting the lumped empirical mode to obtain modal components with different characteristics, which reduces the complexity of the subsequent quantile gradient lifting decision tree prediction model training and improves the accuracy of prediction.
  • the randomness ensures the diversification of learning between samples, making the quantile gradient boosting decision tree less likely to fall into the risk of overfitting and has good generalization ability.
  • a load interval prediction method based on quantile gradient lifting decision tree comprises the following steps:
  • the original distribution network station area load sequence is time-sequential, and the original data includes active power.
  • the original data of the distribution network station area load is collected, and the original data is obtained by sampling in chronological order, and the load time series can be obtained.
  • step S100 specifically includes:
  • step S102 repeat step S101 M times, thereby adding different Gaussian white noise to the load sequence of the original distribution network station area each time, to obtain M groups of eigenmode components and residual components;
  • the residual component mean is expressed as,
  • x(i) represent the modal component values before and after normalization, respectively
  • x min and x max are the minimum and maximum values of the modal components, respectively.
  • step S200 specifically includes:
  • the first 70% to 90% of data is extracted from the data set (modal component) in a certain proportion as training samples, and the rest of the data are used as test samples.
  • the data in the modal component is distributed in chronological order, and the attributes and labels of the samples can be obtained through a certain combination.
  • D 1 represents a part of the training set, also denoted as D 1 (j, s)
  • m 1 represents the modal component set corresponding to D 1
  • c 1 represents the expected value corresponding to D 1
  • D 2 represents a part of the training set, also denoted as D 2 (j, s)
  • m 2 represents the modal component set corresponding to D 2
  • c 2 represents the expected value corresponding to D 2 ;
  • pinball loss function As an evaluation model for the prediction performance of the quantile gradient boosting decision tree prediction model, wherein the pinball loss function is,
  • L(y i , c) represents the pinball loss function value
  • is the preset quantile point
  • ⁇ ⁇ represents the checking function
  • pinball loss function can be used to evaluate the difference between the predicted value of the model (quantile gradient boosting decision tree) and the actual value of the sample under different quantile conditions. The better the performance of the loss function, the better the performance of the model. The better the performance.
  • r ti represents the negative gradient
  • f t-1 (x) represents the load prediction value when the quantile gradient boosting decision tree prediction model iterates t-1 times
  • c tj represents the best estimated value corresponding to ( xi , r ti );
  • f t (x) represents the output value of the updated quantile gradient boosting decision tree prediction model at t iterations
  • I( ) represents the step function
  • quantile gradient boosting decision tree model is trained iteratively.
  • the next iteration uses the negative gradient to measure the performance of the base learner in the previous round, and corrects the previous errors by fitting the negative gradient of the loss function. Finally, Find an output value that is infinitely close to the true value.
  • f T (x) represents the output value of the updated quantile gradient boosting decision tree prediction model at T iterations
  • the probability density function is,
  • n is the number of quantile points
  • K( ) represents the Gaussian kernel function
  • h is the preset window width coefficient
  • y represents the label of the test sample.
  • window width coefficient can be selected by empirical rules.
  • step S400 specifically includes:
  • s.t. represents the constraint condition
  • Pr(y ⁇ [L, U]) represents the probability that y falls in the confidence interval [L, U], where [L, U] is the confidence interval satisfying the predetermined confidence level , and output it as the load interval prediction result of the distribution network platform area.
  • the present invention also provides a load interval prediction based on a quantile gradient lifting decision tree systems, including:
  • the modal decomposition module 100 is used to decompose the load sequence of the original distribution network station area by using the lumped empirical mode, obtain several modal components, and perform normalization processing on each modal component;
  • the decision tree prediction module 200 is used to establish a quantile gradient boosting decision tree prediction model for each modal component, obtain the predicted value of each modal component under different quantile conditions, and convert the predicted value of each modal component Accumulate to obtain the conditional distribution when the predicted value is at the preset quantile point;
  • the probability density calculation module 300 is used to obtain the probability density function of the future distribution network station area load in the conditional distribution when the predicted value is at the preset quantile point by using the kernel density estimation method;
  • the confidence prediction module 400 is used to obtain a confidence interval meeting a predetermined confidence level through probability density function calculation, so as to output the prediction result of the load interval of the distribution network station area.
  • the present invention provides a load interval forecasting system based on quantile gradient lifting decision tree, which decomposes the load sequence of the original distribution network station area by using the lumped empirical mode to obtain modal components with different characteristics, which reduces the subsequent
  • quantile gradient increases the complexity of the decision tree prediction model training, improves the accuracy of prediction, and uses the kernel density estimation method to obtain the probability density function, avoiding the subjectivity and a priori of constructing the probability distribution, and improving the distribution network platform.
  • the reliability and accuracy of the regional load interval prediction is used to ensure the diversity of learning between samples, so that the quantile gradient boosting decision tree is not easy to fall into the risk of overfitting, and has good generalization ability .
  • the present invention also provides a computer-readable storage medium, in which a computer program is stored.
  • a computer program is stored.
  • the steps of the above-mentioned load interval prediction method based on quantile gradient boosting decision tree are realized.
  • the present invention also provides an electronic device, including: a processor and a memory; wherein,
  • the processor is used to load and execute the computer program, so that the electronic device executes the steps of the above-mentioned load interval prediction method based on quantile gradient boosting decision tree.
  • the disclosed devices and methods may be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of units is only a logical function division. In actual implementation, there may be other division methods.
  • multiple units or components can be combined or integrated. to another system, or some features may be ignored, or not implemented.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated units can be implemented in the form of hardware or in the form of software functional units.
  • the integrated unit is realized in the form of a software function unit and sold or used as an independent product, it can be stored in a computer-readable storage medium.
  • the technical solution of the present application is essentially or part of the contribution to the prior art or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , including several instructions for executing all or part of the steps of the methods described in the various embodiments of the present application through a computer device (which may be a personal computer, a server, or a network device, etc.).
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (English full name: Read-Only Memory, English abbreviation: ROM), random access memory (English full name: Random Access Memory, English abbreviation: RAM), magnetic Various media that can store program codes such as discs or optical discs.

Abstract

一种基于分位数梯度提升决策树的负荷区间预测方法及系统,通过采用集总经验模态对原始配电网台区负荷序列进行分解,得到不同特征的模态分量,降低了后续分位数梯度提升决策树预测模型训练的复杂程度,提高了预测的准确性,并采用核密度估计方法得到概率密度函数,避免构造概率分布的主观性和先验性,提高了配电网台区负荷区间预测的可靠性和准确性,同时,利用决策树采样的随机性确保样本之间学习的多样化,使得分位数梯度提升决策树不易陷入过拟合风险,具有良好的泛化能力。

Description

基于分位数梯度提升决策树的负荷区间预测方法及系统
本申请要求于2021年9月8日提交中国专利局、申请号为202111046819.8、发明名称为“基于分位数梯度提升决策树的负荷区间预测方法及系统”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及电力负荷预测技术领域,尤其涉及一种基于分位数梯度提升决策树的负荷区间预测方法及系统。
背景技术
随着智能配电网的建设和大数据技术的快速发展,配电网台区海量数据的采集与存储问题在近年来得以解决,为实现配电网台区精细化管理提供了完备的数据基础和技术条件。然而配电网台区负荷直接与用户用电行为息息相关,相比系统级负荷,具有较高的不确定性,客观上直接影响传统配电网台区负荷预测的准确度,给配电网安全稳定运行造成较大的影响。传统配电网台区负荷预测采用点预测方法,仅仅给出单一的确定性数值,无法计及未来配电网台区负荷可能的概率分布,在涉及负荷不确定性分析的问题上难以满足实际需求。
而配电网台区负荷区间预测方法能够得到未来某一时间点上配电网台区负荷的置信区间,实现对负荷不确定性的有效量化,因此在配电网风险预警评估、精准规划与优化调度等方面具有一定的应用价值与研究意义。
而现有技术中,多是采用预测误差分布统计法、概率式预测方法或传统分位数回归方法,其中,预测误差分布统计法对历史数据质量要求较高,且划分统计区间存在一定的主观性,而且对参数的设定较为敏感,很大程度上影响构造历史预测误差分布的可靠性;概率式预测方法通常假定配电网台区负荷服从某种特定的分布,然而这种假定的有效性难以通过统计方法得到严格证明,容易与实际分布存在较大的偏差,从而影响配电网台区 负荷区间预测的准确性。传统分位数回归方法大多基于浅层的机器学习算法如BP神经网络,容易在模型训练阶段陷入局部最优解,导致模型泛化能力不足。
发明内容
本申请提供了一种基于分位数梯度提升决策树的负荷区间预测方法及系统,用于解决上述预测的可靠性和准确性差且泛化能力不足的技术问题。
有鉴于此,本申请第一方面提供了一种基于分位数梯度提升决策树的负荷区间预测方法,包括以下步骤:
S1、采用集总经验模态对原始配电网台区负荷序列进行分解,得到若干个模态分量,对每个模态分量进行归一化处理;
S2、对每个模态分量分别建立分位数梯度提升决策树预测模型,得到各个模态分量在不同分位数条件下的预测值,将各个模态分量的预测值进行累加,得到预测值在预设的分位点时的条件分布;
S3、采用核密度估计方法在所述预测值在预设的分位点时的条件分布中得到未来配电网台区负荷的概率密度函数;
S4、通过所述概率密度函数计算求得满足预定的置信水平下的置信区间,从而输出配电网台区负荷区间预测结果。
优选地,所述步骤S1之前包括:
根据预设采样周期采集配电网台区负荷的原始数据,对所述原始数据进行数据清洗,得到原始配电网台区负荷序列,所述原始配电网台区负荷序列具有时序性,所述原始数据包括有功功率。
优选地,所述步骤S1具体包括:
S101、在所述原始配电网台区负荷序列中加入高斯白噪声得到新的配电网台区负荷序列,利用集总经验模态对所述新的配电网台区负荷序列进行分解,得到若干个模态分量,所述模态分量包括若干个本征模态分量和一个残差分量;
S102、重复M次步骤S101,从而每次向所述原始配电网台区负荷序列加入不同的高斯白噪声,以得到M组本征模态分量和残差分量;
S103、对M组本征模态分量和残差分量分别求取均值,从而得到若干个本征模态分量均值和一个残差分量均值,其中,所述本征模态分量均值表示为,
Figure PCTCN2022079202-appb-000001
所述残差分量均值表示为,
Figure PCTCN2022079202-appb-000002
式中,imf i,m(t)为第m组第i个本征模态分量,m=1,2,…,M,r m(t)为第m组残差分量;
对每个模态分量通过下式进行归一化处理,
Figure PCTCN2022079202-appb-000003
式中,x(i)、
Figure PCTCN2022079202-appb-000004
分别表示归一化前和归一化后的模态分量数值,x min和x max分别为模态分量数值中的最小值和最大值。
优选地,步骤S2具体包括:
S201、通过归一化后的若干个所述模态分量选取训练样本和测试样本以分别构建训练集和测试集;
S202、假设所述训练样本定义为D={(x 1,y 1),(x 2,y 2),…(x i,y i),…,(x m,y m)},其中,x i和y i分别为训练样本的属性和标签,x i∈R N,R表示实数域,N表示维度,基于决策树算法对每个决策树依次采用Bootstrap策略有放回地随机抽取m个训练样本,则生成由n个决策树组合成的分位数梯度提升决策树,其中,n为预设的决策树数量;
S203、随机选择某个待切分属性j,将属性j上所有取值从小到大排序,记为
Figure PCTCN2022079202-appb-000005
通过下式得到属性j上的候选划分点集合H j
Figure PCTCN2022079202-appb-000006
S204、在候选划分点集合H j中随机选择某个待切分点s,s∈H j,根据(j,s)将训练集切分为两部分;
S205、通过下式计算切分后的两部分训练集上对应标签的期望值,作为决策树待选的估计值:
Figure PCTCN2022079202-appb-000007
Figure PCTCN2022079202-appb-000008
式中,D 1表示一部分训练集,也记作D 1(j,s),
Figure PCTCN2022079202-appb-000009
m 1表示D 1对应的模态分量集合,c 1表示D 1对应的期望值,D 2表示一部分训练集,也记作D 2(j,s),
Figure PCTCN2022079202-appb-000010
m 2表示D 2对应的模态分量集合,c 2表示D 2对应的期望值;
S206、在遍历所有可能的解(j,s)直到找到最优解
Figure PCTCN2022079202-appb-000011
使得下式对应的目标值最小,将最优解
Figure PCTCN2022079202-appb-000012
作为划分节点:
Figure PCTCN2022079202-appb-000013
S207、重复步骤S203~S206直到满足停止分裂条件,从而生成决策树,其中,停止分裂条件为所述目标值小于预设阈值或达到决策树的预设最大深度;
S208、采用弹球损失函数作为分位数梯度提升决策树预测模型的预测性能的评价模型,其中,弹球损失函数为,
Figure PCTCN2022079202-appb-000014
式中,L(y i,c)表示弹球损失函数值,τ为预设的分位点,ρ τ表示检查函数;
S209、假设分位数梯度提升决策树预测模型的输出值为f(x),则f(x)的初始化表达式为,
Figure PCTCN2022079202-appb-000015
S210、设迭代次数t=1,2,…,T,通过下式计算迭代t次后第i个训练样本的损失函数负梯度:
Figure PCTCN2022079202-appb-000016
式中,r ti表示负梯度,f t-1(x)表示分位数梯度提升决策树预测模型迭代t-1次时的负荷预测值;
S211、将(x i,y i)替换为(x i,r ti),i=1,2,…,m,根据步骤S203~S207将(x i,r ti)拟合得到第t个决策树,其对应的子叶节点区域为R tj,j=1,2,…,J,其中,J为决策树子叶节点的个数;通过下式计算最佳估计值:
Figure PCTCN2022079202-appb-000017
式中,c tj表示(x i,r ti)对应的最佳估计值;
S212、通过下式更新分位数梯度提升决策树预测模型的输出值f(x):
Figure PCTCN2022079202-appb-000018
式中,f t(x)表示为t次迭代时的更新分位数梯度提升决策树预测模型的输出值,I(·)表示阶跃函数;
S213、训练完成后,得到分位数梯度提升决策树预测模型的最终的输出值f(x)为,
Figure PCTCN2022079202-appb-000019
式中,f T(x)表示T次迭代时的更新分位数梯度提升决策树预测模型的输出值;
S214、假设所述预设的分位点τ的取值为τ i=0.01,0.02,…,0.99,当给定第w个模态分量的测试样本时,则对应有模态分量在分位点为τ i时 的预测值,将其记为f j(x|τ i);
S215、通过下式将各个模态分量的预测值进行累加,得到预测值在预设的分位点时的条件分布f(x|τ i):
Figure PCTCN2022079202-appb-000020
优选地,所述概率密度函数为,
Figure PCTCN2022079202-appb-000021
式中,n为分位点的个数,K(·)表示高斯核函数,h为预置窗宽系数,y表示测试样本的标签。
优选地,步骤S4具体包括:
假设给定置信水平为(1-α)×100%,α表示显著性水平,α=0.01,0.05或0.1,则在所述概率密度函数中求得置信区间的下限L和上限U满足下式的条件:
Figure PCTCN2022079202-appb-000022
式中,s.t.表示约束条件,Pr(y∈[L,U])表示y落于置信区间[L,U]中的概率,其中,[L,U]为满足预定的置信水平下的置信区间,并将其作为配电网台区负荷区间预测结果进行输出。
第二方面,本发明还提供了一种基于分位数梯度提升决策树的负荷区间预测系统,包括:
模态分解模块,用于采用集总经验模态对原始配电网台区负荷序列进行分解,得到若干个模态分量,对每个模态分量进行归一化处理;
决策树预测模块,用于对每个模态分量分别建立分位数梯度提升决策树预测模型,得到各个模态分量在不同分位数条件下的预测值,将各个模态分量的预测值进行累加,得到预测值在预设的分位点时的条件分布;
概率密度计算模块,用于采用核密度估计方法在所述预测值在预设的分位点时的条件分布中得到未来配电网台区负荷的概率密度函数;
置信预测模块,用于通过所述概率密度函数计算求得满足预定的置信 水平下的置信区间,从而输出配电网台区负荷区间预测结果。
第三方面,本发明还提供了一种计算机可读存储介质,其中存储有计算机程序,所述计算机程序被处理器加载执行时,实现如上述的基于分位数梯度提升决策树的负荷区间预测方法的步骤。
第四方面,本发明还提供了一种电子设备,包括:处理器及存储器;其中,
所述存储器用于存储计算机程序;
所述处理器用于加载执行所述计算机程序,以使所述电子设备执行如上述的基于分位数梯度提升决策树的负荷区间预测方法的步骤。
从以上技术方案可以看出,本发明具有以下优点:
本发明通过采用集总经验模态对原始配电网台区负荷序列进行分解,得到不同特征的模态分量,降低了后续分位数梯度提升决策树预测模型训练的复杂程度,提高了预测的准确性,并采用核密度估计方法得到概率密度函数,避免构造概率分布的主观性和先验性,提高了配电网台区负荷区间预测的可靠性和准确性,同时,利用决策树采样的随机性确保样本之间学习的多样化,使得分位数梯度提升决策树不易陷入过拟合风险,具有良好的泛化能力。
附图说明
图1为本申请实施例提供的一种基于分位数梯度提升决策树的负荷区间预测方法的流程图;
图2为本申请实施例提供的一种基于分位数梯度提升决策树的负荷区间预测系统的结构示意图。
具体实施方式
为了使本技术领域的人员更好地理解本申请方案,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提 下所获得的所有其他实施例,都属于本申请保护的范围。
在现有技术中,多是采用预测误差分布统计法、概率式预测方法或传统分位数回归方法,其中,预测误差分布统计法是根据历史配电网台区负荷预测的误差数据,通过配电网台区负荷水平和统计时段两个维度建立误差概率分布模型,然后以此为根据,结合确定性预测结果进行概率补偿修正,得到区间预测结果,但是,对历史数据质量要求较高,且划分统计区间存在一定的主观性,而且对参数的设定较为敏感,很大程度上影响构造历史预测误差分布的可靠性;
概率式预测方法是由贝叶斯理论发展起来的机器学习算法,多以核函数作为回归分析的基础,其中代表性的算法有高斯过程回归。高斯过程回归假定随机变量的方差服从高斯分布,主要用于求出预测量的期望值及其分布状况,进而得到配电网台区负荷任意置信水平下的区间预测结果,但是,通常假定配电网台区负荷服从某种特定的分布,然而这种假定的有效性难以通过统计方法得到严格证明,容易与实际分布存在较大的偏差,从而影响配电网台区负荷区间预测的准确性;
传统分位数回归方法大多基于浅层的机器学习算法如BP神经网络,容易在模型训练阶段陷入局部最优解,导致模型泛化能力不足。
为此,请参阅图1,本发明提供的一种基于分位数梯度提升决策树的负荷区间预测方法,包括以下步骤:
S1、采用集总经验模态对原始配电网台区负荷序列进行分解,得到若干个模态分量,对每个模态分量进行归一化处理;
S2、对每个模态分量分别建立分位数梯度提升决策树预测模型,得到各个模态分量在不同分位数条件下的预测值,将各个模态分量的预测值进行累加,得到预测值在预设的分位点时的条件分布;
S3、采用核密度估计方法在预测值在预设的分位点时的条件分布中得到未来配电网台区负荷的概率密度函数;
S4、通过概率密度函数计算求得满足预定的置信水平下的置信区间,从而输出配电网台区负荷区间预测结果。
本发明通过采用集总经验模态对原始配电网台区负荷序列进行分解, 得到不同特征的模态分量,降低了后续分位数梯度提升决策树预测模型训练的复杂程度,提高了预测的准确性,并采用核密度估计方法得到概率密度函数,避免构造概率分布的主观性和先验性,提高了配电网台区负荷区间预测的可靠性和准确性,同时,利用决策树采样的随机性确保样本之间学习的多样化,使得分位数梯度提升决策树不易陷入过拟合风险,具有良好的泛化能力。
以下为本发明提供的一种基于分位数梯度提升决策树的负荷区间预测方法的实施例的具体描述。
本发明提供的一种基于分位数梯度提升决策树的负荷区间预测方法,包括以下步骤:
S0、根据预设采样周期采集配电网台区负荷的原始数据,对原始数据进行数据清洗,得到原始配电网台区负荷序列,原始配电网台区负荷序列具有时序性,原始数据包括有功功率。
需要说明的是,根据预设采样周期采集配电网台区负荷的原始数据,其原始数据是按时间先后顺序进行采样得到的,可得到负荷时间序列。
同时,在采样过程中由于一些原因可能出现数据缺失或异常的情况,通过对原始数据进行数据清洗,即可得到相对完整、正常的负荷时间序列。
S100、采用集总经验模态对原始配电网台区负荷序列进行分解,得到若干个模态分量,对每个模态分量进行归一化处理;
具体地,步骤S100具体包括:
S101、在原始配电网台区负荷序列中加入高斯白噪声得到新的配电网台区负荷序列,利用集总经验模态对新的配电网台区负荷序列进行分解,得到若干个模态分量,模态分量包括若干个本征模态分量和一个残差分量;
S102、重复M次步骤S101,从而每次向原始配电网台区负荷序列加入不同的高斯白噪声,以得到M组本征模态分量和残差分量;
S103、对M组本征模态分量和残差分量分别求取均值,从而得到若干个本征模态分量均值和一个残差分量均值,其中,本征模态分量均值表示为,
Figure PCTCN2022079202-appb-000023
残差分量均值表示为,
Figure PCTCN2022079202-appb-000024
式中,imf i,m(t)为第m组第i个本征模态分量,m=1,2,…,M,r m(t)为第m组残差分量;
对每个模态分量通过下式进行归一化处理,
Figure PCTCN2022079202-appb-000025
式中,x(i)、
Figure PCTCN2022079202-appb-000026
分别表示归一化前和归一化后的模态分量数值,x min和x max分别为模态分量数值中的最小值和最大值。
S200、对每个模态分量分别建立分位数梯度提升决策树预测模型,得到各个模态分量在不同分位数条件下的预测值,将各个模态分量的预测值进行累加,得到预测值在预设的分位点时的条件分布;
具体地,步骤S200具体包括:
S201、通过归一化后的若干个模态分量选取训练样本和测试样本以分别构建训练集和测试集;
一般示例中,按一定比例从数据集(模态分量)中抽取前70%~90%的数据作为训练样本,而其余的数据作为测试样本。
S202、假设训练样本定义为D={(x 1,y 1),(x 2,y 2),…(x i,y i),…,(x m,y m)},其中,x i和y i分别为训练样本的属性和标签,x i∈R N,R表示实数域,N表示维度,基于决策树算法对每个决策树依次采用Bootstrap策略有放回地随机抽取m个训练样本,则生成由n个决策树组合成的分位数梯度提升决策树,其中,n为预设的决策树数量;
需要说明的是,若当前采样时刻记为t,样本的属性是指模态分量中N个采样时刻为t-p的数据,这里数据一般不唯一,p=1,2,…,k,k为任意自然数;样本的标签是指模态分量中单个采样时刻为t+q的数据,q=1,2,…,k。 模态分量中的数据是按时间先后顺序分布的,通过一定的组合方式即可得到样本的属性和标签。
S203、随机选择某个待切分属性j,将属性j上所有取值从小到大排序,记为
Figure PCTCN2022079202-appb-000027
通过下式得到属性j上的候选划分点集合H j
Figure PCTCN2022079202-appb-000028
在本实施例中,若当前采样时刻记为t,样本的属性是指模态分量中N个采样时刻为t-p的数据,这里数据一般不唯一,p=1,2,…,k,k为任意自然数。根据数据的数值大小,对属性j上的所有取值进行从小到大的重新排序。
S204、在候选划分点集合H j中随机选择某个待切分点s,s∈H j,根据(j,s)将训练集切分为两部分;
S205、通过下式计算切分后的两部分训练集上对应标签的期望值,作为决策树待选的估计值:
Figure PCTCN2022079202-appb-000029
Figure PCTCN2022079202-appb-000030
式中,D 1表示一部分训练集,也记作D 1(j,s),
Figure PCTCN2022079202-appb-000031
m 1表示D 1对应的模态分量集合,c 1表示D 1对应的期望值,D 2表示一部分训练集,也记作D 2(j,s),
Figure PCTCN2022079202-appb-000032
m 2表示D 2对应的模态分量集合,c 2表示D 2对应的期望值;
S206、在遍历所有可能的解(j,s)直到找到最优解
Figure PCTCN2022079202-appb-000033
使得下式对应的目标值最小,将最优解
Figure PCTCN2022079202-appb-000034
作为划分节点:
Figure PCTCN2022079202-appb-000035
S207、重复步骤S203~S206直到满足停止分裂条件,从而生成决策树, 其中,停止分裂条件为目标值小于预设阈值或达到决策树的预设最大深度;
S208、采用弹球损失函数作为分位数梯度提升决策树预测模型的预测性能的评价模型,其中,弹球损失函数为,
Figure PCTCN2022079202-appb-000036
式中,L(y i,c)表示弹球损失函数值,τ为预设的分位点,ρ τ表示检查函数;
需要说明的是,采用弹球损失函数可以评价不同分位点条件下模型(分位数梯度提升决策树)的预测值与样本的真实值之间的差异,损失函数表现越好,通常模型的性能越好。
S209、假设分位数梯度提升决策树预测模型的输出值为f(x),则f(x)的初始化表达式为,
Figure PCTCN2022079202-appb-000037
S210、设迭代次数t=1,2,…,T,通过下式计算迭代t次后第i个训练样本的损失函数负梯度:
Figure PCTCN2022079202-appb-000038
式中,r ti表示负梯度,f t-1(x)表示分位数梯度提升决策树预测模型迭代t-1次时的负荷预测值;
S211、将(x i,y i)替换为(x i,r ti),i=1,2,…,m,根据步骤S203~S207将(x i,r ti)拟合得到第t个决策树,其对应的子叶节点区域为R tj,j=1,2,…,J,其中,J为决策树子叶节点的个数;通过下式计算最佳估计值:
Figure PCTCN2022079202-appb-000039
式中,c tj表示(x i,r ti)对应的最佳估计值;
S212、通过下式更新分位数梯度提升决策树预测模型的输出值f(x):
Figure PCTCN2022079202-appb-000040
式中,f t(x)表示为t次迭代时的更新分位数梯度提升决策树预测模型的输出值,I(·)表示阶跃函数;
需要说明的是,分位数梯度提升决策树模型采用迭代方式训练,下一轮迭代利用负梯度测量前一轮基学习器的性能,通过拟合损失函数负梯度来校正之前出现的错误,最终找到能够无限逼近真实值的输出值。
S213、训练完成后,得到分位数梯度提升决策树预测模型的最终的输出值f(x)为,
Figure PCTCN2022079202-appb-000041
式中,f T(x)表示T次迭代时的更新分位数梯度提升决策树预测模型的输出值;
S214、假设预设的分位点τ的取值为τ i=0.01,0.02,…,0.99,当给定第w个模态分量的测试样本时,则对应有模态分量在分位点为τ i时的预测值,将其记为f j(x|τ i);
S215、通过下式将各个模态分量的预测值进行累加,得到预测值在预设的分位点时的条件分布f(x|τ i):
Figure PCTCN2022079202-appb-000042
S300、采用核密度估计方法在预测值在预设的分位点时的条件分布中得到未来配电网台区负荷的概率密度函数;
具体地,概率密度函数为,
Figure PCTCN2022079202-appb-000043
式中,n为分位点的个数,K(·)表示高斯核函数,h为预置窗宽系数,y表示测试样本的标签。
需要说明的是,可采用经验法则选取合适的窗宽系数。
S400、通过概率密度函数计算求得满足预定的置信水平下的置信区间,从而输出配电网台区负荷区间预测结果。
具体地,步骤S400具体包括:
假设给定置信水平为(1-α)×100%,α表示显著性水平,α=0.01,0.05或0.1,则在概率密度函数中求得置信区间的下限L和上限U满足下式的条件:
Figure PCTCN2022079202-appb-000044
式中,s.t.表示约束条件,Pr(y∈[L,U])表示y落于置信区间[L,U]中的概率,其中,[L,U]为满足预定的置信水平下的置信区间,并将其作为配电网台区负荷区间预测结果进行输出。
以上为本发明提供的一种基于分位数梯度提升决策树的负荷区间预测方法的实施例的描述,参考图2,本发明还提供了一种基于分位数梯度提升决策树的负荷区间预测系统,包括:
模态分解模块100,用于采用集总经验模态对原始配电网台区负荷序列进行分解,得到若干个模态分量,对每个模态分量进行归一化处理;
决策树预测模块200,用于对每个模态分量分别建立分位数梯度提升决策树预测模型,得到各个模态分量在不同分位数条件下的预测值,将各个模态分量的预测值进行累加,得到预测值在预设的分位点时的条件分布;
概率密度计算模块300,用于采用核密度估计方法在预测值在预设的分位点时的条件分布中得到未来配电网台区负荷的概率密度函数;
置信预测模块400,用于通过概率密度函数计算求得满足预定的置信水平下的置信区间,从而输出配电网台区负荷区间预测结果。
需要说明的是,本发明提供的一种基于分位数梯度提升决策树的负荷区间预测系统的工作过程与上述一种基于分位数梯度提升决策树的负荷区间预测方法的流程一致,在此不再赘述。
本发明提供的一种基于分位数梯度提升决策树的负荷区间预测系统,通过采用集总经验模态对原始配电网台区负荷序列进行分解,得到不同特征的模态分量,降低了后续分位数梯度提升决策树预测模型训练的复杂程 度,提高了预测的准确性,并采用核密度估计方法得到概率密度函数,避免构造概率分布的主观性和先验性,提高了配电网台区负荷区间预测的可靠性和准确性,同时,利用决策树采样的随机性确保样本之间学习的多样化,使得分位数梯度提升决策树不易陷入过拟合风险,具有良好的泛化能力。
本发明还提供了一种计算机可读存储介质,其中存储有计算机程序,计算机程序被处理器加载执行时,实现如上述的基于分位数梯度提升决策树的负荷区间预测方法的步骤。
本发明还提供了一种电子设备,包括:处理器及存储器;其中,
存储器用于存储计算机程序;
处理器用于加载执行计算机程序,以使电子设备执行如上述的基于分位数梯度提升决策树的负荷区间预测方法的步骤。
在本申请所提供的几个实施例中,应该理解到,所揭露的装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解, 本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以通过一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(英文全称:Read-Only Memory,英文缩写:ROM)、随机存取存储器(英文全称:Random Access Memory,英文缩写:RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
以上实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围。

Claims (9)

  1. 一种基于分位数梯度提升决策树的负荷区间预测方法,其特征在于,包括以下步骤:
    S1、采用集总经验模态对原始配电网台区负荷序列进行分解,得到若干个模态分量,对每个模态分量进行归一化处理;
    S2、对每个模态分量分别建立分位数梯度提升决策树预测模型,得到各个模态分量在不同分位数条件下的预测值,将各个模态分量的预测值进行累加,得到预测值在预设的分位点时的条件分布;
    S3、采用核密度估计方法在所述预测值在预设的分位点时的条件分布中得到未来配电网台区负荷的概率密度函数;
    S4、通过所述概率密度函数计算求得满足预定的置信水平下的置信区间,从而输出配电网台区负荷区间预测结果。
  2. 根据权利要求1所述的基于分位数梯度提升决策树的负荷区间预测方法,其特征在于,所述步骤S1之前包括:
    根据预设采样周期采集配电网台区负荷的原始数据,对所述原始数据进行数据清洗,得到原始配电网台区负荷序列,所述原始配电网台区负荷序列具有时序性,所述原始数据包括有功功率。
  3. 根据权利要求1所述的基于分位数梯度提升决策树的负荷区间预测方法,其特征在于,所述步骤S1具体包括:
    S101、在所述原始配电网台区负荷序列中加入高斯白噪声得到新的配电网台区负荷序列,利用集总经验模态对所述新的配电网台区负荷序列进行分解,得到若干个模态分量,所述模态分量包括若干个本征模态分量和一个残差分量;
    S102、重复M次步骤S101,从而每次向所述原始配电网台区负荷序列加入不同的高斯白噪声,以得到M组本征模态分量和残差分量;
    S103、对M组本征模态分量和残差分量分别求取均值,从而得到若干个本征模态分量均值和一个残差分量均值,其中,所述本征模态分量均值表示为,
    Figure PCTCN2022079202-appb-100001
    所述残差分量均值表示为,
    Figure PCTCN2022079202-appb-100002
    式中,imf i,m(t)为第m组第i个本征模态分量,m=1,2,…,M,r m(t)为第m组残差分量;
    对每个模态分量通过下式进行归一化处理,
    Figure PCTCN2022079202-appb-100003
    式中,x(i)、
    Figure PCTCN2022079202-appb-100004
    分别表示归一化前和归一化后的模态分量数值,x min和x max分别为模态分量数值中的最小值和最大值。
  4. 根据权利要求1所述的基于分位数梯度提升决策树的负荷区间预测方法,其特征在于,步骤S2具体包括:
    S201、通过归一化后的若干个所述模态分量选取训练样本和测试样本以分别构建训练集和测试集;
    S202、假设所述训练样本定义为D={(x 1,y 1),(x 2,y 2),…(x i,y i),…,(x m,y m)},其中,x i和y i分别为训练样本的属性和标签,x i∈R N,R表示实数域,N表示维度,基于决策树算法对每个决策树依次采用Bootstrap策略有放回地随机抽取m个训练样本,则生成由n个决策树组合成的分位数梯度提升决策树,其中,n为预设的决策树数量;
    S203、随机选择某个待切分属性j,将属性j上所有取值从小到大排序,记为
    Figure PCTCN2022079202-appb-100005
    通过下式得到属性j上的候选划分点集合H j
    Figure PCTCN2022079202-appb-100006
    S204、在候选划分点集合H j中随机选择某个待切分点s,s∈H j,根据(j,s)将训练集切分为两部分;
    S205、通过下式计算切分后的两部分训练集上对应标签的期望值,作为决策树待选的估计值:
    Figure PCTCN2022079202-appb-100007
    Figure PCTCN2022079202-appb-100008
    式中,D 1表示一部分训练集,也记作D 1(j,s),
    Figure PCTCN2022079202-appb-100009
    m 1表示D 1对应的模态分量集合,c 1表示D 1对应的期望值,D 2表示一部分训练集,也记作D 2(j,s),
    Figure PCTCN2022079202-appb-100010
    m 2表示D 2对应的模态分量集合,c 2表示D 2对应的期望值;
    S206、在遍历所有可能的解(j,s)直到找到最优解
    Figure PCTCN2022079202-appb-100011
    使得下式对应的目标值最小,将最优解
    Figure PCTCN2022079202-appb-100012
    作为划分节点:
    Figure PCTCN2022079202-appb-100013
    S207、重复步骤S203~S206直到满足停止分裂条件,从而生成决策树,其中,停止分裂条件为所述目标值小于预设阈值或达到决策树的预设最大深度;
    S208、采用弹球损失函数作为分位数梯度提升决策树预测模型的预测性能的评价模型,其中,弹球损失函数为,
    Figure PCTCN2022079202-appb-100014
    式中,L(y i,c)表示弹球损失函数值,τ为预设的分位点,ρ τ表示检查函数;
    S209、假设分位数梯度提升决策树预测模型的输出值为f(x),则f(x)的初始化表达式为,
    Figure PCTCN2022079202-appb-100015
    S210、设迭代次数t=1,2,…,T,通过下式计算迭代t次后第i个训练样本的损失函数负梯度:
    Figure PCTCN2022079202-appb-100016
    式中,r ti表示负梯度,f t-1(x)表示分位数梯度提升决策树预测模型迭代t-1次时的负荷预测值;
    S211、将(x i,y i)替换为(x i,r ti),i=1,2,…,m,根据步骤S203~S207将(x i,r ti)拟合得到第t个决策树,其对应的子叶节点区域为R tj,j=1,2,…,J,其中,J为决策树子叶节点的个数;通过下式计算最佳估计值:
    Figure PCTCN2022079202-appb-100017
    式中,c tj表示(x i,r ti)对应的最佳估计值;
    S212、通过下式更新分位数梯度提升决策树预测模型的输出值f(x):
    Figure PCTCN2022079202-appb-100018
    式中,f t(x)表示为t次迭代时的更新分位数梯度提升决策树预测模型的输出值,I(·)表示阶跃函数;
    S213、训练完成后,得到分位数梯度提升决策树预测模型的最终的输出值f(x)为,
    Figure PCTCN2022079202-appb-100019
    式中,f T(x)表示T次迭代时的更新分位数梯度提升决策树预测模型的输出值;
    S214、假设所述预设的分位点τ的取值为τ i=0.01,0.02,…,0.99,当给定第w个模态分量的测试样本时,则对应有模态分量在分位点为τ i时的预测值,将其记为f j(x|τ i);
    S215、通过下式将各个模态分量的预测值进行累加,得到预测值在预 设的分位点时的条件分布f(x|τ i):
    Figure PCTCN2022079202-appb-100020
  5. 根据权利要求4所述的基于分位数梯度提升决策树的负荷区间预测方法,其特征在于,所述概率密度函数为,
    Figure PCTCN2022079202-appb-100021
    式中,n为分位点的个数,K(·)表示高斯核函数,h为预置窗宽系数,y表示测试样本的标签。
  6. 根据权利要求5所述的基于分位数梯度提升决策树的负荷区间预测方法,步骤S4具体包括:
    假设给定置信水平为(1-α)×100%,α表示显著性水平,α=0.01,0.05或0.1,则在所述概率密度函数中求得置信区间的下限L和上限U满足下式的条件:
    Figure PCTCN2022079202-appb-100022
    式中,s.t.表示约束条件,Pr(y∈[L,U])表示y落于置信区间[L,U]中的概率,其中,[L,U]为满足预定的置信水平下的置信区间,并将其作为配电网台区负荷区间预测结果进行输出。
  7. 一种基于分位数梯度提升决策树的负荷区间预测系统,其特征在于,包括:
    模态分解模块,用于采用集总经验模态对原始配电网台区负荷序列进行分解,得到若干个模态分量,对每个模态分量进行归一化处理;
    决策树预测模块,用于对每个模态分量分别建立分位数梯度提升决策树预测模型,得到各个模态分量在不同分位数条件下的预测值,将各个模态分量的预测值进行累加,得到预测值在预设的分位点时的条件分布;
    概率密度计算模块,用于采用核密度估计方法在所述预测值在预设的分位点时的条件分布中得到未来配电网台区负荷的概率密度函数;
    置信预测模块,用于通过所述概率密度函数计算求得满足预定的置信 水平下的置信区间,从而输出配电网台区负荷区间预测结果。
  8. 一种计算机可读存储介质,其中存储有计算机程序,其特征在于,所述计算机程序被处理器加载执行时,实现如权利要求1至6中任一所述的基于分位数梯度提升决策树的负荷区间预测方法的步骤。
  9. 一种电子设备,其特征在于,包括:处理器及存储器;其中,
    所述存储器用于存储计算机程序;
    所述处理器用于加载执行所述计算机程序,以使所述电子设备执行如权利要求1至6中任一所述的基于分位数梯度提升决策树的负荷区间预测方法的步骤。
PCT/CN2022/079202 2021-09-08 2022-03-04 基于分位数梯度提升决策树的负荷区间预测方法及系统 WO2023035564A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111046819.8 2021-09-08
CN202111046819.8A CN113496315B (zh) 2021-09-08 2021-09-08 基于分位数梯度提升决策树的负荷区间预测方法及系统

Publications (1)

Publication Number Publication Date
WO2023035564A1 true WO2023035564A1 (zh) 2023-03-16

Family

ID=77997172

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/079202 WO2023035564A1 (zh) 2021-09-08 2022-03-04 基于分位数梯度提升决策树的负荷区间预测方法及系统

Country Status (2)

Country Link
CN (1) CN113496315B (zh)
WO (1) WO2023035564A1 (zh)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116300774A (zh) * 2023-05-23 2023-06-23 蓝星智云(山东)智能科技有限公司 基于主元分析与核密度估计的间歇过程可视化监控方法
CN116432478A (zh) * 2023-06-15 2023-07-14 广东电网有限责任公司东莞供电局 一种电力系统的能量确定方法、装置、设备及介质
CN116544931A (zh) * 2023-06-27 2023-08-04 北京理工大学 基于集成片段变换和时间卷积网络的电力负荷分布预测方法
CN116596044A (zh) * 2023-07-18 2023-08-15 华能山东发电有限公司众泰电厂 基于多源数据的发电负荷预测模型训练方法及装置
CN116646933A (zh) * 2023-07-24 2023-08-25 北京中能亿信软件有限公司 一种基于大数据的电力负荷调度方法及系统
CN117112857A (zh) * 2023-10-23 2023-11-24 合肥综合性国家科学中心人工智能研究院(安徽省人工智能实验室) 一种适用于工业智能制造的加工路径推荐方法
CN117239731A (zh) * 2023-09-21 2023-12-15 山东工商学院 基于混合模型的节假日短期电力负荷预测方法
CN117290664A (zh) * 2023-09-27 2023-12-26 贵州大学 一种基于emd-blstm模型的刀盘扭矩实时动态预测方法及装置

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113496315B (zh) * 2021-09-08 2022-01-25 广东电网有限责任公司湛江供电局 基于分位数梯度提升决策树的负荷区间预测方法及系统

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109359778A (zh) * 2018-11-13 2019-02-19 中石化石油工程技术服务有限公司 基于优化经验模态分解的短期天然气负荷预测方法
CN109726865A (zh) * 2018-12-27 2019-05-07 国网江苏省电力有限公司电力科学研究院 基于emd-qrf的用户负荷概率密度预测方法、装置和存储介质
US20210097453A1 (en) * 2018-06-12 2021-04-01 Tsinghua University Method for quantile probabilistic short-term power load ensemble forecasting, electronic device and storage medium
CN113496315A (zh) * 2021-09-08 2021-10-12 广东电网有限责任公司湛江供电局 基于分位数梯度提升决策树的负荷区间预测方法及系统

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10366451B2 (en) * 2016-01-27 2019-07-30 Huawei Technologies Co., Ltd. System and method for prediction using synthetic features and gradient boosted decision tree
CN109978201A (zh) * 2017-12-27 2019-07-05 深圳市景程信息科技有限公司 基于高斯过程分位数回归模型的概率负荷预测系统及方法
CN109242139A (zh) * 2018-07-23 2019-01-18 华北电力大学 一种电力日峰值负荷预测方法
CN110969197B (zh) * 2019-11-22 2022-01-04 上海交通大学 一种基于实例迁移的风力发电的分位数预测方法
CN111523735A (zh) * 2020-05-09 2020-08-11 上海积成能源科技有限公司 一种基于轻量级梯度提升算法预测短期电力负荷的系统模型
CN112001439A (zh) * 2020-08-19 2020-11-27 西安建筑科技大学 基于gbdt的商场建筑空调冷负荷预测方法、存储介质及设备
CN112488352A (zh) * 2020-10-21 2021-03-12 上海旻浦科技有限公司 一种基于梯度提升回归的房价区间预测方法及系统
CN112926780A (zh) * 2021-03-01 2021-06-08 南方电网科学研究院有限责任公司 基于Sister预测的平均分位数回归的概率负荷预测方法

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210097453A1 (en) * 2018-06-12 2021-04-01 Tsinghua University Method for quantile probabilistic short-term power load ensemble forecasting, electronic device and storage medium
CN109359778A (zh) * 2018-11-13 2019-02-19 中石化石油工程技术服务有限公司 基于优化经验模态分解的短期天然气负荷预测方法
CN109726865A (zh) * 2018-12-27 2019-05-07 国网江苏省电力有限公司电力科学研究院 基于emd-qrf的用户负荷概率密度预测方法、装置和存储介质
CN113496315A (zh) * 2021-09-08 2021-10-12 广东电网有限责任公司湛江供电局 基于分位数梯度提升决策树的负荷区间预测方法及系统

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
BI, YUNFAN B: "Short-Term Load Forecasting Model of Power System Based on Gradient Boosting Decision Tree", JOURNAL OF QINGDAO UNIVERSITY (ENGINEERING & TECHNOLOGY EDITION), vol. 33, no. 3, 1 August 2018 (2018-08-01), pages 1 - 6, XP093045069 *
WANG WANG LIYAN LIYAN, XU QIANG, HUANG KAIYI, WU JIAN, AI QIAN, ZHANG YAXIN, JIA XUAN: "A Two-Stage Stochastic Programming Method for Multi-Energy Microgrid System Considering the Uncertainty of New Energy and Load", DIANLI JIANSHE = ELECTRIC POWER CONSTRUCTION, DIANLI JIANSHE ZAZHISHE, CN, vol. 41, no. 4, 1 April 2020 (2020-04-01), CN , pages 100, XP093045061, ISSN: 1000-7229, DOI: 10.3969/j.issn.1000-7229.2020.04.012 *

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116300774B (zh) * 2023-05-23 2023-08-08 蓝星智云(山东)智能科技有限公司 基于主元分析与核密度估计的间歇过程可视化监控方法
CN116300774A (zh) * 2023-05-23 2023-06-23 蓝星智云(山东)智能科技有限公司 基于主元分析与核密度估计的间歇过程可视化监控方法
CN116432478B (zh) * 2023-06-15 2023-09-08 广东电网有限责任公司东莞供电局 一种电力系统的能量确定方法、装置、设备及介质
CN116432478A (zh) * 2023-06-15 2023-07-14 广东电网有限责任公司东莞供电局 一种电力系统的能量确定方法、装置、设备及介质
CN116544931A (zh) * 2023-06-27 2023-08-04 北京理工大学 基于集成片段变换和时间卷积网络的电力负荷分布预测方法
CN116544931B (zh) * 2023-06-27 2023-12-01 北京理工大学 基于集成片段变换和时间卷积网络的电力负荷分布预测方法
CN116596044B (zh) * 2023-07-18 2023-11-07 华能山东泰丰新能源有限公司 基于多源数据的发电负荷预测模型训练方法及装置
CN116596044A (zh) * 2023-07-18 2023-08-15 华能山东发电有限公司众泰电厂 基于多源数据的发电负荷预测模型训练方法及装置
CN116646933B (zh) * 2023-07-24 2023-10-10 北京中能亿信软件有限公司 一种基于大数据的电力负荷调度方法及系统
CN116646933A (zh) * 2023-07-24 2023-08-25 北京中能亿信软件有限公司 一种基于大数据的电力负荷调度方法及系统
CN117239731A (zh) * 2023-09-21 2023-12-15 山东工商学院 基于混合模型的节假日短期电力负荷预测方法
CN117239731B (zh) * 2023-09-21 2024-02-27 山东工商学院 基于混合模型的节假日短期电力负荷预测方法
CN117290664A (zh) * 2023-09-27 2023-12-26 贵州大学 一种基于emd-blstm模型的刀盘扭矩实时动态预测方法及装置
CN117290664B (zh) * 2023-09-27 2024-04-26 贵州大学 一种基于emd-blstm模型的刀盘扭矩实时动态预测方法及装置
CN117112857A (zh) * 2023-10-23 2023-11-24 合肥综合性国家科学中心人工智能研究院(安徽省人工智能实验室) 一种适用于工业智能制造的加工路径推荐方法
CN117112857B (zh) * 2023-10-23 2024-01-05 合肥综合性国家科学中心人工智能研究院(安徽省人工智能实验室) 一种适用于工业智能制造的加工路径推荐方法

Also Published As

Publication number Publication date
CN113496315B (zh) 2022-01-25
CN113496315A (zh) 2021-10-12

Similar Documents

Publication Publication Date Title
WO2023035564A1 (zh) 基于分位数梯度提升决策树的负荷区间预测方法及系统
De Boer et al. A tutorial on the cross-entropy method
Nguyen et al. Pay-as-you-go reconciliation in schema matching networks
US20230072708A1 (en) Asymmetric Laplace-based wind power forecasting method and system
JP2021518024A (ja) 機械学習アルゴリズムのためのデータを生成する方法、システム
Li et al. Restricted Boltzmann machine-based approaches for link prediction in dynamic networks
CN110968701A (zh) 用于图神经网络的关系图谱建立方法以及装置、设备
US7809665B2 (en) Method and system for transitioning from a case-based classifier system to a rule-based classifier system
US11042802B2 (en) System and method for hierarchically building predictive analytic models on a dataset
CN110910004A (zh) 一种多重不确定性的水库调度规则提取方法及系统
CN116611546B (zh) 基于知识图谱的目标研究区域滑坡预测方法及系统
Bouchachia et al. Towards incremental fuzzy classifiers
Wang et al. Graph active learning for GCN-based zero-shot classification
JP2016194914A (ja) 混合モデル選択の方法及び装置
Zhang et al. Reinforcement learning with actor-critic for knowledge graph reasoning
CN113642727A (zh) 神经网络模型的训练方法和多媒体信息的处理方法、装置
Li et al. An improved genetic-XGBoost classifier for customer consumption behavior prediction
Kattan et al. GP made faster with semantic surrogate modelling
Khodaverdian et al. Combination of convolutional neural network and gated recurrent unit for energy aware resource allocation
CN114266352B (zh) 模型训练结果优化方法、装置、存储介质及设备
CN116822571A (zh) 预测方法、模型的训练方法及装置、设备、存储介质
US20230209367A1 (en) Telecommunications network predictions based on machine learning using aggregated network key performance indicators
CN115408189A (zh) 人工智能与大数据结合的异常检测方法及服务系统
CN113590774A (zh) 事件查询方法、装置以及存储介质
Cadenas et al. NIP-an imperfection processor to data mining datasets

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22866042

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE