CN117350439A

CN117350439A - Energy aggregation service provider load prediction method and system based on transverse federal learning

Info

Publication number: CN117350439A
Application number: CN202311535675.1A
Authority: CN
Inventors: 宋瑜辉; 黄一川; 荆朝霞
Original assignee: South China University of Technology SCUT
Current assignee: South China University of Technology SCUT
Priority date: 2023-11-16
Filing date: 2023-11-16
Publication date: 2024-01-05
Anticipated expiration: 2043-11-16
Also published as: CN117350439B

Abstract

The invention discloses a method and a system for predicting the load of an energy aggregation server based on transverse federal learning, wherein the method and the system are used for realizing the load prediction of the energy aggregation server under the data privacy security based on a FedAvg transverse federal learning framework and a PCA-BP neural network model, data and model parameters are respectively placed in a local server of each terminal user and a central server of the energy aggregation server, and the local server uses the PCA-BP neural network model to predict the load of the terminal user and calculate Root Mean Square Error (RMSE); the central server adopts FedAvg algorithm to aggregate weighted average loss of all local servers and update PCA-BP neural network model parameters; by the limited interaction of the local server and the central server, strict adherence to data security and privacy protocols is ensured, and the number of models and time consumption are reduced, so that accurate load prediction is realized without damaging user data privacy.

Description

Load forecasting method and system for energy aggregation service providers based on horizontal federated learning

技术领域Technical Field

本发明涉及能源聚合服务商负荷预测的技术领域，尤其是指一种基于横向联邦学习的能源聚合服务商负荷预测方法及系统。The present invention relates to the technical field of load forecasting for energy aggregation service providers, and in particular to a method and system for load forecasting for energy aggregation service providers based on horizontal federated learning.

背景技术Background Art

新能源市场主体如分布式光伏、电动汽车和储能是新型电力系统的重要结构支撑，与传统的燃煤和燃气发电厂等能源供应商相比，新能源供应商通常具有容量规模较小、在电力系统物理层面上分散且相互独立的特点，因此它们在电力市场中直接获取正收益的能力相对较弱。为解决这一问题，以撮合市场交易为服务的能源聚合服务商应运而生。然而，能源聚合服务商有效的负荷预测能力是其参与电力市场的基石，并且是实现最大化市场利润的关键因素。随着新能源供应商群体的不断壮大，大量具有不同时空维度颗粒度的电力数据将被产生。这些数据对负荷预测具有巨大价值，且将根据新能源供应商个体的能源类型和环境因素需求进行保存。但由于电力商品的安全和隐私特性，能源供应商不愿意将各自的数据共享给能源聚合服务商。因此，在确保能源供应商个体数据安全和隐私的前提下，如何让能源聚合服务商有效地利用电力数据进行负荷预测成为了一个重要的问题。New energy market players such as distributed photovoltaics, electric vehicles and energy storage are important structural supports for the new power system. Compared with traditional energy suppliers such as coal-fired and gas-fired power plants, new energy suppliers usually have smaller capacity, are dispersed and independent at the physical level of the power system, so their ability to directly obtain positive returns in the power market is relatively weak. To solve this problem, energy aggregation service providers that match market transactions have emerged. However, the effective load forecasting ability of energy aggregation service providers is the cornerstone of their participation in the power market and a key factor in maximizing market profits. With the continuous growth of the new energy supplier group, a large amount of power data with different spatiotemporal granularity will be generated. These data are of great value for load forecasting and will be saved according to the energy type and environmental factors of individual new energy suppliers. However, due to the security and privacy characteristics of electricity commodities, energy suppliers are reluctant to share their data with energy aggregation service providers. Therefore, on the premise of ensuring the security and privacy of individual energy supplier data, how to enable energy aggregation service providers to effectively use power data for load forecasting has become an important issue.

电力数据已被广泛应用于负荷预测等场景，其重要的社会和经济价值正不断被挖掘。传统的负荷预测方法主要包括小波神经网络WNN、广义回归神经网络GRNN、最小二乘支持向量回归LSSVR、数据迁移学习DTL、梯度提升决策树GBDT、深度信念网络DBN、双向长短期记忆网络BiLSTM等。然而，在电力市场环境下，能源聚合服务商在为旗下能源供应商提供多样化服务时，通常无法直接获取完整的数据集。从零售商角度看，电力数据涉及系统安全和隐私，与外部共享或泄露可能存在很大的安全风险。针对这个问题，现有研究中，无论是采用传统的集中式学习算法，还是分布式计算算法，都主要依赖于对数据进行加密处理以保证数据的安全和隐私。不过这种方式仍然需要原始数据的对外交互，会导致数据隐私和安全无法得到有效保障。为解决这一矛盾，联邦学习FL为保护数据安全和隐私提供了一种良好途径。Power data has been widely used in scenarios such as load forecasting, and its important social and economic value is being continuously explored. Traditional load forecasting methods mainly include wavelet neural network WNN, generalized regression neural network GRNN, least squares support vector regression LSSVR, data transfer learning DTL, gradient boosting decision tree GBDT, deep belief network DBN, bidirectional long short-term memory network BiLSTM, etc. However, in the power market environment, energy aggregation service providers usually cannot directly obtain complete data sets when providing diversified services to their energy suppliers. From the perspective of retailers, power data involves system security and privacy, and sharing or leaking it with the outside may pose great security risks. In response to this problem, existing research, whether using traditional centralized learning algorithms or distributed computing algorithms, mainly relies on encrypting data to ensure data security and privacy. However, this method still requires external interaction of original data, which will result in data privacy and security cannot be effectively guaranteed. To resolve this contradiction, federated learning FL provides a good way to protect data security and privacy.

联邦学习已在多个领域得到实际应用，包括移动设备、工业工程、医疗保健、金融等，尤其在医疗和金融行业的应用较为广泛。在FL技术架构中，主要包括去中心化和中心化两种模式。中心化模式即在聚合服务商处部署一台中央服务器，负责全局模型参数拟合；各个能源供应商部署一台客户端服务器，负责与中央服务器通信。在此过程中，能源供应商的电力数据保存在本地客户端，不参与数据交换。能源供应商仅需下载中央服务器的模型参数进行本地训练，之后将训练后的模型参数上传至中央服务器。中央服务器在进行全局优化拟合后，将更新后的参数下发给客户端进行迭代训练，直至获得最终预测模型。但在参与联邦学习训练时，会面临一个问题：在联邦学习框架下，预测网络可能因数据集的无关特征和庞大数据量导致预测性能不足。Federated learning has been put into practical use in many fields, including mobile devices, industrial engineering, healthcare, finance, etc., especially in the medical and financial industries. In the FL technical architecture, there are mainly two modes: decentralized and centralized. The centralized mode is to deploy a central server at the aggregation service provider, which is responsible for global model parameter fitting; each energy supplier deploys a client server, which is responsible for communicating with the central server. In this process, the power data of the energy supplier is stored in the local client and does not participate in data exchange. The energy supplier only needs to download the model parameters of the central server for local training, and then upload the trained model parameters to the central server. After the global optimization fitting, the central server sends the updated parameters to the client for iterative training until the final prediction model is obtained. However, when participating in federated learning training, there will be a problem: under the federated learning framework, the prediction network may have insufficient prediction performance due to irrelevant features of the data set and the huge amount of data.

发明内容Summary of the invention

本发明的第一目的在于克服现有技术的缺点与不足，提供一种基于横向联邦学习的能源聚合服务商负荷预测方法，允许在不接触真实数据集的情况下共享新能源主体的信息，从而获得准确有效的负荷预测能力。The first purpose of the present invention is to overcome the shortcomings and deficiencies of the prior art and to provide a load forecasting method for energy aggregation service providers based on horizontal federated learning, which allows the information of new energy entities to be shared without accessing the real data set, thereby obtaining accurate and effective load forecasting capabilities.

本发明的第二目的在于提供一种基于横向联邦学习的能源聚合服务商负荷预测系统。The second object of the present invention is to provide an energy aggregation service provider load forecasting system based on horizontal federated learning.

本发明的第一目的通过下述技术方案实现：基于横向联邦学习的能源聚合服务商负荷预测方法，该方法是基于FedAvg横向联邦学习框架与PCA-BP神经网络模型实现数据隐私安全下能源聚合服务商的负荷预测，其中，数据和模型参数分别放置于各终端用户的本地服务器和能源聚合服务商的中央服务器中，本地服务器使用PCA-BP神经网络模型对终端用户进行负荷预测并计算均方根误差RMSE，该PCA-BP神经网络模型是在原来BP神经网络模型的基础上基于PCA算法提取主要特征，剔除聚合商用户本地数据集的无用和冗余特征，防止在BP神经网络模型中放大网络缺点；中央服务器采用FedAvg算法聚合所有本地服务器的加权平均损失并更新PCA-BP神经网络模型的参数；通过本地服务器和中央服务器的有限次交互，确保严格遵守数据安全和隐私协议，减少了模型数量和时间消耗，从而在不损害用户数据隐私的情况下实现了准确的负荷预测。The first purpose of the present invention is achieved through the following technical solutions: a load forecasting method for energy aggregation service providers based on horizontal federated learning, which is based on the FedAvg horizontal federated learning framework and the PCA-BP neural network model to realize the load forecasting of energy aggregation service providers under data privacy security, wherein the data and model parameters are respectively placed in the local server of each terminal user and the central server of the energy aggregation service provider, and the local server uses the PCA-BP neural network model to perform load forecasting on the terminal user and calculate the root mean square error RMSE. The PCA-BP neural network model is based on the original BP neural network model and extracts the main features based on the PCA algorithm, eliminates the useless and redundant features of the local data set of the aggregator user, and prevents the network shortcomings from being amplified in the BP neural network model; the central server uses the FedAvg algorithm to aggregate the weighted average losses of all local servers and update the parameters of the PCA-BP neural network model; through limited interactions between the local server and the central server, strict compliance with the data security and privacy protocols is ensured, the number of models and time consumption are reduced, thereby achieving accurate load forecasting without compromising the privacy of user data.

进一步，所述的基于横向联邦学习的能源聚合服务商负荷预测方法，包括以下步骤：Furthermore, the energy aggregation service provider load forecasting method based on horizontal federated learning includes the following steps:

S1：采集时间数据、天气数据、行业数据、负荷数据和经济数据这五种特征类型初始数据并进行人工特征选取，形成聚合商用户本地数据集，对聚合商用户本地数据集进行预处理，包括异常值检测和缺失值补充，并对数据中的离散值采用独热编码和sin/cos循环编码，连续值采取均值方差归一化操作；接着，将预处理后的聚合商用户本地数据集划分为训练集与测试集，分别用于模型的训练与测试；S1: Collect the initial data of five feature types, namely time data, weather data, industry data, load data and economic data, and perform manual feature selection to form a local data set of aggregator users. Preprocess the local data set of aggregator users, including outlier detection and missing value supplementation, and use one-hot encoding and sin/cos cyclic encoding for discrete values in the data, and mean-variance normalization for continuous values; then, divide the preprocessed local data set of aggregator users into a training set and a test set, which are used for model training and testing respectively;

S2：把训练集的数据送入本地服务器的PCA-BP神经网络模型中进行首次训练，此次训练中给定该模型的初始参数θ₀；训练时先对训练集的数据采用PCA进行特征提取，包括数据特征矩阵的标准化处理、协方差矩阵计算和奇异值分解，得到经过特征提取后的特征矩阵；再将提取的特征矩阵输入到PCA-BP神经网络模型中得到初始各终端用户的负荷预测值；其中，在反向传播中采用平滑曲线交叉熵方法计算负荷预测结果和真实值的损失值，经过有限轮迭代至损失值最小，得到单次本地训练最优网络；S2: Send the data of the training set to the PCA-BP neural network model of the local server for the first training. The initial parameter θ ₀ of the model is given in this training. During the training, PCA is first used to extract features from the data of the training set, including standardization of the data feature matrix, calculation of the covariance matrix and singular value decomposition, to obtain a feature matrix after feature extraction. The extracted feature matrix is then input into the PCA-BP neural network model to obtain the initial load forecast value of each terminal user. In the back propagation, the smooth curve cross entropy method is used to calculate the loss value of the load forecast result and the true value. After a finite number of iterations until the loss value is minimized, the optimal network for a single local training is obtained.

S3：将测试集的数据输入到单次本地训练最优网络中得到负荷预测信息，接着，计算该负荷预测信息与真实值的平滑曲线交叉熵损失函数和网络权重、网络阈值的反向传播梯度，生成单次本地最优网络参数更新值θ′；S3: Input the data of the test set into the single local training optimal network to obtain the load forecast information. Then, calculate the smooth curve cross entropy loss function of the load forecast information and the true value and the back propagation gradient of the network weight and network threshold to generate a single local optimal network parameter update value θ′;

S4：本地服务器将单次本地最优网络参数更新值θ′上传至中央服务器，中央服务器基于FedAvg算法求出所有终端用户的加权平均损失，通过随机梯度下降SGD拟合全局模型参数θ；接着，中央服务器将全局模型参数θ发送给本地服务器，本地服务器基于θ更新本地参数θ′＝θ，本地服务器的PCA-BP神经网络模型采用新的参数θ进行新一轮训练；S4: The local server uploads the single local optimal network parameter update value θ′ to the central server. The central server calculates the weighted average loss of all terminal users based on the FedAvg algorithm and fits the global model parameter θ through stochastic gradient descent SGD. Then, the central server sends the global model parameter θ to the local server. The local server updates the local parameter θ′=θ based on θ. The PCA-BP neural network model of the local server uses the new parameter θ for a new round of training.

S5：重复步骤S2-S4直至完成第R次交互；通过中央服务器和本地服务器的不断交互迭代，获得针对当前市场环境下负荷预测的全局最优模型，采用最终的全局最优模型参数θ在本地服务器进行最终负荷预测，得到全局最优负荷预测值，并根据此全局最优负荷预测值制定相应市场策略以获取收益。S5: Repeat steps S2-S4 until the Rth interaction is completed; through continuous interaction and iteration between the central server and the local server, the global optimal model for load forecasting under the current market environment is obtained, and the final global optimal model parameter θ is used to perform the final load forecast on the local server to obtain the global optimal load forecast value, and formulate the corresponding market strategy based on this global optimal load forecast value to obtain benefits.

进一步，在步骤S1，采集的初始数据中影响因素包括：时间因素、天气因素、行业因素和经济因素；其中，时间因素的纬度数据选择年、月、日、时、分信息反映负荷的周期性变化；影响负荷变化的天气因素包括温度、湿度、降水量、日照、风向、风速和气压；行业因素通过农/林/牧/渔业、工业、交通运输/仓储/邮政业、信息传输/软件/信息技术服务业、批发和零售业、住宿和餐饮业、金融业、房地产业、租赁和商务服务业、公共服务及管理组织这10个行业用电量来描述对负荷的影响；经济因素直接选取当地GDP数据反映社会经济环境与负荷之间的联系。Further, in step S1, the influencing factors in the collected initial data include: time factors, weather factors, industry factors and economic factors; among them, the latitude data of the time factor selects year, month, day, hour and minute information to reflect the periodic changes of the load; the weather factors affecting the load change include temperature, humidity, precipitation, sunshine, wind direction, wind speed and air pressure; the industry factor describes the impact on the load through the electricity consumption of 10 industries, namely agriculture/forestry/animal husbandry/fishery, industry, transportation/warehousing/postal industry, information transmission/software/information technology service industry, wholesale and retail industry, accommodation and catering industry, finance industry, real estate industry, leasing and business service industry, public service and management organization; the economic factor directly selects the local GDP data to reflect the connection between the socio-economic environment and the load.

进一步，在步骤S1，对聚合商用户本地数据集进行预处理，包括：Further, in step S1, the aggregator user local data set is preprocessed, including:

a、用3-Sigma准则检测能源聚合服务商负荷数据集可能存在的异常值；a. Use the 3-Sigma criterion to detect possible outliers in the energy aggregation service provider's load data set;

b、按式(1)、(2)进行缺失值填充：b. Fill missing values according to formula (1) and (2):

x_ab＝X_a-1(1-w_ab)+X_aw_ab,a＝2,3,...,T″,b＝1,2,…,N (1)；x _ab ＝X _a-1 (1-w _ab )+X _a w _ab ,a＝2,3,...,T″,b＝1,2,...,N (1);

式中，X_a为原序列第a小时的数据值，X_a-1为原序列第a-1小时的数据值，x_ab为插值后序列第a个小时内第b个数据值，w_ab为X_a对x_ab的权重，N为1小时内拆分点的总数，T″为原序列总小时数；Where _Xa is the data value of the ath hour of the original sequence, Xa _-1 is the data value of the a-1th hour of the original sequence, _xab is the bth data value within the ath hour of the interpolated sequence, _wab is the weight of _Xa to _xab , N is the total number of split points within 1 hour, and T″ is the total number of hours in the original sequence;

c、对数据中的离散值采用独热编码和sin/cos循环编码；c. Use one-hot encoding and sin/cos cyclic encoding for discrete values in the data;

d、对数据中的连续值采用均值方差归一化进行特征的缩放，如式(3)：d. Use mean variance normalization to scale the features of continuous values in the data, as shown in formula (3):

式中，x_norm为数据经归一化后的值，μ为数据均值，σ为数据标准差。In the formula, _xnorm is the normalized value of the data, μ is the mean of the data, and σ is the standard deviation of the data.

进一步，在步骤S2，采用PCA-BP神经网络进行训练，包括以下步骤：Further, in step S2, PCA-BP neural network is used for training, including the following steps:

S21：对训练集的特征数据样本中每个特征d_m计算均值和标准差并对各数据进行标准化处理得到特征d_i'_j，最后得到标准化特征矩阵D_stand，其中，m表示特征数量，n表示每种特征在时间维度上的序列长度，i∈n，j∈m，如式(4)-(8)所示：S21: Calculate the mean of each feature d _m in the feature data sample of the training set and standard deviation The data are standardized to obtain the features d _i ' _j , and finally the standardized feature matrix D _stand is obtained, where m represents the number of features, n represents the sequence length of each feature in the time dimension, i∈n, j∈m, as shown in equations (4)-(8):

式中，D为特征矩阵，d_nm表示数据样本的第m维特征在第n个长度上的具体数据，d'_m表示标准化特征矩阵的第m维特征向量，d'_nm表示标准化特征矩阵的第m维特征向量在第n个长度上的具体数据；Where D is the feature matrix, d _nm represents the specific data of the m-th dimension feature of the data sample at the n-th length, d' _m represents the m-th dimension feature vector of the standardized feature matrix, and d' _nm represents the specific data of the m-th dimension feature vector of the standardized feature matrix at the n-th length;

S22：计算标准化特征矩阵D_stand的协方差矩阵A，如式(9)、(10)所示：S22: Calculate the covariance matrix A of the standardized feature matrix D _stand , as shown in equations (9) and (10):

式中，a_ij和a_nm表示协方差矩阵的元素，k∈n，d'_ki和d'_kj分别表示标准化后的第i维和第j维特征的第k个数据，由式(7)计算得出，和分别表示第i维和第j维特征的均值，由式(5)计算得出；In the formula, _aij and _anm represent the elements of the covariance matrix, k∈n, d' _ki and d' _kj represent the kth data of the i-th and j-th dimension features after standardization, respectively, which are calculated by formula (7). and They represent the mean of the i-th and j-th dimension features, respectively, and are calculated by formula (5);

S23：利用奇异值分解SVD计算协方差矩阵A的特征值λ_q和特征向量Z_q，其中q∈[1,m]，如式(11)、(12)所示：S23: Calculate the eigenvalues λ _q and eigenvectors Z _q of the covariance matrix A using singular value decomposition SVD, where q∈[1,m], as shown in equations (11) and (12):

Z₁＝[z₁₁ z₁₂ … z_1m]^T Z₂＝[z₂₁ z₂₂ … z_2m]^T…Z_m＝[z_n1 z_n2 … z_nm]^T (11)；Z ₁ = [z ₁₁ z ₁₂ … z _1m ] ^T Z ₂ = [z ₂₁ z ₂₂ … z _2m ] ^T … Z _m = [z _n1 z _n2 … z _nm ] ^T (11);

λ₁≥λ₂≥…≥λ_m≥0 (12)；λ ₁ ≥λ ₂ ≥…≥λ _m ≥0 (12);

式中，λ_m、Z_m分别为协方差矩阵的特征值和特征向量，z_nm为第m个特征向量的第n个数据；Where, λ _m and Z _m are the eigenvalue and eigenvector of the covariance matrix respectively, and z _nm is the nth data of the mth eigenvector;

S24：引入主成分的累积贡献率μ′为特征向量的评价指标，选取累积贡献率超过80％的特征向量t_j作为评价矩阵T′，通过PCA特征提取得到特征矩阵X，如式(13)-(15)所示：S24: The cumulative contribution rate μ′ of the principal component is introduced as the evaluation index of the eigenvector, and the eigenvector _tj with a cumulative contribution rate exceeding 80% is selected as the evaluation matrix T′. The feature matrix X is obtained by PCA feature extraction, as shown in equations (13)-(15):

T′＝[t₁ t₂ … t_m] (14)；T′=[t ₁ t ₂ ... t _m ] (14);

x_m＝d_mt_m,X＝DT′ (15)；x _m ＝d _m t _m ,X＝DT′ (15);

式中，p′表示特征值的顺序，λ_j、λ_k均为λ_m的子集，x_m为特征矩阵X的第m维特征；Where p′ represents the order of eigenvalues, λ _j and λ _k are both subsets of λ _m , and x _m is the m-th dimension feature of the feature matrix X;

S25：基于BP神经网络模型搭建一个由1个输入层、3个隐藏层以及1个输出层组成负荷预测模型，即PCA-BP神经网络模型，其激活函数全部采用Sigmoid函数，如式(16)所示：S25: Based on the BP neural network model, a load forecasting model consisting of 1 input layer, 3 hidden layers and 1 output layer is built, namely the PCA-BP neural network model. Its activation function all uses the Sigmoid function, as shown in formula (16):

式中，f(x)为神经元的激活函数，x表示每层神经元的输出；In the formula, f(x) is the activation function of the neuron, and x represents the output of each layer of neurons;

S26：设定3个隐藏层的权重系数分别为ω₁,ω₂,ω₃，阈值分别为b₁,b₂,b₃；输出层的权重系数为ω₄，阈值为b₄；将PCA-BP神经网络模型参数统计为θ∈W,b，其中，W＝{ω₁,ω₂,ω₃,ω₄|，b＝{b₁,b₂,b₃,b₄}，W为PCA-BP神经网络模型权重集合，b为PCA-BP神经网络模型阈值集合；该PCA-BP神经网络模型的正向传播如式(17)所示：S26: Set the weight coefficients of the three hidden layers to ω ₁ , ω ₂ , ω ₃ , and the thresholds to b ₁ , b ₂ , b ₃ ; set the weight coefficient of the output layer to ω ₄ , and the threshold to b ₄ ; statistically calculate the parameters of the PCA-BP neural network model as θ∈W,b, where W = {ω ₁ ,ω ₂ ,ω ₃ ,ω ₄ |, b = {b ₁ ,b ₂ ,b ₃ ,b ₄ }, W is the weight set of the PCA-BP neural network model, and b is the threshold set of the PCA-BP neural network model; the forward propagation of the PCA-BP neural network model is shown in formula (17):

式中，q＝Z⁺表示该网络模型的层数，h∈q；X_q＝[x_q,1 x_q,2 … x_q,m]和Y_q＝[y_q,1 y_q,2… y_q,m]分别表示第q层网络的输入和输出，x_q,m为第q层网络第m个输入，y_q,m为第q层网络第m个输出；ω_h和b_h分别表示第h层网络的权重系数和阈值，ω_h∈W，b_h∈b，并且当q＝1时，X_q等于式(15)中的X，即X为该网络模型的第一层输入；In the formula, q = Z ⁺ represents the number of layers of the network model, h∈q; _Xq = [ _xq,1xq _,2 … _xq,m ] and _Yq = [ _yq,1yq _,2 …yq _,m ] represent the input and output of the qth layer network respectively, xq _,m is the mth input of the qth layer network, and yq _,m is the mth output of the qth layer network; _ωh and _bh represent the weight coefficient and threshold of the hth layer network respectively, _ωh∈W , _bh∈b , and when q=1, _Xq is equal to X in formula (15), that is, X is the first layer input of the network model;

S27：将训练集的数据经PCA降维后的输入X输入至构建的PCA-BP神经网络模型中，得到预测结果；S27: Input the input X of the training set data after PCA dimension reduction into the constructed PCA-BP neural network model to obtain the prediction result;

S28：采用平滑曲线交叉熵方法得到PCA-BP神经网络模型各级的损失函数L_q，如式(18)所示：S28: The smooth curve cross entropy method is used to obtain the loss function L _q of each level of the PCA-BP neural network model, as shown in formula (18):

式中，Y′_q表示每层网络的真实值；In the formula, Y′ _q represents the true value of each layer of the network;

S29：计算PCA-BP神经网络模型权重的反向传播梯度与阈值的反向传播梯度如式(19)、(20)所示：S29: Calculate the back propagation gradient of the PCA-BP neural network model weights Backpropagating gradients with threshold As shown in formula (19) and (20):

S210：更新PCA-BP神经网络模型参数，表示为式(21)：S210: Update the PCA-BP neural network model parameters, expressed as formula (21):

式中，l为网络模型的学习率，用来表示网络模型训练迭代收敛的速度，ω′_h、b'_h分别表示更新后的PCA-BP神经网络模型权重和阈值；Where l is the learning rate of the network model, which is used to indicate the speed of iterative convergence of network model training; ω′ _h and b′ _h respectively represent the updated weight and threshold of the PCA-BP neural network model;

S211：重复步骤S26-S210，直至经过有限轮次训练后停止迭代更新PCA-BP神经网络参数θ′，得到单次本地训练最优网络。S211: Repeat steps S26-S210 until the iterative update of the PCA-BP neural network parameters θ′ is stopped after a limited number of rounds of training, and the optimal network for a single local training is obtained.

进一步，所述步骤S3包括以下步骤：Further, the step S3 comprises the following steps:

S31：将测试集的数据输入到单次本地训练最优网络中得到负荷预测值；S31: input the data of the test set into the single local training optimal network to obtain the load forecast value;

S32：按照式(18)-(20)计算测试集负荷预测值与真实值的平滑曲线交叉熵损失函数L_j(θ)和网络权重、网络阈值的反向传播梯度；S32: Calculate the smooth curve cross entropy loss function L _j (θ) of the load prediction value and the true value of the test set and the back propagation gradient of the network weight and network threshold according to equations (18)-(20);

S33：根据式(21)生成单次本地最优网络参数更新值θ′。S33: Generate a single local optimal network parameter update value θ′ according to formula (21).

进一步，所述步骤S4包括以下步骤：Further, the step S4 comprises the following steps:

S41：本地服务器将单次本地最优网络参数更新值θ′上传至中央服务器；S41: The local server uploads the single local optimal network parameter update value θ′ to the central server;

S42：中央服务器在收集到所有响应用户更新后的模型参数θ′＝{θ'_p(p＝1,2,…,P)}，θ'_p表示第p个响应用户的单次本地最优网络参数更新值，对参数基于FedAvg算法做聚合处理，求出所有终端用户的加权平均损失，通过SGD拟合全局模型参数θ，如式(22)、(23)所示：S42: After collecting the updated model parameters θ′＝{θ' _p (p＝1,2,…,P)} of all responding users, θ' _p represents the single local optimal network parameter update value of the p-th responding user, the central server aggregates the parameters based on the FedAvg algorithm, calculates the weighted average loss of all terminal users, and fits the global model parameters θ through SGD, as shown in equations (22) and (23):

式中，F_p(θ)表示第p个聚合商用户的所有数据特征的平均损失，f(θ)为更新θ的函数，F_g(θ)表示第g个聚合商用户的所有数据特征的平均损失，g∈P是子集；Where _Fp (θ) represents the average loss of all data features of the p-th aggregator user, f(θ) is the function for updating θ, _Fg (θ) represents the average loss of all data features of the g-th aggregator user, and g∈P is a subset;

S43：能源聚合服务商的中央服务器将全局模型参数θ发送给所有本地服务器，聚合商用户的本地服务器基于全局模型参数更新PCA-BP神经网络模型参数θ′＝θ；S43: The central server of the energy aggregation service provider sends the global model parameter θ to all local servers, and the local server of the aggregator user updates the PCA-BP neural network model parameter θ′=θ based on the global model parameter;

S44：在本地服务器的PCA-BP神经网络模型采用新的参数θ进行新一轮训练。S44: The PCA-BP neural network model of the local server uses the new parameter θ to perform a new round of training.

进一步，所述步骤S5包括以下步骤：Further, the step S5 comprises the following steps:

S51：重复步骤S2-S4，在中央服务器和本地服务器之间完成有限次交互；S51: repeating steps S2-S4 to complete a limited number of interactions between the central server and the local server;

S52：经过第R次交互后，获得针对当前市场环境下负荷预测的全局最优模型，令通信交互轮次r＝1,2,…,R为聚合商用户本地服务器与能源聚合服务商的中央服务器模型参数交互的次数；S52: After the Rth interaction, the global optimal model for load forecasting under the current market environment is obtained, and the communication interaction round r=1,2,…,R is the number of model parameter interactions between the aggregator user's local server and the energy aggregation service provider's central server;

S53：能源聚合服务商对全局最优模型进行校验，根据对全局最优模型的贡献程度大小，给予相应的聚合商用户分发响应奖励；S53: The energy aggregation service provider verifies the global optimal model and gives corresponding aggregator users distribution response rewards according to the degree of contribution to the global optimal model;

S54：采用最终的全局最优模型参数θ在本地服务器进行聚合商用户最终负荷预测，得到全局最优负荷预测值；S54: using the final global optimal model parameter θ to perform a final load forecast of the aggregator user in the local server to obtain a global optimal load forecast value;

S55：根据此全局最优负荷预测值制定相应市场策略以获取收益。S55: Formulate corresponding market strategies according to the global optimal load forecast value to obtain benefits.

本发明的第二目的通过下述技术方案实现：基于横向联邦学习的能源聚合服务商负荷预测系统，用于实现上述的基于横向联邦学习的能源聚合服务商负荷预测方法，其包括：The second object of the present invention is achieved through the following technical solution: an energy aggregation service provider load forecasting system based on horizontal federated learning, which is used to implement the above-mentioned energy aggregation service provider load forecasting method based on horizontal federated learning, comprising:

数据采集与处理模块，用于采集时间数据、天气数据、行业数据、负荷数据和经济数据这五种特征类型初始数据并进行人工特征选取，形成聚合商用户本地数据集，对聚合商用户本地数据集进行预处理，包括异常值检测和缺失值补充，并对数据中的离散值采用独热编码和sin/cos循环编码，连续值采取均值方差归一化操作；接着，将预处理后的聚合商用户本地数据集划分为训练集与测试集，分别用于模型的训练与测试；The data collection and processing module is used to collect the initial data of five feature types, namely time data, weather data, industry data, load data and economic data, and perform manual feature selection to form a local data set of aggregator users. The local data set of aggregator users is preprocessed, including outlier detection and missing value supplementation, and the discrete values in the data are encoded using one-hot encoding and sin/cos cyclic encoding, and the continuous values are normalized using mean-variance normalization operations; then, the preprocessed local data set of aggregator users is divided into a training set and a test set, which are used for model training and testing respectively;

训练模块，用于将训练集的数据送入本地服务器的PCA-BP神经网络模型中进行首次训练，此次训练中给定该模型的初始参数θ₀；训练时先对训练集的数据采用PCA进行特征提取，包括数据特征矩阵的标准化处理、协方差矩阵计算和奇异值分解，得到经过特征提取后的特征矩阵；再将提取的特征矩阵输入到PCA-BP神经网络模型中得到初始各终端用户的负荷预测值；其中，在反向传播中采用平滑曲线交叉熵方法计算负荷预测结果和真实值的损失值，经过有限轮迭代至损失值最小，得到单次本地训练最优网络；The training module is used to send the data of the training set to the PCA-BP neural network model of the local server for the first training. The initial parameter θ ₀ of the model is given in this training. During the training, the data of the training set is firstly subjected to feature extraction by using PCA, including the standardization of the data feature matrix, the calculation of the covariance matrix and the singular value decomposition, so as to obtain the feature matrix after feature extraction. The extracted feature matrix is then input into the PCA-BP neural network model to obtain the initial load forecast value of each terminal user. In the back propagation, the smooth curve cross entropy method is used to calculate the loss value of the load forecast result and the true value, and after a finite number of iterations until the loss value is minimized, the optimal network for a single local training is obtained.

参数更新模块，用于将测试集的数据输入到单次本地训练最优网络中得到负荷预测信息，接着，计算该负荷预测信息与真实值的平滑曲线交叉熵损失函数和网络权重、网络阈值的反向传播梯度，生成单次本地最优网络参数更新值θ′；The parameter updating module is used to input the test set data into the single local training optimal network to obtain the load forecast information, and then calculate the smooth curve cross entropy loss function of the load forecast information and the true value and the back propagation gradient of the network weight and the network threshold to generate a single local optimal network parameter update value θ′;

计算模块，通过本地服务器将单次本地最优网络参数更新值θ′上传至中央服务器，中央服务器基于FedAvg算法求出所有终端用户的加权平均损失，通过随机梯度下降SGD拟合全局模型参数θ；接着，中央服务器将全局模型参数θ发送给本地服务器，本地服务器基于θ更新本地参数θ′＝θ，本地服务器的PCA-BP神经网络模型采用新的参数θ进行新一轮训练；The calculation module uploads the single local optimal network parameter update value θ′ to the central server through the local server. The central server calculates the weighted average loss of all terminal users based on the FedAvg algorithm and fits the global model parameter θ through stochastic gradient descent SGD. Then, the central server sends the global model parameter θ to the local server. The local server updates the local parameter θ′=θ based on θ. The PCA-BP neural network model of the local server uses the new parameter θ for a new round of training.

负荷预测模块，通过中央服务器和本地服务器的不断交互迭代，获得针对当前市场环境下负荷预测的全局最优模型，采用最终的全局最优模型参数θ在本地服务器进行最终负荷预测，得到全局最优负荷预测值，并根据此全局最优负荷预测值制定相应市场策略以获取收益。The load forecasting module obtains the global optimal model for load forecasting under the current market environment through continuous interactive iteration between the central server and the local server. The final global optimal model parameter θ is used to perform the final load forecast on the local server to obtain the global optimal load forecast value, and the corresponding market strategy is formulated based on this global optimal load forecast value to obtain benefits.

本发明与现有技术相比，具有如下优点与有益效果：Compared with the prior art, the present invention has the following advantages and beneficial effects:

1、本发明为能源聚合商引入一个新颖的联合框架，确保新兴能源供应商数据的保密性，从而使它们能够在无需披露耗电量信息的情况下进行通信。1. The present invention introduces a novel federation framework for energy aggregators, ensuring the confidentiality of data of emerging energy suppliers, thereby enabling them to communicate without disclosing electricity consumption information.

2、本发明提出一种PCA-BP神经网络模型，该模型利用数据隐私考虑进行特征提取。2. The present invention proposes a PCA-BP neural network model, which uses data privacy considerations to perform feature extraction.

3、本发明设计了使用联合平均加权组合训练神经网络模型的方法，为能源聚合商制定相应市场策略、获取收益提供了技术支撑。3. The present invention designs a method for training a neural network model using a joint average weighted combination, which provides technical support for energy aggregators to formulate corresponding market strategies and obtain profits.

总之，本发明确保严格遵守数据安全和隐私协议，减少了模型数量和时间消耗，从而在不损害用户数据隐私的情况下利用联邦学习聚合参数，实现了准确的负荷预测，具有实际应用价值，值得推广。In summary, the present invention ensures strict compliance with data security and privacy protocols, reduces the number of models and time consumption, and thus utilizes federated learning to aggregate parameters without compromising user data privacy, thus achieving accurate load forecasting, which has practical application value and is worthy of promotion.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

图1为本发明方法的流程示意图。FIG1 is a schematic flow diagram of the method of the present invention.

图2为本发明系统的架构图。FIG. 2 is a diagram showing the architecture of the system of the present invention.

具体实施方式DETAILED DESCRIPTION

下面结合实施例及附图对本发明作进一步详细的描述，但本发明的实施方式不限于此。The present invention is further described in detail below in conjunction with embodiments and drawings, but the embodiments of the present invention are not limited thereto.

实施例1Example 1

如图1所示，本实施例公开了一种基于横向联邦学习的能源聚合服务商负荷预测方法，该方法是基于FedAvg横向联邦学习框架与PCA-BP神经网络模型实现数据隐私安全下能源聚合服务商的负荷预测，其中，数据和模型参数分别放置于各终端用户的本地服务器和能源聚合服务商的中央服务器中，本地服务器使用PCA-BP神经网络模型对终端用户进行负荷预测并计算均方根误差RMSE，该PCA-BP神经网络模型是在原来BP神经网络模型的基础上基于PCA算法提取主要特征，剔除聚合商用户本地数据集的无用和冗余特征，防止在BP神经网络模型中放大网络缺点；中央服务器采用FedAvg算法聚合所有本地服务器的加权平均损失并更新PCA-BP神经网络模型的参数；通过本地服务器和中央服务器的有限次交互，确保严格遵守数据安全和隐私协议，减少了模型数量和时间消耗，从而在不损害用户数据隐私的情况下实现了准确的负荷预测。As shown in Figure 1, this embodiment discloses a load forecasting method for energy aggregation service providers based on horizontal federated learning. The method is based on the FedAvg horizontal federated learning framework and the PCA-BP neural network model to achieve load forecasting for energy aggregation service providers under data privacy security, wherein data and model parameters are respectively placed in the local server of each terminal user and the central server of the energy aggregation service provider. The local server uses the PCA-BP neural network model to perform load forecasting on the terminal user and calculate the root mean square error RMSE. The PCA-BP neural network model is based on the original BP neural network model and extracts the main features based on the PCA algorithm, eliminates useless and redundant features of the local data set of the aggregator user, and prevents the network shortcomings from being amplified in the BP neural network model; the central server uses the FedAvg algorithm to aggregate the weighted average losses of all local servers and update the parameters of the PCA-BP neural network model; through limited interactions between the local server and the central server, strict compliance with data security and privacy protocols is ensured, the number of models and time consumption are reduced, and accurate load forecasting is achieved without compromising user data privacy.

该能源聚合服务商负荷预测方法的具体实施包括以下步骤：The specific implementation of the energy aggregation service provider load forecasting method includes the following steps:

S1：采集时间数据、天气数据、行业数据、负荷数据和经济数据这五种特征类型初始数据并进行人工特征选取，形成聚合商用户本地数据集，对聚合商用户本地数据集进行预处理，包括异常值检测和缺失值补充，并对数据中的离散值采用独热编码和sin/cos循环编码，连续值采取均值方差归一化操作；接着，将预处理后的聚合商用户本地数据集划分为训练集与测试集，分别用于模型的训练与测试。S1: Collect initial data of five feature types, including time data, weather data, industry data, load data, and economic data, and perform manual feature selection to form a local data set of aggregator users. Preprocess the local data set of aggregator users, including outlier detection and missing value supplementation, and use one-hot encoding and sin/cos cyclic encoding for discrete values in the data, and mean-variance normalization for continuous values; then, divide the preprocessed local data set of aggregator users into a training set and a test set, which are used for model training and testing respectively.

采集的初始数据中影响因素包括：时间因素、天气因素、行业因素和经济因素；其中，时间因素的纬度数据选择年、月、日、时、分信息反映负荷的周期性变化；影响负荷变化的天气因素包括温度、湿度、降水量、日照、风向、风速和气压；行业因素通过农/林/牧/渔业、工业、交通运输/仓储/邮政业、信息传输/软件/信息技术服务业、批发和零售业、住宿和餐饮业、金融业、房地产业、租赁和商务服务业、公共服务及管理组织这10个行业用电量来描述对负荷的影响；经济因素直接选取当地GDP数据反映社会经济环境与负荷之间的联系。The influencing factors in the initial data collected include: time factors, weather factors, industry factors and economic factors; among them, the latitude data of the time factor selects year, month, day, hour and minute information to reflect the periodic changes of the load; the weather factors that affect the load change include temperature, humidity, precipitation, sunshine, wind direction, wind speed and air pressure; the industry factor describes the impact on the load through the electricity consumption of 10 industries, including agriculture/forestry/animal husbandry/fishery, industry, transportation/warehousing/postal industry, information transmission/software/information technology services, wholesale and retail industry, accommodation and catering industry, finance industry, real estate industry, leasing and business services industry, public services and management organizations; the economic factor directly selects the local GDP data to reflect the connection between the social and economic environment and the load.

对聚合商用户本地数据集进行预处理，包括：Preprocess the aggregator user local data set, including:

x_ab＝X_a-1(1-w_ab)+X_aw_ab,a＝2,3,...,T″,b＝1,2,...,N (1)；x _ab ＝X _a-1 (1-w _ab )+X _a w _ab ,a＝2,3,...,T″,b＝1,2,...,N (1);

S2：把训练集的数据送入本地服务器的PCA-BP神经网络模型中进行首次训练，此次训练中给定该模型的初始参数λ₀；训练时先对训练集的数据采用PCA进行特征提取，包括数据特征矩阵的标准化处理、协方差矩阵计算和奇异值分解，得到经过特征提取后的特征矩阵；再将提取的特征矩阵输入到PCA-BP神经网络模型中得到初始各终端用户的负荷预测值；其中，在反向传播中采用平滑曲线交叉熵方法计算负荷预测结果和真实值的损失值，经过有限轮迭代至损失值最小，得到单次本地训练最优网络。S2: Send the data of the training set to the PCA-BP neural network model of the local server for the first training. The initial parameter λ ₀ of the model is given in this training. During the training, PCA is first used to extract features from the data of the training set, including standardization of the data feature matrix, calculation of the covariance matrix and singular value decomposition, to obtain a feature matrix after feature extraction. The extracted feature matrix is then input into the PCA-BP neural network model to obtain the initial load forecast value of each terminal user. In the back propagation, the smooth curve cross entropy method is used to calculate the loss value of the load forecast result and the true value. After a finite number of iterations until the loss value is minimized, the optimal network for a single local training is obtained.

采用PCA-BP神经网络进行训练，包括以下步骤：The PCA-BP neural network is used for training, which includes the following steps:

Z₁＝[z₁₁ z₁₂ … z_1m]^T Z₂＝[z₂₁ z₂₂ … z_2m]^T…Z_m＝[z_n1 z_n2 … z_nm]^T (11)；Z ₁ ＝ [z ₁₁ z ₁₂ … z _1m ] ^T Z ₂ ＝ [z ₂₁ z ₂₂ … z _2m ] ^T … Z _m = [z _n1 z _n2 … z _nm ] ^T (11);

T′＝[t₁ t₂ … t_m] (14)；T′=[t ₁ t ₂ ... t _m ] (14);

x_m＝d_mt_m,X＝DT′ (15)；x _m ＝d _m t _m ,X＝DT′ (15);

S26：设定3个隐藏层的权重系数分别为ω₁,ω₂,ω₃，阈值分别为b₁,b₂,b₃；输出层的权重系数为ω₄，阈值为b₄；将PCA-BP神经网络模型参数统计为θ∈W,b，其中，W＝{ω₁,ω₂,ω₃,ω₄}，b＝{b₁,b₂,b₃,b₄}，W为PCA-BP神经网络模型权重集合，b为PCA-BP神经网络模型阈值集合；该PCA-BP神经网络模型的正向传播如式(17)所示：S26: Set the weight coefficients of the three hidden layers to ω ₁ , ω ₂ , ω ₃ , and the thresholds to b ₁ , b ₂ , b ₃ ; set the weight coefficient of the output layer to ω ₄ , and the threshold to b ₄ ; count the parameters of the PCA-BP neural network model as θ∈W,b, where W = {ω ₁ ,ω ₂ ,ω ₃ ,ω ₄ }, b = {b ₁ ,b ₂ ,b ₃ ,b ₄ }, W is the weight set of the PCA-BP neural network model, and b is the threshold set of the PCA-BP neural network model; the forward propagation of the PCA-BP neural network model is shown in formula (17):

S3：将测试集的数据输入到单次本地训练最优网络中得到负荷预测信息，接着，计算该负荷预测信息与真实值的平滑曲线交叉熵损失函数和网络权重、网络阈值的反向传播梯度，生成单次本地最优网络参数更新值θ′，包括以下步骤：S3: Input the data of the test set into the single local training optimal network to obtain the load forecast information, then calculate the smooth curve cross entropy loss function of the load forecast information and the true value and the back propagation gradient of the network weight and network threshold to generate a single local optimal network parameter update value θ′, including the following steps:

S4：本地服务器将单次本地最优网络参数更新值θ′上传至中央服务器，中央服务器基于FedAvg算法求出所有终端用户的加权平均损失，通过随机梯度下降SGD拟合全局模型参数θ，接着，中央服务器将全局模型参数θ发送给本地服务器，本地服务器基于θ更新本地参数θ′＝θ，本地服务器的PCA-BP神经网络模型采用新的参数θ进行新一轮训练，包括以下步骤：S4: The local server uploads the single local optimal network parameter update value θ′ to the central server. The central server calculates the weighted average loss of all terminal users based on the FedAvg algorithm and fits the global model parameter θ through stochastic gradient descent SGD. Then, the central server sends the global model parameter θ to the local server. The local server updates the local parameter θ′=θ based on θ. The PCA-BP neural network model of the local server uses the new parameter θ for a new round of training, including the following steps:

S42：中央服务器在收集到所有响应用户更新后的模型参数θ′＝{θ′_p(p＝1,2,…,P)}，θ'_p表示第p个响应用户的单次本地最优网络参数更新值，对参数基于FedAvg算法做聚合处理，求出所有终端用户的加权平均损失，通过SGD拟合全局模型参数θ，如式(22)、(23)所示：S42: After collecting the updated model parameters θ′＝{θ′ _p (p＝1,2,…,P)} of all responding users, θ′ _p represents the single local optimal network parameter update value of the p-th responding user, the central server aggregates the parameters based on the FedAvg algorithm, calculates the weighted average loss of all terminal users, and fits the global model parameters θ through SGD, as shown in equations (22) and (23):

S43：能源聚合服务商的中央服务器将全局模型参数θ发送给所有本地服务器，聚合商用户的本地服务器基于全局模型参数更新PCA-BP神经网络模型参数θ′＝0；S43: The central server of the energy aggregation service provider sends the global model parameter θ to all local servers, and the local server of the aggregator user updates the PCA-BP neural network model parameter θ′＝0 based on the global model parameter;

S5：重复步骤S2-S4直至完成第R次交互；通过中央服务器和本地服务器的不断交互迭代，获得针对当前市场环境下负荷预测的全局最优模型，采用最终的全局最优模型参数θ在本地服务器进行最终负荷预测，得到全局最优负荷预测值，并根据此全局最优负荷预测值制定相应市场策略以获取收益，包括以下步骤：S5: Repeat steps S2-S4 until the Rth interaction is completed; through continuous interaction and iteration between the central server and the local server, a global optimal model for load forecasting under the current market environment is obtained, and the final global optimal model parameter θ is used to perform final load forecasting on the local server to obtain the global optimal load forecast value, and a corresponding market strategy is formulated according to the global optimal load forecast value to obtain benefits, including the following steps:

下面我们以一个具体实例来说明本发明上述实施例提供的基于横向联邦学习的能源聚合服务商负荷预测方法的执行过程以及所能达到的有益效果。Below we use a specific example to illustrate the execution process of the energy aggregation service provider load forecasting method based on horizontal federated learning provided by the above embodiment of the present invention and the beneficial effects that can be achieved.

使用的数据集是中国南方某市大型能源聚合服务商及旗下10家新能源供应商个体2018年9月-2019年8月共365天的多元特征真实数据集。构建数据集的时间颗粒度为1小时，即每天有24个多元特征的负荷数据值。参与训练的数据集将按6：2：2的比例划分为训练集、验证集和测试集。The dataset used is a real dataset of multivariate features of a large energy aggregation service provider in a city in southern China and its 10 new energy suppliers from September 2018 to August 2019, totaling 365 days. The time granularity of constructing the dataset is 1 hour, that is, there are 24 multivariate feature load data values every day. The dataset involved in the training will be divided into training set, validation set and test set in a ratio of 6:2:2.

综合考虑负荷的影响因素后，最终选择负荷预测模型的输入特征类型为时间、天气、负荷、行业和经济。其中时间维度的数据选择年月日时分信息反映负荷的周期性变化；天气影响负荷变化的因素包括温度、湿度、降水量、日照、风向、风速和气压；行业类型因素通过农/林/牧/渔业、工业、交通运输/仓储/邮政业、信息传输/软件/信息技术服务业、批发和零售业、住宿和餐饮业、金融业、房地产业、租赁和商务服务业、公共服务及管理组织等10个行业用电量来描述对负荷的影响；经济因素则直接选取当地GDP数据反映社会经济环境与负荷之间的联系。After comprehensively considering the factors affecting the load, the input feature types of the load forecasting model are finally selected as time, weather, load, industry and economy. The data of the time dimension selects the year, month, day, hour and minute information to reflect the periodic changes of the load; the factors affecting the load change by weather include temperature, humidity, precipitation, sunshine, wind direction, wind speed and air pressure; the industry type factor describes the impact on the load through the electricity consumption of 10 industries such as agriculture/forestry/animal husbandry/fishery, industry, transportation/warehousing/postal industry, information transmission/software/information technology service industry, wholesale and retail industry, accommodation and catering industry, finance industry, real estate industry, leasing and business service industry, public service and management organization; the economic factor directly selects the local GDP data to reflect the connection between the social and economic environment and the load.

进一步，对原始电力负荷数据进行异常检测、缺失值填充等数据集预处理，构建能源聚合服务商负荷数据集，具体如表1所示：Furthermore, the original power load data is preprocessed by performing anomaly detection, missing value filling and other data set preprocessing to construct the energy aggregation service provider load data set, as shown in Table 1:

表1能源聚合服务商负荷数据集的初始特征类型Table 1 Initial feature types of energy aggregation service provider load dataset

通过对时间、天气、行业、负荷和经济五种特征类型初始数据进行人工特征选取，并对离散值采用独热编码和sin/cos循环编码转换处理，而连续值统一采用归一化转化处理，最终得到29种特征量，如表2所示：By manually selecting features from the initial data of five feature types, namely time, weather, industry, load and economy, and using unique hot encoding and sin/cos cyclic encoding conversion for discrete values, and normalization conversion for continuous values, 29 feature quantities are finally obtained, as shown in Table 2:

表2特征提取Table 2 Feature extraction

利用SVD和PCA提取表2中的主要特征，得到提取后的特征矩阵X。SVD and PCA are used to extract the main features in Table 2 and obtain the extracted feature matrix X.

基于FedAvg算法建立基于联邦学习的能源聚合服务商负荷预测架构，主体包括风电能源供应商、光伏能源供应商、储能能源供应商、电动汽车能源供应商、能源聚合服务商。在此应用场景中，能源供应商个体客户端各自拥有自己独立的电力数据集，其数据集的样本体量不同，但他们都在同一区域，负荷预测受到相同因素影响，具有相同维度的数据特征。整个流程分为6步，如下所示：Based on the FedAvg algorithm, a load forecasting architecture for energy aggregation service providers based on federated learning is established. The main entities include wind power energy suppliers, photovoltaic energy suppliers, energy storage energy suppliers, electric vehicle energy suppliers, and energy aggregation service providers. In this application scenario, individual clients of energy suppliers each have their own independent power data sets with different sample sizes, but they are all in the same area, and the load forecast is affected by the same factors and has data features of the same dimensions. The entire process is divided into 6 steps, as shown below:

步骤1：参与市场报价的能源聚合服务商需要制定策略，对其内部能源供应商发起联邦训练请求。Step 1: Energy aggregation service providers participating in market bidding need to formulate strategies to initiate federated training requests to their internal energy suppliers.

步骤2：能源供应商根据自身需求，决定是否结合本地电力数据响应服务端请求进行预测模型训练。Step 2: The energy supplier decides whether to respond to the server request for prediction model training based on its own needs in combination with local power data.

步骤3：响应服务器请求的能源供应商每一轮首先从能源聚合服务商服务端下载全局模型参数，在本地客户端将各自的电力数据输入上述预测模型进行本地训练，并将每一轮的模型参数上传服务端。Step 3: In each round, the energy suppliers that respond to the server request first download the global model parameters from the energy aggregation service provider server, input their respective power data into the above prediction model for local training on the local client, and upload the model parameters of each round to the server.

步骤4：与响应的能源供应商客户端多次通信交互后，能源聚合服务商服务端将获得针对当前市场环境下负荷预测的全局最优模型。Step 4: After multiple communications and interactions with the responding energy supplier client, the energy aggregation service provider server will obtain the global optimal model for load forecasting under the current market environment.

步骤5：能源聚合服务商对全局最优模型进行校验，根据对全局最优模型的贡献程度大小，给予响应的能源供应商分发响应奖励。Step 5: The energy aggregation service provider verifies the global optimal model and distributes response rewards to the responding energy suppliers based on their contribution to the global optimal model.

步骤6：能源聚合服务商采用最终的全局最优模型进行负荷预测，并制定相应市场策略获取收益。Step 6: The energy aggregation service provider uses the final global optimal model to perform load forecasting and formulate corresponding market strategies to obtain profits.

根据表3设置联邦学习模型参数。本实施例的负荷预测是通过当前时刻前24小时历史数据来预测当前时刻的负荷值，属于回归任务问题。针对此类问题的系统通常采用平均绝对误差(Mean Absolute Error，MAE)、均方根误差(Root Mean Squared Error，RMSE)、平均绝对百分比误差(Mean Absolute Percentage Error，MAPE)和对称平均绝对百分比误差(Symmetric Mean Absolute Percentage Error，SMAPE)四种评价指标。由于RMSE指标对大误差样本有更大的惩罚，对离群点更敏感，而负荷预测对预测精度需要敏感，所以在此采用RMSE作为该模型算法的评价指标，如式(24)所示：Set the parameters of the federated learning model according to Table 3. The load forecasting in this embodiment predicts the load value at the current moment by using the historical data of the 24 hours before the current moment, which is a regression task problem. Systems for such problems usually use four evaluation indicators: Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), Mean Absolute Percentage Error (MAPE) and Symmetric Mean Absolute Percentage Error (SMAPE). Since the RMSE indicator has a greater penalty for large error samples and is more sensitive to outliers, and load forecasting needs to be sensitive to prediction accuracy, RMSE is used as the evaluation indicator of the model algorithm, as shown in formula (24):

表3模型参数Table 3 Model parameters

参数名称Parameter name 参数值Parameter Value 输入层神经元Input layer neurons 32个32 隐藏层神经元Hidden layer neurons 60(20×20×20)个60 (20×20×20) 输出层神经元Output layer neurons 10个10 网络学习率lNetwork learning rate l 0.080.08 比例参数CProportional parameter C 0.50.5 客户端数量参数PClient quantity parameter P 10个10 本地采样参数BLocal sampling parameters B 100100 本地训练参数ELocal training parameters E 100轮100 rounds 通讯参数RCommunication parameters R 10轮10 rounds

式中，λ_RMSE表示为时刻预测值与真实值差值的样本标准差；为样本总数。λ_RMSE数值越小表示模型预测准确性越好。In the formula, λ _RMSE is expressed as Time prediction value With the true value the sample standard deviation of the difference; is the total number of samples. The smaller the _{λ RMSE} value is, the better the prediction accuracy of the model is.

分别从数据隐私与安全、特征提取、模型训练速度三个维度，综合对比了现行主流负荷预测技术方法，如表4、5所示。其中，PF代表的是本发明提出的基于横向联邦学习的能源聚合服务商负荷预测方法，Fedavg代表PF在不进行特征提取情况下的方法；P-BP、P-LSTM、LSTM分别代表传统需要直接接触全部数据的集中式负荷预测方法，而P-BP和P-LSTM方法会对数据集进行特征提取。The current mainstream load forecasting technical methods are comprehensively compared from the three dimensions of data privacy and security, feature extraction, and model training speed, as shown in Tables 4 and 5. Among them, PF represents the energy aggregation service provider load forecasting method based on horizontal federated learning proposed in this invention, Fedavg represents the PF method without feature extraction; P-BP, P-LSTM, and LSTM represent traditional centralized load forecasting methods that need to directly access all data, while P-BP and P-LSTM methods will extract features from the data set.

表4不同预测方法模型训练时间对比Table 4 Comparison of model training time for different prediction methods

表5不同方法预测准确率的RMSE对比Table 5 RMSE comparison of prediction accuracy of different methods

客户端Client PFPF FedavgFedavg P-BPP-BP P-LSTMP-LSTM LSTMLSTM 11 0.039960.03996 0.038820.03882 0.047740.04774 0.064130.06413 0.144330.14433 22 0.041150.04115 0.041940.04194 0.042800.04280 0.058930.05893 0.064130.06413 33 0.039660.03966 0.040440.04044 0.043320.04332 0.071430.07143 0.060740.06074 44 0.040920.04092 0.045450.04545 0.040160.04016 0.062790.06279 0.123370.12337 55 0.040910.04091 0.045450.04545 0.040160.04016 0.081790.08179 0.117930.11793 66 0.039240.03924 0.041960.04196 0.042180.04218 0.082290.08229 0.131230.13123 77 0.039230.03923 0.039800.03980 0.044180.04418 0.046090.04609 0.092950.09295 88 0.039430.03943 0.039370.03937 0.044280.04428 0.089990.08999 0.133120.13312 99 0.046720.04672 0.041630.04163 0.055760.05576 0.105670.10567 0.072220.07222 1010 0.040140.04014 0.041630.04163 0.040820.04082 0.042790.04279 0.128280.12828 平均值average value 0.040740.04074 0.041650.04165 0.044140.04414 0.070590.07059 0.106830.10683

由表5、6可知，PF与P-LSTM方法比Fedavg和LSTM方法在RMSE值性能上都提升了，说明同一种方法，采用了特征提取的会比没有的预测准确性高，并且也在训练学习中速度更加具有优势。这是由于在进行训练前，PCA能有效地甄别出与负荷变化相关的特征，从而使得模型方法在训练时更高效。对比结果表明，与传统的方法相比，本发明提出的算法模型负荷预测值与真实值的偏差最小，性能更佳。可以看到，在平行数据集的环境下，PF方法不但能有有效地保证数据隐私与安全，而且在预测准确率和模型学习训练的速度方便也表现出优越的性能。As can be seen from Tables 5 and 6, the PF and P-LSTM methods have improved the RMSE value performance compared to the Fedavg and LSTM methods, indicating that for the same method, the one that uses feature extraction will have higher prediction accuracy than the one without, and it also has an advantage in speed during training and learning. This is because before training, PCA can effectively identify features related to load changes, making the model method more efficient during training. The comparison results show that compared with the traditional method, the algorithm model proposed in the present invention has the smallest deviation between the load prediction value and the true value, and has better performance. It can be seen that in the environment of parallel data sets, the PF method can not only effectively guarantee data privacy and security, but also show superior performance in terms of prediction accuracy and the speed of model learning and training.

最后，考虑到现实环境下，能源供应商个体间可以得到的数据样本量不尽相同，算法模型在联邦分布式学习下会存在不平衡数据集的问题，本次实验将10家能源供应商个体的原数据样本分别随机抽样，按照56％，36％，89％，47％，23％，22％，68％，76％，48％，35％的比例进行仿真实验得到表6。结果表明，本发明提出的算法模型在不平衡数据集的环境下，依然能够保证足够的预测精度，具有较强的通用性。Finally, considering that the amount of data samples available to individual energy suppliers is different in the real environment, the algorithm model will have the problem of unbalanced data sets under federated distributed learning. In this experiment, the original data samples of 10 individual energy suppliers were randomly sampled, and simulation experiments were carried out at a ratio of 56%, 36%, 89%, 47%, 23%, 22%, 68%, 76%, 48%, and 35% to obtain Table 6. The results show that the algorithm model proposed in this invention can still guarantee sufficient prediction accuracy in the environment of unbalanced data sets and has strong versatility.

表6不平衡数据集下联邦学习预测准确率的RMSE对比Table 6 RMSE comparison of federated learning prediction accuracy under unbalanced datasets

客户端Client RMSE值RMSE value 11 0.006550.00655 22 0.008740.00874 33 0.086020.08602 44 0.017930.01793 55 0.037180.03718 66 0.032740.03274 77 0.041430.04143 88 0.089200.08920 99 0.007710.00771 1010 0.009420.00942 平均average 0.033690.03369

实施例2Example 2

本实施例公开了一种基于横向联邦学习的能源聚合服务商负荷预测系统，用于实现实施例1所述的基于横向联邦学习的能源聚合服务商负荷预测方法，如图2所示，该系统包括以下功能模块：This embodiment discloses a load forecasting system for energy aggregation service providers based on horizontal federated learning, which is used to implement the load forecasting method for energy aggregation service providers based on horizontal federated learning described in Example 1. As shown in FIG2 , the system includes the following functional modules:

上述实施例为本发明较佳的实施方式，但本发明的实施方式并不受上述实施例的限制，其他的任何未背离本发明的精神实质与原理下所作的改变、修饰、替代、组合、简化，均应为等效的置换方式，都包含在本发明的保护范围之内。The above embodiments are preferred implementation modes of the present invention, but the implementation modes of the present invention are not limited to the above embodiments. Any other changes, modifications, substitutions, combinations, and simplifications that do not deviate from the spirit and principles of the present invention should be equivalent replacement methods and are included in the protection scope of the present invention.

Claims

1. A load forecasting method for energy aggregation service providers based on horizontal federated learning, characterized in that the method is based on the FedAvg horizontal federated learning framework and the PCA-BP neural network model to achieve load forecasting for energy aggregation service providers under data privacy security, wherein data and model parameters are respectively placed in the local server of each terminal user and the central server of the energy aggregation service provider, and the local server uses the PCA-BP neural network model to perform load forecasting on the terminal user and calculate the root mean square error RMSE, the PCA-BP neural network model is based on the original BP neural network model and extracts the main features based on the PCA algorithm, eliminating useless and redundant features of the local data set of the aggregator user; the central server uses the FedAvg algorithm to aggregate the weighted average losses of all local servers and update the parameters of the PCA-BP neural network model; through limited interactions between the local server and the central server, strict compliance with data security and privacy protocols is ensured, and accurate load forecasting is achieved without compromising user data privacy.

2. The method for load forecasting of energy aggregation service providers based on horizontal federated learning according to claim 1, characterized in that it comprises the following steps:

S1: Collect the initial data of five feature types, namely time data, weather data, industry data, load data and economic data, and perform manual feature selection to form a local data set of aggregator users. Preprocess the local data set of aggregator users, including outlier detection and missing value supplementation, and use one-hot encoding and sin/cos cyclic encoding for discrete values in the data, and mean-variance normalization for continuous values; then, divide the preprocessed local data set of aggregator users into a training set and a test set, which are used for model training and testing respectively;

S2: Send the data of the training set to the PCA-BP neural network model of the local server for the first training. The initial parameter θ ₀ of the model is given in this training. During the training, PCA is first used to extract features from the data of the training set, including standardization of the data feature matrix, calculation of the covariance matrix and singular value decomposition, to obtain a feature matrix after feature extraction. The extracted feature matrix is then input into the PCA-BP neural network model to obtain the initial load forecast value of each terminal user. In the back propagation, the smooth curve cross entropy method is used to calculate the loss value of the load forecast result and the true value. After a finite number of iterations until the loss value is minimized, the optimal network for a single local training is obtained.

S3: Input the data of the test set into the single local training optimal network to obtain the load forecast information. Then, calculate the smooth curve cross entropy loss function of the load forecast information and the true value and the back propagation gradient of the network weight and network threshold to generate a single local optimal network parameter update value θ′;

S4: The local server uploads the single local optimal network parameter update value θ′ to the central server. The central server calculates the weighted average loss of all terminal users based on the FedAvg algorithm and fits the global model parameter θ through stochastic gradient descent SGD. Then, the central server sends the global model parameter θ to the local server. The local server updates the local parameter θ′=θ based on θ. The PCA-BP neural network model of the local server uses the new parameter θ for a new round of training.

S5: Repeat steps S2-S4 until the Rth interaction is completed; through continuous interaction and iteration between the central server and the local server, the global optimal model for load forecasting under the current market environment is obtained, and the final global optimal model parameter θ is used to perform the final load forecast on the local server to obtain the global optimal load forecast value, and formulate the corresponding market strategy based on this global optimal load forecast value to obtain benefits.

3. According to claim 2, the load forecasting method for energy aggregation service providers based on horizontal federated learning is characterized in that, in step S1, the influencing factors in the collected initial data include: time factors, weather factors, industry factors and economic factors; wherein, the latitude data of the time factor selects year, month, day, hour and minute information to reflect the periodic changes of the load; the weather factors affecting the load change include temperature, humidity, precipitation, sunshine, wind direction, wind speed and air pressure; the industry factor describes the impact on the load through the electricity consumption of 10 industries including agriculture/forestry/animal husbandry/fishery, industry, transportation/warehousing/postal industry, information transmission/software/information technology service industry, wholesale and retail industry, accommodation and catering industry, finance industry, real estate industry, leasing and business service industry, public service and management organization; the economic factor directly selects local GDP data to reflect the connection between the socio-economic environment and the load.

4. The method for load forecasting of energy aggregation service providers based on horizontal federated learning according to claim 3 is characterized in that, in step S1, the local data set of the aggregator user is preprocessed, including:

a. Use the 3-Sigma criterion to detect possible outliers in the energy aggregation service provider's load data set;

b. Fill missing values according to formula (1) and (2):

x _ab ＝X _a-1 (1-w _ab )+X _a w _ab ,a＝2,3,...,T″,b＝1,2,...,N (1);

Where _Xa is the data value of the ath hour of the original sequence, Xa _-1 is the data value of the a-1th hour of the original sequence, _xab is the bth data value within the ath hour of the interpolated sequence, _wab is the weight of _Xa to _xab , N is the total number of split points within 1 hour, and T″ is the total number of hours in the original sequence;

c. Use one-hot encoding and sin/cos cyclic encoding for discrete values in the data;

d. Use mean variance normalization to scale the features of continuous values in the data, as shown in formula (3):

In the formula, _xnorm is the normalized value of the data, μ is the mean of the data, and σ is the standard deviation of the data.

5. The method for load forecasting of energy aggregation service providers based on horizontal federated learning according to claim 4 is characterized in that, in step S2, a PCA-BP neural network is used for training, comprising the following steps:

S21: Calculate the mean of each feature d _m in the feature data sample of the training set and standard deviation The data are standardized to obtain the features d _i ' _j , and finally the standardized feature matrix D _stand is obtained, where m represents the number of features, n represents the sequence length of each feature in the time dimension, i∈n, j∈m, as shown in equations (4)-(8):

Where D is the feature matrix, d _nm represents the specific data of the m-th dimension feature of the data sample at the n-th length, d' _m represents the m-th dimension feature vector of the standardized feature matrix, and d' _nm represents the specific data of the m-th dimension feature vector of the standardized feature matrix at the n-th length;

S22: Calculate the covariance matrix A of the standardized feature matrix D _stand , as shown in equations (9) and (10):

In the formula, _aij and _anm represent the elements of the covariance matrix, k∈n, d' _ki and d' _kj represent the kth data of the i-th and j-th dimension features after standardization, respectively, which are calculated by formula (7). and They represent the mean of the i-th and j-th dimension features, respectively, and are calculated by formula (5);

S23: Calculate the eigenvalues λ _q and eigenvectors Z _q of the covariance matrix A using singular value decomposition SVD, where q∈[1,m], as shown in equations (11) and (12):

Z ₁ ＝ [z ₁₁ z ₁₂ … z _1m ] ^T Z ₂ ＝ [z ₂₁ z ₂₂ … z _2m ] ^T … Z _m = [z _n1 z _n2 … z _nm ] ^T (11);

λ ₁ ≥λ ₂ ≥…≥ _m ≤0 (12);

Where, λ _m and Z _m are the eigenvalue and eigenvector of the covariance matrix respectively, and z _nm is the nth data of the mth eigenvector;

S24: The cumulative contribution rate μ′ of the principal component is introduced as the evaluation index of the eigenvector, and the eigenvector _tj with a cumulative contribution rate exceeding 80% is selected as the evaluation matrix T′. The feature matrix X is obtained by PCA feature extraction, as shown in equations (13)-(15):

T′=[t ₁ t ₂ ... t _m ] (14);

x _m ＝d _m t _m ,X＝DT′ (15);

Where p′ represents the order of eigenvalues, λ _j and λ _k are both subsets of λ _m , and x _m is the m-th dimension feature of the feature matrix X;

S25: Based on the BP neural network model, a load forecasting model consisting of 1 input layer, 3 hidden layers and 1 output layer is built, namely the PCA-BP neural network model. Its activation function all uses the Sigmoid function, as shown in formula (16):

In the formula, f(x) is the activation function of the neuron, and x represents the output of each layer of neurons;

S26: Set the weight coefficients of the three hidden layers to ω ₁ , ω ₂ , ω ₃ , and the thresholds to b ₁ , b ₂ , b ₃ ; set the weight coefficient of the output layer to ω ₄ , and the threshold to b ₄ ; statistically calculate the parameters of the PCA-BP neural network model as θ∈W,b, where W = {ω ₁ ,ω ₂ ,ω ₃ ,ω ₄ |, b = {b ₁ ,b ₂ ,b ₃ ,b ₄ }, W is the weight set of the PCA-BP neural network model, and b is the threshold set of the PCA-BP neural network model; the forward propagation of the PCA-BP neural network model is shown in formula (17):

In the formula, q = Z ⁺ represents the number of layers of the network model, h∈q; _Xq = [ _xq,1xq _,2 … _xq,m ] and _Yq = [ _yq,1yq _,2 …yq _,m ] represent the input and output of the qth layer network respectively, xq _,m is the mth input of the qth layer network, and yq _,m is the mth output of the qth layer network; _ωh and _bh represent the weight coefficient and threshold of the hth layer network respectively, _ωh∈W , _bh∈b , and when q=1, _Xq is equal to X in formula (15), that is, X is the first layer input of the network model;

S27: Input the input X of the training set data after PCA dimension reduction into the constructed PCA-BP neural network model to obtain the prediction result;

S28: The smooth curve cross entropy method is used to obtain the loss function L _q of each level of the PCA-BP neural network model, as shown in formula (18):

In the formula, Y′ _q represents the true value of each layer of the network;

S29: Calculate the back propagation gradient of the PCA-BP neural network model weights Backpropagating gradients with threshold As shown in formula (19) and (20):

S210: Update the PCA-BP neural network model parameters, expressed as formula (21):

Where l is the learning rate of the network model, which is used to indicate the speed of iterative convergence of network model training; ω′ _h and b′ _h respectively represent the updated weight and threshold of the PCA-BP neural network model;

S211: Repeat steps S26-S210 until the iterative update of the PCA-BP neural network parameters θ′ is stopped after a limited number of rounds of training, and the optimal network for a single local training is obtained.

6. The method for load forecasting of energy aggregation service providers based on horizontal federated learning according to claim 5, characterized in that step S3 comprises the following steps:

S31: input the data of the test set into the single local training optimal network to obtain the load forecast value;

S32: Calculate the smooth curve cross entropy loss function L _j (θ) of the load prediction value and the true value of the test set and the back propagation gradient of the network weight and network threshold according to equations (18)-(20);

S33: Generate a single local optimal network parameter update value θ′ according to formula (21).

7. The method for load forecasting of energy aggregation service providers based on horizontal federated learning according to claim 6, wherein step S4 comprises the following steps:

S41: The local server uploads the single local optimal network parameter update value θ′ to the central server;

S42: After collecting the updated model parameters θ′＝{θ' _p (p＝1,2,…,P)} of all responding users, θ' _p represents the single local optimal network parameter update value of the p-th responding user, the central server aggregates the parameters based on the FedAvg algorithm, calculates the weighted average loss of all terminal users, and fits the global model parameters θ through SGD, as shown in equations (22) and (23):

Where _Fp (θ) represents the average loss of all data features of the p-th aggregator user, f(θ) is the function for updating θ, _Fg (θ) represents the average loss of all data features of the g-th aggregator user, and g∈P is a subset;

S43: The central server of the energy aggregation service provider sends the global model parameter θ to all local servers, and the local server of the aggregator user updates the PCA-BP neural network model parameter θ′=θ based on the global model parameter;

S44: The PCA-BP neural network model of the local server uses the new parameter θ to perform a new round of training.

8. The method for load forecasting of energy aggregation service providers based on horizontal federated learning according to claim 7, characterized in that step S5 comprises the following steps:

S51: repeating steps S2-S4 to complete a limited number of interactions between the central server and the local server;

S52: After the Rth interaction, the global optimal model for load forecasting under the current market environment is obtained, and the communication interaction round r=1,2,…,R is the number of model parameter interactions between the aggregator user's local server and the energy aggregation service provider's central server;

S53: The energy aggregation service provider verifies the global optimal model and gives corresponding aggregator users distribution response rewards according to the degree of contribution to the global optimal model;

S54: using the final global optimal model parameter θ to perform a final load forecast of the aggregator user in the local server to obtain a global optimal load forecast value;

S55: Formulate corresponding market strategies according to the global optimal load forecast value to obtain benefits.

9. A load forecasting system for energy aggregation service providers based on horizontal federated learning, characterized in that it is used to implement the load forecasting method for energy aggregation service providers based on horizontal federated learning according to any one of claims 1 to 8, and comprises:

The data collection and processing module is used to collect the initial data of five feature types, namely time data, weather data, industry data, load data and economic data, and perform manual feature selection to form a local data set of aggregator users. The local data set of aggregator users is preprocessed, including outlier detection and missing value supplementation, and the discrete values in the data are encoded using one-hot encoding and sin/cos cyclic encoding, and the continuous values are normalized using mean-variance normalization operations; then, the preprocessed local data set of aggregator users is divided into a training set and a test set, which are used for model training and testing respectively;

The training module is used to send the data of the training set to the PCA-BP neural network model of the local server for the first training. The initial parameter θ ₀ of the model is given in this training. During the training, the data of the training set is firstly subjected to feature extraction by using PCA, including the standardization of the data feature matrix, the calculation of the covariance matrix and the singular value decomposition, so as to obtain the feature matrix after feature extraction. The extracted feature matrix is then input into the PCA-BP neural network model to obtain the initial load forecast value of each terminal user. In the back propagation, the smooth curve cross entropy method is used to calculate the loss value of the load forecast result and the true value, and after a finite number of rounds of iteration until the loss value is minimized, the optimal network for a single local training is obtained.

The parameter updating module is used to input the test set data into the single local training optimal network to obtain the load forecast information, and then calculate the smooth curve cross entropy loss function of the load forecast information and the true value and the back propagation gradient of the network weight and the network threshold to generate a single local optimal network parameter update value θ′;

The calculation module uploads the single local optimal network parameter update value θ′ to the central server through the local server. The central server calculates the weighted average loss of all terminal users based on the FedAvg algorithm and fits the global model parameter θ through stochastic gradient descent SGD. Then, the central server sends the global model parameter θ to the local server. The local server updates the local parameter θ′=θ based on θ. The PCA-BP neural network model of the local server uses the new parameter θ for a new round of training.

The load forecasting module obtains the global optimal model for load forecasting under the current market environment through continuous interactive iteration between the central server and the local server. The final global optimal model parameter θ is used to perform the final load forecast on the local server to obtain the global optimal load forecast value, and the corresponding market strategy is formulated based on this global optimal load forecast value to obtain benefits.