CN116826743A - Power load prediction method based on federal graph neural network - Google Patents

Power load prediction method based on federal graph neural network Download PDF

Info

Publication number
CN116826743A
CN116826743A CN202310868553.8A CN202310868553A CN116826743A CN 116826743 A CN116826743 A CN 116826743A CN 202310868553 A CN202310868553 A CN 202310868553A CN 116826743 A CN116826743 A CN 116826743A
Authority
CN
China
Prior art keywords
node
graph
power load
data
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310868553.8A
Other languages
Chinese (zh)
Inventor
王峰
曹宇航
刘立
鲜学丰
方立刚
卜峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Vocational University
Original Assignee
Suzhou Vocational University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Vocational University filed Critical Suzhou Vocational University
Priority to CN202310868553.8A priority Critical patent/CN116826743A/en
Publication of CN116826743A publication Critical patent/CN116826743A/en
Pending legal-status Critical Current

Links

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a power load prediction method based on a federal graph neural network, which comprises the following steps: s1, data acquisition: s2, data processing; s3, constructing a graph neural network; s4, super node construction; s5, building a global graph structure. The invention provides a power load prediction method based on a federal graph neural network.

Description

Power load prediction method based on federal graph neural network
Technical Field
The invention discloses a power load prediction method based on a federal graph neural network, and belongs to the field of machine learning and deep learning power load prediction.
Background
In the operation of a power system, it is important to accurately predict the power load, so that an electric company can be helped to make an effective scheduling decision, and the stability and safety of power supply are ensured.
Conventional power load prediction methods are typically based on statistical and machine learning techniques, which require a large amount of historical data as input to train out models that can accurately predict future loads. However, in practical applications, because the power data is distributed in clients in different regions and often contains sensitive information, a centralized data processing method has risks of data security and privacy disclosure. In addition, a large amount of data is collected and processed at one place, which also results in large communication overhead and low calculation efficiency.
With the development of federal learning and Graph Neural Network (GNN) technology, a new power load prediction method has been developed. The federal learning can allow a plurality of clients to share model parameters for centralized training on the premise of keeping data privacy, so that the data privacy problem is solved. The graphic neural network can process graphic data of the power system and capture topological structure information of the power system, so that accuracy of prediction is improved.
However, in a real world scenario, data in a power system is often stored in a distributed manner on multiple clients, centralized processing communication overhead is large, and open data access between clients is required.
Disclosure of Invention
Most federal graph neural network predictions at present have difficulty in considering the problem of information interaction between clients, so that the accuracy of a final prediction model is reduced. The invention provides a power load prediction method based on a federal graph neural network.
The technical scheme of the invention is as follows:
a power load prediction method based on a federal graph neural network comprises the following steps:
s1, data acquisition: collecting power load data of each region through a data acquisition system of the smart grid; collecting connection relation data among all transmission towers, namely connection conditions of transmission wires among all transmission towers, according to a topological structure of the power system; collecting other relevant data exogenous information of each region, including population density characteristics, climate characteristics and industrial production values;
s2, data processing: carrying out normalization processing on the collected power load data of each region, and constructing a graph structure of a power system according to the connection relation data among the power transmission towers, wherein nodes represent the power transmission towers of each region, edges represent power transmission wires among the power transmission towers, other collected related data are used as node attributes for training a graph neural network model, and the processed data set is divided into a training set, a verification set and a test set; 70% of the data is used as training set, 15% of the data is used as validation set, and the remaining 15% of the data is used as test set;
in the step S2, the data is scaled between [0,1] by using a Min-Max normalization method, and the formula is:
wherein: x is x normalized Representing the original data, x max And x min Representing the maximum value and the minimum value in the data respectively;
the construction of the graph structure in the step S2 is as follows:
creating an adjacency matrix to represent the graph g= (V, E), wherein the rows and columns of the matrix represent transmission towers of each region, and adjacency matrix elements a [ i ] [ j ] =1 if there is an edge between node i and node j, otherwise a [ i ] [ j ] =0;
s3, constructing a graph neural network: selecting GraphSAGE as a framework of a graph neural network, and extracting structure and node characteristic information in power load data of each region;
in step S3, the specific operation steps for selecting graphSAGE as the architecture of the graph neural network are as follows:
s3-1, defining a model structure as a plurality of GraphSAGE layers according to the architecture of the GraphSAGE, and processing node characteristics and structure information and a full connection layer to generate a prediction result of the power load;
s3-2, defining a loss function of the average absolute error; adam is selected as an optimizer for optimizing model parameters to minimize losses;
s3-3, inputting training data into a graph SAGE model, wherein in each layer, for each node, a certain number of nodes are randomly selected from neighbor nodes of the node, and then the neighbor nodes are used as the input of aggregation operation; this process is performed in each layer, and finally a local neighborhood of each node is obtained; in each layer of GraphSAGE, executing an aggregation function to collect local neighborhood information to a target node, and generating updated node representation; transforming the updated node representation by a nonlinear activation function, which is repeated in each layer of graphSAGE, the updating rule formula of the graphSAGE:
wherein:characteristic of node v representing the k-th layer, < >>Characteristic of node v of layer k-1, W k Is a k-th layer of a learnable weight matrix, N (v) represents a neighbor node set of node v; />Features of node u of layer k-1;
the S3-4 fully connected layer performs linear transformation on the input characteristics and then outputs predicted power load, and the process is described by the following formula:
y=W*X+b
wherein: w and b represent weights and biases of the full connection layer, X is an input characteristic, and y is an output prediction result;
using the mean absolute error MAE as a loss function, the formula for MAE is as follows:
wherein: n represents the total number of power load values, Y true Representing the actual power load value, Y pred Representing a predicted electrical load value;
s3-5, optimizing model parameters by using an Adam optimizer.
S4, super node construction: for a graph constructed for a local jurisdiction by each client, firstly, backbone network extraction is carried out, then sampling subgraphs are generated through high-centrality neighbor sampling and BFS algorithm based on breadth first, the contribution degree of each sampling subgraph is calculated through the accumulated contribution rate in kernel principal component analysis, and the minimum k subgraph sets are found; finally, constructing the subgraphs through supernodes fused by the characteristics of the multi-layer perceptron to form a supernode capable of representing the characteristics and the structure of the local subgraphs of the client;
the specific process of supernode formation in step S4 is as follows:
s4-1 firstly, the importance of each node is measured through the decentration, and the calculation formula of the decentration is as follows:
wherein: c (C) d(v) The degree centrality of the node v is represented, and the degree (v) represents the degree of the node v, namely the number of the edges connected with the node v;
s4-2, arranging all nodes in descending order of degree centrality, and calculating the importance sum T of all nodes, wherein the formula is as follows:
T=∑ v∈V C d(v)
wherein: v represents the node set of the original graph;
setting an accumulated contribution rate threshold value P, and calculating the node quantity Q of the extracted backbone network, wherein the formula is as follows:
wherein: searching the smallest node q in the node set v', wherein the sum of the importance metrics from the 1 st node to the q-th node is more than or equal to P times of the total importance metric T;
extracting sub-graphs containing the first Q nodes from the original graph as a backbone network, arranging each node in descending order of degree centrality, selecting a node with the highest centrality as a starting node, and performing neighbor sampling by using a BFS algorithm until the sum of the importance metrics of the nodes in the sub-graphs reaches P times of the sum of the importance metrics of the nodes in the original graph; then selecting the next node to repeat the BFS process until Q sampling subgraphs are obtained;
s4-3, for each sampling sub-graph, training an encoder of a GraphSAGE architecture to obtain the coding representation of the nodes in the sampling sub-graph, wherein the step is the same as the step S3;
s4-4, creating a message passing framework on each sampling subgraph, collecting the characteristics of the neighbors of each node in each time step, and then using an aggregation function to aggregate the characteristics to obtain a new characteristic representation; all new feature representations are then integrated into a unified sub-graph embedding space, and the update rule formula is as follows: message generation function:
M(h v ,h u )=h v *W*h u
wherein: h is a v And h u Characteristic representations of a node v and a node u respectively, wherein W is a weight matrix capable of learning;
aggregation function:
wherein:representing all messages from nodes u to v, σ representing the ReLu function;
status update function:
U(h v ,a v )=h v +a v
wherein: a, a v Representing the aggregated message;
s4-5, normalizing the data so that the mean value of each feature is 0 and the standard deviation is 1, and calculating a kernel matrix by using Gaussian radial basis function kernels; centering the kernel matrix to enable the average value of data points mapped to the high-order adjustment space to be 0, and centering the kernel matrix; calculating a variance interpretation ratio, namely the contribution degree of each principal component;
gaussian radial basis function kernel, the kernel matrix is calculated by the following formula:
K(i,j)=exp(-γ||x i -x j || 2 )
wherein: x is x i And x j Is the ith row and the jth row of the data matrix, gamma is the width parameter of the gaussian radial basis function kernel, and I represents the Euclidean norm;
assuming that the kernel matrix K is a matrix of n x n, the centered kernel matrix is calculated by the following formula:
wherein: k represents the core matrix and,is an all 1 vector of n 1>Is an all 1 vector of 1*n;
the variance interpretation ratio of each principal component is calculated by the following formula:
wherein: lambda (lambda) i Is the characteristic value of the ith representative sub-graph, and m is the total number of the representative sub-graphs;
sequencing all the sampling subgraphs according to the contribution degree from large to small, then accumulating the contribution degrees of the subgraphs one by one from the beginning until the accumulated contribution degree exceeds 95% of the total contribution degree of the original subgraphs, and forming k minimum subgraphs, namely representing the subgraphs;
s4-6 initializing a multi-layer perceptron model, wherein the input of each fully connected layer is the output of the previous layer, and nonlinear transformation is carried out through an activation function, and the expression form of the fully connected layer is as follows:
H l+1 =Relu(H l W l +b l )
wherein: h l Is the output of the first layer, W l And b l Is the weight and bias of layer i;
taking the embedded representation of each representative subgraph as input, inputting into a multi-layer perceptron-based model, and embedding a matrix H for each node i Obtaining a fused representation Z by using a model based on a multi-layer perceptron i
Z i =MLP(H i )
The fused representations of all representative subgraphs are stacked together:
Z=[Z 1 ,Z 2 ,·····,Z K ]
and then the fusion operation in the step S4-6 is carried out, and finally, a high-dimensional supernode representing the local sub-graph characteristics and structures of the client is obtained.
S5, building a global graph structure: repeating the step S4 for n clients formed by the power load information of n areas to obtain n supernodes; adding one side between each node of the n supernodes by using a permutation and combination method to perform information interaction to form a graph structure, then training the graph to obtain a model for power load prediction, and distributing a weight for each side in the graph; and then, deleting the edges in turn from small to large according to the weight of the edges, comparing the edges with the local model performance obtained by each client through graph training in the local jurisdiction, and stopping deleting the edges when the model reaches the optimal performance on the subgraphs of all the clients to obtain the global federal graph model.
The construction of the global graph structure comprises the following specific processes:
s5-1, repeating the step S4 for n clients formed by power load information of n areas to obtain n supernodes;
s5-2, adding an edge between each node by using the method of permutation and combination to perform information interaction to form a graph structure, and adding an edge between every two nodes to represent information interaction between the nodes; adding a weight to each supernode using pearson correlation coefficients; the higher the correlation coefficient is, the larger the weight is, the more similar the power load change trend of the two areas is, and the information interaction is more valuable; the pearson correlation coefficient formula is as follows:
wherein: x is x i And y i Power load information, x, representing region i and region j, respectively mean Representing the average value, y, of the power load sequences of region i mean Representing the average value of the power load sequences of region j;
s5-3, training a GNN model by using the graph structure obtained in the S5-2, representing the performance of the model by using the average absolute error, and recording;
s5-4, deleting the represented edges in sequence from low to high according to the obtained pearson correlation coefficient, repeating the step S5-3 to train a new GNN model and record the performance of the GNN model after deleting one edge, comparing the performance of the model after retraining each time with the performance of the local model obtained by each client through graph training in a local jurisdiction, stopping deleting the edges when the model reaches the optimal performance on sub-graphs of all the clients, obtaining a global federal graph model, and distributing the global federal graph model to each local client for power load prediction.
The beneficial effects of the invention are as follows:
the invention discloses a power load prediction method based on a federal graph neural network, which realizes high-efficiency power load prediction in a distributed environment under the privacy protection and information interaction limitation among clients. Traditional power load prediction methods typically require data from different regions to be centralized at a central server for training and reasoning. However, in a real world scenario, data in a power system is often stored in a distributed manner on multiple clients, centralized processing communication overhead is large, and open data access between clients is required. The invention provides a power load prediction method based on a federal graph neural network.
In the method, each client maintains a local map model that represents the relationships between the various regions in the power system. Nodes represent transmission towers of each region, and edges represent transmission conductors between transmission towers. To better capture key nodes and structural information in the graph, a topology backbone is extracted at each client, which is a subgraph containing nodes of high centrality and importance in the graph. In this way, computational complexity and communication overhead can be reduced while preserving the primary topological features in the graph. A neighbor sampling strategy based on Breadth First Search (BFS) is adopted to generate a sub-graph, self-adaptive Top-K neighbor sampling is realized by utilizing the accumulated contribution rate in kernel principal component analysis, the sub-graph generation process is further optimized, and finally a super node capable of representing the characteristics and the structure of a local sub-graph of a client is obtained. Then, the supernodes of each region are connected with each other to form a supernode diagram, and training is carried out by using the GNN model. And then, deleting edges in the graph step by step, comparing the edges with the local model performance obtained by each client through graph training in a local jurisdiction, obtaining a federal global graph model when the model reaches the optimal performance on sub-graphs of all the clients, and sending the model to each local client for power load prediction, so that the model performance of each local client is improved.
Drawings
FIG. 1 is a flow chart of the method of the present invention;
FIG. 2 is a supernode formation flow chart;
FIG. 3 is a graph of edge clipping and model optimization.
Detailed Description
The invention is further described below with reference to the accompanying drawings. The following examples are only for more clearly illustrating the technical aspects of the present invention, and are not intended to limit the scope of the present invention.
As shown in fig. 1, a power load prediction method based on a federal graph neural network includes the following steps:
s1, data acquisition: collecting power load data of each region through a data acquisition system of the smart grid; collecting connection relation data among all transmission towers, namely connection conditions of transmission wires among all transmission towers, according to a topological structure of the power system; collecting other relevant data exogenous information of each region, including population density characteristics, climate characteristics and industrial production values;
s2, data processing: carrying out normalization processing on the collected power load data of each region, and constructing a graph structure of a power system according to the connection relation data among the power transmission towers, wherein nodes represent the power transmission towers of each region, edges represent power transmission wires among the power transmission towers, other collected related data are used as node attributes for training a graph neural network model, and the processed data set is divided into a training set, a verification set and a test set; 70% of the data is used as training set, 15% of the data is used as validation set, and the remaining 15% of the data is used as test set;
in the step S2, the data is scaled between [0,1] by using a Min-Max normalization method, and the formula is:
wherein: x is x normalized Representing the original data, x max And x min Representing the maximum value and the minimum value in the data respectively;
the construction of the graph structure in the step S2 is as follows:
creating an adjacency matrix to represent the graph g= (V, E), wherein the rows and columns of the matrix represent transmission towers of each region, and adjacency matrix elements a [ i ] [ j ] =1 if there is an edge between node i and node j, otherwise a [ i ] [ j ] =0;
s3, constructing a graph neural network: selecting GraphSAGE as a framework of a graph neural network, and extracting structure and node characteristic information in power load data of each region;
in step S3, the specific operation steps for selecting graphSAGE as the architecture of the graph neural network are as follows:
s3-1, defining a model structure as a plurality of GraphSAGE layers according to the architecture of the GraphSAGE, and processing node characteristics and structure information and a full connection layer to generate a prediction result of the power load;
s3-2, defining a loss function of the average absolute error; adam is selected as an optimizer for optimizing model parameters to minimize losses;
s3-3, inputting training data into a graph SAGE model, wherein in each layer, for each node, a certain number of nodes are randomly selected from neighbor nodes of the node, and then the neighbor nodes are used as the input of aggregation operation; this process is performed in each layer, and finally a local neighborhood of each node is obtained; in each layer of GraphSAGE, executing an aggregation function to collect local neighborhood information to a target node, and generating updated node representation; transforming the updated node representation by a nonlinear activation function, which is repeated in each layer of graphSAGE, the updating rule formula of the graphSAGE:
wherein:characteristic of node v representing the k-th layer, < >>Characteristic of node v of layer k-1, W k Is a k-th layer of a learnable weight matrix, N (v) represents a neighbor node set of node v; />Features of node u of layer k-1;
the S3-4 fully connected layer performs linear transformation on the input characteristics and then outputs predicted power load, and the process is described by the following formula:
y=W*X+b
wherein: w and b represent weights and biases of the full connection layer, X is an input characteristic, and y is an output prediction result;
using the mean absolute error MAE as a loss function, the formula for MAE is as follows:
wherein: n represents the total number of power load values, Y true Representing the actual power load value, Y pred Representing a predicted electrical load value;
s3-5, optimizing model parameters by using an Adam optimizer.
S4, super node construction: as shown in fig. 2, for a graph constructed for a local jurisdiction by each client, backbone network extraction is firstly performed, then sampling subgraphs are generated through high-centrality neighbor sampling and a breadth-first based BFS algorithm, the contribution degree of each sampling subgraph is calculated through the accumulated contribution rate in kernel principal component analysis, and the minimum k subgraph sets are found; finally, constructing the subgraphs through supernodes fused by the characteristics of the multi-layer perceptron to form a supernode capable of representing the characteristics and the structure of the local subgraphs of the client;
the specific process of supernode formation in step S4 is as follows:
s4-1 firstly, the importance of each node is measured through the decentration, and the calculation formula of the decentration is as follows:
wherein: c (C) d(v) The degree centrality of the node v is represented, and the degree (v) represents the degree of the node v, namely the number of the edges connected with the node v;
s4-2, arranging all nodes in descending order of degree centrality, and calculating the importance sum T of all nodes, wherein the formula is as follows:
T=∑ v∈V C d(v)
wherein: v represents the node set of the original graph;
setting an accumulated contribution rate threshold value P, and calculating the node quantity Q of the extracted backbone network, wherein the formula is as follows:
wherein: searching the smallest node q in the node set v', wherein the sum of the importance metrics from the 1 st node to the q-th node is more than or equal to P times of the total importance metric T;
extracting sub-graphs containing the first Q nodes from the original graph as a backbone network, arranging each node in descending order of degree centrality, selecting a node with the highest centrality as a starting node, and performing neighbor sampling by using a BFS algorithm until the sum of the importance metrics of the nodes in the sub-graphs reaches P times of the sum of the importance metrics of the nodes in the original graph; then selecting the next node to repeat the BFS process until Q sampling subgraphs are obtained;
s4-3, for each sampling sub-graph, training an encoder of a GraphSAGE architecture to obtain the coding representation of the nodes in the sampling sub-graph, wherein the step is the same as the step S3;
s4-4, creating a message passing framework on each sampling subgraph, collecting the characteristics of the neighbors of each node in each time step, and then using an aggregation function to aggregate the characteristics to obtain a new characteristic representation; all new feature representations are then integrated into a unified sub-graph embedding space, and the update rule formula is as follows: message generation function:
M(h v ,h u )=h v *W*h u
wherein: h is a v And h u Characteristic representations of a node v and a node u respectively, wherein W is a weight matrix capable of learning;
aggregation function:
wherein:representing all messages from nodes u to v, σ representing the ReLu function;
status update function:
U(h v ,a v )=h v +a v
wherein: a, a v Representing the aggregated message;
s4-5, normalizing the data so that the mean value of each feature is 0 and the standard deviation is 1, and calculating a kernel matrix by using Gaussian radial basis function kernels; centering the kernel matrix to enable the average value of data points mapped to the high-order adjustment space to be 0, and centering the kernel matrix; calculating a variance interpretation ratio, namely the contribution degree of each principal component;
gaussian radial basis function kernel, the kernel matrix is calculated by the following formula:
K(i,j)=exp(-γ||x i -x j || 2 )
wherein: x is x i And x j Is the ith row and the jth row of the data matrix, gamma is the width parameter of the gaussian radial basis function kernel, and I represents the Euclidean norm;
assuming that the kernel matrix K is a matrix of n x n, the centered kernel matrix is calculated by the following formula:
wherein: k represents the core matrix and,is an all 1 vector of n 1>Is an all 1 vector of 1*n;
the variance interpretation ratio of each principal component is calculated by the following formula:
wherein: lambda (lambda) i Is the characteristic value of the ith representative sub-graph, and m is the total number of the representative sub-graphs;
sequencing all the sampling subgraphs according to the contribution degree from large to small, then accumulating the contribution degrees of the subgraphs one by one from the beginning until the accumulated contribution degree exceeds 95% of the total contribution degree of the original subgraphs, and forming k minimum subgraphs, namely representing the subgraphs;
s4-6 initializing a multi-layer perceptron model, wherein the input of each fully connected layer is the output of the previous layer, and nonlinear transformation is carried out through an activation function, and the expression form of the fully connected layer is as follows:
H l+1 =Relu(H l W l +b l )
wherein: h l Is the output of the first layer, W l And b l Is the weight and bias of layer i;
each representative subThe embedded representation of the graph is input as input into a multi-layer perceptron-based model, embedding a matrix H for each node i Obtaining a fused representation Z by using a model based on a multi-layer perceptron i
Z i =MLP(H i )
The fused representations of all representative subgraphs are stacked together:
Z=[Z 1 ,Z 2 ,·····,Z K ]
and then the fusion operation in the step S4-6 is carried out, and finally, a high-dimensional supernode representing the local sub-graph characteristics and structures of the client is obtained.
S5, building a global graph structure: as shown in fig. 3, repeating step S4 for n clients formed by power load information of n regions, to obtain n supernodes; adding one side between each node of the n supernodes by using a permutation and combination method to perform information interaction to form a graph structure, then training the graph to obtain a model for power load prediction, and distributing a weight for each side in the graph; and then, deleting the edges in turn from small to large according to the weight of the edges, comparing the edges with the local model performance obtained by each client through graph training in the local jurisdiction, and stopping deleting the edges when the model reaches the optimal performance on the subgraphs of all the clients to obtain the global federal graph model.
The construction of the global graph structure comprises the following specific processes:
s5-1, repeating the step S4 for n clients formed by power load information of n areas to obtain n supernodes;
s5-2, adding an edge between each node by using the method of permutation and combination to perform information interaction to form a graph structure, and adding an edge between every two nodes to represent information interaction between the nodes; adding a weight to each supernode using pearson correlation coefficients; the higher the correlation coefficient is, the larger the weight is, the more similar the power load change trend of the two areas is, and the information interaction is more valuable; the pearson correlation coefficient formula is as follows:
wherein: x is x i And y i Power load information, x, representing region i and region j, respectively mean Representing the average value, y, of the power load sequences of region i mean Representing the average value of the power load sequences of region j;
s5-3, training a GNN model by using the graph structure obtained in the S5-2, representing the performance of the model by using the average absolute error, and recording;
s5-4, deleting the represented edges in sequence from low to high according to the obtained pearson correlation coefficient, repeating the step S5-3 to train a new GNN model and record the performance of the GNN model after deleting one edge, comparing the performance of the model after retraining each time with the performance of the local model obtained by each client through graph training in a local jurisdiction, stopping deleting the edges when the model reaches the optimal performance on sub-graphs of all the clients, obtaining a global federal graph model, and distributing the global federal graph model to each local client for power load prediction.
The foregoing is merely a preferred embodiment of the present invention, and it should be noted that modifications and variations could be made by those skilled in the art without departing from the technical principles of the present invention, and such modifications and variations should also be regarded as being within the scope of the invention.

Claims (6)

1. The power load prediction method based on the federal graph neural network is characterized by comprising the following steps of:
s1, data acquisition: collecting power load data of each region through a data acquisition system of the smart grid; collecting connection relation data among all transmission towers, namely connection conditions of transmission wires among all transmission towers, according to a topological structure of the power system; collecting other relevant data exogenous information of each region, including population density characteristics, climate characteristics and industrial production values;
s2, data processing: carrying out normalization processing on the collected power load data of each region, and constructing a graph structure of a power system according to the connection relation data among the power transmission towers, wherein nodes represent the power transmission towers of each region, edges represent power transmission wires among the power transmission towers, other collected related data are used as node attributes for training a graph neural network model, and the processed data set is divided into a training set, a verification set and a test set; 70% of the data is used as training set, 15% of the data is used as validation set, and the remaining 15% of the data is used as test set;
s3, constructing a graph neural network: selecting GraphSAGE as a framework of a graph neural network, and extracting structure and node characteristic information in power load data of each region;
s4, super node construction: for a graph constructed for a local jurisdiction by each client, firstly, backbone network extraction is carried out, then sampling subgraphs are generated through high-centrality neighbor sampling and BFS algorithm based on breadth first, the contribution degree of each sampling subgraph is calculated through the accumulated contribution rate in kernel principal component analysis, and the minimum k subgraph sets are found; finally, constructing the subgraphs through supernodes fused by the characteristics of the multi-layer perceptron to form a supernode capable of representing the characteristics and the structure of the local subgraphs of the client;
s5, building a global graph structure: repeating the step S4 for n clients formed by the power load information of n areas to obtain n supernodes; adding one side between each node of the n supernodes by using a permutation and combination method to perform information interaction to form a graph structure, then training the graph to obtain a model for power load prediction, and distributing a weight for each side in the graph; and then, deleting the edges in turn from small to large according to the weight of the edges, comparing the edges with the local model performance obtained by each client through graph training in the local jurisdiction, and stopping deleting the edges when the model reaches the optimal performance on the subgraphs of all the clients to obtain the global federal graph model.
2. The method for predicting the power load based on the federal graph neural network according to claim 1, wherein the Min-Max normalization method is used in the data processing in the step S2 to scale the data between [0,1], and the formula is:
wherein: x is x normalized Representing the original data, x max And x min Representing the maximum and minimum values in the data, respectively.
3. The method for predicting the power load based on the federal graph neural network according to claim 1, wherein the graph structure in the step S2 is constructed as follows:
an adjacency matrix is created to represent the graph g= (V, E), wherein the rows and columns of the matrix represent transmission towers of each region, and adjacency matrix elements a [ i ] [ j ] =1 if there is an edge between node i and node j, otherwise a [ i ] [ j ] =0.
4. The method for predicting the power load based on the federal graph neural network according to claim 1, wherein the specific operation steps of selecting graphSAGE as the framework of the graph neural network in the step S3 are as follows:
s3-1, defining a model structure as a plurality of GraphSAGE layers according to the architecture of the GraphSAGE, and processing node characteristics and structure information and a full connection layer to generate a prediction result of the power load;
s3-2, defining a loss function of the average absolute error; adam is selected as an optimizer for optimizing model parameters to minimize losses;
s3-3, inputting training data into a graph SAGE model, wherein in each layer, for each node, a certain number of nodes are randomly selected from neighbor nodes of the node, and then the neighbor nodes are used as the input of aggregation operation; this process is performed in each layer, and finally a local neighborhood of each node is obtained; in each layer of GraphSAGE, executing an aggregation function to collect local neighborhood information to a target node, and generating updated node representation; transforming the updated node representation by a nonlinear activation function, which is repeated in each layer of graphSAGE, the updating rule formula of the graphSAGE:
wherein:characteristic of node v representing the k-th layer, < >>Characteristic of node v of layer k-1, W k Is a k-th layer of a learnable weight matrix, N (v) represents a neighbor node set of node v; />Features of node u of layer k-1;
the S3-4 fully connected layer performs linear transformation on the input characteristics and then outputs predicted power load, and the process is described by the following formula:
y=W*X+b
wherein: w and b represent weights and biases of the full connection layer, X is an input characteristic, and y is an output prediction result;
using the mean absolute error MAE as a loss function, the formula for MAE is as follows:
wherein: n represents the total number of power load values, Y true Representing the actual power load value, Y pred Representing a predicted electrical load value;
s3-5, optimizing model parameters by using an Adam optimizer.
5. The method for predicting the power load based on the federal graph neural network according to claim 1, wherein the specific process of forming the supernode in step S4 is as follows:
s4-1 firstly, the importance of each node is measured through the decentration, and the calculation formula of the decentration is as follows:
wherein: c (C) d(v) The degree centrality of the node v is represented, and the degree (v) represents the degree of the node v, namely the number of the edges connected with the node v;
s4-2, arranging all nodes in descending order of degree centrality, and calculating the importance sum T of all nodes, wherein the formula is as follows:
T=∑ v∈V C d(v)
wherein: v represents the node set of the original graph;
setting an accumulated contribution rate threshold value P, and calculating the node quantity Q of the extracted backbone network, wherein the formula is as follows:
wherein: searching the smallest node q in the node set v', wherein the sum of the importance metrics from the 1 st node to the q-th node is more than or equal to P times of the total importance metric T;
extracting sub-graphs containing the first Q nodes from the original graph as a backbone network, arranging each node in descending order of degree centrality, selecting a node with the highest centrality as a starting node, and performing neighbor sampling by using a BFS algorithm until the sum of the importance metrics of the nodes in the sub-graphs reaches P times of the sum of the importance metrics of the nodes in the original graph; then selecting the next node to repeat the BFS process until Q sampling subgraphs are obtained;
s4-3, for each sampling sub-graph, training an encoder of a GraphSAGE architecture to obtain the coding representation of the nodes in the sampling sub-graph, wherein the step is the same as the step S3;
s4-4, creating a message passing framework on each sampling subgraph, collecting the characteristics of the neighbors of each node in each time step, and then using an aggregation function to aggregate the characteristics to obtain a new characteristic representation; all new feature representations are then integrated into a unified sub-graph embedding space, and the update rule formula is as follows: message generation function:
M(h v ,h u )=h v *W*h u
wherein: h is a v And h u Characteristic representations of a node v and a node u respectively, wherein W is a weight matrix capable of learning;
aggregation function:
wherein:representing all messages from nodes u to v, σ representing the ReLu function;
status update function:
U(h v ,a v )=h v +a v
wherein: a, a v Representing the aggregated message;
s4-5, normalizing the data so that the mean value of each feature is 0 and the standard deviation is 1, and calculating a kernel matrix by using Gaussian radial basis function kernels; centering the kernel matrix to enable the average value of data points mapped to the high-order adjustment space to be 0, and centering the kernel matrix; calculating a variance interpretation ratio, namely the contribution degree of each principal component;
gaussian radial basis function kernel, the kernel matrix is calculated by the following formula:
K(i,j)=exp(-γ||x i -x j || 2 )
wherein: x is x i And x j Is the ith row and the jth row of the data matrix, gamma is the width parameter of the gaussian radial basis function kernel, and I represents the Euclidean norm;
assuming that the kernel matrix K is a matrix of n x n, the centered kernel matrix is calculated by the following formula:
wherein: k represents the core matrix and,is an all 1 vector of n 1>Is an all 1 vector of 1*n;
the variance interpretation ratio of each principal component is calculated by the following formula:
wherein: lambda (lambda) i Is the characteristic value of the ith representative sub-graph, and m is the total number of the representative sub-graphs;
sequencing all the sampling subgraphs according to the contribution degree from large to small, then accumulating the contribution degrees of the subgraphs one by one from the beginning until the accumulated contribution degree exceeds 95% of the total contribution degree of the original subgraphs, and forming k minimum subgraphs, namely representing the subgraphs;
s4-6 initializing a multi-layer perceptron model, wherein the input of each fully connected layer is the output of the previous layer, and nonlinear transformation is carried out through an activation function, and the expression form of the fully connected layer is as follows:
H l+1 =Relu(H l W l +b l )
wherein: h l Is the output of the first layer, W l And b l Is the weight and bias of layer i;
taking the embedded representation of each representative subgraph as input, inputting into a multi-layer perceptron-based model, and embedding a matrix H for each node i Obtaining a fused representation Z by using a model based on a multi-layer perceptron i
Z i =MLP(H i )
The fused representations of all representative subgraphs are stacked together:
Z=[Z 1 ,Z 2 ,……,Z K ]
and then the fusion operation in the step S4-6 is carried out, and finally, a high-dimensional supernode representing the local sub-graph characteristics and structures of the client is obtained.
6. The method for predicting the power load based on the federal graph neural network according to claim 1, wherein the construction of the global graph structure in step S5 comprises the following specific steps:
s5-1, repeating the step S4 for n clients formed by power load information of n areas to obtain n supernodes;
s5-2, adding an edge between each node by using the method of permutation and combination to perform information interaction to form a graph structure, and adding an edge between every two nodes to represent information interaction between the nodes; adding a weight to each supernode using pearson correlation coefficients; the higher the correlation coefficient is, the larger the weight is, the more similar the power load change trend of the two areas is, and the information interaction is more valuable; the pearson correlation coefficient formula is as follows:
wherein: x is x i And y i Power load information, x, representing region i and region j, respectively mean Representing the average value, y, of the power load sequences of region i mean Representing the average value of the power load sequences of region j;
s5-3, training a GNN model by using the graph structure obtained in the S5-2, measuring the performance of the model by using the average absolute error, and recording;
s5-4, deleting the represented edges in sequence from low to high according to the obtained pearson correlation coefficient, repeating the step S5-3 to train a new GNN model and record the performance of the GNN model after deleting one edge, comparing the performance of the model after retraining each time with the performance of the local model obtained by each client through graph training in a local jurisdiction, stopping deleting the edges when the model reaches the optimal performance on sub-graphs of all the clients, obtaining a global federal graph model, and distributing the global federal graph model to each local client for power load prediction.
CN202310868553.8A 2023-07-14 2023-07-14 Power load prediction method based on federal graph neural network Pending CN116826743A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310868553.8A CN116826743A (en) 2023-07-14 2023-07-14 Power load prediction method based on federal graph neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310868553.8A CN116826743A (en) 2023-07-14 2023-07-14 Power load prediction method based on federal graph neural network

Publications (1)

Publication Number Publication Date
CN116826743A true CN116826743A (en) 2023-09-29

Family

ID=88112694

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310868553.8A Pending CN116826743A (en) 2023-07-14 2023-07-14 Power load prediction method based on federal graph neural network

Country Status (1)

Country Link
CN (1) CN116826743A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117453414A (en) * 2023-11-10 2024-01-26 国网山东省电力公司营销服务中心(计量中心) Contribution degree prediction method and system for participation of power data in data sharing calculation

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117453414A (en) * 2023-11-10 2024-01-26 国网山东省电力公司营销服务中心(计量中心) Contribution degree prediction method and system for participation of power data in data sharing calculation

Similar Documents

Publication Publication Date Title
Nie et al. Network traffic prediction based on deep belief network in wireless mesh backbone networks
Quan et al. Incorporating wind power forecast uncertainties into stochastic unit commitment using neural network-based prediction intervals
Dou et al. Hybrid model for renewable energy and loads prediction based on data mining and variational mode decomposition
CN116245033B (en) Artificial intelligent driven power system analysis method and intelligent software platform
CN112187554B (en) Operation and maintenance system fault positioning method and system based on Monte Carlo tree search
CN113505458A (en) Cascading failure key trigger branch prediction method, system, equipment and storage medium
CN116826743A (en) Power load prediction method based on federal graph neural network
CN112287990A (en) Model optimization method of edge cloud collaborative support vector machine based on online learning
Daş Forecasting the energy demand of Turkey with a NN based on an improved Particle Swarm Optimization
CN113516501A (en) User communication behavior prediction method and device based on graph neural network
Almutairi et al. An intelligent deep learning based prediction model for wind power generation
CN117060408A (en) New energy power generation prediction method and system
CN117117850A (en) Short-term electricity load prediction method and system
Khan et al. Dynamic feedback neuro-evolutionary networks for forecasting the highly fluctuating electrical loads
D’Ambrosio et al. Optimizing cellular automata through a meta-model assisted memetic algorithm
Shterev et al. Time series prediction with neural networks: a review
CN114066250A (en) Method, device, equipment and storage medium for measuring and calculating repair cost of power transmission project
Ding et al. Evolving neural network using hybrid genetic algorithm and simulated annealing for rainfall-runoff forecasting
CN111950765A (en) Probabilistic transient stability prediction method based on stacked noise reduction self-encoder
Lee et al. Three‐Phase Load Prediction‐Based Hybrid Convolution Neural Network Combined Bidirectional Long Short‐Term Memory in Solar Power Plant
Lu et al. Anomaly Recognition Method for Massive Data of Power Internet of Things Based on Bayesian Belief Network
CN116562168B (en) Electric power informatization data mining system and method based on deep learning
Li et al. Short-term wind power prediction method based on genetic algorithm optimized xgboost regression model
CN117633456B (en) Marine wind power weather event identification method and device based on self-adaptive focus loss
Lin et al. An efficient evolutionary algorithm for fuzzy inference systems

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination