CN116826743A

CN116826743A - Power load prediction method based on federal graph neural network

Info

Publication number: CN116826743A
Application number: CN202310868553.8A
Authority: CN
Inventors: 王峰; 曹宇航; 刘立; 鲜学丰; 方立刚; 卜峰
Original assignee: Suzhou Vocational University
Current assignee: Suzhou Vocational University
Priority date: 2023-07-14
Filing date: 2023-07-14
Publication date: 2023-09-29

Abstract

The invention discloses a power load prediction method based on a federal graph neural network, which comprises the following steps: s1, data acquisition: s2, data processing; s3, constructing a graph neural network; s4, super node construction; s5, building a global graph structure. The invention provides a power load prediction method based on a federal graph neural network.

Description

Power load prediction method based on federal graph neural network

Technical Field

The invention discloses a power load prediction method based on a federal graph neural network, and belongs to the field of machine learning and deep learning power load prediction.

Background

In the operation of a power system, it is important to accurately predict the power load, so that an electric company can be helped to make an effective scheduling decision, and the stability and safety of power supply are ensured.

Conventional power load prediction methods are typically based on statistical and machine learning techniques, which require a large amount of historical data as input to train out models that can accurately predict future loads. However, in practical applications, because the power data is distributed in clients in different regions and often contains sensitive information, a centralized data processing method has risks of data security and privacy disclosure. In addition, a large amount of data is collected and processed at one place, which also results in large communication overhead and low calculation efficiency.

With the development of federal learning and Graph Neural Network (GNN) technology, a new power load prediction method has been developed. The federal learning can allow a plurality of clients to share model parameters for centralized training on the premise of keeping data privacy, so that the data privacy problem is solved. The graphic neural network can process graphic data of the power system and capture topological structure information of the power system, so that accuracy of prediction is improved.

However, in a real world scenario, data in a power system is often stored in a distributed manner on multiple clients, centralized processing communication overhead is large, and open data access between clients is required.

Disclosure of Invention

Most federal graph neural network predictions at present have difficulty in considering the problem of information interaction between clients, so that the accuracy of a final prediction model is reduced. The invention provides a power load prediction method based on a federal graph neural network.

The technical scheme of the invention is as follows:

a power load prediction method based on a federal graph neural network comprises the following steps:

s1, data acquisition: collecting power load data of each region through a data acquisition system of the smart grid; collecting connection relation data among all transmission towers, namely connection conditions of transmission wires among all transmission towers, according to a topological structure of the power system; collecting other relevant data exogenous information of each region, including population density characteristics, climate characteristics and industrial production values;

s2, data processing: carrying out normalization processing on the collected power load data of each region, and constructing a graph structure of a power system according to the connection relation data among the power transmission towers, wherein nodes represent the power transmission towers of each region, edges represent power transmission wires among the power transmission towers, other collected related data are used as node attributes for training a graph neural network model, and the processed data set is divided into a training set, a verification set and a test set; 70% of the data is used as training set, 15% of the data is used as validation set, and the remaining 15% of the data is used as test set;

in the step S2, the data is scaled between [0,1] by using a Min-Max normalization method, and the formula is:

wherein: x is x _normalized Representing the original data, x _max And x _min Representing the maximum value and the minimum value in the data respectively;

the construction of the graph structure in the step S2 is as follows:

creating an adjacency matrix to represent the graph g= (V, E), wherein the rows and columns of the matrix represent transmission towers of each region, and adjacency matrix elements a [ i ] [ j ] =1 if there is an edge between node i and node j, otherwise a [ i ] [ j ] =0;

s3, constructing a graph neural network: selecting GraphSAGE as a framework of a graph neural network, and extracting structure and node characteristic information in power load data of each region;

in step S3, the specific operation steps for selecting graphSAGE as the architecture of the graph neural network are as follows:

s3-1, defining a model structure as a plurality of GraphSAGE layers according to the architecture of the GraphSAGE, and processing node characteristics and structure information and a full connection layer to generate a prediction result of the power load;

s3-2, defining a loss function of the average absolute error; adam is selected as an optimizer for optimizing model parameters to minimize losses;

s3-3, inputting training data into a graph SAGE model, wherein in each layer, for each node, a certain number of nodes are randomly selected from neighbor nodes of the node, and then the neighbor nodes are used as the input of aggregation operation; this process is performed in each layer, and finally a local neighborhood of each node is obtained; in each layer of GraphSAGE, executing an aggregation function to collect local neighborhood information to a target node, and generating updated node representation; transforming the updated node representation by a nonlinear activation function, which is repeated in each layer of graphSAGE, the updating rule formula of the graphSAGE:

wherein:characteristic of node v representing the k-th layer, < >>Characteristic of node v of layer k-1, W ^k Is a k-th layer of a learnable weight matrix, N (v) represents a neighbor node set of node v; />Features of node u of layer k-1;

the S3-4 fully connected layer performs linear transformation on the input characteristics and then outputs predicted power load, and the process is described by the following formula:

y＝W*X+b

wherein: w and b represent weights and biases of the full connection layer, X is an input characteristic, and y is an output prediction result;

using the mean absolute error MAE as a loss function, the formula for MAE is as follows:

wherein: n represents the total number of power load values, Y _true Representing the actual power load value, Y _pred Representing a predicted electrical load value;

s3-5, optimizing model parameters by using an Adam optimizer.

S4, super node construction: for a graph constructed for a local jurisdiction by each client, firstly, backbone network extraction is carried out, then sampling subgraphs are generated through high-centrality neighbor sampling and BFS algorithm based on breadth first, the contribution degree of each sampling subgraph is calculated through the accumulated contribution rate in kernel principal component analysis, and the minimum k subgraph sets are found; finally, constructing the subgraphs through supernodes fused by the characteristics of the multi-layer perceptron to form a supernode capable of representing the characteristics and the structure of the local subgraphs of the client;

the specific process of supernode formation in step S4 is as follows:

s4-1 firstly, the importance of each node is measured through the decentration, and the calculation formula of the decentration is as follows:

wherein: c (C) _d(v) The degree centrality of the node v is represented, and the degree (v) represents the degree of the node v, namely the number of the edges connected with the node v;

s4-2, arranging all nodes in descending order of degree centrality, and calculating the importance sum T of all nodes, wherein the formula is as follows:

T＝∑ _v∈V C _d(v)

wherein: v represents the node set of the original graph;

setting an accumulated contribution rate threshold value P, and calculating the node quantity Q of the extracted backbone network, wherein the formula is as follows:

wherein: searching the smallest node q in the node set v', wherein the sum of the importance metrics from the 1 st node to the q-th node is more than or equal to P times of the total importance metric T;

extracting sub-graphs containing the first Q nodes from the original graph as a backbone network, arranging each node in descending order of degree centrality, selecting a node with the highest centrality as a starting node, and performing neighbor sampling by using a BFS algorithm until the sum of the importance metrics of the nodes in the sub-graphs reaches P times of the sum of the importance metrics of the nodes in the original graph; then selecting the next node to repeat the BFS process until Q sampling subgraphs are obtained;

s4-3, for each sampling sub-graph, training an encoder of a GraphSAGE architecture to obtain the coding representation of the nodes in the sampling sub-graph, wherein the step is the same as the step S3;

s4-4, creating a message passing framework on each sampling subgraph, collecting the characteristics of the neighbors of each node in each time step, and then using an aggregation function to aggregate the characteristics to obtain a new characteristic representation; all new feature representations are then integrated into a unified sub-graph embedding space, and the update rule formula is as follows: message generation function:

M(h _v ，h _u )＝h _v *W*h _u

wherein: h is a _v And h _u Characteristic representations of a node v and a node u respectively, wherein W is a weight matrix capable of learning;

aggregation function:

wherein:representing all messages from nodes u to v, σ representing the ReLu function;

status update function:

U(h _v ，a _v )＝h _v +a _v

wherein: a, a _v Representing the aggregated message;

s4-5, normalizing the data so that the mean value of each feature is 0 and the standard deviation is 1, and calculating a kernel matrix by using Gaussian radial basis function kernels; centering the kernel matrix to enable the average value of data points mapped to the high-order adjustment space to be 0, and centering the kernel matrix; calculating a variance interpretation ratio, namely the contribution degree of each principal component;

gaussian radial basis function kernel, the kernel matrix is calculated by the following formula:

K(i，j)＝exp(-γ||x _i -x _j || ² )

wherein: x is x _i And x _j Is the ith row and the jth row of the data matrix, gamma is the width parameter of the gaussian radial basis function kernel, and I represents the Euclidean norm;

assuming that the kernel matrix K is a matrix of n x n, the centered kernel matrix is calculated by the following formula:

wherein: k represents the core matrix and,is an all 1 vector of n 1>Is an all 1 vector of 1*n;

the variance interpretation ratio of each principal component is calculated by the following formula:

wherein: lambda (lambda) _i Is the characteristic value of the ith representative sub-graph, and m is the total number of the representative sub-graphs;

sequencing all the sampling subgraphs according to the contribution degree from large to small, then accumulating the contribution degrees of the subgraphs one by one from the beginning until the accumulated contribution degree exceeds 95% of the total contribution degree of the original subgraphs, and forming k minimum subgraphs, namely representing the subgraphs;

s4-6 initializing a multi-layer perceptron model, wherein the input of each fully connected layer is the output of the previous layer, and nonlinear transformation is carried out through an activation function, and the expression form of the fully connected layer is as follows:

H ^l+1 ＝Relu(H ^l W ^l +b ^l )

wherein: h ^l Is the output of the first layer, W ^l And b ^l Is the weight and bias of layer i;

taking the embedded representation of each representative subgraph as input, inputting into a multi-layer perceptron-based model, and embedding a matrix H for each node _i Obtaining a fused representation Z by using a model based on a multi-layer perceptron _i ：

Z _i ＝MLP(H _i )

The fused representations of all representative subgraphs are stacked together:

Z＝[Z ₁ ，Z ₂ ，·····，Z _K ]

and then the fusion operation in the step S4-6 is carried out, and finally, a high-dimensional supernode representing the local sub-graph characteristics and structures of the client is obtained.

S5, building a global graph structure: repeating the step S4 for n clients formed by the power load information of n areas to obtain n supernodes; adding one side between each node of the n supernodes by using a permutation and combination method to perform information interaction to form a graph structure, then training the graph to obtain a model for power load prediction, and distributing a weight for each side in the graph; and then, deleting the edges in turn from small to large according to the weight of the edges, comparing the edges with the local model performance obtained by each client through graph training in the local jurisdiction, and stopping deleting the edges when the model reaches the optimal performance on the subgraphs of all the clients to obtain the global federal graph model.

The construction of the global graph structure comprises the following specific processes:

s5-1, repeating the step S4 for n clients formed by power load information of n areas to obtain n supernodes;

s5-2, adding an edge between each node by using the method of permutation and combination to perform information interaction to form a graph structure, and adding an edge between every two nodes to represent information interaction between the nodes; adding a weight to each supernode using pearson correlation coefficients; the higher the correlation coefficient is, the larger the weight is, the more similar the power load change trend of the two areas is, and the information interaction is more valuable; the pearson correlation coefficient formula is as follows:

wherein: x is x _i And y _i Power load information, x, representing region i and region j, respectively _mean Representing the average value, y, of the power load sequences of region i _mean Representing the average value of the power load sequences of region j;

s5-3, training a GNN model by using the graph structure obtained in the S5-2, representing the performance of the model by using the average absolute error, and recording;

s5-4, deleting the represented edges in sequence from low to high according to the obtained pearson correlation coefficient, repeating the step S5-3 to train a new GNN model and record the performance of the GNN model after deleting one edge, comparing the performance of the model after retraining each time with the performance of the local model obtained by each client through graph training in a local jurisdiction, stopping deleting the edges when the model reaches the optimal performance on sub-graphs of all the clients, obtaining a global federal graph model, and distributing the global federal graph model to each local client for power load prediction.

The beneficial effects of the invention are as follows:

the invention discloses a power load prediction method based on a federal graph neural network, which realizes high-efficiency power load prediction in a distributed environment under the privacy protection and information interaction limitation among clients. Traditional power load prediction methods typically require data from different regions to be centralized at a central server for training and reasoning. However, in a real world scenario, data in a power system is often stored in a distributed manner on multiple clients, centralized processing communication overhead is large, and open data access between clients is required. The invention provides a power load prediction method based on a federal graph neural network.

In the method, each client maintains a local map model that represents the relationships between the various regions in the power system. Nodes represent transmission towers of each region, and edges represent transmission conductors between transmission towers. To better capture key nodes and structural information in the graph, a topology backbone is extracted at each client, which is a subgraph containing nodes of high centrality and importance in the graph. In this way, computational complexity and communication overhead can be reduced while preserving the primary topological features in the graph. A neighbor sampling strategy based on Breadth First Search (BFS) is adopted to generate a sub-graph, self-adaptive Top-K neighbor sampling is realized by utilizing the accumulated contribution rate in kernel principal component analysis, the sub-graph generation process is further optimized, and finally a super node capable of representing the characteristics and the structure of a local sub-graph of a client is obtained. Then, the supernodes of each region are connected with each other to form a supernode diagram, and training is carried out by using the GNN model. And then, deleting edges in the graph step by step, comparing the edges with the local model performance obtained by each client through graph training in a local jurisdiction, obtaining a federal global graph model when the model reaches the optimal performance on sub-graphs of all the clients, and sending the model to each local client for power load prediction, so that the model performance of each local client is improved.

Drawings

FIG. 1 is a flow chart of the method of the present invention;

FIG. 2 is a supernode formation flow chart;

FIG. 3 is a graph of edge clipping and model optimization.

Detailed Description

The invention is further described below with reference to the accompanying drawings. The following examples are only for more clearly illustrating the technical aspects of the present invention, and are not intended to limit the scope of the present invention.

As shown in fig. 1, a power load prediction method based on a federal graph neural network includes the following steps:

the construction of the graph structure in the step S2 is as follows:

y＝W*X+b

s3-5, optimizing model parameters by using an Adam optimizer.

S4, super node construction: as shown in fig. 2, for a graph constructed for a local jurisdiction by each client, backbone network extraction is firstly performed, then sampling subgraphs are generated through high-centrality neighbor sampling and a breadth-first based BFS algorithm, the contribution degree of each sampling subgraph is calculated through the accumulated contribution rate in kernel principal component analysis, and the minimum k subgraph sets are found; finally, constructing the subgraphs through supernodes fused by the characteristics of the multi-layer perceptron to form a supernode capable of representing the characteristics and the structure of the local subgraphs of the client;

the specific process of supernode formation in step S4 is as follows:

T＝∑ _v∈V C _d(v)

wherein: v represents the node set of the original graph;

M(h _v ，h _u )＝h _v *W*h _u

aggregation function:

status update function:

U(h _v ，a _v )＝h _v +a _v

wherein: a, a _v Representing the aggregated message;

K(i，j)＝exp(-γ||x _i -x _j || ² )

H ^l+1 ＝Relu(H ^l W ^l +b ^l )

each representative subThe embedded representation of the graph is input as input into a multi-layer perceptron-based model, embedding a matrix H for each node _i Obtaining a fused representation Z by using a model based on a multi-layer perceptron _i ：

Z _i ＝MLP(H _i )

The fused representations of all representative subgraphs are stacked together:

Z＝[Z ₁ ，Z ₂ ，·····，Z _K ]

S5, building a global graph structure: as shown in fig. 3, repeating step S4 for n clients formed by power load information of n regions, to obtain n supernodes; adding one side between each node of the n supernodes by using a permutation and combination method to perform information interaction to form a graph structure, then training the graph to obtain a model for power load prediction, and distributing a weight for each side in the graph; and then, deleting the edges in turn from small to large according to the weight of the edges, comparing the edges with the local model performance obtained by each client through graph training in the local jurisdiction, and stopping deleting the edges when the model reaches the optimal performance on the subgraphs of all the clients to obtain the global federal graph model.

The foregoing is merely a preferred embodiment of the present invention, and it should be noted that modifications and variations could be made by those skilled in the art without departing from the technical principles of the present invention, and such modifications and variations should also be regarded as being within the scope of the invention.

Claims

1. The power load prediction method based on the federal graph neural network is characterized by comprising the following steps of:

2. The method for predicting the power load based on the federal graph neural network according to claim 1, wherein the Min-Max normalization method is used in the data processing in the step S2 to scale the data between [0,1], and the formula is:

wherein: x is x _normalized Representing the original data, x _max And x _min Representing the maximum and minimum values in the data, respectively.

3. The method for predicting the power load based on the federal graph neural network according to claim 1, wherein the graph structure in the step S2 is constructed as follows:

an adjacency matrix is created to represent the graph g= (V, E), wherein the rows and columns of the matrix represent transmission towers of each region, and adjacency matrix elements a [ i ] [ j ] =1 if there is an edge between node i and node j, otherwise a [ i ] [ j ] =0.

4. The method for predicting the power load based on the federal graph neural network according to claim 1, wherein the specific operation steps of selecting graphSAGE as the framework of the graph neural network in the step S3 are as follows:

y＝W*X+b

s3-5, optimizing model parameters by using an Adam optimizer.

5. The method for predicting the power load based on the federal graph neural network according to claim 1, wherein the specific process of forming the supernode in step S4 is as follows:

T＝∑ _v∈V C _d(v)

wherein: v represents the node set of the original graph;

M(h _v ,h _u )＝h _v *W*h _u

aggregation function:

status update function:

U(h _v ,a _v )＝h _v +a _v

wherein: a, a _v Representing the aggregated message;

K(i,j)＝exp(-γ||x _i -x _j || ² )

H ^l+1 ＝Relu(H ^l W ^l +b ^l )

Z _i ＝MLP(H _i )

The fused representations of all representative subgraphs are stacked together:

Z＝[Z ₁ ,Z ₂ ,……,Z _K ]

6. The method for predicting the power load based on the federal graph neural network according to claim 1, wherein the construction of the global graph structure in step S5 comprises the following specific steps:

s5-3, training a GNN model by using the graph structure obtained in the S5-2, measuring the performance of the model by using the average absolute error, and recording;