CN115842768A

CN115842768A - SDN route optimization method based on time-space feature fusion of graph neural network

Info

Publication number: CN115842768A
Application number: CN202211473921.0A
Authority: CN
Inventors: 陈俊彦; 谢小兰; 俸皓; 王勇; 廖岑卉珊
Original assignee: Guilin University of Electronic Technology
Current assignee: Guilin University of Electronic Technology
Priority date: 2022-11-22
Filing date: 2022-11-22
Publication date: 2023-03-24

Abstract

The invention discloses an SDN route optimization method based on time-space feature fusion of a graph neural network, which utilizes a reinforcement learning agent to learn the interdependence relation between network switch flow load and network performance and determines a group of optimal route forwarding schemes so as to balance the end-to-end path bandwidth capacity and load balance of the network. The optimal path of the data packet is found by combining the prediction of the graph neural network, namely, a time-space feature fusion network model (GCT-Route network model) based on the graph neural network is developed to assist deep reinforcement learning to quickly complete the self-learning process. The invention can solve the problem of network routing performance reduction caused by long-time trial and error exploration in the learning process of the reinforcement learning agent.

Description

SDN route optimization method based on time-space feature fusion of graph neural network

Technical Field

The invention relates to the technical field of Software Defined Networking (SDN), in particular to an SDN route optimization method based on time-space feature fusion of a graph neural Network.

Background

The advent of software defined networking has made centralized management and operation possible, and network resources such as switches have become flexibly configurable through programmable interfaces. The traditional SDN route optimization method approximately fits the current network state through modeling, and adopts a heuristic method to calculate route configuration for a multimedia flow request in real time, and the method has the defects of strict application scene, huge calculation cost and difficulty in coping with the future real-time high-dynamic network environment. In recent years, advances in deep reinforcement learning techniques have provided new avenues for optimization to solve highly complex routing problems. A deep reinforcement learning based routing scheme can learn and adapt to complex networks by improving routing decision performance in an experience-driven and model-free manner. However, due to the nature of reinforcement learning, i.e., exploration in determining the best strategy, network performance may degrade during the learning process. In particular, if the network topology changes, the agents of the DRL need to relearn for route optimization. The more complex the characteristics of the network traffic, the longer the time required for convergence, resulting in long-term poor network performance.

Disclosure of Invention

The invention aims to solve the problem that the network performance is reduced in the learning process by the existing SDN route optimization method, and provides an SDN route optimization method based on the time-space feature fusion of a graph neural network.

In order to solve the problems, the invention is realized by the following technical scheme:

an SDN route optimization method based on spatio-temporal feature fusion of a graph neural network comprises the following steps:

step 1, an SDN control plane collects historical network information, namely network topology, a routing scheme and a flow matrix to generate a training data set, and the training data set is stored in a storage area of an SDN knowledge plane;

step 2, constructing a spatio-temporal feature fusion network model based on a graph neural network on an SDN knowledge plane, and performing off-line training on the spatio-temporal feature fusion network model based on the graph neural network by using a training data set;

step 3, the SDN control plane maps the current flow matrix

And the prize value r at the current time _t′ The reinforcement learning agent input into the SDN knowledge plane generates an action value, and then the action value is input into a space-time feature fusion network model based on a graph neural network;

step 4, the SDN knowledge plane receives an action value as a routing scheme P based on a time-space feature fusion network model of the graph neural network, and the routing scheme P, the network topology M and a current-time flow matrix are used

And a traffic matrix sequence consisting of the traffic matrix at the current time and its q preceding times>

Taken together as an input to a spatio-temporal feature fusion network model based on a neural network of graphs, the spatio-temporal feature fusion network model based on a neural network of graphs outputs a link level performance matrix->

Performance matrix for a path level +>

And the predicted traffic matrix for the next instant in time->

Wherein q is a set value;

step 5, performance matrix of reward function of reinforcement learning agent of SDN knowledge plane according to link level

And the performance matrix of the path level +>

Generating a prize value r for the next time _t′+1 ；

Step 6, judging an incentive value sequence { r) consisting of the incentive values of the next time and s times before the next time _t′+1-s ,…,r _t′ ,r _t′+1 Whether the standard deviation of is less than a set convergence threshold: wherein s is a set value;

if the data plane is the SDN plane, issuing a routing scheme P to an SDN control plane by a time-space feature fusion network model of the SDN knowledge plane based on the graph neural network, generating an SDN flow table item by the SDN control plane in combination with the routing scheme P and a network topology M, issuing the SDN flow table item to each network node of the data plane, and forwarding network flow by the network nodes according to the SDN flow table item to realize routing optimization of the data plane;

otherwise, let t '= t' +1, and the traffic matrix at the next moment to be predicted

Assign a value to the flow matrix at the current moment>

Prize value r for next time _t′+1 Assigning the prize value r at the current moment _t′ And go to step 3.

In step 1, the training data set D is:

in the formula (I), the compound is shown in the specification,M＝[m _ij ] _n*n for network topology, m _ij Representing a network node v _i And network node v _j When the network node v is connected _i And network node v _j When m is connected _ij =1, when network node v _i And network node v _j M when not connected _ij ＝0；P＝[p _ij ] _n*n For the routing scheme, p _ij Representing a network node v _i To network node v _j A path of (a);

is the flow matrix at time t, < >>

Network node v representing time t _i To network node v _j The flow rate of (a); i, j =1,2, …, n, n is the number of network nodes.

In the step 2, the time-space feature fusion network model based on the graph neural network comprises a network performance prediction module and a flow matrix prediction module;

(1) in the network performance prediction module:

firstly, defining the minimum value of the link capacity of all links of each path in a routing scheme as a link characteristic, and assigning the link characteristic to an initial link state vector; meanwhile, defining the flow among all network nodes of the flow matrix at the time t as a path characteristic, and assigning the path characteristic to an initial path state vector;

then, repeating the same message transmission operation for T times on the current link state vector and the current path state vector, wherein in the T times of circulation process, the link and the path exchange hidden states with each other to obtain a final link hidden state and a final path hidden state; the method specifically comprises the following steps:

and (3) message aggregation process: performing matrix splicing on a current link state vector and a current path state vector to obtain a link and path characteristic matrix, and inputting the link and path characteristic matrix and a network topology into two layers of GCNs to obtain output states of the two layers of GCNs; then sending the output state of the GCN to the three-layer GCN to obtain the output state of the three-layer GCN; then sending the output states of the two layers of GCNs and the output states of the three layers of GCNs into a residual error module to obtain the output state of the residual error module, and obtaining the hidden state of the path by the output state of the residual error module through a Softmax function;

path state update procedure: firstly, matrix splicing is carried out on a current path state vector and a hidden state of a path to obtain a path characteristic matrix; inputting the path feature matrix into a GRU to obtain an updated path state vector;

and (3) link state updating process: each link sums the hidden states of all paths including the link to obtain the hidden state of the link; performing matrix splicing on the current link state vector and the hidden state of the link to obtain a link characteristic matrix; then inputting the link characteristic matrix into a GRU to obtain an updated link state vector;

and (3) an iterative process: inputting the updated path state vector and the updated link state vector as next message transmission operation, and obtaining a final link hidden state and a final path hidden state after T message transmission operations; wherein T is a set value;

finally, the final link hiding state is subjected to full connection layer to obtain a link-level performance matrix, and the final path hiding state is subjected to full connection layer to obtain a path-level performance matrix;

(2) in the traffic matrix prediction module:

firstly, inputting a traffic matrix sequence and a network topology composed of traffic matrixes at t moment and q moments before the t moment into a layer of GCN to obtain a hidden state with space-related information;

then, inputting the hidden state with the spatial characteristic information into an LSTM, and calculating to obtain the hidden state covering the space-time characteristic, so as to obtain a hidden state set with the space-time characteristic;

secondly, calculating the weight of each hidden state in a hidden state set with space-time characteristics based on an attention mechanism, and obtaining a context vector of the global network topology flow change information;

and finally, passing the context vector through a full connection layer, and obtaining a predicted flow matrix at the next moment by utilizing a ReLU function.

In the above step 4, the network topology M, the routing scheme P, and the traffic matrix at the current time are set

The performance matrix is input into a network performance prediction module of a spatio-temporal feature fusion network model based on the graph neural network, and the network performance prediction module of the spatio-temporal feature fusion network model based on the graph neural network outputs a link-level performance matrix->

And path level performance matrix

A traffic matrix sequence consisting of the network topology M and the traffic matrices at the current time and the previous q times

The flow matrix prediction module is input to a space-time characteristic fusion network model based on the graph neural network, and outputs a predicted flow matrix at the next moment>

In the above step 5, the bonus value r of the bonus function at the next time _t′+1 Comprises the following steps:

in the formula, alpha and beta are set adjustable parameters;

a performance matrix at the link level; />

For a path level performance matrix, <' > H>

Is all->

N is the number of network nodes.

Compared with the prior art, the invention utilizes the reinforcement learning agent to learn the interdependence relation between the network switch flow load and the network performance and determines a group of optimal route forwarding schemes so as to balance the end-to-end path bandwidth capacity and the load balance of the network. The optimal path of the data packet is found by combining the prediction of the graph neural network, namely, a time-space feature fusion network model (GCT-Route network model) based on the graph neural network is developed to assist deep reinforcement learning to quickly complete the self-learning process. The invention can solve the problem of network routing performance reduction caused by long-time trial and error exploration in the learning process of the reinforcement learning agent.

Drawings

Fig. 1 is a flowchart of an SDN route optimization method based on spatio-temporal feature fusion of graph neural networks;

FIG. 2 is a schematic diagram of a GCT-Route network model;

fig. 3 is a system architecture diagram for implementing an SDN route optimization method based on spatio-temporal feature fusion of a graph neural network.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail below with reference to specific examples.

An SDN route optimization method based on spatio-temporal feature fusion of a graph neural network, as shown in fig. 1, includes the following steps:

s1, modeling a computer network scene, collecting historical network information, and generating a training data set.

Graph Neural Network (GNN) is a special Network that is suitable for applications involving information structured in the form of graphs. GNNs can be well generalized to different graph structures and help to achieve relationships between different network nodes and edges.

For communication network G, the present invention uses G = (V, L) to describe the network structure; v = [ V ] ₁ ,v ₂ ,…,v _n ]Representing a set of network nodes, n being the number of network nodes; l = [ L = ₁ ,l ₂ ,…,l _m ]Representing a set of links, and m is the number of links. According to the definition of the graph network, defining the network topology connection relationship as a network topology M:

in the formula, m _ij For the ith row and jth column of the network topology M, a network node v is represented _i And network node v _j The connection relationship of (1).

Defining a routing scheme P in the network:

P＝[p _ij ] _n*n (2)

in the formula, p _ij For the ith row and jth column of the routing scheme, network node v is represented _i To network node v _j The path of (c). For convenience of presentation, each path p _ij By p _k Is shown by p _k Defined as a link sequence p _k ＝

Where k (i) is path p _k Index of the ith link in (1).

Defining a traffic matrix at time t

In the formula (I), the compound is shown in the specification,

the ith row and the jth column of the traffic matrix at the time t indicate the network node v at the time t _i To network node v _j The flow rate of (c).

Acquiring network information data including network topology M, routing scheme P and flow matrix by using network monitoring module on control plane

A training data set D is generated and stored in a memory area of the knowledge plane.

S2, constructing a time-space feature fusion network model (GCT-Route) network model based on the graph neural network on a knowledge plane, and performing offline training on the GCT-Route network model by using the data set.

The GCT-Route network model is divided into a network performance prediction module and a traffic matrix prediction module, as shown in fig. 2. The Network performance prediction module represents the dependency between links and paths of a given routing scheme using a Message Passing Neural Network (MPNN), which follows that the state of a path depends on the state of all links on the path and the state of links depends on the state of all paths of the traversed links, and is composed of a Graph Convolution Neural Network (GCN), a residual Connection (skip Connection), a Gate-controlled round-robin Unit (GRU), and a Full Connection layer (FC). In order to fully acquire the spatial correlation and the time correlation of the network, the traffic matrix prediction module firstly uses the GCN to capture the spatial characteristics of the network information and then uses a Long short-term memory network (Long Shor)t-Term Memory, LSTM) captures temporal characteristics of network information to obtain network state information with spatio-temporal feature fusion. A self-attention mechanism is then introduced to adjust the importance of the different time slices and to collect global spatiotemporal information to improve prediction accuracy. Finally, a Full Connection layer (FC) is passed, and the ReLU function is selected to obtain the final prediction result. The constructed GCT-Route network model draws up an input network topology M, a routing scheme P and a traffic matrix

And realizing complex relations among topological structures, routes and input traffic by using network nodes, links and source-destination paths in the routing scheme in the topology and traffic passing through the network nodes, the links and the source-destination paths in the routing scheme, and outputting a traffic matrix of the next moment of performance estimation and prediction of network link levels and path levels. In the training process, the network topology M, the routing scheme P and the traffic matrix at the t moment are judged>

Inputting the flow matrix sequence into a network performance prediction module for training, and determining whether the flow matrix sequence is based on the flow matrix sequence formed by the network topology M, the t moment and the previous q moments>

And inputting the data into a traffic matrix prediction module for training, wherein q is the length of a traffic matrix sequence. .

(1) For the network performance prediction module:

1) To route each path P in the plan P _k All links of

Is defined as a link characteristic x _l Based on the time t, the flow matrix->

Is defined as the path characteristics x, i.e. the bandwidth carried by each source-target path _p Then the link characteristics x _l And path characteristics x _p Assign a value to the initial link-state vector->

And an initial path state vector>

State initialization is performed on the state vectors of the links and paths.

2) In order to solve the dependency relationship between the link and the path, the network performance prediction module repeats the same message transmission operation for T times on the state vectors of the link and the path, in the T times of circulation process, the link and the path exchange hidden states mutually, and the finally obtained link hidden state h _l And path hidden state h _p 。

2.1 For a path in the routing scheme P), each path collects messages from all links contained therein, and uses the GCN and residual module to perform message aggregation to capture the spatial relationship of the path states, resulting in the hidden state of the path

Firstly, the current path state vector

And the current link state vector->

Matrix splicing is carried out to obtain a link and path characteristic matrix X _t ：

After splicing, the network topology M is taken as an adjacency matrix and a link and path characteristic matrix X _t Inputting the data into a two-layer GCN, and obtaining the output state h of the two-layer GCN through the graph convolution operation of the two-layer GCN ^gcn ：

h ^gcn ＝GCN ⁽²⁾ (M,X _t ) (8)

Then the output state h of the two layers of GCN is set ^gcn Sequentially passing through three layers of GCN to obtain the GCN with three layers of GCN output states ⁽³⁾ (M,h ^gcn ) And the output states GCN of the three layers of GCN are compared ⁽³⁾ (M,h ^gcn ) And output state h of two-layer GCN ^gcn Sending the residual error signal into a residual error module to obtain the output state h of the residual error module ^r ：

h ^r ＝GCN ⁽³⁾ (M,h ^gcn )+h ^gcn (9) Subsequently, the output state h of the residual module ^r Obtaining hidden state of path through Softmax function

2.2 After the above-mentioned message aggregation process is completed, the path state is updated to obtain an updated path state vector

Firstly, the current path state vector

Hidden status of sum path +>

Performing matrix splicing to obtain a path characteristic matrix->

After splicing, a feature matrix is obtained

As an input to the one level GRU, an update of the path state is performed resulting in an updated path state vector ≧>

2.3 ) updating the link state after the path state is updated, to obtain an updated link state vector

First, hidden state based on path

Each link sums the hidden states of all paths that contain it, resulting in the hidden state ≦ for the link>

In the formula (I), the compound is shown in the specification,

representing a path p containing the ith link _k Is hiddenThe hidden state.

Then the current link state vector is used

And hidden status of link->

Performing matrix splicing to obtain a link characteristic matrix>

After splicing, the link characteristic matrix is combined

As an input to the one-level GRU, an update of the link state is made, resulting in an updated link state vector ≥ r>

2.4 After updating of the path state and the link state is completed, the updating result is obtained

And &>

As the input of the next message transmission operation, after T times of message transmission operations, the link hidden state h is finally obtained _l And path hidden state h _p 。

3) Hiding link state h _l And path hidden state h _p Obtaining links through full connectivity layersAnd the performance prediction index of the path, i.e. the performance matrix at the link level

And the performance matrix of the path level +>

A performance matrix representing the link level used to measure the link remaining capacity, or @, of the network>

A performance matrix representing the path level is used to measure the bearer bandwidth between the network nodes.

(2) For the traffic matrix prediction module:

1) Spatially correlated information of the state information is captured using a GCN module.

A flow matrix sequence consisting of the flow matrix of the input t time and the flow matrix of the first q times

And a radical of Y _TM Inputting the characteristic matrix as a network node into the GCN; simultaneously inputting the network topology M into the GCN as an adjacency matrix; obtaining a hidden state with space related information through graph convolution operation:

then, in order to acquire the time correlation of the network, the hidden state with the spatial characteristic information is provided

Inputting the data into an LSTM, and calculating to obtain a hidden state covering the space-time characteristics:

and represent these hidden states as H, i.e. H is a set of hidden states with spatio-temporal characteristics:

2) And calculating the weight of each hidden state based on an attention mechanism, and obtaining a context vector of the global network topology traffic change information.

A self-attention mechanism is introduced to adjust the importance of different time slices and collect global spatiotemporal information to improve prediction accuracy. In the attention-based mechanism:

firstly, a hidden state set H is used as input, and corresponding output is obtained through two hidden layers

In the formula, w ₁ And b ₁ Respectively the weight and bias of the first hidden layer, w ₂ And b ₂ Respectively the weight and the bias of the hidden layer of the second layer.

Then, calculating the characteristic of each feature by using a Softmax normalized exponential functionWeight of

Then obtaining a context vector b of the global network topology flow change information through weighting and calculation _t ：

3) Using the full-connection layer to output the prediction result, selecting the ReLU function to obtain the final prediction result, namely the traffic matrix prediction value at the next moment

Where FC (-) represents a layer of fully connected network.

(3) Loss function

In the training of the network performance prediction module, the prediction value of the performance matrix of the link level is used

With the true value Y _l Is predicted based on the mean square error of (d) and the performance matrix of the path level->

With the true value Y _p The sum of the mean square errors of the two-dimensional network performance prediction module is used as a loss function, and model parameters of the network performance prediction module are updated through a gradient descent method;

in the training of the flow matrix prediction module, the prediction value of the flow matrix is used

And the true value->

Mean square error therebetween, and adding L ₂ The regularization item is used as a loss function, and model parameters of the flow matrix prediction module are updated through a gradient descent method;

in the formula, lambda is a hyper-parameter;

the Loss function Loss of the whole GCT-Route network model is as follows:

Loss＝Loss1+Loss2 (27)

in the formula, loss1 represents a Loss term of the network performance prediction module, and Loss2 represents a Loss term of the traffic matrix prediction module.

S3, the network monitoring module of the control plane acquires the flow matrix at the current moment

The traffic matrix at the present moment is pickup>

And the prize value r at the current time _t′ The method is input into a reinforcement learning agent for reinforcement learning, and the agent generates an action value according to the state and then inputs the action value into a GCT-Route network model.

S4, the GCT-Route network model receives the action value as a routing scheme P and the action value is matched with a current time flow matrix

Flow matrix sequence consisting of flow matrices at the current moment and q moments before the current moment>

And the network topology M together as an input to the model, followed by an output link-level performance matrix +>

Performance matrix of a path level->

And the traffic matrix predictor for the next instant>

S5, performance matrix of reward function of reinforcement learning agent according to link level

And the performance matrix of the path level +>

A prize value for the next time instance is generated.

The reward function takes load balance as an optimization target, namely maximizing the residual capacity of a link and minimizing the path bearing bandwidth among network nodes, namely a reward value r at the next moment _t′+1 Comprises the following steps:

r _t′+1 ＝αr _l -βr _p (28)

in the formula, alpha and beta are adjustable parameters.

Link residual capacity r _l Is the sum of the remaining capacity of the network link, i.e.:

in the formula (I), the compound is shown in the specification,

i.e. is>

Is a performance matrix of the link level->

Is selected and/or selected in the ith row of (1)>

For a link-level performance matrix, n is the number of network nodes.

Path bearing bandwidth r between network nodes _p The variance of the bearer bandwidth for each source-target path, i.e.:

in the formula (I), the compound is shown in the specification,

is a performance matrix of the path level>

Is selected and/or selected in the ith row of (1)>

Is all->

The average value of (a) of (b),

for a path-level performance matrix, n is the number of network nodes.

And S6, judging whether the reward value is converged.

Setting a reward value sequence R _t′+1 The reward value of the next time and s times before the next time forms R _t′+1 ＝{r _t′+1-s ,…,r _t′ ,r _t′+1 -where s is the length of the prize value sequence.

Setting the convergence threshold to epsilon, if the standard deviation sigma (R) of the reward value sequence _t′+1 ) Less than ε, i.e. σ (R) _t′+1 )<ε, the reward value is determined to be convergent. At this time, the GCT-Route network model issues the routing scheme P to a flow table generating module of the control plane, the flow table generating module combines the routing scheme P with the network topology M to generate an SDN flow table item, then the SDN flow table item is issued to each network node of the data plane, and the network nodes forward network flow according to the latest flow table item to realize the routing optimization of the data plane.

Otherwise, it is determined not to converge.

S7, let t '= t' +1, and the predicted traffic matrix at the next moment

Assign a value to the flow matrix at the current moment>

Prize value r for next time _t′+1 Assigning the prize value r at the current moment _t′ And go to S3. Finally, network self-learning is formed, and the problem that network routing performance is reduced due to long-time exploration and trial and error of the reinforcement learning agent in the learning process is solved.

The system architecture for implementing the method is shown in fig. 3, and specifically comprises a data plane, a control plane and a knowledge plane. The data plane consists of SDN switches (network nodes) for performing network traffic data forwarding. The control plane is composed of an SDN controller and is functionally divided into a network monitoring module and a flow table generating module, wherein the network monitoring module is used for collecting network information data, and the flow table generating module is used for generating SDN flow table items according to a routing scheme issued by an upper layer and issuing SDN switches of the data plane. The knowledge plane is composed of a reinforcement learning intelligent agent, a GCT-Route network model and a storage area, wherein the storage area is used for storing a training data set, and the reinforcement learning intelligent agent and the GCT-Route network model interact with each other to generate a network traffic scheduling strategy.

It should be noted that, although the above-mentioned embodiments of the present invention are illustrative, the present invention is not limited thereto, and therefore, the present invention is not limited to the above-mentioned specific embodiments. Other embodiments, which can be made by those skilled in the art in light of the teachings of the present invention, are considered to be within the scope of the present invention without departing from its principles.

Claims

1. An SDN route optimization method based on space-time feature fusion of a graph neural network is characterized by comprising the following steps:

step 3, the SDN control plane maps the current flow matrix

And the prize value P at the current time _t′ The reinforcement learning agent input into the SDN knowledge plane generates an action value, and then the action value is input into a space-time feature fusion network model based on a graph neural network;

step 4, the SDN knowledge plane receives the action value as a routing scheme I based on the time-space feature fusion network model of the graph neural network, and the routing scheme P, the network topology M and the current flow matrix are used

And a flow matrix sequence consisting of flow matrices at the current moment and q moments before the current moment

Taken together as based on the picture spiritOutputting a link-level performance matrix based on a spatio-temporal feature fusion network model of a neural network via input of a spatio-temporal feature fusion network model of the network

Path level performance matrix

And predicted traffic matrix for next time instant

Wherein q is a set value;

And path level performance matrix

Generating a prize value r for the next time _t′+1 ；

Assigning to the current time flow matrix

2. The SDN route optimization method based on spatio-temporal feature fusion of the graph neural network as claimed in claim 1, wherein in step 1, the training data set D is:

wherein M = [ M ] _ij ] _n*n For network topology, m _ij Representing a network node v _i And network node v _j When the network node v is connected _i And network node v _j When m is connected _ij =1, when network node v _i And network node v _j M when not connected _ij ＝0；P＝[p _ij ] _n*n For the routing scheme, p _ij Representing a network node v _i To network node v _j A path of (a);

for the traffic matrix at time t,

3. The SDN routing optimization method based on the spatio-temporal feature fusion of the graph neural network as claimed in claim 1 or 2, wherein in the step 2, the spatio-temporal feature fusion network model based on the graph neural network comprises a network performance prediction module and a traffic matrix prediction module;

(1) in the network performance prediction module:

firstly, defining the minimum value of the link capacity of all links of each path in a routing scheme as a link characteristic, and assigning to an initial link state vector; meanwhile, defining the flow among all network nodes of the flow matrix at the time t as a path characteristic, and assigning the path characteristic to an initial path state vector;

path state update procedure: firstly, matrix splicing is carried out on a current path state vector and a hidden state of a path to obtain a path characteristic matrix; inputting the path feature matrix into the GRU to obtain an updated path state vector;

and (3) link state updating process: each link sums the hidden states of all paths including the link to obtain the hidden state of the link; performing matrix splicing on the current link state vector and the hidden state of the link to obtain a link characteristic matrix; then inputting the link characteristic matrix into the GRU to obtain an updated link state vector;

(2) in the traffic matrix prediction module:

firstly, inputting a traffic matrix sequence and a network topology composed of traffic matrixes at t moment and q moments before the t moment into a layer of GCN to obtain a hidden state with space-related information; wherein q is a set value;

4. The SDN route optimization method based on spatio-temporal feature fusion of graph neural network as claimed in claim 1, wherein in step 4, the network topology M, the routing scheme P and the current time flow matrix are combined

The network performance prediction module is input into a space-time feature fusion network model based on the graph neural network and outputs a performance matrix of a link level

And path level performance matrix

The flow matrix prediction module is input to the space-time characteristic fusion network model based on the graph neural network and outputs the predicted flow matrix at the next moment

5. The SDN route optimization method based on spatio-temporal feature fusion of graph neural network as claimed in claim 1, wherein in step 5, the reward value r of the reward function at the next moment _t′+1 Comprises the following steps:

in the formula, alpha and beta are set adjustable parameters;

a performance matrix at the link level;

for the performance matrix at the path level,

is all that

N is the number of network nodes.