CN115842768A - SDN route optimization method based on time-space feature fusion of graph neural network - Google Patents

SDN route optimization method based on time-space feature fusion of graph neural network Download PDF

Info

Publication number
CN115842768A
CN115842768A CN202211473921.0A CN202211473921A CN115842768A CN 115842768 A CN115842768 A CN 115842768A CN 202211473921 A CN202211473921 A CN 202211473921A CN 115842768 A CN115842768 A CN 115842768A
Authority
CN
China
Prior art keywords
network
matrix
path
link
sdn
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211473921.0A
Other languages
Chinese (zh)
Inventor
陈俊彦
谢小兰
俸皓
王勇
廖岑卉珊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guilin University of Electronic Technology
Original Assignee
Guilin University of Electronic Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guilin University of Electronic Technology filed Critical Guilin University of Electronic Technology
Priority to CN202211473921.0A priority Critical patent/CN115842768A/en
Publication of CN115842768A publication Critical patent/CN115842768A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/50Reducing energy consumption in communication networks in wire-line communication networks, e.g. low power modes or reduced link rate

Landscapes

  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention discloses an SDN route optimization method based on time-space feature fusion of a graph neural network, which utilizes a reinforcement learning agent to learn the interdependence relation between network switch flow load and network performance and determines a group of optimal route forwarding schemes so as to balance the end-to-end path bandwidth capacity and load balance of the network. The optimal path of the data packet is found by combining the prediction of the graph neural network, namely, a time-space feature fusion network model (GCT-Route network model) based on the graph neural network is developed to assist deep reinforcement learning to quickly complete the self-learning process. The invention can solve the problem of network routing performance reduction caused by long-time trial and error exploration in the learning process of the reinforcement learning agent.

Description

SDN route optimization method based on time-space feature fusion of graph neural network
Technical Field
The invention relates to the technical field of Software Defined Networking (SDN), in particular to an SDN route optimization method based on time-space feature fusion of a graph neural Network.
Background
The advent of software defined networking has made centralized management and operation possible, and network resources such as switches have become flexibly configurable through programmable interfaces. The traditional SDN route optimization method approximately fits the current network state through modeling, and adopts a heuristic method to calculate route configuration for a multimedia flow request in real time, and the method has the defects of strict application scene, huge calculation cost and difficulty in coping with the future real-time high-dynamic network environment. In recent years, advances in deep reinforcement learning techniques have provided new avenues for optimization to solve highly complex routing problems. A deep reinforcement learning based routing scheme can learn and adapt to complex networks by improving routing decision performance in an experience-driven and model-free manner. However, due to the nature of reinforcement learning, i.e., exploration in determining the best strategy, network performance may degrade during the learning process. In particular, if the network topology changes, the agents of the DRL need to relearn for route optimization. The more complex the characteristics of the network traffic, the longer the time required for convergence, resulting in long-term poor network performance.
Disclosure of Invention
The invention aims to solve the problem that the network performance is reduced in the learning process by the existing SDN route optimization method, and provides an SDN route optimization method based on the time-space feature fusion of a graph neural network.
In order to solve the problems, the invention is realized by the following technical scheme:
an SDN route optimization method based on spatio-temporal feature fusion of a graph neural network comprises the following steps:
step 1, an SDN control plane collects historical network information, namely network topology, a routing scheme and a flow matrix to generate a training data set, and the training data set is stored in a storage area of an SDN knowledge plane;
step 2, constructing a spatio-temporal feature fusion network model based on a graph neural network on an SDN knowledge plane, and performing off-line training on the spatio-temporal feature fusion network model based on the graph neural network by using a training data set;
step 3, the SDN control plane maps the current flow matrix
Figure BDA0003956195410000011
And the prize value r at the current time t′ The reinforcement learning agent input into the SDN knowledge plane generates an action value, and then the action value is input into a space-time feature fusion network model based on a graph neural network;
step 4, the SDN knowledge plane receives an action value as a routing scheme P based on a time-space feature fusion network model of the graph neural network, and the routing scheme P, the network topology M and a current-time flow matrix are used
Figure BDA0003956195410000012
And a traffic matrix sequence consisting of the traffic matrix at the current time and its q preceding times>
Figure BDA0003956195410000013
Taken together as an input to a spatio-temporal feature fusion network model based on a neural network of graphs, the spatio-temporal feature fusion network model based on a neural network of graphs outputs a link level performance matrix->
Figure BDA0003956195410000014
Performance matrix for a path level +>
Figure BDA0003956195410000021
And the predicted traffic matrix for the next instant in time->
Figure BDA0003956195410000022
Wherein q is a set value;
step 5, performance matrix of reward function of reinforcement learning agent of SDN knowledge plane according to link level
Figure BDA0003956195410000023
And the performance matrix of the path level +>
Figure BDA0003956195410000024
Generating a prize value r for the next time t′+1
Step 6, judging an incentive value sequence { r) consisting of the incentive values of the next time and s times before the next time t′+1-s ,…,r t′ ,r t′+1 Whether the standard deviation of is less than a set convergence threshold: wherein s is a set value;
if the data plane is the SDN plane, issuing a routing scheme P to an SDN control plane by a time-space feature fusion network model of the SDN knowledge plane based on the graph neural network, generating an SDN flow table item by the SDN control plane in combination with the routing scheme P and a network topology M, issuing the SDN flow table item to each network node of the data plane, and forwarding network flow by the network nodes according to the SDN flow table item to realize routing optimization of the data plane;
otherwise, let t '= t' +1, and the traffic matrix at the next moment to be predicted
Figure BDA0003956195410000025
Assign a value to the flow matrix at the current moment>
Figure BDA0003956195410000026
Prize value r for next time t′+1 Assigning the prize value r at the current moment t′ And go to step 3.
In step 1, the training data set D is:
Figure BDA0003956195410000027
in the formula (I), the compound is shown in the specification,M=[m ij ] n*n for network topology, m ij Representing a network node v i And network node v j When the network node v is connected i And network node v j When m is connected ij =1, when network node v i And network node v j M when not connected ij =0;P=[p ij ] n*n For the routing scheme, p ij Representing a network node v i To network node v j A path of (a);
Figure BDA0003956195410000028
is the flow matrix at time t, < >>
Figure BDA0003956195410000029
Network node v representing time t i To network node v j The flow rate of (a); i, j =1,2, …, n, n is the number of network nodes.
In the step 2, the time-space feature fusion network model based on the graph neural network comprises a network performance prediction module and a flow matrix prediction module;
(1) in the network performance prediction module:
firstly, defining the minimum value of the link capacity of all links of each path in a routing scheme as a link characteristic, and assigning the link characteristic to an initial link state vector; meanwhile, defining the flow among all network nodes of the flow matrix at the time t as a path characteristic, and assigning the path characteristic to an initial path state vector;
then, repeating the same message transmission operation for T times on the current link state vector and the current path state vector, wherein in the T times of circulation process, the link and the path exchange hidden states with each other to obtain a final link hidden state and a final path hidden state; the method specifically comprises the following steps:
and (3) message aggregation process: performing matrix splicing on a current link state vector and a current path state vector to obtain a link and path characteristic matrix, and inputting the link and path characteristic matrix and a network topology into two layers of GCNs to obtain output states of the two layers of GCNs; then sending the output state of the GCN to the three-layer GCN to obtain the output state of the three-layer GCN; then sending the output states of the two layers of GCNs and the output states of the three layers of GCNs into a residual error module to obtain the output state of the residual error module, and obtaining the hidden state of the path by the output state of the residual error module through a Softmax function;
path state update procedure: firstly, matrix splicing is carried out on a current path state vector and a hidden state of a path to obtain a path characteristic matrix; inputting the path feature matrix into a GRU to obtain an updated path state vector;
and (3) link state updating process: each link sums the hidden states of all paths including the link to obtain the hidden state of the link; performing matrix splicing on the current link state vector and the hidden state of the link to obtain a link characteristic matrix; then inputting the link characteristic matrix into a GRU to obtain an updated link state vector;
and (3) an iterative process: inputting the updated path state vector and the updated link state vector as next message transmission operation, and obtaining a final link hidden state and a final path hidden state after T message transmission operations; wherein T is a set value;
finally, the final link hiding state is subjected to full connection layer to obtain a link-level performance matrix, and the final path hiding state is subjected to full connection layer to obtain a path-level performance matrix;
(2) in the traffic matrix prediction module:
firstly, inputting a traffic matrix sequence and a network topology composed of traffic matrixes at t moment and q moments before the t moment into a layer of GCN to obtain a hidden state with space-related information;
then, inputting the hidden state with the spatial characteristic information into an LSTM, and calculating to obtain the hidden state covering the space-time characteristic, so as to obtain a hidden state set with the space-time characteristic;
secondly, calculating the weight of each hidden state in a hidden state set with space-time characteristics based on an attention mechanism, and obtaining a context vector of the global network topology flow change information;
and finally, passing the context vector through a full connection layer, and obtaining a predicted flow matrix at the next moment by utilizing a ReLU function.
In the above step 4, the network topology M, the routing scheme P, and the traffic matrix at the current time are set
Figure BDA0003956195410000031
The performance matrix is input into a network performance prediction module of a spatio-temporal feature fusion network model based on the graph neural network, and the network performance prediction module of the spatio-temporal feature fusion network model based on the graph neural network outputs a link-level performance matrix->
Figure BDA0003956195410000032
And path level performance matrix
Figure BDA0003956195410000033
A traffic matrix sequence consisting of the network topology M and the traffic matrices at the current time and the previous q times
Figure BDA0003956195410000034
The flow matrix prediction module is input to a space-time characteristic fusion network model based on the graph neural network, and outputs a predicted flow matrix at the next moment>
Figure BDA0003956195410000035
In the above step 5, the bonus value r of the bonus function at the next time t′+1 Comprises the following steps:
Figure BDA0003956195410000036
in the formula, alpha and beta are set adjustable parameters;
Figure BDA0003956195410000037
Figure BDA0003956195410000038
a performance matrix at the link level; />
Figure BDA0003956195410000039
Figure BDA00039561954100000310
For a path level performance matrix, <' > H>
Figure BDA00039561954100000311
Is all->
Figure BDA00039561954100000312
N is the number of network nodes.
Compared with the prior art, the invention utilizes the reinforcement learning agent to learn the interdependence relation between the network switch flow load and the network performance and determines a group of optimal route forwarding schemes so as to balance the end-to-end path bandwidth capacity and the load balance of the network. The optimal path of the data packet is found by combining the prediction of the graph neural network, namely, a time-space feature fusion network model (GCT-Route network model) based on the graph neural network is developed to assist deep reinforcement learning to quickly complete the self-learning process. The invention can solve the problem of network routing performance reduction caused by long-time trial and error exploration in the learning process of the reinforcement learning agent.
Drawings
Fig. 1 is a flowchart of an SDN route optimization method based on spatio-temporal feature fusion of graph neural networks;
FIG. 2 is a schematic diagram of a GCT-Route network model;
fig. 3 is a system architecture diagram for implementing an SDN route optimization method based on spatio-temporal feature fusion of a graph neural network.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail below with reference to specific examples.
An SDN route optimization method based on spatio-temporal feature fusion of a graph neural network, as shown in fig. 1, includes the following steps:
s1, modeling a computer network scene, collecting historical network information, and generating a training data set.
Graph Neural Network (GNN) is a special Network that is suitable for applications involving information structured in the form of graphs. GNNs can be well generalized to different graph structures and help to achieve relationships between different network nodes and edges.
For communication network G, the present invention uses G = (V, L) to describe the network structure; v = [ V ] 1 ,v 2 ,…,v n ]Representing a set of network nodes, n being the number of network nodes; l = [ L = 1 ,l 2 ,…,l m ]Representing a set of links, and m is the number of links. According to the definition of the graph network, defining the network topology connection relationship as a network topology M:
Figure BDA0003956195410000041
in the formula, m ij For the ith row and jth column of the network topology M, a network node v is represented i And network node v j The connection relationship of (1).
Defining a routing scheme P in the network:
P=[p ij ] n*n (2)
in the formula, p ij For the ith row and jth column of the routing scheme, network node v is represented i To network node v j The path of (c). For convenience of presentation, each path p ij By p k Is shown by p k Defined as a link sequence p k
Figure BDA0003956195410000046
Where k (i) is path p k Index of the ith link in (1).
Defining a traffic matrix at time t
Figure BDA0003956195410000045
Figure BDA0003956195410000043
In the formula (I), the compound is shown in the specification,
Figure BDA0003956195410000044
the ith row and the jth column of the traffic matrix at the time t indicate the network node v at the time t i To network node v j The flow rate of (c).
Acquiring network information data including network topology M, routing scheme P and flow matrix by using network monitoring module on control plane
Figure BDA0003956195410000051
A training data set D is generated and stored in a memory area of the knowledge plane.
Figure BDA0003956195410000052
S2, constructing a time-space feature fusion network model (GCT-Route) network model based on the graph neural network on a knowledge plane, and performing offline training on the GCT-Route network model by using the data set.
The GCT-Route network model is divided into a network performance prediction module and a traffic matrix prediction module, as shown in fig. 2. The Network performance prediction module represents the dependency between links and paths of a given routing scheme using a Message Passing Neural Network (MPNN), which follows that the state of a path depends on the state of all links on the path and the state of links depends on the state of all paths of the traversed links, and is composed of a Graph Convolution Neural Network (GCN), a residual Connection (skip Connection), a Gate-controlled round-robin Unit (GRU), and a Full Connection layer (FC). In order to fully acquire the spatial correlation and the time correlation of the network, the traffic matrix prediction module firstly uses the GCN to capture the spatial characteristics of the network information and then uses a Long short-term memory network (Long Shor)t-Term Memory, LSTM) captures temporal characteristics of network information to obtain network state information with spatio-temporal feature fusion. A self-attention mechanism is then introduced to adjust the importance of the different time slices and to collect global spatiotemporal information to improve prediction accuracy. Finally, a Full Connection layer (FC) is passed, and the ReLU function is selected to obtain the final prediction result. The constructed GCT-Route network model draws up an input network topology M, a routing scheme P and a traffic matrix
Figure BDA0003956195410000053
And realizing complex relations among topological structures, routes and input traffic by using network nodes, links and source-destination paths in the routing scheme in the topology and traffic passing through the network nodes, the links and the source-destination paths in the routing scheme, and outputting a traffic matrix of the next moment of performance estimation and prediction of network link levels and path levels. In the training process, the network topology M, the routing scheme P and the traffic matrix at the t moment are judged>
Figure BDA0003956195410000054
Inputting the flow matrix sequence into a network performance prediction module for training, and determining whether the flow matrix sequence is based on the flow matrix sequence formed by the network topology M, the t moment and the previous q moments>
Figure BDA0003956195410000055
And inputting the data into a traffic matrix prediction module for training, wherein q is the length of a traffic matrix sequence. .
(1) For the network performance prediction module:
1) To route each path P in the plan P k All links of
Figure BDA00039561954100000511
Is defined as a link characteristic x l Based on the time t, the flow matrix->
Figure BDA0003956195410000056
Is defined as the path characteristics x, i.e. the bandwidth carried by each source-target path p Then the link characteristics x l And path characteristics x p Assign a value to the initial link-state vector->
Figure BDA0003956195410000057
And an initial path state vector>
Figure BDA0003956195410000058
State initialization is performed on the state vectors of the links and paths.
Figure BDA0003956195410000059
Figure BDA00039561954100000510
2) In order to solve the dependency relationship between the link and the path, the network performance prediction module repeats the same message transmission operation for T times on the state vectors of the link and the path, in the T times of circulation process, the link and the path exchange hidden states mutually, and the finally obtained link hidden state h l And path hidden state h p
2.1 For a path in the routing scheme P), each path collects messages from all links contained therein, and uses the GCN and residual module to perform message aggregation to capture the spatial relationship of the path states, resulting in the hidden state of the path
Figure BDA0003956195410000061
Firstly, the current path state vector
Figure BDA0003956195410000062
And the current link state vector->
Figure BDA0003956195410000063
Matrix splicing is carried out to obtain a link and path characteristic matrix X t
Figure BDA0003956195410000064
After splicing, the network topology M is taken as an adjacency matrix and a link and path characteristic matrix X t Inputting the data into a two-layer GCN, and obtaining the output state h of the two-layer GCN through the graph convolution operation of the two-layer GCN gcn
h gcn =GCN (2) (M,X t ) (8)
Then the output state h of the two layers of GCN is set gcn Sequentially passing through three layers of GCN to obtain the GCN with three layers of GCN output states (3) (M,h gcn ) And the output states GCN of the three layers of GCN are compared (3) (M,h gcn ) And output state h of two-layer GCN gcn Sending the residual error signal into a residual error module to obtain the output state h of the residual error module r
h r =GCN (3) (M,h gcn )+h gcn (9) Subsequently, the output state h of the residual module r Obtaining hidden state of path through Softmax function
Figure BDA00039561954100000629
Figure BDA0003956195410000066
2.2 After the above-mentioned message aggregation process is completed, the path state is updated to obtain an updated path state vector
Figure BDA0003956195410000067
Firstly, the current path state vector
Figure BDA0003956195410000068
Hidden status of sum path +>
Figure BDA0003956195410000069
Performing matrix splicing to obtain a path characteristic matrix->
Figure BDA00039561954100000630
Figure BDA00039561954100000611
After splicing, a feature matrix is obtained
Figure BDA00039561954100000612
As an input to the one level GRU, an update of the path state is performed resulting in an updated path state vector ≧>
Figure BDA00039561954100000613
Figure BDA00039561954100000614
2.3 ) updating the link state after the path state is updated, to obtain an updated link state vector
Figure BDA00039561954100000615
First, hidden state based on path
Figure BDA00039561954100000616
Each link sums the hidden states of all paths that contain it, resulting in the hidden state ≦ for the link>
Figure BDA00039561954100000617
Figure BDA00039561954100000618
In the formula (I), the compound is shown in the specification,
Figure BDA00039561954100000619
representing a path p containing the ith link k Is hiddenThe hidden state.
Then the current link state vector is used
Figure BDA00039561954100000620
And hidden status of link->
Figure BDA00039561954100000621
Performing matrix splicing to obtain a link characteristic matrix>
Figure BDA00039561954100000631
Figure BDA00039561954100000623
After splicing, the link characteristic matrix is combined
Figure BDA00039561954100000624
As an input to the one-level GRU, an update of the link state is made, resulting in an updated link state vector ≥ r>
Figure BDA00039561954100000625
Figure BDA00039561954100000626
2.4 After updating of the path state and the link state is completed, the updating result is obtained
Figure BDA00039561954100000627
And &>
Figure BDA00039561954100000628
As the input of the next message transmission operation, after T times of message transmission operations, the link hidden state h is finally obtained l And path hidden state h p
3) Hiding link state h l And path hidden state h p Obtaining links through full connectivity layersAnd the performance prediction index of the path, i.e. the performance matrix at the link level
Figure BDA0003956195410000071
And the performance matrix of the path level +>
Figure BDA0003956195410000072
Figure BDA0003956195410000073
A performance matrix representing the link level used to measure the link remaining capacity, or @, of the network>
Figure BDA0003956195410000074
Figure BDA0003956195410000075
A performance matrix representing the path level is used to measure the bearer bandwidth between the network nodes.
Figure BDA0003956195410000076
(2) For the traffic matrix prediction module:
1) Spatially correlated information of the state information is captured using a GCN module.
A flow matrix sequence consisting of the flow matrix of the input t time and the flow matrix of the first q times
Figure BDA0003956195410000077
Figure BDA0003956195410000078
And a radical of Y TM Inputting the characteristic matrix as a network node into the GCN; simultaneously inputting the network topology M into the GCN as an adjacency matrix; obtaining a hidden state with space related information through graph convolution operation:
Figure BDA0003956195410000079
then, in order to acquire the time correlation of the network, the hidden state with the spatial characteristic information is provided
Figure BDA00039561954100000710
Inputting the data into an LSTM, and calculating to obtain a hidden state covering the space-time characteristics:
Figure BDA00039561954100000711
and represent these hidden states as H, i.e. H is a set of hidden states with spatio-temporal characteristics:
Figure BDA00039561954100000712
2) And calculating the weight of each hidden state based on an attention mechanism, and obtaining a context vector of the global network topology traffic change information.
A self-attention mechanism is introduced to adjust the importance of different time slices and collect global spatiotemporal information to improve prediction accuracy. In the attention-based mechanism:
firstly, a hidden state set H is used as input, and corresponding output is obtained through two hidden layers
Figure BDA00039561954100000713
Figure BDA00039561954100000714
In the formula, w 1 And b 1 Respectively the weight and bias of the first hidden layer, w 2 And b 2 Respectively the weight and the bias of the hidden layer of the second layer.
Then, calculating the characteristic of each feature by using a Softmax normalized exponential functionWeight of
Figure BDA00039561954100000715
Figure BDA00039561954100000716
Then obtaining a context vector b of the global network topology flow change information through weighting and calculation t
Figure BDA00039561954100000717
3) Using the full-connection layer to output the prediction result, selecting the ReLU function to obtain the final prediction result, namely the traffic matrix prediction value at the next moment
Figure BDA00039561954100000718
Figure BDA00039561954100000719
Where FC (-) represents a layer of fully connected network.
(3) Loss function
In the training of the network performance prediction module, the prediction value of the performance matrix of the link level is used
Figure BDA00039561954100000720
With the true value Y l Is predicted based on the mean square error of (d) and the performance matrix of the path level->
Figure BDA00039561954100000721
With the true value Y p The sum of the mean square errors of the two-dimensional network performance prediction module is used as a loss function, and model parameters of the network performance prediction module are updated through a gradient descent method;
Figure BDA0003956195410000081
in the training of the flow matrix prediction module, the prediction value of the flow matrix is used
Figure BDA0003956195410000082
And the true value->
Figure BDA0003956195410000083
Mean square error therebetween, and adding L 2 The regularization item is used as a loss function, and model parameters of the flow matrix prediction module are updated through a gradient descent method;
Figure BDA0003956195410000084
in the formula, lambda is a hyper-parameter;
the Loss function Loss of the whole GCT-Route network model is as follows:
Loss=Loss1+Loss2 (27)
in the formula, loss1 represents a Loss term of the network performance prediction module, and Loss2 represents a Loss term of the traffic matrix prediction module.
S3, the network monitoring module of the control plane acquires the flow matrix at the current moment
Figure BDA0003956195410000085
The traffic matrix at the present moment is pickup>
Figure BDA0003956195410000086
And the prize value r at the current time t′ The method is input into a reinforcement learning agent for reinforcement learning, and the agent generates an action value according to the state and then inputs the action value into a GCT-Route network model.
S4, the GCT-Route network model receives the action value as a routing scheme P and the action value is matched with a current time flow matrix
Figure BDA0003956195410000087
Flow matrix sequence consisting of flow matrices at the current moment and q moments before the current moment>
Figure BDA0003956195410000088
And the network topology M together as an input to the model, followed by an output link-level performance matrix +>
Figure BDA0003956195410000089
Performance matrix of a path level->
Figure BDA00039561954100000810
And the traffic matrix predictor for the next instant>
Figure BDA00039561954100000811
S5, performance matrix of reward function of reinforcement learning agent according to link level
Figure BDA00039561954100000812
And the performance matrix of the path level +>
Figure BDA00039561954100000813
A prize value for the next time instance is generated.
The reward function takes load balance as an optimization target, namely maximizing the residual capacity of a link and minimizing the path bearing bandwidth among network nodes, namely a reward value r at the next moment t′+1 Comprises the following steps:
r t′+1 =αr l -βr p (28)
in the formula, alpha and beta are adjustable parameters.
Link residual capacity r l Is the sum of the remaining capacity of the network link, i.e.:
Figure BDA00039561954100000814
in the formula (I), the compound is shown in the specification,
Figure BDA00039561954100000815
i.e. is>
Figure BDA00039561954100000816
Is a performance matrix of the link level->
Figure BDA00039561954100000817
Is selected and/or selected in the ith row of (1)>
Figure BDA00039561954100000818
For a link-level performance matrix, n is the number of network nodes.
Path bearing bandwidth r between network nodes p The variance of the bearer bandwidth for each source-target path, i.e.:
Figure BDA00039561954100000819
in the formula (I), the compound is shown in the specification,
Figure BDA00039561954100000820
Figure BDA00039561954100000821
is a performance matrix of the path level>
Figure BDA00039561954100000822
Is selected and/or selected in the ith row of (1)>
Figure BDA00039561954100000823
Is all->
Figure BDA00039561954100000824
The average value of (a) of (b),
Figure BDA00039561954100000825
for a path-level performance matrix, n is the number of network nodes.
And S6, judging whether the reward value is converged.
Setting a reward value sequence R t′+1 The reward value of the next time and s times before the next time forms R t′+1 ={r t′+1-s ,…,r t′ ,r t′+1 -where s is the length of the prize value sequence.
Setting the convergence threshold to epsilon, if the standard deviation sigma (R) of the reward value sequence t′+1 ) Less than ε, i.e. σ (R) t′+1 )<ε, the reward value is determined to be convergent. At this time, the GCT-Route network model issues the routing scheme P to a flow table generating module of the control plane, the flow table generating module combines the routing scheme P with the network topology M to generate an SDN flow table item, then the SDN flow table item is issued to each network node of the data plane, and the network nodes forward network flow according to the latest flow table item to realize the routing optimization of the data plane.
Otherwise, it is determined not to converge.
S7, let t '= t' +1, and the predicted traffic matrix at the next moment
Figure BDA0003956195410000091
Assign a value to the flow matrix at the current moment>
Figure BDA0003956195410000092
Prize value r for next time t′+1 Assigning the prize value r at the current moment t′ And go to S3. Finally, network self-learning is formed, and the problem that network routing performance is reduced due to long-time exploration and trial and error of the reinforcement learning agent in the learning process is solved.
The system architecture for implementing the method is shown in fig. 3, and specifically comprises a data plane, a control plane and a knowledge plane. The data plane consists of SDN switches (network nodes) for performing network traffic data forwarding. The control plane is composed of an SDN controller and is functionally divided into a network monitoring module and a flow table generating module, wherein the network monitoring module is used for collecting network information data, and the flow table generating module is used for generating SDN flow table items according to a routing scheme issued by an upper layer and issuing SDN switches of the data plane. The knowledge plane is composed of a reinforcement learning intelligent agent, a GCT-Route network model and a storage area, wherein the storage area is used for storing a training data set, and the reinforcement learning intelligent agent and the GCT-Route network model interact with each other to generate a network traffic scheduling strategy.
It should be noted that, although the above-mentioned embodiments of the present invention are illustrative, the present invention is not limited thereto, and therefore, the present invention is not limited to the above-mentioned specific embodiments. Other embodiments, which can be made by those skilled in the art in light of the teachings of the present invention, are considered to be within the scope of the present invention without departing from its principles.

Claims (5)

1. An SDN route optimization method based on space-time feature fusion of a graph neural network is characterized by comprising the following steps:
step 1, an SDN control plane collects historical network information, namely network topology, a routing scheme and a flow matrix to generate a training data set, and the training data set is stored in a storage area of an SDN knowledge plane;
step 2, constructing a spatio-temporal feature fusion network model based on a graph neural network on an SDN knowledge plane, and performing off-line training on the spatio-temporal feature fusion network model based on the graph neural network by using a training data set;
step 3, the SDN control plane maps the current flow matrix
Figure FDA0003956195400000011
And the prize value P at the current time t′ The reinforcement learning agent input into the SDN knowledge plane generates an action value, and then the action value is input into a space-time feature fusion network model based on a graph neural network;
step 4, the SDN knowledge plane receives the action value as a routing scheme I based on the time-space feature fusion network model of the graph neural network, and the routing scheme P, the network topology M and the current flow matrix are used
Figure FDA0003956195400000012
And a flow matrix sequence consisting of flow matrices at the current moment and q moments before the current moment
Figure FDA0003956195400000013
Taken together as based on the picture spiritOutputting a link-level performance matrix based on a spatio-temporal feature fusion network model of a neural network via input of a spatio-temporal feature fusion network model of the network
Figure FDA0003956195400000014
Path level performance matrix
Figure FDA0003956195400000015
And predicted traffic matrix for next time instant
Figure FDA0003956195400000016
Wherein q is a set value;
step 5, performance matrix of reward function of reinforcement learning agent of SDN knowledge plane according to link level
Figure FDA0003956195400000017
And path level performance matrix
Figure FDA0003956195400000018
Generating a prize value r for the next time t′+1
Step 6, judging an incentive value sequence { r) consisting of the incentive values of the next time and s times before the next time t′+1-s ,…,r t′ ,r t′+1 Whether the standard deviation of is less than a set convergence threshold: wherein s is a set value;
if the data plane is the SDN plane, issuing a routing scheme P to an SDN control plane by a time-space feature fusion network model of the SDN knowledge plane based on the graph neural network, generating an SDN flow table item by the SDN control plane in combination with the routing scheme P and a network topology M, issuing the SDN flow table item to each network node of the data plane, and forwarding network flow by the network nodes according to the SDN flow table item to realize routing optimization of the data plane;
otherwise, let t '= t' +1, and the traffic matrix at the next moment to be predicted
Figure FDA0003956195400000019
Assigning to the current time flow matrix
Figure FDA00039561954000000110
Prize value r for next time t′+1 Assigning the prize value r at the current moment t′ And go to step 3.
2. The SDN route optimization method based on spatio-temporal feature fusion of the graph neural network as claimed in claim 1, wherein in step 1, the training data set D is:
Figure FDA00039561954000000111
wherein M = [ M ] ij ] n*n For network topology, m ij Representing a network node v i And network node v j When the network node v is connected i And network node v j When m is connected ij =1, when network node v i And network node v j M when not connected ij =0;P=[p ij ] n*n For the routing scheme, p ij Representing a network node v i To network node v j A path of (a);
Figure FDA00039561954000000112
for the traffic matrix at time t,
Figure FDA00039561954000000113
network node v representing time t i To network node v j The flow rate of (a); i, j =1,2, …, n, n is the number of network nodes.
3. The SDN routing optimization method based on the spatio-temporal feature fusion of the graph neural network as claimed in claim 1 or 2, wherein in the step 2, the spatio-temporal feature fusion network model based on the graph neural network comprises a network performance prediction module and a traffic matrix prediction module;
(1) in the network performance prediction module:
firstly, defining the minimum value of the link capacity of all links of each path in a routing scheme as a link characteristic, and assigning to an initial link state vector; meanwhile, defining the flow among all network nodes of the flow matrix at the time t as a path characteristic, and assigning the path characteristic to an initial path state vector;
then, repeating the same message transmission operation for T times on the current link state vector and the current path state vector, wherein in the T times of circulation process, the link and the path exchange hidden states with each other to obtain a final link hidden state and a final path hidden state; the method specifically comprises the following steps:
and (3) message aggregation process: performing matrix splicing on a current link state vector and a current path state vector to obtain a link and path characteristic matrix, and inputting the link and path characteristic matrix and a network topology into two layers of GCNs to obtain output states of the two layers of GCNs; then sending the output state of the GCN to the three-layer GCN to obtain the output state of the three-layer GCN; then sending the output states of the two layers of GCNs and the output states of the three layers of GCNs into a residual error module to obtain the output state of the residual error module, and obtaining the hidden state of the path by the output state of the residual error module through a Softmax function;
path state update procedure: firstly, matrix splicing is carried out on a current path state vector and a hidden state of a path to obtain a path characteristic matrix; inputting the path feature matrix into the GRU to obtain an updated path state vector;
and (3) link state updating process: each link sums the hidden states of all paths including the link to obtain the hidden state of the link; performing matrix splicing on the current link state vector and the hidden state of the link to obtain a link characteristic matrix; then inputting the link characteristic matrix into the GRU to obtain an updated link state vector;
and (3) an iterative process: inputting the updated path state vector and the updated link state vector as next message transmission operation, and obtaining a final link hidden state and a final path hidden state after T message transmission operations; wherein T is a set value;
finally, the final link hiding state is subjected to full connection layer to obtain a link-level performance matrix, and the final path hiding state is subjected to full connection layer to obtain a path-level performance matrix;
(2) in the traffic matrix prediction module:
firstly, inputting a traffic matrix sequence and a network topology composed of traffic matrixes at t moment and q moments before the t moment into a layer of GCN to obtain a hidden state with space-related information; wherein q is a set value;
then, inputting the hidden state with the spatial characteristic information into an LSTM, and calculating to obtain the hidden state covering the space-time characteristic, so as to obtain a hidden state set with the space-time characteristic;
secondly, calculating the weight of each hidden state in a hidden state set with space-time characteristics based on an attention mechanism, and obtaining a context vector of the global network topology flow change information;
and finally, passing the context vector through a full connection layer, and obtaining a predicted flow matrix at the next moment by utilizing a ReLU function.
4. The SDN route optimization method based on spatio-temporal feature fusion of graph neural network as claimed in claim 1, wherein in step 4, the network topology M, the routing scheme P and the current time flow matrix are combined
Figure FDA0003956195400000031
The network performance prediction module is input into a space-time feature fusion network model based on the graph neural network and outputs a performance matrix of a link level
Figure FDA0003956195400000032
And path level performance matrix
Figure FDA0003956195400000033
A traffic matrix sequence consisting of the network topology M and the traffic matrices at the current time and the previous q times
Figure FDA0003956195400000034
The flow matrix prediction module is input to the space-time characteristic fusion network model based on the graph neural network and outputs the predicted flow matrix at the next moment
Figure FDA0003956195400000035
5. The SDN route optimization method based on spatio-temporal feature fusion of graph neural network as claimed in claim 1, wherein in step 5, the reward value r of the reward function at the next moment t′+1 Comprises the following steps:
Figure FDA0003956195400000036
in the formula, alpha and beta are set adjustable parameters;
Figure FDA0003956195400000037
Figure FDA0003956195400000038
a performance matrix at the link level;
Figure FDA0003956195400000039
Figure FDA00039561954000000310
for the performance matrix at the path level,
Figure FDA00039561954000000311
is all that
Figure FDA00039561954000000312
N is the number of network nodes.
CN202211473921.0A 2022-11-22 2022-11-22 SDN route optimization method based on time-space feature fusion of graph neural network Pending CN115842768A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211473921.0A CN115842768A (en) 2022-11-22 2022-11-22 SDN route optimization method based on time-space feature fusion of graph neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211473921.0A CN115842768A (en) 2022-11-22 2022-11-22 SDN route optimization method based on time-space feature fusion of graph neural network

Publications (1)

Publication Number Publication Date
CN115842768A true CN115842768A (en) 2023-03-24

Family

ID=85575970

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211473921.0A Pending CN115842768A (en) 2022-11-22 2022-11-22 SDN route optimization method based on time-space feature fusion of graph neural network

Country Status (1)

Country Link
CN (1) CN115842768A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117395188A (en) * 2023-12-07 2024-01-12 南京信息工程大学 Deep reinforcement learning-based heaven-earth integrated load balancing routing method
CN117454930A (en) * 2023-12-22 2024-01-26 苏州元脑智能科技有限公司 Method and device for outputting expression characteristic data aiming at graphic neural network

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117395188A (en) * 2023-12-07 2024-01-12 南京信息工程大学 Deep reinforcement learning-based heaven-earth integrated load balancing routing method
CN117395188B (en) * 2023-12-07 2024-03-12 南京信息工程大学 Deep reinforcement learning-based heaven-earth integrated load balancing routing method
CN117454930A (en) * 2023-12-22 2024-01-26 苏州元脑智能科技有限公司 Method and device for outputting expression characteristic data aiming at graphic neural network
CN117454930B (en) * 2023-12-22 2024-04-05 苏州元脑智能科技有限公司 Method and device for outputting expression characteristic data aiming at graphic neural network

Similar Documents

Publication Publication Date Title
CN115842768A (en) SDN route optimization method based on time-space feature fusion of graph neural network
CN113328938B (en) Network autonomous intelligent management and control method based on deep reinforcement learning
CN110012516B (en) Low-orbit satellite routing strategy method based on deep reinforcement learning architecture
Dolati et al. DeepViNE: Virtual network embedding with deep reinforcement learning
US6411946B1 (en) Route optimization and traffic management in an ATM network using neural computing
CN114567598B (en) Load balancing method and device based on deep learning and cross-domain cooperation
CN114500360A (en) Network traffic scheduling method and system based on deep reinforcement learning
EP2483850A2 (en) Apparatus and method for determining optimum paths in a multi-layer network using a routing engine
CN113194034A (en) Route optimization method and system based on graph neural network and deep reinforcement learning
CN111211987A (en) Method and system for dynamically adjusting flow in network, electronic equipment and storage medium
CN111917642B (en) SDN intelligent routing data transmission method for distributed deep reinforcement learning
CN114629543B (en) Satellite network self-adaptive flow scheduling method based on deep supervised learning
CN111010341B (en) Overlay network routing decision method based on deep learning
CN114221691A (en) Software-defined air-space-ground integrated network route optimization method based on deep reinforcement learning
CN115242295B (en) Satellite network SDN multi-controller deployment method and system
CN113395207A (en) Deep reinforcement learning-based route optimization framework and method under SDN framework
CN116527565A (en) Internet route optimization method and device based on graph convolution neural network
CN115714741A (en) Routing decision method and system based on collaborative multi-agent reinforcement learning
CN116781139A (en) Flow prediction satellite path selection method and system based on reinforcement learning
CN117041132B (en) Distributed load balancing satellite routing method based on deep reinforcement learning
CN114051272A (en) Intelligent routing method for dynamic topological network
CN114301910A (en) Cloud-edge collaborative computing task unloading method in Internet of things environment
CN117061360A (en) SDN network flow prediction method and system based on space-time information
CN116938810A (en) Deep reinforcement learning SDN intelligent route optimization method based on graph neural network
CN116055324B (en) Digital twin method for self-optimization of data center network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination