WO2024037136A1 - Graph structure feature-based routing optimization method and system - Google Patents

Graph structure feature-based routing optimization method and system Download PDF

Info

Publication number
WO2024037136A1
WO2024037136A1 PCT/CN2023/098735 CN2023098735W WO2024037136A1 WO 2024037136 A1 WO2024037136 A1 WO 2024037136A1 CN 2023098735 W CN2023098735 W CN 2023098735W WO 2024037136 A1 WO2024037136 A1 WO 2024037136A1
Authority
WO
WIPO (PCT)
Prior art keywords
network
graph
target
policy
routing
Prior art date
Application number
PCT/CN2023/098735
Other languages
French (fr)
Chinese (zh)
Inventor
郭永安
吴庆鹏
张啸
佘昊
钱琪杰
Original Assignee
南京邮电大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 南京邮电大学 filed Critical 南京邮电大学
Publication of WO2024037136A1 publication Critical patent/WO2024037136A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/02Topology update or discovery
    • H04L45/08Learning-based routing, e.g. using neural networks or artificial intelligence
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/02Topology update or discovery

Definitions

  • the invention relates to the field of computer network technology, and in particular to a routing optimization method and system based on graph structure characteristics.
  • the purpose of the present invention is to provide a routing optimization method and system based on graph structure characteristics, which is suitable for SDN network environments.
  • Switches or routing devices support traditional layer 2 network protocols to optimize global routing overhead from multiple network attributes. , adapt to dynamic and complex SDN networks and ensure SDN network performance.
  • the present invention designs a routing optimization method based on graph structure characteristics.
  • the following steps S1 to S3 are performed to obtain the routing overhead of each link in the target SDN network and adjust each link. weight to complete routing optimization of the target SDN network.
  • Step S1 For the target SDN network, based on the southbound interface protocol, obtain the network topology diagram of the target SDN network, and construct a graph adjacency matrix according to the connection relationship between the nodes on each link of the target SDN network in the network topology diagram, respectively. For each node on each link of the target SDN network, construct the information feature vector of each node based on the link bandwidth, traffic, packet loss rate, and transmission delay of each node, and build the target SDN based on the information feature vector of each node Network information feature matrix of the network.
  • Step S2 Taking the graph adjacency matrix and the network information feature matrix as the state of the target SDN network, based on the graph learning algorithm
  • the graph adjacency matrix and the network information feature matrix are used as input, through the deep graph learning method, the routing strategy and routing cost of the target SDN network in the current state are used as the output, and based on the gradient back propagation method, the graph learning neural network is updated.
  • Network parameters, and after a preset number of iterations, the graph learning neural network is trained to obtain a deep graph learning model that minimizes the routing overhead of the target SDN network and maximizes link utilization.
  • Step S3 Based on the trained deep graph learning model and the status of the target SDN network, obtain the routing strategy that minimizes the routing cost of the target SDN network, deploy the routing strategy to the target SDN network, and change each link of the target SDN network according to the routing strategy. Route weight to complete routing optimization of the target SDN network.
  • step S1 the specific steps of step S1 are as follows:
  • Step S1.1 For the target SDN network, based on the southbound interface protocol, obtain the network topology of the target SDN network, where the network topology includes M routers and N links.
  • Step S1.2 Based on the network topology of the target SDN network, each router corresponds to a real node, and each link corresponds to an edge. Insert a virtual node on the edge corresponding to each link, and combine the network topology of the target SDN network.
  • V real represents the set of real nodes
  • V virtual represents the set of virtual nodes
  • V real ⁇ v s1 , v s2 ,..., v sM ⁇
  • v s1 , v s2 ,..., v sM represent M real nodes;
  • V virtual ⁇ v x1 , v x2 ,..., v xN ⁇
  • e 1 , e 2 ,..., e 2N represent 2N edges.
  • the elements a ij in the graph adjacency matrix A are as follows:
  • B wi is the link bandwidth of node i
  • T hi is the traffic of node i
  • L pi is the packet loss rate of node i
  • D ti is the transmission delay of node i
  • the network information feature matrix H of the target SDN network is constructed as follows:
  • h 1 , h 2 ,..., h i ,..., h x are the information feature vectors of each node.
  • node i described in step S1.4 if node i is a virtual node, then the traffic T hi , packet loss rate L pi , and transmission delay D ti of node i are 0, If node i is a real node, the link bandwidth B wi of node i is 0.
  • the deep graph learning method in step S2 includes four graph learning neural networks and an experience pool.
  • the four graph learning neural networks are respectively an online graph policy network, an online graph value network, and a target Graph policy network, target graph value network, and the four graph learning neural networks each include an input layer, two hidden layers, and an output layer.
  • the input layer of the online graph policy network and the target graph policy network is based on the graph adjacency matrix A and the network information feature matrix H
  • the outputs of the online graph policy network and the target graph policy network are used as the inputs of the online graph value network and the target graph value network respectively.
  • Each graph learns the propagation formula from the input layer of the neural network to the hidden layer and between hidden layers. Similarly, if the input layer is recorded as layer 0, the first hidden layer is recorded as layer 1, and the second hidden layer is recorded as layer 2, the propagation formula is as follows:
  • ⁇ ( ⁇ ) means normalizing the formula inside the brackets
  • H l is the network information feature matrix of the l-th layer
  • W l+1 is the weight matrix of the l+1-th layer
  • H 0 H
  • I is the x-order unit matrix, for The degree matrix of As follows:
  • W 1 is a 4 ⁇ 4 matrix
  • W 2 is a 4 ⁇ 1 matrix
  • the output layer is a fully connected layer
  • its output value is an x ⁇ 1 matrix
  • K is the weight matrix of the output layer of the online graph policy network and the target graph policy network
  • H 2 is the network information feature matrix of the second layer
  • W 1 and W 2 are both A 1 ⁇ 1 matrix
  • the output layer is the aggregation layer
  • its output value is a 1 ⁇ 1 matrix, recorded as Value, specifically as follows:
  • Q is the weight value of the output layer, is the i-th value in the layer 2 network information feature matrix H2 ; according to the routing policy Policy output by the online graph policy network, the routing cost of each link in the target SDN network is updated.
  • step S2 the specific steps of step S2 are as follows:
  • Step S2.1 Initialize the weight matrices of the online graph policy network, online graph value network, target policy network, and target graph value network.
  • the weight matrix of the online graph policy network is W ⁇ and the weight matrix of the online graph value network is W ⁇ ′
  • the weight matrix of the target graph policy network is W ⁇
  • the weight matrix of the target graph value network is W ⁇ ′ .
  • Step S2.2 Initialize the experience pool. The specific steps are as follows:
  • Step S2.2.2 Definition are the outputs of the output layer of the online graph policy network, target graph policy network, online graph value network, and target graph value network at time t respectively; calculate the output routing policy of the online graph policy network according to the following formula
  • U(B w , Th , L p , D t ) is the link utilization rate
  • B w , Th , L p , and D t are the link bandwidth, traffic, and packet loss rate of the target SDN network respectively.
  • Transmission delay, K f is the proportional coefficient
  • the objective function to construct the target SDN network link utilization maximization is U max (B w , Th , L p , D t ).
  • Step S2.2.3 Define the experience pool R as follows:
  • s t+1 represents the status of the target SDN network at time t+1, that is, the online graph policy network outputs the routing policy The obtained status of the target SDN network.
  • Step S2.3 For the target SDN network, perform a preset number of iterations, where the preset number of iterations is T.
  • the specific steps are as follows:
  • Step S2.3.2 Based on the status s t of the target SDN network at time t, the online graph policy network outputs the routing policy.
  • the process is recorded as Among them, ⁇ is the network parameter of the online graph policy network;
  • Step S2.3.3 According to routing policy Update the routing costs of each link in the target SDN network
  • Step S2.3.4 Obtain the routing policy Updated target SDN network state s t+1 , and obtain environmental feedback f t at the same time;
  • Step S2.3.5 Place Stored in the experience pool R as a set of historical records
  • Step S2.3.6 Randomly select Y groups of historical records from the experience pool R Among them, the subscript m represents any set of historical records in the experience pool R;
  • Step S2.3.7 Based on the historical records extracted in step S2.3.6 Calculate target map value The output corresponding to the network As follows:
  • ⁇ ′ is the network parameter of the target graph policy network
  • ⁇ ′ is the network parameter of the target graph value network
  • is the discount.
  • Factor is a constant, and ⁇ (0,1);
  • Step S2.3.8 Calculate the loss Loss ogvn of the online graph value network output value according to the following formula:
  • the online graph value network representing the network parameter ⁇ is in the state s m of the target SDN network, and when the routing policy output by the online graph policy network is ⁇ (s m
  • Step S2.3.9 According to the loss Loss ogvn of the output value of the online graph value network, based on the gradient backpropagation method, update the network parameters ⁇ of the online graph value network;
  • Step S2.3.10 Calculate gradient value According to the gradient value Based on the gradient backpropagation method, the network parameters ⁇ of the online graph policy network are updated, where Indicates finding the gradient of the formula in parentheses;
  • is a constant, and ⁇ (0,1);
  • Step S2.3.12 Repeat S2.3.2 to step S2.3.11 until the number of iterations reaches the preset number T, and the target SDN is obtained.
  • step S3 the specific steps of step S3 are as follows:
  • Step S31 Obtain the graph adjacency matrix A and network information feature matrix H of the target SDN network;
  • Step S32 Based on the trained deep graph learning model and according to the status [A, H] of the target SDN network, obtain the routing strategy that minimizes the routing cost of the target SDN network;
  • Step S33 Deploy to the target SDN network according to the routing policy obtained in step S32, and change the link weights of the target SDN network according to the routing policy;
  • Step S34 During the traffic transmission process, the updated weight of each link is used for traffic transmission according to the shortest path scheme.
  • the present invention also designs a system for route optimization method based on graph structure characteristics.
  • the target SDN network includes a control plane and a data plane, where the control plane includes an information acquisition module, a policy deployment module, and a DGL module; so that the method based on graph structure characteristics
  • the system of routing optimization method implements the routing optimization method based on graph structure characteristics.
  • Each link and node of the target SDN network is deployed on the data plane.
  • the information acquisition module on the control plane is used to obtain the network topology diagram of the target SDN network, generate a graph adjacency matrix and a network information feature matrix, and send them to the DGL module.
  • the DGL module is based on the graph learning neural network. It takes the graph adjacency matrix and the network information feature matrix as inputs. Through the deep graph learning method, it uses the routing cost of the target SDN network in the current state as the output. Based on the gradient back propagation method, it updates the graph learning neural network. Network parameters of the network, and after a preset number of iterations, the graph learning neural network is trained to obtain a deep graph learning model that minimizes the routing overhead of the target SDN network and maximizes link utilization.
  • the policy deployment module on the control plane is used to obtain the routing strategy that minimizes the routing cost of the target SDN network based on the trained deep graph learning model obtained by the DGL module and based on the status of the target SDN network, and combine the routing strategy with the target SDN network Routing overhead is sent to the data plane.
  • the advantages of the present invention include:
  • the deep graph learning model has strong generalization ability.
  • the trained deep graph learning model is still effective when the network topology changes, and can adapt to large-scale dynamic and complex networks.
  • Figure 1 is an overall block diagram of a system based on a route optimization method based on graph structure features provided according to an embodiment of the present invention
  • Figure 2 is a DGL algorithm framework diagram provided according to an embodiment of the present invention.
  • Figure 3 is a structural diagram of a graph learning neural network provided according to an embodiment of the present invention.
  • the embodiment of the present invention provides a routing optimization method based on graph structure characteristics. For the target SDN network, the following steps S1 to S3 are performed to obtain the routing overhead of each link in the target SDN network and adjust the weight of each link. , complete the routing optimization of the target SDN network.
  • Step S1 Referring to Figure 1, for the target SDN network, obtain the network topology diagram of the target SDN network based on the southbound interface protocol, and construct a diagram based on the connection relationships between the nodes on each link of the target SDN network in the network topology diagram.
  • the adjacency matrix is for each node on each link of the target SDN network, and based on the link bandwidth, traffic, packet loss rate, and transmission delay of each node, the information feature vector of each node is constructed, and based on the information feature vector of each node , construct the network information feature matrix of the target SDN network.
  • step S1 The specific steps of step S1 are as follows:
  • Step S1.1 For the target SDN network, based on the southbound interface protocol, obtain the network topology of the target SDN network, where the network topology includes M routers and N links.
  • Step S1.2 Based on the network topology of the target SDN network, each router corresponds to a real node, and each link corresponds to an edge. Insert a virtual node on the edge corresponding to each link, and combine the network topology of the target SDN network.
  • V real represents the set of real nodes
  • V virtual represents the set of virtual nodes
  • V real ⁇ v s1 , v s2 ,..., v sM ⁇
  • v s1 , v s2 ,..., v sM represent M real nodes;
  • V virtual ⁇ v x1 , v x2 ,..., v xN ⁇
  • e 1 , e 2 ,..., e 2N represent 2N edges.
  • the elements a ij in the graph adjacency matrix A are as follows:
  • B wi is the link bandwidth of node i
  • T hi is the traffic of node i
  • L pi is the packet loss rate of node i
  • D ti is the transmission delay of node i.
  • the node i For the node i, if the node i is a virtual node, the traffic T hi , the packet loss rate L pi , and the transmission delay D ti of the node i are 0. If the node i is a real node, the link bandwidth B wi of the node i is 0.
  • the network information feature matrix H of the target SDN network is constructed as follows:
  • h 1 , h 2 ,..., h i ,..., h x are the information feature vectors of each node.
  • Step S2 Take the graph adjacency matrix and the network information feature matrix as the target SDN network state, learn the neural network based on the graph, take the graph adjacency matrix and the network information feature matrix as input, and use the deep graph learning method (Deep Graph Learning, DGL) to Taking the routing strategy and routing cost of the target SDN network in the current state as the output, based on the gradient back propagation method, the network parameters of the graph learning neural network are updated, and after a preset number of iterations, the graph learning neural network is Conduct training to obtain a deep graph learning model that minimizes routing overhead and maximizes link utilization in the target SDN network.
  • DGL Deep Graph Learning
  • the deep graph learning method described in step S2 includes four graph learning neural networks and an experience pool.
  • the four graph learning neural networks are the Online Graph Strategy Network (OGSN) and the Online Graph Value Network. (Online Graph Value Network, OGVN), Target Graph Strategy Network (TGSN), Target Graph Value Network (Target Graph Value Network, TGVN), referring to Figure 3, each of the four graph learning neural networks includes an input layer, two hidden layers, and an output layer.
  • the input layer of the online graph policy network and the target graph policy network takes the graph adjacency matrix A and the network information feature matrix H as inputs, and the outputs of the online graph policy network and the target graph policy network serve as the inputs of the online graph value network and the target graph value network respectively.
  • the propagation formulas from the input layer to the hidden layer and between hidden layers of each graph learning neural network are the same.
  • the input layer is recorded as layer 0
  • the first hidden layer is recorded as layer 1
  • the second hidden layer Denoted as layer 2
  • the propagation formula is as follows:
  • ⁇ ( ⁇ ) means normalizing the formula inside the brackets
  • H l is the network information feature matrix of the l-th layer
  • W l+1 is the weight matrix of the l+1-th layer
  • H 0 H
  • I is the x-order unit matrix, for The degree matrix of As follows:
  • W 1 is a 4 ⁇ 4 matrix
  • W 2 is a 4 ⁇ 1 matrix
  • the output layer is a fully connected layer
  • its output value is an x ⁇ 1 matrix
  • K is the weight matrix of the output layer of the online graph policy network and the target graph policy network
  • H 2 is the network information feature matrix of the second layer.
  • W 1 and W 2 are both 1 ⁇ 1 matrices
  • the output layer is the aggregation layer
  • its output value is a 1 ⁇ 1 matrix, recorded as Value, as follows:
  • Q is the weight value of the output layer, is the i-th value in the layer 2 network information feature matrix H2 ; according to the routing policy Policy output by the online graph policy network, the routing cost of each link in the target SDN network is updated.
  • step S2 the specific steps of step S2 are as follows:
  • Step S2.1 Initialize the weight matrices of the online graph policy network, online graph value network, target policy network, and target graph value network.
  • the weight matrix of the online graph policy network is W ⁇ and the weight matrix of the online graph value network is W ⁇ ′
  • the weight matrix of the goal graph policy network is W ⁇
  • the weight matrix of the goal graph value network is W ⁇ ′ .
  • the network parameters of the online graph policy network and the goal policy network are consistent, and the online graph value network and goal graph The network parameters of the value network are consistent.
  • Step S2.2 Initialize the experience pool. The specific steps are as follows:
  • Step S2.2.2 Definition are the outputs of the output layer of the online graph policy network, target graph policy network, online graph value network, and target graph value network at time t respectively; calculate the output routing policy of the online graph policy network according to the following formula
  • U(B w , Th , L p , D t ) is the link utilization rate
  • B w , Th , L p , and D t are the link bandwidth, traffic, and packet loss rate of the target SDN network respectively.
  • Transmission delay, K f is the proportional coefficient.
  • the objective function to construct the target SDN network link utilization maximization is U max (B w , Th , L p , D t ).
  • Step S2.2.3 Define the experience pool R as follows:
  • s t+1 represents the status of the target SDN network at time t+1, that is, the online graph policy network outputs the routing policy The obtained status of the target SDN network.
  • Step S2.3 For the target SDN network, perform a preset number of iterations, where the preset number of iterations is T.
  • the specific steps are as follows:
  • Step S2.3.2 Based on the status s t of the target SDN network at time t, the online graph policy network outputs the routing policy.
  • the process is recorded as Among them, ⁇ is the network parameter of the online graph policy network;
  • Step S2.3.3 According to routing policy Update the routing costs of each link in the target SDN network
  • Step S2.3.4 Obtain the routing policy Updated target SDN network state s t+1 , and obtain environmental feedback f t at the same time;
  • Step S2.3.5 Place Stored in the experience pool R as a set of historical records
  • Step S2.3.6 Randomly select Y groups of historical records from the experience pool R Among them, the subscript m represents any set of historical records in the experience pool R;
  • Step S2.3.7 Based on the historical records extracted in step S2.3.6 Calculate the output corresponding to the target graph value network As follows:
  • ⁇ ′ is the network parameter of the target graph policy network
  • ⁇ ′ is the network parameter of the target graph value network
  • is the discount.
  • the factor is a constant, and ⁇ (0,1).
  • Step S2.3.8 Calculate the loss Loss ogvn of the online graph value network output value according to the following formula:
  • the online graph value network representing the network parameter ⁇ is in the state s m of the target SDN network, and when the routing policy output by the online graph policy network is ⁇ (s m
  • Step S2.3.9 According to the loss Loss ogvn of the output value of the online graph value network, based on the gradient backpropagation method, update the network parameters ⁇ of the online graph value network.
  • Step S2.3.10 Calculate gradient value According to the gradient value Based on the gradient backpropagation method, the network parameters ⁇ of the online graph policy network are updated, where Indicates finding the gradient of the formula in parentheses.
  • is a constant, and ⁇ (0,1).
  • Step S2.3.12 Repeat S2.3.2 to Step S2.3.11 until the number of iterations reaches the preset number T, and obtain the routing strategy that minimizes the routing cost of the target SDN network.
  • Step S3 Based on the trained deep graph learning model and the status of the target SDN network, obtain the routing strategy that minimizes the routing cost of the target SDN network, deploy the routing strategy to the target SDN network, and change each link of the target SDN network according to the routing strategy. Route weight to complete routing optimization of the target SDN network.
  • step S3 The specific steps of step S3 are as follows:
  • Step S31 Obtain the graph adjacency matrix A and network information feature matrix H of the target SDN network;
  • Step S32 Based on the trained deep graph learning model and according to the status [A, H] of the target SDN network, obtain the routing strategy that minimizes the routing cost of the target SDN network;
  • Step S33 Deploy to the target SDN network according to the routing policy obtained in step S32, and change the link weights of the target SDN network according to the routing policy;
  • Step S34 During the traffic transmission process, the updated weight of each link is used for traffic transmission according to the shortest path scheme.
  • Embodiments of the present invention also provide a system for routing optimization methods based on graph structure characteristics.
  • the target SDN network includes a control plane and a data plane, where the control plane includes an information acquisition module, a policy deployment module, and a DGL module; such that The system of the routing optimization method based on graph structure characteristics implements the routing optimization method based on graph structure characteristics.
  • Each link and node of the target SDN network is deployed on the data plane.
  • the information acquisition module on the control plane is used to obtain the network topology diagram of the target SDN network, generate a graph adjacency matrix and a network information feature matrix, and send them to the DGL module.
  • the DGL module is based on the graph learning neural network. It takes the graph adjacency matrix and the network information feature matrix as inputs. Through the deep graph learning method, it uses the routing cost of the target SDN network in the current state as the output. Based on the gradient back propagation method, it updates the graph learning neural network. Network parameters of the network, and after a preset number of iterations, the graph learning neural network is trained to obtain a deep graph learning model that minimizes the routing overhead of the target SDN network and maximizes link utilization.
  • the policy deployment module on the control plane is used to obtain the routing strategy that minimizes the routing cost of the target SDN network based on the trained deep graph learning model obtained by the DGL module and based on the status of the target SDN network, and combine the routing strategy with the target SDN network Routing overhead is sent to the data plane.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

Disclosed in the present invention are a graph structure feature-based routing optimization method and system. The system is used for an SDN environment and comprises a control plane and a data plane, wherein the control plane comprises an information acquisition module, a strategy deployment module, and a deep graph learning (DGL) module. The method comprises: acquiring network topology structure information and information in a network, and generating a corresponding graph adjacency matrix and a network information feature matrix; training a graph learning neural network according to the graph adjacency matrix and the network information feature matrix, to obtain a DGL model allowing for the minimum SDN routing overhead and the maximum link utilization rate; and using the DGL model and deploying same to an SDN. According to the method, a dynamic and complex network topology is learned from a spatial dimension, the difficulty in optimizing a dynamic topology is overcome, and a better routing scheme is provided for the SDN.

Description

一种基于图结构特征的路由优化方法与系统A route optimization method and system based on graph structure characteristics 技术领域Technical field
本发明涉及计算机网络技术领域,具体涉及一种基于图结构特征的路由优化方法与系统。The invention relates to the field of computer network technology, and in particular to a routing optimization method and system based on graph structure characteristics.
背景技术Background technique
近年来,随着网络环境的复杂化,业务流量的多样化,路由路径优化问题成为一个研究热点。在传统网络中,路由选择采用尽力而为(Best-Effort)模型,利用OSPF技术来提供最短路径,无法适应动态、复杂的网络环境。软件定义网络(Software Defined Network,SDN)架构的提出将传统网络的控制平面和数据平面进行解耦,大大增加了路由优化问题解决方案的空间。在SDN环境下,深度强化学习与神经网络的结合能够为路由决策提供极大地帮助。但CNN、RNN、LSTM等算法本质上适用于欧式空间,例如图像、网格等。网络拓扑通常是一个复杂的模型,链路与链路、节点与节点之间有很强的空间相关性,传统神经网络很难将这一特征表现出来,并且基于深度强化学习的路由优化模型在网络拓扑发生变化时需要重新训练,不具有对动态拓扑的泛化能力。因此,需要有一种方法能够对网络拓扑的空间特征进行提取,从空间维度上学习动态、复杂的网络拓扑,并且能够克服动态拓扑的优化问题,提供更加优质的路由方案。In recent years, with the complexity of the network environment and the diversification of business traffic, the routing path optimization problem has become a research hotspot. In traditional networks, routing selection adopts the best-effort model and uses OSPF technology to provide the shortest path, which cannot adapt to dynamic and complex network environments. The proposal of Software Defined Network (SDN) architecture decouples the control plane and data plane of traditional networks, greatly increasing the space for solutions to routing optimization problems. In an SDN environment, the combination of deep reinforcement learning and neural networks can greatly help routing decisions. However, algorithms such as CNN, RNN, and LSTM are essentially suitable for Euclidean spaces, such as images, grids, etc. Network topology is usually a complex model with strong spatial correlation between links and nodes. It is difficult for traditional neural networks to express this feature, and routing optimization models based on deep reinforcement learning are When the network topology changes, it needs to be retrained, and it does not have the ability to generalize to dynamic topology. Therefore, there is a need for a method that can extract the spatial characteristics of network topology, learn dynamic and complex network topology from the spatial dimension, and be able to overcome the optimization problems of dynamic topology and provide better routing solutions.
发明内容Contents of the invention
本发明目的:在于提供一种基于图结构特征的路由优化方法与系统,适用于SDN网络环境下,交换机或路由设备支持传统的二层网络协议,实现从多个网络属性上优化全局的路由开销,适应动态、复杂的SDN网络,保障SDN网络性能。The purpose of the present invention is to provide a routing optimization method and system based on graph structure characteristics, which is suitable for SDN network environments. Switches or routing devices support traditional layer 2 network protocols to optimize global routing overhead from multiple network attributes. , adapt to dynamic and complex SDN networks and ensure SDN network performance.
为实现以上功能,本发明设计一种基于图结构特征的路由优化方法,针对目标SDN网络,执行以下步骤S1-步骤S3,获得目标SDN网络中各条链路的路由开销,调整各条链路的权重,完成目标SDN网络的路由优化。In order to realize the above functions, the present invention designs a routing optimization method based on graph structure characteristics. For the target SDN network, the following steps S1 to S3 are performed to obtain the routing overhead of each link in the target SDN network and adjust each link. weight to complete routing optimization of the target SDN network.
步骤S1:针对目标SDN网络,基于南向接口协议,获取目标SDN网络的网络拓扑图,根据网络拓扑图中目标SDN网络的各链路上各节点之间的连接关系,构建图邻接矩阵,分别针对目标SDN网络的各链路上各节点,根据各节点的链路带宽、流量、丢包率、传输时延,构建各节点的信息特征向量,并基于各节点的信息特征向量,构建目标SDN网络的网络信息特征矩阵。Step S1: For the target SDN network, based on the southbound interface protocol, obtain the network topology diagram of the target SDN network, and construct a graph adjacency matrix according to the connection relationship between the nodes on each link of the target SDN network in the network topology diagram, respectively. For each node on each link of the target SDN network, construct the information feature vector of each node based on the link bandwidth, traffic, packet loss rate, and transmission delay of each node, and build the target SDN based on the information feature vector of each node Network information feature matrix of the network.
步骤S2:以图邻接矩阵、网络信息特征矩阵为目标SDN网络的状态,基于图学习神 经网络,以图邻接矩阵、网络信息特征矩阵为输入,通过深度图学习方法,以当前状态下目标SDN网络的路由策略、路由开销为输出,基于梯度反向传播方法,更新图学习神经网络的网络参数,并经过预设次数的迭代,对图学习神经网络进行训练,获得使目标SDN网络路由开销最小、链路利用率最大的深度图学习模型。Step S2: Taking the graph adjacency matrix and the network information feature matrix as the state of the target SDN network, based on the graph learning algorithm Through the network, the graph adjacency matrix and the network information feature matrix are used as input, through the deep graph learning method, the routing strategy and routing cost of the target SDN network in the current state are used as the output, and based on the gradient back propagation method, the graph learning neural network is updated. Network parameters, and after a preset number of iterations, the graph learning neural network is trained to obtain a deep graph learning model that minimizes the routing overhead of the target SDN network and maximizes link utilization.
步骤S3:根据训练好的深度图学习模型,基于目标SDN网络的状态,获得使目标SDN网络路由开销最小的路由策略,将路由策略部署至目标SDN网络,根据路由策略改变目标SDN网络的各链路权重,完成目标SDN网络的路由优化。Step S3: Based on the trained deep graph learning model and the status of the target SDN network, obtain the routing strategy that minimizes the routing cost of the target SDN network, deploy the routing strategy to the target SDN network, and change each link of the target SDN network according to the routing strategy. Route weight to complete routing optimization of the target SDN network.
作为本发明的一种优选技术方案:步骤S1的具体步骤如下:As a preferred technical solution of the present invention: the specific steps of step S1 are as follows:
步骤S1.1:针对目标SDN网络,基于南向接口协议,获取目标SDN网络的网络拓扑结构,其中网络拓扑结构包含M个路由器、N条链路。Step S1.1: For the target SDN network, based on the southbound interface protocol, obtain the network topology of the target SDN network, where the network topology includes M routers and N links.
步骤S1.2:针对目标SDN网络的网络拓扑结构,每个路由器对应一个实节点,每条链路对应一条边,在每条链路所对应的边上插入虚节点,将目标SDN网络的网络拓扑结构表示为M个实节点、N个虚节点、2N条边的网络拓扑图G(V,E),其中,V表示节点集合,E表示边集合,具体如下式:
V={V,V}
Step S1.2: Based on the network topology of the target SDN network, each router corresponds to a real node, and each link corresponds to an edge. Insert a virtual node on the edge corresponding to each link, and combine the network topology of the target SDN network. The topology is expressed as a network topology graph G(V,E) with M real nodes, N virtual nodes, and 2N edges, where V represents the node set and E represents the edge set, specifically as follows:
V={V real , V virtual }
其中,V表示实节点集合,V表示虚节点集合;
V={vs1,vs2,...,vsM}
Among them, V real represents the set of real nodes, and V virtual represents the set of virtual nodes;
V real ={v s1 , v s2 ,..., v sM }
其中,vs1,vs2,...,vsM表示M个实节点;
V={vx1,vx2,...,vxN}
Among them, v s1 , v s2 ,..., v sM represent M real nodes;
V virtual ={v x1 , v x2 ,..., v xN }
其中,vx1,vx2,...,vxN表示N个虚节点;
E={e1,e2,...,e2N}
Among them, v x1 , v x2 ,..., v xN represent N virtual nodes;
E={e 1 , e 2 ,..., e 2N }
其中,e1,e2,...,e2N表示2N条边。Among them, e 1 , e 2 ,..., e 2N represent 2N edges.
步骤S1.3:令x=M+N,x表示节点总数,节点包括M个实节点、N个虚节点,基于目标SDN网络的网络拓扑图,构建x阶的图邻接矩阵A如下式:
Step S1.3: Let x=M+N, x represents the total number of nodes, and the nodes include M real nodes and N virtual nodes. Based on the network topology diagram of the target SDN network, construct an x-order graph adjacency matrix A as follows:
其中,图邻接矩阵A中的元素aij如下式:
Among them, the elements a ij in the graph adjacency matrix A are as follows:
步骤S1.4:针对目标SDN网络的任一节点i,根据节点i的链路带宽、流量、丢包率、传输时延,构建节点i的信息特征向量hi如下式:
hi=[Bwi,Thi,Lpi,Dti]
Step S1.4: For any node i of the target SDN network, based on the link bandwidth, traffic, packet loss rate, and transmission delay of node i, construct the information feature vector h i of node i as follows:
h i =[B wi , T hi , L pi , D ti ]
式中,Bwi为节点i的链路带宽,Thi为节点i的流量,Lpi为节点i的丢包率,Dti为节点i的传输时延;In the formula, B wi is the link bandwidth of node i, T hi is the traffic of node i, L pi is the packet loss rate of node i, and D ti is the transmission delay of node i;
基于各节点的信息特征向量,构建目标SDN网络的网络信息特征矩阵H如下式:
Based on the information feature vector of each node, the network information feature matrix H of the target SDN network is constructed as follows:
式中,h1,h2,...,hi,...,hx为各节点的信息特征向量。In the formula, h 1 , h 2 ,..., h i ,..., h x are the information feature vectors of each node.
作为本发明的一种优选技术方案:步骤S1.4中所述的节点i,若节点i为虚节点,则节点i的流量Thi、丢包率Lpi、传输时延Dti为0,若节点i为实节点,则节点i的链路带宽Bwi为0。As a preferred technical solution of the present invention: node i described in step S1.4, if node i is a virtual node, then the traffic T hi , packet loss rate L pi , and transmission delay D ti of node i are 0, If node i is a real node, the link bandwidth B wi of node i is 0.
作为本发明的一种优选技术方案:步骤S2中所述深度图学习方法包括四个图学习神经网络和一个经验池,四个图学习神经网络分别为在线图策略网络、在线图价值网络、目标图策略网络、目标图价值网络,四个图学习神经网络分别均包括一个输入层、两个隐藏层、一个输出层。As a preferred technical solution of the present invention: the deep graph learning method in step S2 includes four graph learning neural networks and an experience pool. The four graph learning neural networks are respectively an online graph policy network, an online graph value network, and a target Graph policy network, target graph value network, and the four graph learning neural networks each include an input layer, two hidden layers, and an output layer.
在线图策略网络、目标图策略网络的输入层以图邻接矩阵A、网络信息特征矩阵H 为输入,在线图策略网络、目标图策略网络的输出分别作为在线图价值网络、目标图价值网络的输入,其中,各图学习神经网络的输入层到隐藏层、以及隐藏层之间的传播公式相同,将输入层记为第0层,第一个隐藏层记为第1层,第二个隐藏层记为第2层,则传播公式如下式:
The input layer of the online graph policy network and the target graph policy network is based on the graph adjacency matrix A and the network information feature matrix H As input, the outputs of the online graph policy network and the target graph policy network are used as the inputs of the online graph value network and the target graph value network respectively. Each graph learns the propagation formula from the input layer of the neural network to the hidden layer and between hidden layers. Similarly, if the input layer is recorded as layer 0, the first hidden layer is recorded as layer 1, and the second hidden layer is recorded as layer 2, the propagation formula is as follows:
式中,σ(·)表示将括号内部的公式进行归一化,Hl为第l层的网络信息特征矩阵,Wl+1为第l+1层的权重矩阵,其中,H0=H,I为x阶单位矩阵,的度矩阵,如下式:
In the formula, σ(·) means normalizing the formula inside the brackets, H l is the network information feature matrix of the l-th layer, W l+1 is the weight matrix of the l+1-th layer, where H 0 =H , I is the x-order unit matrix, for The degree matrix of As follows:
其中,如下式:
in, As follows:
其中,在线图策略网络、目标图策略网络中,W1是一个4×4的矩阵,W2是一个4×1的矩阵,输出层为全连接层,其输出值为x×1矩阵,记为路由策略Policy,具体如下式:
Policy=H2×K
Among them, in the online graph policy network and the target graph policy network, W 1 is a 4×4 matrix, W 2 is a 4×1 matrix, the output layer is a fully connected layer, and its output value is an x×1 matrix, denoted is the routing policy Policy, the specific formula is as follows:
Policy=H 2 ×K
式中,K为在线图策略网络、目标图策略网络输出层的权重矩阵,H2为第2层的网络信息特征矩阵;在线图价值网络、目标图价值网络中,W1和W2均为1×1的矩阵,输出层为聚合层,其输出值为1×1矩阵,记为Value,具体如下式:
In the formula, K is the weight matrix of the output layer of the online graph policy network and the target graph policy network, and H 2 is the network information feature matrix of the second layer; in the online graph value network and the target graph value network, W 1 and W 2 are both A 1×1 matrix, the output layer is the aggregation layer, and its output value is a 1×1 matrix, recorded as Value, specifically as follows:
式中,Q为输出层的权重值,为第2层的网络信息特征矩阵H2中的第i个值;根据在线图策略网络输出的路由策略Policy,更新目标SDN网络中各条链路的路由开销。In the formula, Q is the weight value of the output layer, is the i-th value in the layer 2 network information feature matrix H2 ; according to the routing policy Policy output by the online graph policy network, the routing cost of each link in the target SDN network is updated.
作为本发明的一种优选技术方案:步骤S2的具体步骤如下:As a preferred technical solution of the present invention: the specific steps of step S2 are as follows:
步骤S2.1:对在线图策略网络、在线图价值网络、目标策略网络、目标图价值网络的权重矩阵初始化,其中,在线图策略网络的权重矩阵为Wθ,在线图价值网络的权重矩阵为Wθ′,目标图策略网络的权重矩阵为Wω,目标图价值网络的权重矩阵为Wω′Step S2.1: Initialize the weight matrices of the online graph policy network, online graph value network, target policy network, and target graph value network. Among them, the weight matrix of the online graph policy network is W θ and the weight matrix of the online graph value network is W θ′ , the weight matrix of the target graph policy network is W ω , and the weight matrix of the target graph value network is W ω′ .
步骤S2.2:对经验池进行初始化,具体步骤如下:Step S2.2: Initialize the experience pool. The specific steps are as follows:
步骤S2.2.1:以图邻接矩阵A、网络信息特征矩阵H作为目标SDN网络的状态S,定义 S=[A,H],st表示t时刻目标SDN网络的状态,st=[At,Ht],At表示t时刻目标SDN网络的图邻接矩阵,Ht表示t时刻目标SDN网络的网络信息特征矩阵。Step S2.2.1: Use the graph adjacency matrix A and the network information feature matrix H as the state S of the target SDN network, and define S = [A, H], s t represents the status of the target SDN network at time t, s t = [A t , H t ], A t represents the graph adjacency matrix of the target SDN network at time t, H t represents the target SDN at time t Network information feature matrix of the network.
步骤S2.2.2:定义分别为在线图策略网络、目标图策略网络、在线图价值网络、目标图价值网络的输出层在t时刻的输出;根据下式计算在线图策略网络输出路由策略所获得的环境反馈ft
ft=U(Bw,Th,Lp,Dt)×Kf
Step S2.2.2: Definition are the outputs of the output layer of the online graph policy network, target graph policy network, online graph value network, and target graph value network at time t respectively; calculate the output routing policy of the online graph policy network according to the following formula The obtained environmental feedback f t :
f t =U(B w , Th , L p , D t )×K f
式中,U(Bw,Th,Lp,Dt)为链路利用率,Bw、Th、Lp、Dt分别为目标SDN网络的链路带宽、流量、丢包率、传输时延,Kf为比例系数;构建目标SDN网络链路利用率最大化的目标函数为Umax(Bw,Th,Lp,Dt)。In the formula, U(B w , Th , L p , D t ) is the link utilization rate, B w , Th , L p , and D t are the link bandwidth, traffic, and packet loss rate of the target SDN network respectively. Transmission delay, K f is the proportional coefficient; the objective function to construct the target SDN network link utilization maximization is U max (B w , Th , L p , D t ).
步骤S2.2.3:定义经验池R如下式:
Step S2.2.3: Define the experience pool R as follows:
式中,st+1表示t+1时刻目标SDN网络的状态,即在线图策略网络输出路由策略所获得目标SDN网络的状态。In the formula, s t+1 represents the status of the target SDN network at time t+1, that is, the online graph policy network outputs the routing policy The obtained status of the target SDN network.
步骤S2.3:针对目标SDN网络,进行预设次数的迭代,其中预设迭代次数为T,具体步骤如下:Step S2.3: For the target SDN network, perform a preset number of iterations, where the preset number of iterations is T. The specific steps are as follows:
步骤S2.3.1:令t=1,获取目标SDN网络的初始状态s1Step S2.3.1: Let t=1 and obtain the initial state s 1 of the target SDN network;
步骤S2.3.2:在线图策略网络根据t时刻目标SDN网络的状态st,输出路由策略过程记为其中,θ为在线图策略网络的网络参数;Step S2.3.2: Based on the status s t of the target SDN network at time t, the online graph policy network outputs the routing policy. The process is recorded as Among them, θ is the network parameter of the online graph policy network;
步骤S2.3.3:根据路由策略更新目标SDN网络中各条链路的路由开销;Step S2.3.3: According to routing policy Update the routing costs of each link in the target SDN network;
步骤S2.3.4:获取根据路由策略更新后的目标SDN网络的状态st+1,同时获取环境反馈ftStep S2.3.4: Obtain the routing policy Updated target SDN network state s t+1 , and obtain environmental feedback f t at the same time;
步骤S2.3.5:将作为一组历史记录存入经验池R中;Step S2.3.5: Place Stored in the experience pool R as a set of historical records;
步骤S2.3.6:从经验池R中随机抽取Y组历史记录其中,下标m表示经验池R中任意一组历史记录;Step S2.3.6: Randomly select Y groups of historical records from the experience pool R Among them, the subscript m represents any set of historical records in the experience pool R;
步骤S2.3.7:根据步骤S2.3.6所抽取的历史记录计算目标图价值 网络所对应的输出如下式:
Step S2.3.7: Based on the historical records extracted in step S2.3.6 Calculate target map value The output corresponding to the network As follows:
式中,表示目标图策略网络根据目标SDN网络的状态sm+1所选择的路由策略,θ′为目标图策略网络的网络参数,ω′为目标图价值网络的网络参数,表示目标图价值网络基于目标SDN网络的状态sm+1、且网络参数为ω′时,目标图策略网络所选取的路由策略π′(sm+1|θ′)的期望值,γ为折扣因子,是一个常数,且γ∈(0,1);In the formula, Represents the routing strategy selected by the target graph policy network based on the state s m+1 of the target SDN network, θ′ is the network parameter of the target graph policy network, ω′ is the network parameter of the target graph value network, Indicates the expected value of the routing policy π′ (s m+ 1 |θ′) selected by the target graph policy network when the target graph value network is based on the state s m+1 of the target SDN network and the network parameter is ω′, and γ is the discount. Factor is a constant, and γ∈(0,1);
步骤S2.3.8:根据下式计算在线图价值网络输出值的损失Lossogvn
Step S2.3.8: Calculate the loss Loss ogvn of the online graph value network output value according to the following formula:
式中,表示网络参数ω的在线图价值网络在目标SDN网络的状态sm下,在线图策略网络输出的路由策略为π(sm|θ)时,在线图价值网络输出的价值;In the formula, The online graph value network representing the network parameter ω is in the state s m of the target SDN network, and when the routing policy output by the online graph policy network is π(s m |θ), the value output by the online graph value network;
步骤S2.3.9:根据在线图价值网络输出值的损失Lossogvn,基于梯度反向传播方法,更新在线图价值网络的网络参数ω;Step S2.3.9: According to the loss Loss ogvn of the output value of the online graph value network, based on the gradient backpropagation method, update the network parameters ω of the online graph value network;
步骤S2.3.10:计算梯度值根据梯度值基于梯度反向传播方法,更新在线图策略网络的网络参数θ,其中表示对括号内公式求梯度;Step S2.3.10: Calculate gradient value According to the gradient value Based on the gradient backpropagation method, the network parameters θ of the online graph policy network are updated, where Indicates finding the gradient of the formula in parentheses;
步骤S2.3.11:分别根据下式,更新目标图策略网络的网络参数θ′、目标图价值网络的网络参数ω′:
θ′=τθ+(1-τ)θ′
ω′=τω+(1-τ)ω′
Step S2.3.11: Update the network parameters θ′ of the target graph policy network and the network parameters ω′ of the target graph value network according to the following formulas:
θ′=τθ+(1-τ)θ′
ω′=τω+(1-τ)ω′
式中,τ为常数,且τ∈(0,1);In the formula, τ is a constant, and τ∈(0,1);
步骤S2.3.12:重复S2.3.2至步骤S2.3.11,直至迭代次数达到预设次数T,获得使目标SDN Step S2.3.12: Repeat S2.3.2 to step S2.3.11 until the number of iterations reaches the preset number T, and the target SDN is obtained.
网络路由开销最小的路由策略。Routing strategy with minimum network routing cost.
作为本发明的一种优选技术方案:步骤S3的具体步骤如下:As a preferred technical solution of the present invention: the specific steps of step S3 are as follows:
步骤S31:获取目标SDN网络的图邻接矩阵A、网络信息特征矩阵H;Step S31: Obtain the graph adjacency matrix A and network information feature matrix H of the target SDN network;
步骤S32:基于训练好的深度图学习模型,根据目标SDN网络的状态[A,H],获得使目标SDN网络路由开销最小的路由策略;Step S32: Based on the trained deep graph learning model and according to the status [A, H] of the target SDN network, obtain the routing strategy that minimizes the routing cost of the target SDN network;
步骤S33:根据步骤S32所获得的路由策略,部署至目标SDN网络,根据路由策略改变目标SDN网络的各链路权重;Step S33: Deploy to the target SDN network according to the routing policy obtained in step S32, and change the link weights of the target SDN network according to the routing policy;
步骤S34:在流量传输过程中,根据最短路径方案,采用更新后的各链路权重进行流量传输。Step S34: During the traffic transmission process, the updated weight of each link is used for traffic transmission according to the shortest path scheme.
本发明还设计一种基于图结构特征的路由优化方法的系统,目标SDN网络包括控制平面、数据平面,其中,控制平面包括信息获取模块、策略部署模块、DGL模块;使得所述基于图结构特征的路由优化方法的系统实现所述基于图结构特征的路由优化方法。The present invention also designs a system for route optimization method based on graph structure characteristics. The target SDN network includes a control plane and a data plane, where the control plane includes an information acquisition module, a policy deployment module, and a DGL module; so that the method based on graph structure characteristics The system of routing optimization method implements the routing optimization method based on graph structure characteristics.
目标SDN网络的各链路及各节点部署于数据平面,控制平面上的信息获取模块用于获取目标SDN网络的网络拓扑图,生成图邻接矩阵、网络信息特征矩阵,发送至DGL模块。Each link and node of the target SDN network is deployed on the data plane. The information acquisition module on the control plane is used to obtain the network topology diagram of the target SDN network, generate a graph adjacency matrix and a network information feature matrix, and send them to the DGL module.
DGL模块基于图学习神经网络,以图邻接矩阵、网络信息特征矩阵为输入,通过深度图学习方法,以当前状态下目标SDN网络的路由开销为输出,基于梯度反向传播方法,更新图学习神经网络的网络参数,并经过预设次数的迭代,对图学习神经网络进行训练,获得使目标SDN网络路由开销最小、链路利用率最大的深度图学习模型。The DGL module is based on the graph learning neural network. It takes the graph adjacency matrix and the network information feature matrix as inputs. Through the deep graph learning method, it uses the routing cost of the target SDN network in the current state as the output. Based on the gradient back propagation method, it updates the graph learning neural network. Network parameters of the network, and after a preset number of iterations, the graph learning neural network is trained to obtain a deep graph learning model that minimizes the routing overhead of the target SDN network and maximizes link utilization.
控制平面上的策略部署模块用于根据DGL模块所获得的训练好的深度图学习模型,基于目标SDN网络的状态,获得使目标SDN网络路由开销最小的路由策略,并将路由策略及目标SDN网络路由开销发送到数据平面。The policy deployment module on the control plane is used to obtain the routing strategy that minimizes the routing cost of the target SDN network based on the trained deep graph learning model obtained by the DGL module and based on the status of the target SDN network, and combine the routing strategy with the target SDN network Routing overhead is sent to the data plane.
有益效果:相对于现有技术,本发明的优点包括:Beneficial effects: Compared with the existing technology, the advantages of the present invention include:
1.采用图学习神经网络获取网络拓扑之中节点和链路之间的空间关系;1. Use graph learning neural network to obtain the spatial relationship between nodes and links in the network topology;
2.采用策略网络和价值网络的方式,对算法进行无监督学习,使得算法的学习能力更加细致;2. Use policy network and value network methods to conduct unsupervised learning on the algorithm, making the algorithm's learning ability more detailed;
3.利用智能算法优化SDN网络环境下的路由开销,提升了链路利用率,从而在优化了平均端到端时延、丢包率、吞吐量等;3. Use intelligent algorithms to optimize routing overhead in SDN network environments and improve link utilization, thus optimizing average end-to-end delay, packet loss rate, throughput, etc.;
4.深度图学习模型具有强的泛化能力,训练后的深度图学习模型在网络拓扑变化时依旧有效,能够适应大规模的动态、复杂网络。4. The deep graph learning model has strong generalization ability. The trained deep graph learning model is still effective when the network topology changes, and can adapt to large-scale dynamic and complex networks.
附图说明Description of drawings
图1是根据本发明实施例提供的基于图结构特征的路由优化方法的系统的整体框图;Figure 1 is an overall block diagram of a system based on a route optimization method based on graph structure features provided according to an embodiment of the present invention;
图2是根据本发明实施例提供的DGL算法框架图; Figure 2 is a DGL algorithm framework diagram provided according to an embodiment of the present invention;
图3是根据本发明实施例提供的图学习神经网络结构图。Figure 3 is a structural diagram of a graph learning neural network provided according to an embodiment of the present invention.
具体实施方式Detailed ways
下面结合附图对本发明作进一步描述。以下实施例仅用于更加清楚地说明本发明的技术方案,而不能以此来限制本发明的保护范围。The present invention will be further described below in conjunction with the accompanying drawings. The following examples are only used to more clearly illustrate the technical solutions of the present invention, but cannot be used to limit the scope of the present invention.
本发明实施例提供的一种基于图结构特征的路由优化方法,针对目标SDN网络,执行以下步骤S1-步骤S3,获得目标SDN网络中各条链路的路由开销,调整各条链路的权重,完成目标SDN网络的路由优化。The embodiment of the present invention provides a routing optimization method based on graph structure characteristics. For the target SDN network, the following steps S1 to S3 are performed to obtain the routing overhead of each link in the target SDN network and adjust the weight of each link. , complete the routing optimization of the target SDN network.
步骤S1:参照图1,针对目标SDN网络,基于南向接口协议,获取目标SDN网络的网络拓扑图,根据网络拓扑图中目标SDN网络的各链路上各节点之间的连接关系,构建图邻接矩阵,分别针对目标SDN网络的各链路上各节点,根据各节点的链路带宽、流量、丢包率、传输时延,构建各节点的信息特征向量,并基于各节点的信息特征向量,构建目标SDN网络的网络信息特征矩阵。Step S1: Referring to Figure 1, for the target SDN network, obtain the network topology diagram of the target SDN network based on the southbound interface protocol, and construct a diagram based on the connection relationships between the nodes on each link of the target SDN network in the network topology diagram. The adjacency matrix is for each node on each link of the target SDN network, and based on the link bandwidth, traffic, packet loss rate, and transmission delay of each node, the information feature vector of each node is constructed, and based on the information feature vector of each node , construct the network information feature matrix of the target SDN network.
步骤S1的具体步骤如下:The specific steps of step S1 are as follows:
步骤S1.1:针对目标SDN网络,基于南向接口协议,获取目标SDN网络的网络拓扑结构,其中网络拓扑结构包含M个路由器、N条链路。Step S1.1: For the target SDN network, based on the southbound interface protocol, obtain the network topology of the target SDN network, where the network topology includes M routers and N links.
步骤S1.2:针对目标SDN网络的网络拓扑结构,每个路由器对应一个实节点,每条链路对应一条边,在每条链路所对应的边上插入虚节点,将目标SDN网络的网络拓扑结构表示为M个实节点、N个虚节点、2N条边的网络拓扑图G(V,E),其中,V表示节点集合,E表示边集合,具体如下式:
V={V,V}
Step S1.2: Based on the network topology of the target SDN network, each router corresponds to a real node, and each link corresponds to an edge. Insert a virtual node on the edge corresponding to each link, and combine the network topology of the target SDN network. The topology is expressed as a network topology graph G(V,E) with M real nodes, N virtual nodes, and 2N edges, where V represents the node set and E represents the edge set, specifically as follows:
V={V real , V virtual }
其中,V表示实节点集合,V表示虚节点集合;
V={vs1,vs2,...,vsM}
Among them, V real represents the set of real nodes, and V virtual represents the set of virtual nodes;
V real ={v s1 , v s2 ,..., v sM }
其中,vs1,vs2,...,vsM表示M个实节点;
V={vx1,vx2,...,vxN}
Among them, v s1 , v s2 ,..., v sM represent M real nodes;
V virtual ={v x1 , v x2 ,..., v xN }
其中,vx1,vx2,...,vxN表示N个虚节点;
E={e1,e2,...,e2N}
Among them, v x1 , v x2 ,..., v xN represent N virtual nodes;
E={e 1 , e 2 ,..., e 2N }
其中,e1,e2,...,e2N表示2N条边。 Among them, e 1 , e 2 ,..., e 2N represent 2N edges.
步骤S1.3:令x=M+N,x表示节点总数,节点包括M个实节点、N个虚节点,基于目标SDN网络的网络拓扑图,构建x阶的图邻接矩阵A如下式:
Step S1.3: Let x=M+N, x represents the total number of nodes, and the nodes include M real nodes and N virtual nodes. Based on the network topology diagram of the target SDN network, construct an x-order graph adjacency matrix A as follows:
其中,图邻接矩阵A中的元素aij如下式:
Among them, the elements a ij in the graph adjacency matrix A are as follows:
步骤S1.4:针对目标SDN网络的任一节点i,根据节点i的链路带宽、流量、丢包率、传输时延,构建节点i的信息特征向量hi如下式:
hi=[Bwi,Thi,Lpi,Dti]
Step S1.4: For any node i of the target SDN network, based on the link bandwidth, traffic, packet loss rate, and transmission delay of node i, construct the information feature vector h i of node i as follows:
h i =[B wi , T hi , L pi , D ti ]
式中,Bwi为节点i的链路带宽,Thi为节点i的流量,Lpi为节点i的丢包率,Dti为节点i的传输时延。In the formula, B wi is the link bandwidth of node i, T hi is the traffic of node i, L pi is the packet loss rate of node i, and D ti is the transmission delay of node i.
所述节点i,若节点i为虚节点,则节点i的流量Thi、丢包率Lpi、传输时延Dti为0,若节点i为实节点,则节点i的链路带宽Bwi为0。For the node i, if the node i is a virtual node, the traffic T hi , the packet loss rate L pi , and the transmission delay D ti of the node i are 0. If the node i is a real node, the link bandwidth B wi of the node i is 0.
基于各节点的信息特征向量,构建目标SDN网络的网络信息特征矩阵H如下式:
Based on the information feature vector of each node, the network information feature matrix H of the target SDN network is constructed as follows:
式中,h1,h2,...,hi,...,hx为各节点的信息特征向量。In the formula, h 1 , h 2 ,..., h i ,..., h x are the information feature vectors of each node.
步骤S2:以图邻接矩阵、网络信息特征矩阵为目标SDN网络的状态,基于图学习神经网络,以图邻接矩阵、网络信息特征矩阵为输入,通过深度图学习方法(Deep Graph Learning,DGL),以当前状态下目标SDN网络的路由策略、路由开销为输出,基于梯度反向传播方法,更新图学习神经网络的网络参数,并经过预设次数的迭代,对图学习神经网络 进行训练,获得使目标SDN网络路由开销最小、链路利用率最大的深度图学习模型。Step S2: Take the graph adjacency matrix and the network information feature matrix as the target SDN network state, learn the neural network based on the graph, take the graph adjacency matrix and the network information feature matrix as input, and use the deep graph learning method (Deep Graph Learning, DGL) to Taking the routing strategy and routing cost of the target SDN network in the current state as the output, based on the gradient back propagation method, the network parameters of the graph learning neural network are updated, and after a preset number of iterations, the graph learning neural network is Conduct training to obtain a deep graph learning model that minimizes routing overhead and maximizes link utilization in the target SDN network.
步骤S2中所述深度图学习方法包括四个图学习神经网络和一个经验池,参照图2,四个图学习神经网络分别为在线图策略网络(Online Graph Strategy Network,OGSN)、在线图价值网络(Online Graph Value Network,OGVN)、目标图策略网络(Target Graph Strategy Network,TGSN)、目标图价值网络(Target Graph Value Network,TGVN),参照图3,四个图学习神经网络分别均包括一个输入层、两个隐藏层、一个输出层。The deep graph learning method described in step S2 includes four graph learning neural networks and an experience pool. Referring to Figure 2, the four graph learning neural networks are the Online Graph Strategy Network (OGSN) and the Online Graph Value Network. (Online Graph Value Network, OGVN), Target Graph Strategy Network (TGSN), Target Graph Value Network (Target Graph Value Network, TGVN), referring to Figure 3, each of the four graph learning neural networks includes an input layer, two hidden layers, and an output layer.
在线图策略网络、目标图策略网络的输入层以图邻接矩阵A、网络信息特征矩阵H为输入,在线图策略网络、目标图策略网络的输出分别作为在线图价值网络、目标图价值网络的输入,其中,各图学习神经网络的输入层到隐藏层、以及隐藏层之间的传播公式相同,将输入层记为第0层,第一个隐藏层记为第1层,第二个隐藏层记为第2层,则传播公式如下式:
The input layer of the online graph policy network and the target graph policy network takes the graph adjacency matrix A and the network information feature matrix H as inputs, and the outputs of the online graph policy network and the target graph policy network serve as the inputs of the online graph value network and the target graph value network respectively. , among them, the propagation formulas from the input layer to the hidden layer and between hidden layers of each graph learning neural network are the same. The input layer is recorded as layer 0, the first hidden layer is recorded as layer 1, and the second hidden layer Denoted as layer 2, the propagation formula is as follows:
式中,σ(·)表示将括号内部的公式进行归一化,Hl为第l层的网络信息特征矩阵,Wl+1为第l+1层的权重矩阵,其中,H0=H,I为x阶单位矩阵,的度矩阵,如下式:
In the formula, σ(·) means normalizing the formula inside the brackets, H l is the network information feature matrix of the l-th layer, W l+1 is the weight matrix of the l+1-th layer, where H 0 =H , I is the x-order unit matrix, for The degree matrix of As follows:
其中,如下式:
in, As follows:
其中,在线图策略网络、目标图策略网络中,W1是一个4×4的矩阵,W2是一个4×1的矩阵,输出层为全连接层,其输出值为x×1矩阵,记为路由策略Policy,具体如下式:
Policy=H2×K
Among them, in the online graph policy network and the target graph policy network, W 1 is a 4×4 matrix, W 2 is a 4×1 matrix, the output layer is a fully connected layer, and its output value is an x×1 matrix, denoted is the routing policy Policy, the specific formula is as follows:
Policy=H 2 ×K
式中,K为在线图策略网络、目标图策略网络输出层的权重矩阵,H2为第2层的网络信息特征矩阵。In the formula, K is the weight matrix of the output layer of the online graph policy network and the target graph policy network, and H 2 is the network information feature matrix of the second layer.
在线图价值网络、目标图价值网络中,W1和W2均为1×1的矩阵,输出层为聚合层,其输出值为1×1矩阵,记为Value,具体如下式:
In the online graph value network and the target graph value network, W 1 and W 2 are both 1×1 matrices, the output layer is the aggregation layer, and its output value is a 1×1 matrix, recorded as Value, as follows:
式中,Q为输出层的权重值,为第2层的网络信息特征矩阵H2中的第i个值;根据在线图策略网络输出的路由策略Policy,更新目标SDN网络中各条链路的路由开销。In the formula, Q is the weight value of the output layer, is the i-th value in the layer 2 network information feature matrix H2 ; according to the routing policy Policy output by the online graph policy network, the routing cost of each link in the target SDN network is updated.
参照图2,步骤S2的具体步骤如下:Referring to Figure 2, the specific steps of step S2 are as follows:
步骤S2.1:对在线图策略网络、在线图价值网络、目标策略网络、目标图价值网络的权重矩阵初始化,其中,在线图策略网络的权重矩阵为Wθ,在线图价值网络的权重矩阵为Wθ′,目标图策略网络的权重矩阵为Wω,目标图价值网络的权重矩阵为Wω′,初始化时,在线图策略网络、目标策略网络的网络参数一致,在线图价值网络、目标图价值网络的网络参数一致。Step S2.1: Initialize the weight matrices of the online graph policy network, online graph value network, target policy network, and target graph value network. Among them, the weight matrix of the online graph policy network is W θ and the weight matrix of the online graph value network is W θ′ , the weight matrix of the goal graph policy network is W ω , and the weight matrix of the goal graph value network is W ω′ . During initialization, the network parameters of the online graph policy network and the goal policy network are consistent, and the online graph value network and goal graph The network parameters of the value network are consistent.
步骤S2.2:对经验池进行初始化,具体步骤如下:Step S2.2: Initialize the experience pool. The specific steps are as follows:
步骤S2.2.1:以图邻接矩阵A、网络信息特征矩阵H作为目标SDN网络的状态S,定义S=[A,H],st表示t时刻目标SDN网络的状态,st=[At,Ht],At表示t时刻目标SDN网络的图邻接矩阵,Ht表示t时刻目标SDN网络的网络信息特征矩阵。Step S2.2.1: Take the graph adjacency matrix A and the network information feature matrix H as the state S of the target SDN network, define S = [A, H], s t represents the state of the target SDN network at time t, s t = [A t , H t ], A t represents the graph adjacency matrix of the target SDN network at time t, and H t represents the network information feature matrix of the target SDN network at time t.
步骤S2.2.2:定义分别为在线图策略网络、目标图策略网络、在线图价值网络、目标图价值网络的输出层在t时刻的输出;根据下式计算在线图策略网络输出路由策略所获得的环境反馈ft
ft=U(Bw,Th,Lp,Dt)×Kf
Step S2.2.2: Definition are the outputs of the output layer of the online graph policy network, target graph policy network, online graph value network, and target graph value network at time t respectively; calculate the output routing policy of the online graph policy network according to the following formula The obtained environmental feedback f t :
f t =U(B w , Th , L p , D t )×K f
式中,U(Bw,Th,Lp,Dt)为链路利用率,Bw、Th、Lp、Dt分别为目标SDN网络的链路带宽、流量、丢包率、传输时延,Kf为比例系数。In the formula, U(B w , Th , L p , D t ) is the link utilization rate, B w , Th , L p , and D t are the link bandwidth, traffic, and packet loss rate of the target SDN network respectively. Transmission delay, K f is the proportional coefficient.
构建目标SDN网络链路利用率最大化的目标函数为Umax(Bw,Th,Lp,Dt)。The objective function to construct the target SDN network link utilization maximization is U max (B w , Th , L p , D t ).
步骤S2.2.3:定义经验池R如下式:
Step S2.2.3: Define the experience pool R as follows:
式中,st+1表示t+1时刻目标SDN网络的状态,即在线图策略网络输出路由策略所获得目标SDN网络的状态。 In the formula, s t+1 represents the status of the target SDN network at time t+1, that is, the online graph policy network outputs the routing policy The obtained status of the target SDN network.
步骤S2.3:针对目标SDN网络,进行预设次数的迭代,其中预设迭代次数为T,具体步骤如下:Step S2.3: For the target SDN network, perform a preset number of iterations, where the preset number of iterations is T. The specific steps are as follows:
步骤S2.3.1:令t=1,获取目标SDN网络的初始状态s1Step S2.3.1: Let t=1 and obtain the initial state s 1 of the target SDN network;
步骤S2.3.2:在线图策略网络根据t时刻目标SDN网络的状态st,输出路由策略过程记为其中,θ为在线图策略网络的网络参数;Step S2.3.2: Based on the status s t of the target SDN network at time t, the online graph policy network outputs the routing policy. The process is recorded as Among them, θ is the network parameter of the online graph policy network;
步骤S2.3.3:根据路由策略更新目标SDN网络中各条链路的路由开销;Step S2.3.3: According to routing policy Update the routing costs of each link in the target SDN network;
步骤S2.3.4:获取根据路由策略更新后的目标SDN网络的状态st+1,同时获取环境反馈ftStep S2.3.4: Obtain the routing policy Updated target SDN network state s t+1 , and obtain environmental feedback f t at the same time;
步骤S2.3.5:将作为一组历史记录存入经验池R中;Step S2.3.5: Place Stored in the experience pool R as a set of historical records;
步骤S2.3.6:从经验池R中随机抽取Y组历史记录其中,下标m表示经验池R中任意一组历史记录;Step S2.3.6: Randomly select Y groups of historical records from the experience pool R Among them, the subscript m represents any set of historical records in the experience pool R;
步骤S2.3.7:根据步骤S2.3.6所抽取的历史记录计算目标图价值网络所对应的输出如下式:
Step S2.3.7: Based on the historical records extracted in step S2.3.6 Calculate the output corresponding to the target graph value network As follows:
式中,表示目标图策略网络根据目标SDN网络的状态sm+1所选择的路由策略,θ′为目标图策略网络的网络参数,ω′为目标图价值网络的网络参数,表示目标图价值网络基于目标SDN网络的状态sm+1、且网络参数为ω′时,目标图策略网络所选取的路由策略π′(sm+1|θ′)的期望值,γ为折扣因子,是一个常数,且γ∈(0,1)。In the formula, Represents the routing strategy selected by the target graph policy network based on the state s m+1 of the target SDN network, θ′ is the network parameter of the target graph policy network, ω′ is the network parameter of the target graph value network, Indicates the expected value of the routing policy π′ (s m+ 1 |θ′) selected by the target graph policy network when the target graph value network is based on the state s m+1 of the target SDN network and the network parameter is ω′, and γ is the discount. The factor is a constant, and γ∈(0,1).
步骤S2.3.8:根据下式计算在线图价值网络输出值的损失Lossogvn
Step S2.3.8: Calculate the loss Loss ogvn of the online graph value network output value according to the following formula:
式中,表示网络参数ω的在线图价值网络在目标SDN网络的状态sm下,在线图策略网络输出的路由策略为π(sm|θ)时,在线图价值网络 输出的价值。In the formula, The online graph value network representing the network parameter ω is in the state s m of the target SDN network, and when the routing policy output by the online graph policy network is π(s m |θ), the online graph value network The value of the output.
步骤S2.3.9:根据在线图价值网络输出值的损失Lossogvn,基于梯度反向传播方法,更新在线图价值网络的网络参数ω。Step S2.3.9: According to the loss Loss ogvn of the output value of the online graph value network, based on the gradient backpropagation method, update the network parameters ω of the online graph value network.
步骤S2.3.10:计算梯度值根据梯度值基于梯度反向传播方法,更新在线图策略网络的网络参数θ,其中表示对括号内公式求梯度。Step S2.3.10: Calculate gradient value According to the gradient value Based on the gradient backpropagation method, the network parameters θ of the online graph policy network are updated, where Indicates finding the gradient of the formula in parentheses.
步骤S2.3.11:分别根据下式,更新目标图策略网络的网络参数θ′、目标图价值网络的网络参数ω′:
θ′=τθ+(1-τ)θ′
ω′=τω+(1-τ)ω′
Step S2.3.11: Update the network parameters θ′ of the target graph policy network and the network parameters ω′ of the target graph value network according to the following formulas:
θ′=τθ+(1-τ)θ′
ω′=τω+(1-τ)ω′
式中,τ为常数,且τ∈(0,1)。In the formula, τ is a constant, and τ∈(0,1).
步骤S2.3.12:重复S2.3.2至步骤S2.3.11,直至迭代次数达到预设次数T,获得使目标SDN网络路由开销最小的路由策略。Step S2.3.12: Repeat S2.3.2 to Step S2.3.11 until the number of iterations reaches the preset number T, and obtain the routing strategy that minimizes the routing cost of the target SDN network.
步骤S3:根据训练好的深度图学习模型,基于目标SDN网络的状态,获得使目标SDN网络路由开销最小的路由策略,将路由策略部署至目标SDN网络,根据路由策略改变目标SDN网络的各链路权重,完成目标SDN网络的路由优化。Step S3: Based on the trained deep graph learning model and the status of the target SDN network, obtain the routing strategy that minimizes the routing cost of the target SDN network, deploy the routing strategy to the target SDN network, and change each link of the target SDN network according to the routing strategy. Route weight to complete routing optimization of the target SDN network.
步骤S3的具体步骤如下:The specific steps of step S3 are as follows:
步骤S31:获取目标SDN网络的图邻接矩阵A、网络信息特征矩阵H;Step S31: Obtain the graph adjacency matrix A and network information feature matrix H of the target SDN network;
步骤S32:基于训练好的深度图学习模型,根据目标SDN网络的状态[A,H],获得使目标SDN网络路由开销最小的路由策略;Step S32: Based on the trained deep graph learning model and according to the status [A, H] of the target SDN network, obtain the routing strategy that minimizes the routing cost of the target SDN network;
步骤S33:根据步骤S32所获得的路由策略,部署至目标SDN网络,根据路由策略改变目标SDN网络的各链路权重;Step S33: Deploy to the target SDN network according to the routing policy obtained in step S32, and change the link weights of the target SDN network according to the routing policy;
步骤S34:在流量传输过程中,根据最短路径方案,采用更新后的各链路权重进行流量传输。Step S34: During the traffic transmission process, the updated weight of each link is used for traffic transmission according to the shortest path scheme.
本发明实施例还提供一种基于图结构特征的路由优化方法的系统,参照图1,目标SDN网络包括控制平面、数据平面,其中,控制平面包括信息获取模块、策略部署模块、DGL模块;使得所述基于图结构特征的路由优化方法的系统实现所述基于图结构特征的路由优化方法。Embodiments of the present invention also provide a system for routing optimization methods based on graph structure characteristics. Referring to Figure 1, the target SDN network includes a control plane and a data plane, where the control plane includes an information acquisition module, a policy deployment module, and a DGL module; such that The system of the routing optimization method based on graph structure characteristics implements the routing optimization method based on graph structure characteristics.
目标SDN网络的各链路及各节点部署于数据平面,控制平面上的信息获取模块用于获取目标SDN网络的网络拓扑图,生成图邻接矩阵、网络信息特征矩阵,发送至DGL模块。 Each link and node of the target SDN network is deployed on the data plane. The information acquisition module on the control plane is used to obtain the network topology diagram of the target SDN network, generate a graph adjacency matrix and a network information feature matrix, and send them to the DGL module.
DGL模块基于图学习神经网络,以图邻接矩阵、网络信息特征矩阵为输入,通过深度图学习方法,以当前状态下目标SDN网络的路由开销为输出,基于梯度反向传播方法,更新图学习神经网络的网络参数,并经过预设次数的迭代,对图学习神经网络进行训练,获得使目标SDN网络路由开销最小、链路利用率最大的深度图学习模型。The DGL module is based on the graph learning neural network. It takes the graph adjacency matrix and the network information feature matrix as inputs. Through the deep graph learning method, it uses the routing cost of the target SDN network in the current state as the output. Based on the gradient back propagation method, it updates the graph learning neural network. Network parameters of the network, and after a preset number of iterations, the graph learning neural network is trained to obtain a deep graph learning model that minimizes the routing overhead of the target SDN network and maximizes link utilization.
控制平面上的策略部署模块用于根据DGL模块所获得的训练好的深度图学习模型,基于目标SDN网络的状态,获得使目标SDN网络路由开销最小的路由策略,并将路由策略及目标SDN网络路由开销发送到数据平面。The policy deployment module on the control plane is used to obtain the routing strategy that minimizes the routing cost of the target SDN network based on the trained deep graph learning model obtained by the DGL module and based on the status of the target SDN network, and combine the routing strategy with the target SDN network Routing overhead is sent to the data plane.
上面结合附图对本发明的实施方式作了详细说明,但是本发明并不限于上述实施方式,在本领域普通技术人员所具备的知识范围内,还可以在不脱离本发明宗旨的前提下做出各种变化。 The embodiments of the present invention have been described in detail above with reference to the accompanying drawings. However, the present invention is not limited to the above embodiments. Within the scope of knowledge possessed by those of ordinary skill in the art, other modifications can be made without departing from the spirit of the present invention. Various changes.

Claims (7)

  1. 一种基于图结构特征的路由优化方法,其特征在于,针对目标SDN网络,执行以下步骤S1-步骤S3,获得目标SDN网络中各条链路的路由开销,调整各条链路的权重,完成目标SDN网络的路由优化:A routing optimization method based on graph structure features, which is characterized by executing the following steps S1 to S3 for the target SDN network, obtaining the routing overhead of each link in the target SDN network, adjusting the weight of each link, and completing Routing optimization of target SDN network:
    步骤S1:针对目标SDN网络,基于南向接口协议,获取目标SDN网络的网络拓扑图,根据网络拓扑图中目标SDN网络的各链路上各节点之间的连接关系,构建图邻接矩阵,分别针对目标SDN网络的各链路上各节点,根据各节点的链路带宽、流量、丢包率、传输时延,构建各节点的信息特征向量,并基于各节点的信息特征向量,构建目标SDN网络的网络信息特征矩阵;Step S1: For the target SDN network, based on the southbound interface protocol, obtain the network topology diagram of the target SDN network, and construct a graph adjacency matrix according to the connection relationship between the nodes on each link of the target SDN network in the network topology diagram, respectively. For each node on each link of the target SDN network, construct the information feature vector of each node based on the link bandwidth, traffic, packet loss rate, and transmission delay of each node, and build the target SDN based on the information feature vector of each node Network information feature matrix of the network;
    步骤S2:以图邻接矩阵、网络信息特征矩阵为目标SDN网络的状态,基于图学习神经网络,以图邻接矩阵、网络信息特征矩阵为输入,通过深度图学习方法,以当前状态下目标SDN网络的路由策略、路由开销为输出,基于梯度反向传播方法,更新图学习神经网络的网络参数,并经过预设次数的迭代,对图学习神经网络进行训练,获得使目标SDN网络路由开销最小、链路利用率最大的深度图学习模型;Step S2: Take the graph adjacency matrix and the network information feature matrix as the state of the target SDN network, learn the neural network based on the graph, take the graph adjacency matrix and the network information feature matrix as input, and use the deep graph learning method to learn the target SDN network in the current state. The routing strategy and routing cost are the output. Based on the gradient back propagation method, the network parameters of the graph learning neural network are updated, and after a preset number of iterations, the graph learning neural network is trained to obtain the minimum routing cost of the target SDN network. Deep graph learning model with maximum link utilization;
    步骤S3:根据训练好的深度图学习模型,基于目标SDN网络的状态,获得使目标SDN网络路由开销最小的路由策略,将路由策略部署至目标SDN网络,根据路由策略改变目标SDN网络的各链路权重,完成目标SDN网络的路由优化。Step S3: Based on the trained deep graph learning model and the status of the target SDN network, obtain the routing strategy that minimizes the routing cost of the target SDN network, deploy the routing strategy to the target SDN network, and change each link of the target SDN network according to the routing strategy. Route weight to complete routing optimization of the target SDN network.
  2. 根据权利要求1所述的一种基于图结构特征的路由优化方法,其特征在于,步骤S1的具体步骤如下:A route optimization method based on graph structure characteristics according to claim 1, characterized in that the specific steps of step S1 are as follows:
    步骤S1.1:针对目标SDN网络,基于南向接口协议,获取目标SDN网络的网络拓扑结构,其中网络拓扑结构包含M个路由器、N条链路;Step S1.1: For the target SDN network, based on the southbound interface protocol, obtain the network topology of the target SDN network, where the network topology includes M routers and N links;
    步骤S1.2:针对目标SDN网络的网络拓扑结构,每个路由器对应一个实节点,每条链路对应一条边,在每条链路所对应的边上插入虚节点,将目标SDN网络的网络拓扑结构表示为M个实节点、N个虚节点、2N条边的网络拓扑图G(V,E),其中,V表示节点集合,E表示边集合,具体如下式:
    V={V,V}
    Step S1.2: Based on the network topology of the target SDN network, each router corresponds to a real node, and each link corresponds to an edge. Insert a virtual node on the edge corresponding to each link, and combine the network topology of the target SDN network. The topology is expressed as a network topology graph G(V,E) with M real nodes, N virtual nodes, and 2N edges, where V represents the node set and E represents the edge set, specifically as follows:
    V={V real , V virtual }
    其中,V表示实节点集合,V表示虚节点集合;Among them, V real represents the set of real nodes, and V virtual represents the set of virtual nodes;
    V={vs1,vs2,...,vsM}V real ={v s1 , v s2 ,..., v sM }
    其中,vs1,vs2,...,vsM表示M个实节点;
    V={vx1,vx2,...,vxN}
    Among them, v s1 , v s2 ,..., v sM represent M real nodes;
    V virtual ={v x1 , v x2 ,..., v xN }
    其中,vx1,vx2,...,vxN表示N个虚节点;
    E={e1,e2,...,e2N}
    Among them, v x1 , v x2 ,..., v xN represent N virtual nodes;
    E={e 1 , e 2 ,..., e 2N }
    其中,e1,e2,...,e2N表示2N条边;Among them, e 1 , e 2 ,..., e 2N represents 2N edges;
    步骤S1.3:令x=M+N,x表示节点总数,节点包括M个实节点、N个虚节点,基于目标SDN网络的网络拓扑图,构建x阶的图邻接矩阵A如下式:
    Step S1.3: Let x=M+N, x represents the total number of nodes, and the nodes include M real nodes and N virtual nodes. Based on the network topology diagram of the target SDN network, construct an x-order graph adjacency matrix A as follows:
    其中,图邻接矩阵A中的元素aij如下式:
    Among them, the elements a ij in the graph adjacency matrix A are as follows:
    步骤S1.4:针对目标SDN网络的任一节点i,根据节点i的链路带宽、流量、丢包率、传输时延,构建节点i的信息特征向量hi如下式:
    hi=[Bwi,Thi,Lpi,Dti]
    Step S1.4: For any node i of the target SDN network, based on the link bandwidth, traffic, packet loss rate, and transmission delay of node i, construct the information feature vector h i of node i as follows:
    h i =[B wi , T hi , L pi , D ti ]
    式中,Bwi为节点i的链路带宽,Thi为节点i的流量,Lpi为节点i的丢包率,Dti为节点i的传输时延;In the formula, B wi is the link bandwidth of node i, T hi is the traffic of node i, L pi is the packet loss rate of node i, and D ti is the transmission delay of node i;
    基于各节点的信息特征向量,构建目标SDN网络的网络信息特征矩阵H如下式:
    Based on the information feature vector of each node, the network information feature matrix H of the target SDN network is constructed as follows:
    式中,h1,h2,...,hi,...,hx为各节点的信息特征向量。In the formula, h 1 , h 2 ,..., h i ,..., h x are the information feature vectors of each node.
  3. 根据权利要求2所述的一种基于图结构特征的路由优化方法,其特征在于,步骤S1.4中 所述的节点i,若节点i为虚节点,则节点i的流量Thi、丢包率Lpi、传输时延Dti为0,若节点i为实节点,则节点i的链路带宽Bwi为0。A route optimization method based on graph structure characteristics according to claim 2, characterized in that in step S1.4 For the node i, if the node i is a virtual node, the traffic T hi , the packet loss rate L pi , and the transmission delay D ti of the node i are 0. If the node i is a real node, the link bandwidth B of the node i wi is 0.
  4. 根据权利要求2所述的一种基于图结构特征的路由优化方法,其特征在于,步骤S2中所述深度图学习方法包括四个图学习神经网络和一个经验池,四个图学习神经网络分别为在线图策略网络、在线图价值网络、目标图策略网络、目标图价值网络,四个图学习神经网络分别均包括一个输入层、两个隐藏层、一个输出层;A route optimization method based on graph structure characteristics according to claim 2, characterized in that the deep graph learning method in step S2 includes four graph learning neural networks and an experience pool, and the four graph learning neural networks are respectively They are online graph policy network, online graph value network, target graph policy network, and target graph value network. The four graph learning neural networks each include an input layer, two hidden layers, and an output layer;
    在线图策略网络、目标图策略网络的输入层以图邻接矩阵A、网络信息特征矩阵H为输入,在线图策略网络、目标图策略网络的输出分别作为在线图价值网络、目标图价值网络的输入,其中,各图学习神经网络的输入层到隐藏层、以及隐藏层之间的传播公式相同,将输入层记为第0层,第一个隐藏层记为第1层,第二个隐藏层记为第2层,则传播公式如下式:
    The input layer of the online graph policy network and the target graph policy network takes the graph adjacency matrix A and the network information feature matrix H as inputs, and the outputs of the online graph policy network and the target graph policy network serve as the inputs of the online graph value network and the target graph value network respectively. , among them, the propagation formulas from the input layer to the hidden layer and between hidden layers of each graph learning neural network are the same. The input layer is recorded as layer 0, the first hidden layer is recorded as layer 1, and the second hidden layer Denoted as layer 2, the propagation formula is as follows:
    式中,σ(·)表示将括号内部的公式进行归一化,Hl为第l层的网络信息特征矩阵,Wl+1为第l+1层的权重矩阵,其中,H0=H,I为x阶单位矩阵,的度矩阵,如下式:
    In the formula, σ(·) means normalizing the formula inside the brackets, H l is the network information feature matrix of the l-th layer, W l+1 is the weight matrix of the l+1-th layer, where H 0 =H , I is the x-order unit matrix, for The degree matrix of As follows:
    其中,如下式:
    in, As follows:
    其中,在线图策略网络、目标图策略网络中,W1是一个4×4的矩阵,W2是一个4×1的矩阵,输出层为全连接层,其输出值为x×1矩阵,记为路由策略Policy,具体如下式:
    Policy=H2×K
    Among them, in the online graph policy network and the target graph policy network, W 1 is a 4×4 matrix, W 2 is a 4×1 matrix, the output layer is a fully connected layer, and its output value is an x×1 matrix, denoted is the routing policy Policy, the specific formula is as follows:
    Policy=H 2 ×K
    式中,K为在线图策略网络、目标图策略网络输出层的权重矩阵,H2为第2层的网络信息特征矩阵;In the formula, K is the weight matrix of the output layer of the online graph policy network and the target graph policy network, and H 2 is the network information feature matrix of the second layer;
    在线图价值网络、目标图价值网络中,W1和W2均为1×1的矩阵,输出层为聚合层,其输出值为1×1矩阵,记为Value,具体如下式:
    In the online graph value network and the target graph value network, W 1 and W 2 are both 1×1 matrices, the output layer is the aggregation layer, and its output value is a 1×1 matrix, recorded as Value, as follows:
    式中,Q为输出层的权重值,为第2层的网络信息特征矩阵H2中的第i个值;In the formula, Q is the weight value of the output layer, is the i-th value in the network information feature matrix H 2 of layer 2;
    根据在线图策略网络输出的路由策略Policy,更新目标SDN网络中各条链路的路由开销。According to the routing policy Policy output by the online graph policy network, the routing cost of each link in the target SDN network is updated.
  5. 根据权利要求4所述的一种基于图结构特征的路由优化方法,其特征在于,步骤S2的具体步骤如下:A route optimization method based on graph structure characteristics according to claim 4, characterized in that the specific steps of step S2 are as follows:
    步骤S2.1:对在线图策略网络、在线图价值网络、目标策略网络、目标图价值网络的权重矩阵初始化,其中,在线图策略网络的权重矩阵为Wθ,在线图价值网络的权重矩阵为Wθ′,目标图策略网络的权重矩阵为Wω,目标图价值网络的权重矩阵为Wω′Step S2.1: Initialize the weight matrices of the online graph policy network, online graph value network, target policy network, and target graph value network. Among them, the weight matrix of the online graph policy network is W θ and the weight matrix of the online graph value network is W θ′ , the weight matrix of the target graph policy network is W ω , and the weight matrix of the target graph value network is W ω′ ;
    步骤S2.2:对经验池进行初始化,具体步骤如下:Step S2.2: Initialize the experience pool. The specific steps are as follows:
    步骤S2.2.1:以图邻接矩阵A、网络信息特征矩阵H作为目标SDN网络的状态S,定义S=[A,H],st表示t时刻目标SDN网络的状态,st=[At,Ht],At表示t时刻目标SDN网络的图邻接矩阵,Ht表示t时刻目标SDN网络的网络信息特征矩阵;Step S2.2.1: Take the graph adjacency matrix A and the network information feature matrix H as the state S of the target SDN network, define S = [A, H], s t represents the state of the target SDN network at time t, s t = [A t , H t ], A t represents the graph adjacency matrix of the target SDN network at time t, H t represents the network information feature matrix of the target SDN network at time t;
    步骤S2.2.2:定义分别为在线图策略网络、目标图策略网络、在线图价值网络、目标图价值网络的输出层在t时刻的输出;根据下式计算在线图策略网络输出路由策略所获得的环境反馈ft
    ft=U(Bw,Th,Lp,Dt)×Kf
    Step S2.2.2: Definition are the outputs of the output layer of the online graph policy network, target graph policy network, online graph value network, and target graph value network at time t respectively; calculate the output routing policy of the online graph policy network according to the following formula The obtained environmental feedback f t :
    f t =U(B w , Th , L p , D t )×K f
    式中,U(Bw,Th,Lp,Dt)为链路利用率,Bw、Th、Lp、Dt分别为目标SDN网络的链路带宽、流量、丢包率、传输时延,Kf为比例系数;In the formula, U(B w , Th , L p , D t ) is the link utilization rate, B w , Th , L p , and D t are the link bandwidth, traffic, and packet loss rate of the target SDN network respectively. Transmission delay, K f is the proportional coefficient;
    构建目标SDN网络链路利用率最大化的目标函数为Umax(Bw,Th,Lp,Dt);The objective function to construct the target SDN network link utilization maximization is U max (B w , Th , L p , D t );
    步骤S2.2.3:定义经验池R如下式:
    Step S2.2.3: Define the experience pool R as follows:
    式中,st+1表示t+1时刻目标SDN网络的状态,即在线图策略网络输出路由策略所获得目标SDN网络的状态;In the formula, s t+1 represents the status of the target SDN network at time t+1, that is, the online graph policy network outputs the routing policy The obtained status of the target SDN network;
    步骤S2.3:针对目标SDN网络,进行预设次数的迭代,其中预设迭代次数为T,具体步骤如 下:Step S2.3: For the target SDN network, perform a preset number of iterations, where the preset number of iterations is T. The specific steps are as follows Down:
    步骤S2.3.1:令t=1,获取目标SDN网络的初始状态s1Step S2.3.1: Let t=1 and obtain the initial state s 1 of the target SDN network;
    步骤S2.3.2:在线图策略网络根据t时刻目标SDN网络的状态st,输出路由策略过程记为其中,θ为在线图策略网络的网络参数;Step S2.3.2: Based on the status s t of the target SDN network at time t, the online graph policy network outputs the routing policy. The process is recorded as Among them, θ is the network parameter of the online graph policy network;
    步骤S2.3.3:根据路由策略更新目标SDN网络中各条链路的路由开销;Step S2.3.3: According to routing policy Update the routing costs of each link in the target SDN network;
    步骤S2.3.4:获取根据路由策略更新后的目标SDN网络的状态st+1,同时获取环境反馈ftStep S2.3.4: Obtain the routing policy Updated target SDN network state s t+1 , and obtain environmental feedback f t at the same time;
    步骤S2.3.5:将作为一组历史记录存入经验池R中;Step S2.3.5: Place Stored in the experience pool R as a set of historical records;
    步骤S2.3.6:从经验池R中随机抽取Y组历史记录其中,下标m表示经验池R中任意一组历史记录;Step S2.3.6: Randomly select Y groups of historical records from the experience pool R Among them, the subscript m represents any set of historical records in the experience pool R;
    步骤S2.3.7:根据步骤S2.3.6所抽取的历史记录计算目标图价值网络所对应的输出如下式:
    Step S2.3.7: Based on the historical records extracted in step S2.3.6 Calculate the output corresponding to the target graph value network As follows:
    式中,表示目标图策略网络根据目标SDN网络的状态sm+1所选择的路由策略,θ′所选择的路由策略,ω′为目标图价值网络的网络参数,表示目标图价值网络基于目标SDN网络的状态sm+1且网络参数为ω′时,目标图策略网络所选取的路由策略π′(sm+1|θ′)的期望值,γ为折扣因子,是一个常数,且γ∈(0,1);In the formula, Represents the routing strategy selected by the target graph policy network according to the state sm+1 of the target SDN network, the routing strategy selected by θ′, and ω′ is the network parameter of the target graph value network, Indicates the expected value of the routing policy π′ (s m+ 1 |θ′) selected by the target graph policy network when the target graph value network is based on the state s m+1 of the target SDN network and the network parameter is ω′, and γ is the discount factor. , is a constant, and γ∈(0,1);
    步骤S2.3.8:根据下式计算在线图价值网络输出值的损失Lossogvn
    Step S2.3.8: Calculate the loss Loss ogvn of the online graph value network output value according to the following formula:
    式中,表示网络参数ω的在线图价值网络在目标SDN网络的状态sm下,在线图策略网络输出的路由策略为π(sm|θ)时,在线图价值 网络输出的价值;In the formula, The online graph value network representing the network parameter ω is in the state s m of the target SDN network. When the routing policy output by the online graph policy network is π(s m |θ), the online graph value The value of network output;
    步骤S2.3.9:根据在线图价值网络输出值的损失Lossogvn,基于梯度反向传播方法,更新在线图价值网络的网络参数ω;Step S2.3.9: According to the loss Loss ogvn of the output value of the online graph value network, based on the gradient backpropagation method, update the network parameters ω of the online graph value network;
    步骤S2.3.10:计算梯度值根据梯度值基于梯度反向传播方法,更新在线图策略网络的网络参数θ,其中表示对括号内公式求梯度;Step S2.3.10: Calculate gradient value According to the gradient value Based on the gradient backpropagation method, the network parameters θ of the online graph policy network are updated, where Indicates finding the gradient of the formula in parentheses;
    步骤S2.3.11:分别根据下式,更新目标图策略网络的网络参数θ′、目标图价值网络的网络参数ω′:
    θ′=τθ+(1-τ)θ′
    ω′=τω+(1-τ)ω′
    Step S2.3.11: Update the network parameters θ′ of the target graph policy network and the network parameters ω′ of the target graph value network according to the following formulas:
    θ′=τθ+(1-τ)θ′
    ω′=τω+(1-τ)ω′
    式中,τ为常数,且τ∈(0,1);In the formula, τ is a constant, and τ∈(0,1);
    步骤S2.3.12:重复S2.3.2至步骤S2.3.11,直至迭代次数达到预设次数T,获得使目标SDN网络路由开销最小的路由策略。Step S2.3.12: Repeat S2.3.2 to Step S2.3.11 until the number of iterations reaches the preset number T, and obtain the routing strategy that minimizes the routing cost of the target SDN network.
  6. 根据权利要求5所述的一种基于图结构特征的路由优化方法,其特征在于,步骤S3的具体步骤如下:A route optimization method based on graph structure characteristics according to claim 5, characterized in that the specific steps of step S3 are as follows:
    步骤S31:获取目标SDN网络的图邻接矩阵A、网络信息特征矩阵H;Step S31: Obtain the graph adjacency matrix A and network information feature matrix H of the target SDN network;
    步骤S32:基于训练好的深度图学习模型,根据目标SDN网络的状态[A,H],获得使目标SDN网络路由开销最小的路由策略;Step S32: Based on the trained deep graph learning model and according to the status [A, H] of the target SDN network, obtain the routing strategy that minimizes the routing cost of the target SDN network;
    步骤S33:根据步骤S32所获得的路由策略,部署至目标SDN网络,根据路由策略改变目标SDN网络的各链路权重;Step S33: Deploy to the target SDN network according to the routing policy obtained in step S32, and change the link weights of the target SDN network according to the routing policy;
    步骤S34:在流量传输过程中,根据最短路径方案,采用更新后的各链路权重进行流量传输。Step S34: During the traffic transmission process, the updated weight of each link is used for traffic transmission according to the shortest path scheme.
  7. 一种基于图结构特征的路由优化方法的系统,其特征在于,目标SDN网络包括控制平面、数据平面,其中,控制平面包括信息获取模块、策略部署模块、DGL模块;使得所述基于图结构特征的路由优化方法的系统实现如权利要求1-6中任一项所述的基于图结构特征的路由优化方法;A system of route optimization method based on graph structure characteristics, characterized in that the target SDN network includes a control plane and a data plane, wherein the control plane includes an information acquisition module, a policy deployment module, and a DGL module; so that the said based on graph structure characteristics The system of the routing optimization method implements the routing optimization method based on graph structure characteristics as described in any one of claims 1-6;
    目标SDN网络的各链路及各节点部署于数据平面,控制平面上的信息获取模块用于获取目标SDN网络的网络拓扑图,生成图邻接矩阵、网络信息特征矩阵,发送至DGL模块;Each link and node of the target SDN network is deployed on the data plane. The information acquisition module on the control plane is used to obtain the network topology diagram of the target SDN network, generate a graph adjacency matrix and a network information feature matrix, and send them to the DGL module;
    DGL模块基于图学习神经网络,以图邻接矩阵、网络信息特征矩阵为输入,通过深度图学 习方法,以当前状态下目标SDN网络的路由开销为输出,基于梯度反向传播方法,更新图学习神经网络的网络参数,并经过预设次数的迭代,对图学习神经网络进行训练,获得使目标SDN网络路由开销最小、链路利用率最大的深度图学习模型;The DGL module is based on the graph learning neural network, taking the graph adjacency matrix and the network information feature matrix as input, and uses deep graphics to The learning method uses the routing cost of the target SDN network in the current state as the output, updates the network parameters of the graph learning neural network based on the gradient back propagation method, and trains the graph learning neural network after a preset number of iterations to obtain the desired performance. A deep graph learning model with minimum routing overhead and maximum link utilization in the target SDN network;
    控制平面上的策略部署模块用于根据DGL模块所获得的训练好的深度图学习模型,基于目标SDN网络的状态,获得使目标SDN网络路由开销最小的路由策略,并将路由策略及目标SDN网络路由开销发送到数据平面。 The policy deployment module on the control plane is used to obtain the routing strategy that minimizes the routing cost of the target SDN network based on the trained deep graph learning model obtained by the DGL module and based on the status of the target SDN network, and combine the routing strategy with the target SDN network Routing overhead is sent to the data plane.
PCT/CN2023/098735 2022-08-15 2023-06-07 Graph structure feature-based routing optimization method and system WO2024037136A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210974378.6A CN115225561B (en) 2022-08-15 2022-08-15 Route optimization method and system based on graph structure characteristics
CN202210974378.6 2022-08-15

Publications (1)

Publication Number Publication Date
WO2024037136A1 true WO2024037136A1 (en) 2024-02-22

Family

ID=83615692

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/098735 WO2024037136A1 (en) 2022-08-15 2023-06-07 Graph structure feature-based routing optimization method and system

Country Status (2)

Country Link
CN (1) CN115225561B (en)
WO (1) WO2024037136A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115225561B (en) * 2022-08-15 2022-12-06 南京邮电大学 Route optimization method and system based on graph structure characteristics
CN116055378B (en) * 2023-01-10 2024-05-28 中国联合网络通信集团有限公司 Training method and device for traffic scheduling strategy generation model

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111245718A (en) * 2019-12-30 2020-06-05 浙江工商大学 Routing optimization method based on SDN context awareness
CN111314171A (en) * 2020-01-17 2020-06-19 深圳供电局有限公司 Method, device and medium for predicting and optimizing SDN routing performance
CN113194034A (en) * 2021-04-22 2021-07-30 华中科技大学 Route optimization method and system based on graph neural network and deep reinforcement learning
WO2022116957A1 (en) * 2020-12-02 2022-06-09 中兴通讯股份有限公司 Algorithm model determining method, path determining method, electronic device, sdn controller, and medium
CN114697229A (en) * 2022-03-11 2022-07-01 华中科技大学 Construction method and application of distributed routing planning model
CN115225561A (en) * 2022-08-15 2022-10-21 南京邮电大学 Route optimization method and system based on graph structure characteristics

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103281247B (en) * 2013-05-09 2016-06-15 北京交通大学 The general method for routing of a kind of data center network and system
US20190184561A1 (en) * 2017-12-15 2019-06-20 The Regents Of The University Of California Machine Learning based Fixed-Time Optimal Path Generation
CN110275437B (en) * 2019-06-06 2022-11-15 江苏大学 SDN network flow dominance monitoring node dynamic selection system and method thereof
CN110611619B (en) * 2019-09-12 2020-10-09 西安电子科技大学 Intelligent routing decision method based on DDPG reinforcement learning algorithm
CN113556281A (en) * 2020-04-23 2021-10-26 中兴通讯股份有限公司 Rerouting method and device, electronic equipment and computer readable medium
CN111862579B (en) * 2020-06-10 2021-07-13 深圳大学 Taxi scheduling method and system based on deep reinforcement learning
CN113036772B (en) * 2021-05-11 2022-07-19 国网江苏省电力有限公司南京供电分公司 Power distribution network topology voltage adjusting method based on deep reinforcement learning
CN113285831B (en) * 2021-05-24 2022-08-02 广州大学 Network behavior knowledge intelligent learning method and device, computer equipment and storage medium
CN114286413B (en) * 2021-11-02 2023-09-19 北京邮电大学 TSN network joint routing and stream distribution method and related equipment
CN114500360B (en) * 2022-01-27 2022-11-11 河海大学 Network traffic scheduling method and system based on deep reinforcement learning
CN114859719A (en) * 2022-05-05 2022-08-05 电子科技大学长三角研究院(衢州) Graph neural network-based reinforcement learning cluster bee-congestion control method

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111245718A (en) * 2019-12-30 2020-06-05 浙江工商大学 Routing optimization method based on SDN context awareness
CN111314171A (en) * 2020-01-17 2020-06-19 深圳供电局有限公司 Method, device and medium for predicting and optimizing SDN routing performance
WO2022116957A1 (en) * 2020-12-02 2022-06-09 中兴通讯股份有限公司 Algorithm model determining method, path determining method, electronic device, sdn controller, and medium
CN113194034A (en) * 2021-04-22 2021-07-30 华中科技大学 Route optimization method and system based on graph neural network and deep reinforcement learning
CN114697229A (en) * 2022-03-11 2022-07-01 华中科技大学 Construction method and application of distributed routing planning model
CN115225561A (en) * 2022-08-15 2022-10-21 南京邮电大学 Route optimization method and system based on graph structure characteristics

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
CHE XIANG-BEI, KANG WEN-QIAN, DENG BING, YANG KE-HAN, LI JIAN: "A Prediction Model of SDN Routing Performance Based on Graph Neural Network", ACTA ELECTRONICA SINICA, vol. 49, no. 3, 1 March 2021 (2021-03-01), pages 484 - 491, XP093140194, DOI: 10.12263/DZXB.20200120 *

Also Published As

Publication number Publication date
CN115225561B (en) 2022-12-06
CN115225561A (en) 2022-10-21

Similar Documents

Publication Publication Date Title
WO2024037136A1 (en) Graph structure feature-based routing optimization method and system
CN110012516B (en) Low-orbit satellite routing strategy method based on deep reinforcement learning architecture
CN112437020B (en) Data center network load balancing method based on deep reinforcement learning
CN112491714B (en) Intelligent QoS route optimization method and system based on deep reinforcement learning in SDN environment
Mao et al. An intelligent route computation approach based on real-time deep learning strategy for software defined communication systems
CN112491712B (en) Data packet routing algorithm based on multi-agent deep reinforcement learning
CN111988225B (en) Multi-path routing method based on reinforcement learning and transfer learning
CN110611619A (en) Intelligent routing decision method based on DDPG reinforcement learning algorithm
CN114697229B (en) Construction method and application of distributed routing planning model
CN116527567B (en) Intelligent network path optimization method and system based on deep reinforcement learning
CN113395207B (en) Deep reinforcement learning-based route optimization framework and method under SDN framework
Lei et al. Congestion control in SDN-based networks via multi-task deep reinforcement learning
CN115396366B (en) Distributed intelligent routing method based on graph attention network
CN114143264A (en) Traffic scheduling method based on reinforcement learning in SRv6 network
CN111917642B (en) SDN intelligent routing data transmission method for distributed deep reinforcement learning
Mai et al. Packet routing with graph attention multi-agent reinforcement learning
WO2023109699A1 (en) Multi-agent communication learning method
CN115714741A (en) Routing decision method and system based on collaborative multi-agent reinforcement learning
CN112529148B (en) Intelligent QoS inference method based on graph neural network
Wei et al. GRL-PS: Graph embedding-based DRL approach for adaptive path selection
Bhavanasi et al. Dealing with changes: Resilient routing via graph neural networks and multi-agent deep reinforcement learning
CN116055324B (en) Digital twin method for self-optimization of data center network
CN117061360A (en) SDN network flow prediction method and system based on space-time information
CN107169561A (en) Towards the hybrid particle swarm impulsive neural networks mapping method of power consumption
CN115150335A (en) Optimal flow segmentation method and system based on deep reinforcement learning

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23854042

Country of ref document: EP

Kind code of ref document: A1