CN113194034A - Route optimization method and system based on graph neural network and deep reinforcement learning - Google Patents
Route optimization method and system based on graph neural network and deep reinforcement learning Download PDFInfo
- Publication number
- CN113194034A CN113194034A CN202110435964.9A CN202110435964A CN113194034A CN 113194034 A CN113194034 A CN 113194034A CN 202110435964 A CN202110435964 A CN 202110435964A CN 113194034 A CN113194034 A CN 113194034A
- Authority
- CN
- China
- Prior art keywords
- network
- neural network
- reinforcement learning
- deep reinforcement
- route optimization
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000013528 artificial neural network Methods 0.000 title claims abstract description 50
- 230000002787 reinforcement Effects 0.000 title claims abstract description 38
- 238000000034 method Methods 0.000 title claims abstract description 34
- 238000005457 optimization Methods 0.000 title claims abstract description 33
- 230000009471 action Effects 0.000 claims abstract description 33
- 230000006870 function Effects 0.000 claims abstract description 9
- 230000004931 aggregating effect Effects 0.000 claims abstract description 8
- 230000002776 aggregation Effects 0.000 claims description 9
- 238000004220 aggregation Methods 0.000 claims description 9
- 230000000306 recurrent effect Effects 0.000 claims description 7
- 238000012549 training Methods 0.000 claims description 7
- 238000003860 storage Methods 0.000 claims description 6
- 230000008569 process Effects 0.000 claims description 4
- 238000009825 accumulation Methods 0.000 claims description 3
- 238000003062 neural network model Methods 0.000 claims description 3
- 238000005070 sampling Methods 0.000 claims description 3
- 238000012546 transfer Methods 0.000 claims description 3
- 238000010586 diagram Methods 0.000 description 3
- 235000008694 Humulus lupulus Nutrition 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 239000011159 matrix material Substances 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 238000013527 convolutional neural network Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 238000013468 resource allocation Methods 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L45/00—Routing or path finding of packets in data switching networks
- H04L45/12—Shortest path evaluation
- H04L45/125—Shortest path evaluation based on throughput or bandwidth
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L45/00—Routing or path finding of packets in data switching networks
- H04L45/38—Flow based routing
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Molecular Biology (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
The invention discloses a route optimization method and system based on a graph neural network and deep reinforcement learning, and belongs to the field of network route optimization. Measuring a current network state s, and selecting a shortest path from k source nodes to a target node as an action set a according to a flow demand required to be distributed by a current network state request; inputting the action set a into a graph neural network, aggregating link characteristics and carrying out iterative updating, and obtaining a network state s and an estimated Q value of the action set a through a Q function; and performing deep reinforcement learning according to the estimated Q value to obtain a routing strategy in the current network state, and feeding back the routing strategy to the network topology to execute corresponding routing action. The invention provides a network route optimization system structure based on a graph neural network and deep reinforcement learning, and aims to utilize the graph neural network to learn the relationship between pixel elements in topology and the rules for forming the graph neural network and utilize a deep reinforcement learning algorithm to make decisions so as to optimize network routes.
Description
Technical Field
The invention belongs to the field of network route optimization, and particularly relates to a route optimization method and system based on a graph neural network and deep reinforcement learning.
Background
In the network field, finding the optimal routing configuration from a given traffic matrix is a basic problem, and is also a non-deterministic polynomial problem (NP). Existing solutions based on Deep Learning (DRL) usually preprocess data from network states, present the data in a matrix with a fixed size, and then process the data by a conventional neural network (e.g., a fully-connected neural network, a convolutional neural network), so as to solve the routing optimization problem. Deep reinforcement learning is researched as a key technology of Network routing optimization, and the goal is to establish a self-driven Software Defined Network (SDN). However, when the deep reinforcement learning technology is applied to different network scenes, the method cannot be popularized, because most of the existing deep reinforcement learning methods can only use a fixed network topology during training, but cannot be popularized and effectively operated on a dynamic network topology. The main reason for this limitation is that computer networks are essentially expressed based on graph structures (such as network topology and routing strategies), but the current deep reinforcement learning-based schemes almost entirely use traditional neural network structures, which are not suitable for learning and generalizing graph structure information and cannot model the graph structure information. Even if the most advanced deep reinforcement learning method (such as SoA-DRL) is used, the performance is not ideal when the dynamic network topology is trained, and the method cannot be popularized to a new network topology.
In recent years, Graph Neural Networks (GNNs) have been proposed to model and operate graphs in order to facilitate relational reasoning and structural generalization. In other words, graph neural networks help learn the relationships between graph elements and the rules that make up them, which shows unprecedented generalization ability in the field of network modeling and optimization. And the network topology and routing strategy just need to be learned and optimized for the algorithm of the graph structure.
Disclosure of Invention
Aiming at the defects of the prior art, the invention aims to provide a route optimization method and a route optimization system based on a graph neural network and deep reinforcement learning, and aims to solve the problem of flow distribution of dynamic and unknown network topologies which is difficult to solve by the existing deep reinforcement learning.
In an SDN-based network topology scenario, the SDN controller has a global view of the current network state and must make routing decisions at the arrival of each traffic demand. This problem can be described as a typical network resource allocation problem: in a network topology, each link has a fixed channel capacity, and the controller is responsible for receiving traffic requests and allocating different bandwidths to each link in real time as needed. Therefore, the key issue is how to allocate bandwidth in a network topology to maximize the traffic through the network topology. In this case, the route optimization problem is defined as: an optimal routing strategy is found for each incoming traffic demand from source to destination.
To achieve the above object, according to a first aspect of the present invention, there is provided a route optimization method based on a graph neural network and deep reinforcement learning, including the following steps:
s0., measuring the current network state s, and taking the traffic demand allocated by the current network state request as the target traffic demand;
s1, selecting k shortest paths from source nodes to target nodes according to target flow requirements, wherein a set of the shortest paths is called an action set a, and k is a positive integer;
s2, inputting the action set a into a graph neural network to calculate aggregation link characteristics, performing aggregation and iterative updating, and obtaining a network state s and an estimated Q value of the action set a through a Q function;
s3, performing deep reinforcement learning according to the estimated Q value to obtain a routing strategy in the current network state, feeding the routing strategy back to the network topology to execute corresponding routing action, and obtaining a new network state s';
s4, judging whether a new flow demand exists or not by combining the new network state S ', if so, taking the flow demand requested to be distributed by the new network state S' as a target flow demand, and returning to S1; if not, the next occurrence of the traffic demand is waited, and the process returns to S0.
Preferably, step S2 specifically includes:
calculating link characteristics of each link and adjacent links on the shortest path, aggregating the link characteristics connected with the same node, and updating the link characteristics of each link;
iterating the steps for T times, wherein T is a preset value;
and aggregating the link characteristics after iterative updating, and obtaining the estimated Q value of the network state s and the action set a through a Q function.
Preferably, the graph neural network is a neural network model consisting of a fully connected network and a recurrent neural network RNN:
calculating link characteristics by using a message transfer algorithm;
completing aggregation of link characteristics by a fully connected neural network;
the updating of the link characteristics is carried out by the recurrent neural network RNN.
Preferably, the method further comprises: and regularly acquiring the reward r after the routing action is executed, feeding the reward r back to the deep reinforcement learning for accumulation, and training the deep reinforcement learning.
Preferably, the method further comprises: after obtaining the reward r after executing the routing action each time, forming a tuple { s, a, r, s '} by the current network state s, the action set a, the reward r and the new network state s', storing the tuple in an experience replay buffer, training the graph neural network by randomly sampling from the experience replay buffer, and updating the parameters of the graph structure network.
Preferably, deep reinforcement learning obtains an estimated Q value, an exploration strategy of E-greedy is used, the estimated Q value is randomly selected according to the probability of E, the maximum value of the estimated Q value is selected according to the probability (1-E), and the final selection result is used as a routing strategy in the current network state.
Preferably, the network state is defined by the characteristics of the topological links, including link capacity, link betweenness, current traffic demand, and the like. The link capacity represents the available capacity on the link, and the link betweenness is a centrality measure inherited from graph theory, representing how many paths are possible to traverse the link. In particular toThe link betweenness may be calculated by: for each pair of nodes in the topology, we compute k candidate paths (e.g., k shortest paths) and update each link counter, which indicates how many paths pass through the link. Thus, the betweenness on each link is the number of end-to-end paths through the link divided by the total number of paths. For data processing purposes, we set the link state eigenvalues to the vector { x }1,x2,…,xNIn which x1For link available capacity, x2Is the link number, x3Representing the bandwidth requirement (bandwidth allocated on the link after application of the routing operation) allocated according to the current traffic request, x4-xNVector values that are zero-padded.
Preferably, the number of route combinations that are possible per source-to-destination node pair will typically result in a realistic large-scale network operating in a high-dimensional data space. This complicates the routing problem very much, since the controller should estimate the Q-value for all possible routing actions. To reduce dimensionality, we limit the set of operations per source-to-target pair to k candidate paths. In the experimental environment adopted by the invention, in order to keep a good balance between the flexibility of routing traffic and the evaluation cost, we select k as 4 shortest paths (in hops). The action set may vary depending on the source node and destination node routing traffic demands.
The invention provides a DRL + GNN network route optimization architecture in an innovative way by aiming at a model popularization problem in a network route optimization problem and combining a Graph Neural Network (GNN) for modeling and operating a graph. The system structure aims to utilize GNN to learn the relationship between the graphic elements in the topology and the rules for forming the graphic elements, and utilize a DRL algorithm to make decisions, so that the network routing is optimized, and the system structure has the potential of being popularized to dynamic and unknown network topologies. The specific objective of the architecture is to allocate traffic when traffic demand arrives, maximizing traffic through the network route, thereby implementing a network route optimization method that can be generalized and optimized.
According to a second aspect of the present invention, there is provided a route optimization system based on a graph neural network and deep reinforcement learning, comprising: a computer-readable storage medium and a processor;
the computer-readable storage medium is used for storing executable instructions;
the processor is used for reading the executable instructions stored in the computer readable storage medium and executing the route optimization method based on the graph neural network and the deep reinforcement learning.
Through the technical scheme, compared with the prior art, the network route optimization system structure based on the graph neural network and the deep reinforcement learning is innovatively provided by combining the graph neural network for modeling and operating the graph aiming at the model popularization problem in the network route optimization problem. The system structure aims to utilize the graph neural network to learn the relationship between the graphic elements in the topology and the rules for forming the graphic elements, and utilizes the deep reinforcement learning algorithm to make decisions, thereby optimizing the network routing and having the potential of being popularized to dynamic and unknown network topologies. Compared with the deep reinforcement learning method of the traditional neural network structure, the system structure can optimize the routing performance to a greater degree, and particularly has stronger generalization popularization capability on dynamic and unknown network topologies.
Drawings
FIG. 1 is a block diagram of a network architecture to which a routing optimization method based on a graph neural network and deep reinforcement learning is applied, according to the present invention;
FIG. 2 is a schematic flow chart of a route optimization method based on a graph neural network and deep reinforcement learning according to the present invention;
fig. 3 is a schematic diagram of a message passing algorithm-based graph neural network architecture provided by the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. In addition, the technical features involved in the embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.
The invention provides a route optimization method based on a graph neural network and deep reinforcement learning, which is applied to a network architecture shown in figure 1, wherein the flow schematic diagram of the method is shown in figure 2, and the method comprises the following steps:
s0., measuring the current network state s, and taking the traffic demand allocated by the current network state request as the target traffic demand;
s1, selecting k shortest paths from source nodes to target nodes according to target flow requirements, wherein a set of the shortest paths is called an action set a, and k is a positive integer;
s2, inputting the action set a into a graph neural network to calculate aggregation link characteristics, performing aggregation and iterative updating, and obtaining a network state s and an estimated Q value of the action set a through a Q function;
s3, performing deep reinforcement learning according to the estimated Q value to obtain a routing strategy in the current network state, feeding the routing strategy back to the network topology to execute corresponding routing action, and obtaining a new network state s';
s4, judging whether a new flow demand exists or not by combining the new network state S ', if so, taking the flow demand requested to be distributed by the new network state S' as a target flow demand, and returning to S1; if not, the next occurrence of the traffic demand is waited, and the process returns to S0.
Specifically, step S2 specifically includes:
calculating link characteristics of each link and adjacent links on the shortest path, aggregating the link characteristics connected with the same node, and updating the link characteristics of each link;
iterating the steps for T times, wherein T is a preset value;
and aggregating the link characteristics after iterative updating, and obtaining the estimated Q value of the network state s and the action set a through a Q function.
Specifically, the graph neural network is a neural network model composed of a fully-connected network and a recurrent neural network RNN:
calculating link characteristics by using a message transfer algorithm;
completing aggregation of link characteristics by a fully connected neural network;
the updating of the link characteristics is carried out by the recurrent neural network RNN.
As shown in fig. 3, the specific steps are:
calculating the link characteristics of each link and the adjacent links on the path through a message function (M);
aggregating the link characteristics connected with the same node by using a fully connected neural network (sum);
using Recurrent Neural Network (RNN) to update the link state of each link; iterating the step I and the step III for T times;
inputting the link state after iterative update into a fully connected neural network (sum) for aggregation, and obtaining the estimated Q value of the network state s and the action set a through a Q function.
Specifically, the method further comprises: and regularly acquiring the reward r after the routing action is executed, feeding the reward r back to the deep reinforcement learning for accumulation, and training the deep reinforcement learning.
Specifically, the method further comprises: after obtaining the reward r after executing the routing action each time, forming a tuple { s, a, r, s ' } by the current network state s, the action set a, the reward r and the new network state s ', storing the tuple { s, a, r, s ' } generated each time by adopting an experience replay buffer, training the graph neural network by random sampling from the tuple, and updating the parameters of the graph structure network.
Specifically, deep reinforcement learning obtains an estimated Q value, an exploration strategy of E-greedy is used, the estimated Q value is randomly selected according to the probability of E, the maximum value of the estimated Q value is selected according to the probability (1-E), and the final selection result is used as a routing strategy in the current network state.
Specifically, the network state is defined by the characteristics of the topological links, including link capacity, link betweenness, current traffic demand, and the like. The link capacity represents the available capacity on the link, and the link betweenness is a centrality measure inherited from graph theory, representing how many paths are possible to traverse the link. The link betweenness may be specifically calculated by: for each pair of nodes in the topology, we compute k candidate paths(e.g., k shortest paths) and updates each link counter indicating how many paths pass through the link. Thus, the betweenness on each link is the number of end-to-end paths through the link divided by the total number of paths. For data processing purposes, we set the link state eigenvalues to the vector { x }1,x2,…,xNIn which x1For link available capacity, x2Is the link number, x3Representing the bandwidth requirement (bandwidth allocated on the link after application of the routing operation) allocated according to the current traffic request, x4-xNVector values that are zero-padded.
In particular, the number of route combinations that each source-to-destination node pair may implement typically results in a realistic large-scale network operating in a high-dimensional data space. This complicates the routing problem very much, since the controller should estimate the Q-value for all possible routing actions. To reduce dimensionality, we limit the set of operations per source-to-target pair to k candidate paths. In the experimental environment adopted by the invention, in order to keep a good balance between the flexibility of routing traffic and the evaluation cost, we select k as 4 shortest paths (in hops). The action set may vary depending on the source node and destination node routing traffic demands.
It will be understood by those skilled in the art that the foregoing is only a preferred embodiment of the present invention, and is not intended to limit the invention, and that any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the scope of the present invention.
Claims (8)
1. A route optimization method based on a graph neural network and deep reinforcement learning is characterized by comprising the following steps:
s0., measuring the current network state s, and taking the traffic demand allocated by the current network state request as the target traffic demand;
s1, selecting k shortest paths from source nodes to target nodes according to target flow requirements, wherein a set of the shortest paths is called an action set a, and k is a positive integer;
s2, inputting the action set a into a graph neural network to calculate link characteristics, performing aggregation and iterative updating, and obtaining a network state s and an estimated Q value of the action set a through a Q function;
s3, performing deep reinforcement learning according to the estimated Q value to obtain a routing strategy in the current network state, feeding the routing strategy back to the network topology to execute corresponding routing action, and obtaining a new network state s';
s4, judging whether a new flow demand exists or not by combining the new network state S ', if so, taking the flow demand requested to be distributed by the new network state S' as a target flow demand, and returning to S1; if not, the next occurrence of the traffic demand is waited, and the process returns to S0.
2. The route optimization method according to claim 1, wherein the step S2 specifically includes:
calculating link characteristics of each link and adjacent links on the shortest path, aggregating the link characteristics connected with the same node, and updating the link characteristics of each link;
iterating the steps for T times, wherein T is a preset value;
and aggregating the link characteristics after iterative updating, and obtaining the estimated Q value of the network state s and the action set a through a Q function.
3. The route optimization method according to claim 2, wherein the graph neural network is a neural network model composed of a fully connected network and a recurrent neural network RNN:
calculating link characteristics by using a message transfer algorithm;
completing aggregation of link characteristics by a fully connected neural network;
the updating of the link characteristics is carried out by the recurrent neural network RNN.
4. The route optimization method of claim 1, further comprising: and regularly acquiring the reward r after the routing action is executed, feeding the reward r back to the deep reinforcement learning for accumulation, and training the deep reinforcement learning.
5. The route optimization method of claim 4, further comprising: after obtaining the reward r after executing the routing action each time, forming a tuple { s, a, r, s '} by the current network state s, the action set a, the reward r and the new network state s', and accumulating the tuple;
and training the graph neural network by randomly sampling from the accumulated tuples, and updating the parameters of the graph structure network.
6. The route optimization method of claim 3, wherein the deep reinforcement learning obtains the estimated Q value, the estimated Q value is randomly selected with a probability of e by using an e-greedy exploration strategy, the maximum value of the estimated Q value is selected with a probability of (1-e), and the final selection result is used as the routing strategy in the current network state.
7. The route optimization method according to claim 1, wherein the vector for network states { x }1,x2,…,xNDenotes wherein x1For link available capacity, x2Is the link number, x3For the current flow demand, x4~xNIs a vector value of zero padding, and N is the number of network states.
8. A route optimization system based on a graph neural network and deep reinforcement learning is characterized by comprising: a computer-readable storage medium and a processor;
the computer-readable storage medium is used for storing executable instructions;
the processor is used for reading executable instructions stored in the computer-readable storage medium and executing the method for route optimization based on the graph neural network and the deep reinforcement learning according to any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110435964.9A CN113194034A (en) | 2021-04-22 | 2021-04-22 | Route optimization method and system based on graph neural network and deep reinforcement learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110435964.9A CN113194034A (en) | 2021-04-22 | 2021-04-22 | Route optimization method and system based on graph neural network and deep reinforcement learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113194034A true CN113194034A (en) | 2021-07-30 |
Family
ID=76978560
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110435964.9A Pending CN113194034A (en) | 2021-04-22 | 2021-04-22 | Route optimization method and system based on graph neural network and deep reinforcement learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113194034A (en) |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114051272A (en) * | 2021-10-30 | 2022-02-15 | 西南电子技术研究所(中国电子科技集团公司第十研究所) | Intelligent routing method for dynamic topological network |
CN114221897A (en) * | 2021-12-09 | 2022-03-22 | 网络通信与安全紫金山实验室 | Routing method, device, equipment and medium based on multi-attribute decision |
CN114389990A (en) * | 2022-01-07 | 2022-04-22 | 中国人民解放军国防科技大学 | Shortest path blocking method and device based on deep reinforcement learning |
CN114697229A (en) * | 2022-03-11 | 2022-07-01 | 华中科技大学 | Construction method and application of distributed routing planning model |
CN115022231A (en) * | 2022-06-30 | 2022-09-06 | 武汉烽火技术服务有限公司 | Optimal path planning method and system based on deep reinforcement learning |
CN115173923A (en) * | 2022-07-04 | 2022-10-11 | 重庆邮电大学 | Energy efficiency perception route optimization method and system for low-orbit satellite network |
WO2023020502A1 (en) * | 2021-08-17 | 2023-02-23 | 华为技术有限公司 | Data processing method and apparatus |
CN116366529A (en) * | 2023-04-20 | 2023-06-30 | 哈尔滨工业大学 | Adaptive routing method based on deep reinforcement learning in SDN (software defined network) background |
WO2024037136A1 (en) * | 2022-08-15 | 2024-02-22 | 南京邮电大学 | Graph structure feature-based routing optimization method and system |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110611619A (en) * | 2019-09-12 | 2019-12-24 | 西安电子科技大学 | Intelligent routing decision method based on DDPG reinforcement learning algorithm |
CN110896556A (en) * | 2019-04-04 | 2020-03-20 | 中国电子科技集团公司第五十四研究所 | Time synchronization method and device for post-5G forward transmission network based on deep reinforcement learning |
CN112116129A (en) * | 2020-08-24 | 2020-12-22 | 中山大学 | Dynamic path optimization problem solving method based on deep reinforcement learning |
US20210089868A1 (en) * | 2019-09-23 | 2021-03-25 | Adobe Inc. | Reinforcement learning with a stochastic action set |
-
2021
- 2021-04-22 CN CN202110435964.9A patent/CN113194034A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110896556A (en) * | 2019-04-04 | 2020-03-20 | 中国电子科技集团公司第五十四研究所 | Time synchronization method and device for post-5G forward transmission network based on deep reinforcement learning |
CN110611619A (en) * | 2019-09-12 | 2019-12-24 | 西安电子科技大学 | Intelligent routing decision method based on DDPG reinforcement learning algorithm |
US20210089868A1 (en) * | 2019-09-23 | 2021-03-25 | Adobe Inc. | Reinforcement learning with a stochastic action set |
CN112116129A (en) * | 2020-08-24 | 2020-12-22 | 中山大学 | Dynamic path optimization problem solving method based on deep reinforcement learning |
Non-Patent Citations (2)
Title |
---|
PAUL ALMASAN ETAL: "《deep reinforcement learning meets Graph neural networks:an optical network routing use case》", 《ARXIV PREPRINT ARXIV:1910.07421》 * |
刘辰屹等: "基于机器学习的智能路由算法综述", 《计算机研究与发展》 * |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2023020502A1 (en) * | 2021-08-17 | 2023-02-23 | 华为技术有限公司 | Data processing method and apparatus |
CN114051272A (en) * | 2021-10-30 | 2022-02-15 | 西南电子技术研究所(中国电子科技集团公司第十研究所) | Intelligent routing method for dynamic topological network |
CN114221897A (en) * | 2021-12-09 | 2022-03-22 | 网络通信与安全紫金山实验室 | Routing method, device, equipment and medium based on multi-attribute decision |
CN114389990A (en) * | 2022-01-07 | 2022-04-22 | 中国人民解放军国防科技大学 | Shortest path blocking method and device based on deep reinforcement learning |
CN114697229A (en) * | 2022-03-11 | 2022-07-01 | 华中科技大学 | Construction method and application of distributed routing planning model |
CN114697229B (en) * | 2022-03-11 | 2023-04-07 | 华中科技大学 | Construction method and application of distributed routing planning model |
CN115022231A (en) * | 2022-06-30 | 2022-09-06 | 武汉烽火技术服务有限公司 | Optimal path planning method and system based on deep reinforcement learning |
CN115022231B (en) * | 2022-06-30 | 2023-11-03 | 武汉烽火技术服务有限公司 | Optimal path planning method and system based on deep reinforcement learning |
CN115173923A (en) * | 2022-07-04 | 2022-10-11 | 重庆邮电大学 | Energy efficiency perception route optimization method and system for low-orbit satellite network |
CN115173923B (en) * | 2022-07-04 | 2023-07-04 | 重庆邮电大学 | Low-orbit satellite network energy efficiency perception route optimization method and system |
WO2024037136A1 (en) * | 2022-08-15 | 2024-02-22 | 南京邮电大学 | Graph structure feature-based routing optimization method and system |
CN116366529A (en) * | 2023-04-20 | 2023-06-30 | 哈尔滨工业大学 | Adaptive routing method based on deep reinforcement learning in SDN (software defined network) background |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113194034A (en) | Route optimization method and system based on graph neural network and deep reinforcement learning | |
CN110601973B (en) | Route planning method, system, server and storage medium | |
CN108076158B (en) | Minimum load route selection method and system based on naive Bayes classifier | |
CN106411749A (en) | Path selection method for software defined network based on Q learning | |
CN108075975B (en) | Method and system for determining route transmission path in Internet of things environment | |
Bernárdez et al. | Is machine learning ready for traffic engineering optimization? | |
CN111770019A (en) | Q-learning optical network-on-chip self-adaptive route planning method based on Dijkstra algorithm | |
CN113098714A (en) | Low-delay network slicing method based on deep reinforcement learning | |
CN111130853A (en) | Future route prediction method of software defined vehicle network based on time information | |
CN111641557A (en) | Minimum cost backup path method for delay tolerant network | |
van Leeuwen et al. | CoCoA: A non-iterative approach to a local search (A) DCOP solver | |
Qin et al. | Traffic optimization in satellites communications: A multi-agent reinforcement learning approach | |
CN113645589B (en) | Unmanned aerial vehicle cluster route calculation method based on inverse fact policy gradient | |
Zhang et al. | A service migration method based on dynamic awareness in mobile edge computing | |
Wang et al. | GRouting: dynamic routing for LEO satellite networks with graph-based deep reinforcement learning | |
CN116963225B (en) | Wireless mesh network routing method for streaming media transmission | |
CN111404595B (en) | Method for evaluating health degree of space-based network communication satellite | |
Meng et al. | Intelligent routing orchestration for ultra-low latency transport networks | |
CN116155805A (en) | Distributed intelligent routing method, system, electronic equipment and storage medium | |
Dana et al. | Backup path set selection in ad hoc wireless network using link expiration time | |
Singh et al. | Multi-agent reinforcement learning based efficient routing in opportunistic networks | |
Almasan et al. | Enero: Efficient real-time routing optimization | |
KR102308799B1 (en) | Method for selecting forwarding path based on learning medium access control layer collisions in internet of things networks, recording medium and device for performing the method | |
CN116418492A (en) | Route establishment method, system and quantum cryptography network | |
Xu et al. | A Graph reinforcement learning based SDN routing path selection for optimizing long-term revenue |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20210730 |