CN110986979A - SDN multi-path routing planning method based on reinforcement learning - Google Patents
SDN multi-path routing planning method based on reinforcement learning Download PDFInfo
- Publication number
- CN110986979A CN110986979A CN201911183909.4A CN201911183909A CN110986979A CN 110986979 A CN110986979 A CN 110986979A CN 201911183909 A CN201911183909 A CN 201911183909A CN 110986979 A CN110986979 A CN 110986979A
- Authority
- CN
- China
- Prior art keywords
- reinforcement learning
- path
- routing
- flow
- data packet
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01C—MEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
- G01C21/00—Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
- G01C21/26—Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 specially adapted for navigation in a road network
- G01C21/34—Route searching; Route guidance
- G01C21/3446—Details of route searching algorithms, e.g. Dijkstra, A*, arc-flags, using precalculated routes
Abstract
The invention discloses an SDN multi-path route planning method based on reinforcement learning, which comprises the following steps: applying reinforcement learning to SDN multi-path routing planning, using a QLearning algorithm as a reinforcement learning model, and generating different reward values according to different QoS (quality of service) levels of flow; setting different reward functions for the flows with different QoS levels according to the input network topology matrix and the current flow characteristic matrix to be forwarded, and planning a plurality of paths to forward the flows; and under the condition that the link bandwidth is not enough, a larger flow is divided into a plurality of small flows, so that the link bandwidth utilization rate is improved. The invention utilizes the characteristics of reinforcement learning, continuous interaction with environment and strategy adjustment, and can realize high link utilization rate and effectively reduce network congestion compared with the traditional single-path routing planning.
Description
Technical Field
The invention relates to the field of network communication technology and reinforcement learning, in particular to an SDN multi-path route planning method based on reinforcement learning.
Background
In recent years, with the popularization of the internet, particularly with the appearance of related technologies such as cloud computing and big data, the internet has entered a rapid development period. The rapid development of the internet enables the data volume of network transmission services to increase rapidly, and particularly, in recent years, with the rise of short video and live broadcast platforms, the interaction of network services is more real-time, and a terminal user puts higher demands on the Quality of Service (QoS) of the network services. However, in the case of limited network resources, the continuous increase of internet traffic data may cause problems such as a sharp increase of bandwidth consumption, difficulty in guaranteeing quality of service, and an increase of security problems. Obviously, the traditional network architecture is difficult to meet the diversified requirements of users. In view of the foregoing, there is a need in the internet industry for a new network architecture that addresses existing network problems, and that is more flexible and efficient than conventional architectures to meet the ever-increasing traffic data needs of society.
The SDN is a novel network architecture, is widely concerned by various borders, and solves some problems which cannot be avoided in the traditional network. In the traditional network architecture, each device can independently make a forwarding rule and transmit information through a series of network protocols (such as TCP/IP), under the system structure, the control and the forwarding of the network device are closely coupled, the network device can only plan a path by taking the network device as a center as a flow service and does not have network global resource information, and the problems of network link congestion and the like are easily caused. SDN forwards and control separation, can obtain link information in real time through the OpenFlow protocol, be favorable to the centralized control of network for the control layer obtains network global resource information, and carry out unified management and distribution according to the demand of business, and simultaneously, centralized control still makes whole network regard as a whole, convenient maintenance. Compared with the traditional IP network, the SDN network solves the problems of inaccurate routing information, low routing efficiency and the like of the traditional network, and lays a foundation for realizing intelligent routing planning according to the requirements of different flows. Therefore, the research on the SDN network architecture is of great significance. Routing is an indispensable component of both traditional networks and SDN networks, however, the basic adopted by the current mainstream SDN routing modules is Dijkstra (shortest path) algorithm, if all data packets depend on the shortest path algorithm only, data flows are easy to cause link congestion due to the selection of the same link, and other links are placed in an idle state, which greatly reduces link utilization. On the other hand, the shortest path algorithm is an algorithm for finding the shortest path in graph theory, and when the algorithm is operated, the shortest path from the source node to all other nodes in the topology is actually obtained, so the time complexity of the algorithm is high. There are also protocols that support multipath, such as ECMP, but these protocols do not take into account the quality of service requirements of different traffic streams. Therefore, a better routing strategy is needed in the SDN network to generate routes, improve the performance of the network, and guarantee the service quality of different service flows.
Disclosure of Invention
The invention provides a high-bandwidth-utilization-oriented SDN intelligent routing planning technology, which mainly adopts the shortest path as a routing planning algorithm around the current SDN network, so that the problem of low link bandwidth utilization rate and the like is caused.
The technical scheme adopted by the invention for solving the technical problem is as follows: an SDN multi-path route planning method based on reinforcement learning comprises the following steps:
step 1: acquiring available bandwidth information, total bandwidth information, node information and link information of a network to construct a network topology matrix, and acquiring a characteristic matrix of a stream to be forwarded;
step 2: a QLearning algorithm is adopted as a reinforcement learning model, and the network topology matrix and the characteristic matrix of the flow to be forwarded in the step 1 are input into a reinforcement learning model training Q value table; the reward function R in the QLearning algorithm is as follows:
wherein: rt(Si,Aj) Indicating slave status S of a data packetiSelection action AjThe obtained reward is represented in a routing planning task as the reward generated when the next hop selected by the data packet at the node i is the node j, β is the flow QoS grade, η is the bandwidth utilization rate, d is the destination node, delta (j-d) is an impulse function and represents that when the next hop of the data packet is the destination node, the value is 1, T is the connection state of the network topology nodes, 1 when the two nodes are connected and 0 when the two nodes are not connected, and g (x) is) As a cost function, the following is shown:
in the formula ImIs the total number of links of the network topology. x is the hop count passed by the data packet in forwarding;
and step 3: and obtaining a path Routing according to the Q value table, putting the path Routing into a path set Routing (S, D), and judging whether the minimum link bandwidth of the Routing is smaller than the bandwidth of the stream. If so, a slice of size is divided from the streamWherein B isCan be usedRepresents the minimum link available bandwidth for the current output path, β represents the Qos level, Σ, of the current flowiβiIndicating the overall QoS class. The divided streams are passed from the source node to the target node through the current output path. Taking the residual flow as a new flow to flow back to the step 2 to train the Q value table again; if not, the planning is finished, and the planned multi-path route is obtained from the Routing (S, D).
Further, the traffic characteristic matrix to be forwarded includes a source address, a destination address, a QoS class, and a traffic size of the flow.
Further, the process of training the Q-value table by the QLearning algorithm is specifically as follows:
and setting the maximum step number of the single training.
(1) Initializing a Q value table and a reward function R;
(2) an action epsilon-greedy strategy P is adopted, and an action a is selected;
(3) executing action a, transferring to a state s', calculating a reward value by using a reward function R, and updating a Q value table;
(4) and judging whether s' is a destination node or not. If not, let s ═ s', return to (2).
Further, the SDN multipath routing planning method based on reinforcement learning is characterized in that the cost function is defined as that the cost increases with the increase of the number x of hops passed by the data packet forwarding, and the cost function g (x) is e (0,1), and the cost function should satisfy: the curve of the cost function g (x) is an upward convex function curve, and when the total hop number of the data packet tends to infinity, the cost function value tends to 1.
The method has the advantages that reinforcement learning is applied to SDN multi-path routing planning, a QLearning algorithm is used as a reinforcement learning model, different reward functions are set for flows with different QoS levels according to an input network topology matrix and a current flow characteristic matrix to be forwarded, a plurality of paths are planned to forward the flows, and a larger flow is divided into a plurality of small flows under the condition that the link bandwidth is not enough, so that the utilization rate of the link bandwidth is improved.
Drawings
Figure 1 is an SDN multi-path routing planning architecture diagram;
figure 2 is a SDN network topology diagram;
FIG. 3 is a graph of a cost function;
fig. 4 is a flow chart of reinforcement learning-based multi-path routing planning.
Detailed Description
Aiming at the fact that the existing SDN control adopts Dijkstra algorithm as the shortest route searching algorithm, the method tries to apply reinforcement learning to the SDN route. And directly using the network topology environment for the training of the Q value table by utilizing the characteristic of SDN forwarding control separation. Considering that different services have different requirements on QoS, the invention provides routes with different service qualities for different services; and under the condition that the link bandwidth is not enough, a larger flow is divided into a plurality of small flows, so that the link bandwidth utilization rate is improved.
As shown in fig. 1, the present invention provides a reinforcement learning based SDN multi-path route planning method, which includes the following steps:
step 1: acquiring available bandwidth information, total bandwidth information, node information and link information of a network to construct a network topology matrix, and constructing a network topology graph as shown in figure 2 by using Mininet, wherein the network topology graph comprises 9 OpenFlow switches and 5 hosts; the method comprises the steps of obtaining a characteristic matrix of the flow to be forwarded, setting the bandwidth of each network link to be 200 according to a multi-path routing planning algorithm and an SDN network topology, setting a sending end to be h 1-h 5 and a receiving end to be h 1-h 5, wherein the sending end randomly sends data to other receiving ends with the probability of 20%, and all hosts send 30 static flows in total, wherein the static flows refer to flows which occupy the bandwidth of the link until the experiment is finished once being injected into the network.
Step 2: as shown in fig. 1, using QLearning algorithm as the reinforcement learning model, the multi-path routing algorithm proposed by the present invention uses markov decision process for modeling, and therefore, the model MDP quadruplet proposed by the present invention is defined as follows:
(1) state collection: in a network topology, each switch represents a state, and thus, according to the network topology, a set of network states is defined herein as follows:
S=[s1,s2,s3,…s9]
wherein s is1~s9Representing 9 OpenFlow switches in the network. The source node information of the data packet indicates an initial state of the data packet, and the destination node information indicates a termination state of the data packet. When a certain data packet reaches the destination node, the data packet reaches the termination state. Once the current data packet reaches the termination state, the termination of one round of training is indicated, and the data packet will return to the initial state again for the next round of training.
(2) An action space: in an SDN network, the transmission path of a data packet is determined by the network state, i.e. the data packet can only be transmitted at connected network nodes. According to the network topology, the network connection state is defined as the following formula:
since packets can only be transmitted at connected network nodes, the following set of actions for each state S [ i ] ∈ S can be defined herein according to the set of network states and the network connection state:
A(si)={sj|T[si][sj]=1}
indicates that the current state is at siThe state-selectable action set appears as s on the network topologyiDirectly connected nodes sjI.e. the current state siWill only select the state s connected to itj. For example: state s1The action set of (1) is: a(s)1)={s2,s4}。
(3) And (3) state transition: in each round of training, when the data packet is in state siIf the action is not the selected state of the round, the data packet moves to the next state. Another key issue with reinforcement learning is the generation of reward values. When the Agent generates state transition, the system feeds back a reward to the Agent according to the reward function R.
(4) The final purpose of the multi-path routing planning based on reinforcement learning is to plan reasonable multi-paths through training, so the setting of a reward value R is also important, the bandwidth utilization rate and the delay are mainly considered herein, the delay mainly refers to the hop count of the path, and in order to plan different paths for links with different QoS levels, the hop count of the planned path is considered to be smaller as the traffic level β is larger.
1. QoS level β and link utilization η need to be considered;
2.β large flows are encouraged to allocate paths with fewer hops;
in summary, the reward function formula designed herein is as follows:
wherein: rt(Si,Aj) Indicating slave status S of a data packetiSelection action AjThe obtained reward is represented in a routing planning task as the reward generated when the next hop selected by the data packet at the node i is the node j, β is the flow QoS grade, η is the bandwidth utilization rate, d is the destination node, delta (j-d) is an impulse function and represents that when the next hop of the data packet is the destination node, the value is 1, T is the connection state of the network topology nodes, the two nodes are 1 when connected and 0 when not connected, and the reward function represents that the data packet is in the state SiWhen the next hop that can be selected (connected) is j (action a)j) Then, from T [ S ]i][Aj]The bonus function when 1 yields a bonus value, otherwise the bonus value is set to-1.
g (x) is a cost function defined as the cost increases as the number of hops x passed by the packet increases, and the cost function g (x) is e (0,1) with lmFor the total number of links in the network topology, considering that when the network topology is large, it is impractical to walk all paths, and a data packet can only be forwarded through a part of links, so the cost function should satisfy: the early stage is increased quickly and becomes stable to the later stage, if the total number of hops passed by the data packet reaches lmThen the cost function value is maximized. In summary, the cost function is as follows:
as shown in fig. 3, it can be seen that the cost function is an increasing function, the range of the increasing function is (0,1), the cost increases with the increase of the number of hops x, and the function grows more rapidly in the early stage and tends to be stable in the later stage, which meets the requirement of the cost function.
The second requirement of designing the reward function is to encourage β flows with large flows to allocate paths with small hop count, so the cost function g (x) in the reward function is multiplied by the traffic QoS class β, at this time, under the same condition, the more the number of the path hops passed by the packet forwarding, the larger the cost required by the flows with large QoS class, and the paths with small hop count will be selected by the flows with large QoS class during planning.
After the MDP quadruples are determined, a Q-value table is trained by using a Qlearning algorithm, which comprises the following specific steps.
And setting the maximum step number of the single training.
(1) Initializing a Q value table and a reward function R;
(2) an action epsilon-greedy strategy P is adopted, and an action a is selected;
(3) executing action a, transferring to a state s', calculating a reward value by using a reward function R, and updating a Q value table;
(4) and judging whether s' is a destination node or not. If not, the process returns to step (2) with s ═ s'.
As shown in fig. 4, the learning rate α is set to 0.8, the discount rate γ is set to 0.6, and the value of the action strategy ∈ -greedy strategy ∈ is equal to 0.1.
And obtaining a path Routing according to the trained Q value table, putting the path Routing into a path set Routing (S, D), and judging whether the minimum link bandwidth of the Routing is smaller than the bandwidth of the stream. If so, a slice of size is divided from the streamOf (a) wherein BCan be usedRepresents the minimum link available bandwidth for the current output path, β represents the Qos level, Σ, of the current flowiβiIndicating the overall QoS class. And the small flow reaches the target node from the source destination node through the current output path. Use ofUpdating the flow, wherein B represents the size of the planned flow, namely returning the rest flows as new flows to step 2 for the training of the Q value table; if not, the planning is finished, and the planned multi-path route is obtained from the Routing (S, D).
The above-described embodiments are intended to illustrate rather than to limit the invention, and any modifications and variations of the present invention are within the spirit of the invention and the scope of the appended claims.
Claims (4)
1. An SDN multi-path route planning method based on reinforcement learning is characterized by comprising the following steps:
step 1: acquiring available bandwidth information, total bandwidth information, node information and link information of a network to construct a network topology matrix, and acquiring a characteristic matrix of a stream to be forwarded;
step 2: a QLearning algorithm is adopted as a reinforcement learning model, and the network topology matrix and the characteristic matrix of the flow to be forwarded in the step 1 are input into a reinforcement learning model training Q value table; the reward function R in the QLearning algorithm is as follows:
wherein: rt(Si,Aj) Indicating slave status S of a data packetiSelection action AjThe obtained reward is represented in a routing planning task as the reward generated when the next hop selected by the data packet at the node i is the node j, β is the flow QoS grade, η is the bandwidth utilization rate, d is the destination node, delta (j-d) is an impulse function and represents that when the next hop node of the data packet is the destination node, the value is 1, T is the connection state of the network topology nodes, the two nodes are 1 when connected and 0 when not connected, and g (x) is a cost function, and the following steps are shown:
in the formula ImIs the total number of links of the network topology. x is the hop count passed by the data packet in forwarding;
and step 3: and obtaining a path Routing according to the Q value table, putting the path Routing into a path set Routing (S, D), and judging whether the minimum link bandwidth of the Routing is smaller than the bandwidth of the stream. If so,a division into a size ofWherein B isCan be usedRepresents the minimum link available bandwidth for the current output path, β represents the Qos level, Σ, of the current flowiβiIndicating the overall QoS class. The divided streams are passed from the source node to the target node through the current output path. Taking the residual flow as a new flow to flow back to the step 2 to train the Q value table again; if not, the planning is finished, and the planned multi-path route is obtained from the Routing (S, D).
2. The reinforcement learning-based SDN multipath routing planning method of claim 1, wherein the traffic feature matrix to be forwarded represents information of a flow in a matrix form, wherein the information includes a source address, a destination address, a QoS class, and a traffic size of the flow.
3. The reinforcement learning-based SDN multipath routing planning method of claim 1, wherein the QLearing algorithm training Q-value table specifically comprises the following steps:
after setting the maximum step number of single training, carrying out the following steps;
(1) initializing a Q value table and a reward function R;
(2) an action epsilon-greedy strategy P is adopted, and an action a is selected;
(3) executing action a, transferring to a state s', calculating a reward value by using a reward function R, and updating a Q value table;
(4) and judging whether s' is a destination node or not. If not, the process returns to step (2) with s ═ s'.
4. A reinforcement learning-based SDN multi-path routing planning method according to claim 3, wherein the cost function is defined as that the cost increases as the number of hops x passed by the data packet forwarding increases, and the cost function g (x) is e (0,1), and the cost function should satisfy: the curve of the cost function g (x) is an upward convex function curve, and when the total hop number of the data packet tends to infinity, the cost function value tends to 1.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911183909.4A CN110986979B (en) | 2019-11-27 | 2019-11-27 | SDN multi-path routing planning method based on reinforcement learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911183909.4A CN110986979B (en) | 2019-11-27 | 2019-11-27 | SDN multi-path routing planning method based on reinforcement learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110986979A true CN110986979A (en) | 2020-04-10 |
CN110986979B CN110986979B (en) | 2021-09-10 |
Family
ID=70087419
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911183909.4A Active CN110986979B (en) | 2019-11-27 | 2019-11-27 | SDN multi-path routing planning method based on reinforcement learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110986979B (en) |
Cited By (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111917657A (en) * | 2020-07-02 | 2020-11-10 | 北京邮电大学 | Method and device for determining flow transmission strategy |
CN112398733A (en) * | 2020-11-24 | 2021-02-23 | 新华三大数据技术有限公司 | Traffic scheduling forwarding method and device |
CN112671648A (en) * | 2020-12-22 | 2021-04-16 | 北京浪潮数据技术有限公司 | SDN data transmission method, SDN, device and medium |
CN112822109A (en) * | 2020-12-31 | 2021-05-18 | 上海缔安科技股份有限公司 | SDN core network QoS route optimization algorithm based on reinforcement learning |
CN113098771A (en) * | 2021-03-26 | 2021-07-09 | 哈尔滨工业大学 | Distributed self-adaptive QoS routing method based on Q learning |
CN113158543A (en) * | 2021-02-02 | 2021-07-23 | 浙江工商大学 | Intelligent prediction method for software defined network performance |
CN113347104A (en) * | 2021-05-31 | 2021-09-03 | 国网山东省电力公司青岛供电公司 | SDN-based routing method and system for power distribution Internet of things |
CN113347108A (en) * | 2021-05-20 | 2021-09-03 | 中国电子科技集团公司第七研究所 | SDN load balancing method and system based on Q-learning |
CN113489654A (en) * | 2021-07-06 | 2021-10-08 | 国网信息通信产业集团有限公司 | Routing method, routing device, electronic equipment and storage medium |
CN114124828A (en) * | 2022-01-27 | 2022-03-01 | 广东省新一代通信与网络创新研究院 | Machine learning method and device based on programmable switch |
CN114845359A (en) * | 2022-03-14 | 2022-08-02 | 中国人民解放军军事科学院战争研究院 | Multi-intelligent heterogeneous network selection method based on Nash Q-Learning |
WO2022257917A1 (en) * | 2021-06-11 | 2022-12-15 | 华为技术有限公司 | Path planning method and related device |
US11606265B2 (en) | 2021-01-29 | 2023-03-14 | World Wide Technology Holding Co., LLC | Network control in artificial intelligence-defined networking |
CN117033005A (en) * | 2023-10-07 | 2023-11-10 | 之江实验室 | Deadlock-free routing method and device, storage medium and electronic equipment |
CN115941579B (en) * | 2022-11-10 | 2024-04-26 | 北京工业大学 | Mixed routing method based on deep reinforcement learning |
Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104640168A (en) * | 2014-12-04 | 2015-05-20 | 北京理工大学 | Q-learning based vehicular ad hoc network routing method |
CN106713143A (en) * | 2016-12-06 | 2017-05-24 | 天津理工大学 | Adaptive reliable routing method for VANETs |
CN107104819A (en) * | 2017-03-23 | 2017-08-29 | 武汉邮电科学研究院 | Adaptive self-coordinating unified communications and communication means based on SDN |
US20180034922A1 (en) * | 2016-07-28 | 2018-02-01 | At&T Intellectual Property I, L.P. | Network configuration for software defined network via machine learning |
CN107911299A (en) * | 2017-10-24 | 2018-04-13 | 浙江工商大学 | A kind of route planning method based on depth Q study |
CN109005471A (en) * | 2018-08-07 | 2018-12-14 | 安徽大学 | Based on the extensible video stream method of multicasting of QoS Intellisense under SDN environment |
CN109361601A (en) * | 2018-10-31 | 2019-02-19 | 浙江工商大学 | A kind of SDN route planning method based on intensified learning |
CN109450794A (en) * | 2018-12-11 | 2019-03-08 | 上海云轴信息科技有限公司 | A kind of communication means and equipment based on SDN network |
CN109547340A (en) * | 2018-12-28 | 2019-03-29 | 西安电子科技大学 | SDN data center network jamming control method based on heavy-route |
US20190123974A1 (en) * | 2016-06-23 | 2019-04-25 | Huawei Technologies Co., Ltd. | Method for generating routing control action in software-defined network and related device |
CN109768940A (en) * | 2018-12-12 | 2019-05-17 | 北京邮电大学 | The flow allocation method and device of multi-service SDN network |
-
2019
- 2019-11-27 CN CN201911183909.4A patent/CN110986979B/en active Active
Patent Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104640168A (en) * | 2014-12-04 | 2015-05-20 | 北京理工大学 | Q-learning based vehicular ad hoc network routing method |
US20190123974A1 (en) * | 2016-06-23 | 2019-04-25 | Huawei Technologies Co., Ltd. | Method for generating routing control action in software-defined network and related device |
US20180034922A1 (en) * | 2016-07-28 | 2018-02-01 | At&T Intellectual Property I, L.P. | Network configuration for software defined network via machine learning |
CN106713143A (en) * | 2016-12-06 | 2017-05-24 | 天津理工大学 | Adaptive reliable routing method for VANETs |
CN107104819A (en) * | 2017-03-23 | 2017-08-29 | 武汉邮电科学研究院 | Adaptive self-coordinating unified communications and communication means based on SDN |
CN107911299A (en) * | 2017-10-24 | 2018-04-13 | 浙江工商大学 | A kind of route planning method based on depth Q study |
CN109005471A (en) * | 2018-08-07 | 2018-12-14 | 安徽大学 | Based on the extensible video stream method of multicasting of QoS Intellisense under SDN environment |
CN109361601A (en) * | 2018-10-31 | 2019-02-19 | 浙江工商大学 | A kind of SDN route planning method based on intensified learning |
CN109450794A (en) * | 2018-12-11 | 2019-03-08 | 上海云轴信息科技有限公司 | A kind of communication means and equipment based on SDN network |
CN109768940A (en) * | 2018-12-12 | 2019-05-17 | 北京邮电大学 | The flow allocation method and device of multi-service SDN network |
CN109547340A (en) * | 2018-12-28 | 2019-03-29 | 西安电子科技大学 | SDN data center network jamming control method based on heavy-route |
Non-Patent Citations (2)
Title |
---|
TRUONG THU HUONG 等: "A global multipath load-balanced routing algorithm Reinforcement Learning", 《2019 INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGY CONVERGENCE (ICTC)》 * |
金子晋 等: "SDN环境下基于QLearning算法的业务划分路由选路机制", 《网络与信息安全学报》 * |
Cited By (21)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111917657A (en) * | 2020-07-02 | 2020-11-10 | 北京邮电大学 | Method and device for determining flow transmission strategy |
CN112398733B (en) * | 2020-11-24 | 2022-03-25 | 新华三大数据技术有限公司 | Traffic scheduling forwarding method and device |
CN112398733A (en) * | 2020-11-24 | 2021-02-23 | 新华三大数据技术有限公司 | Traffic scheduling forwarding method and device |
CN112671648A (en) * | 2020-12-22 | 2021-04-16 | 北京浪潮数据技术有限公司 | SDN data transmission method, SDN, device and medium |
CN112822109A (en) * | 2020-12-31 | 2021-05-18 | 上海缔安科技股份有限公司 | SDN core network QoS route optimization algorithm based on reinforcement learning |
US11606265B2 (en) | 2021-01-29 | 2023-03-14 | World Wide Technology Holding Co., LLC | Network control in artificial intelligence-defined networking |
CN113158543A (en) * | 2021-02-02 | 2021-07-23 | 浙江工商大学 | Intelligent prediction method for software defined network performance |
CN113158543B (en) * | 2021-02-02 | 2023-10-24 | 浙江工商大学 | Intelligent prediction method for software defined network performance |
CN113098771A (en) * | 2021-03-26 | 2021-07-09 | 哈尔滨工业大学 | Distributed self-adaptive QoS routing method based on Q learning |
CN113347108B (en) * | 2021-05-20 | 2022-08-02 | 中国电子科技集团公司第七研究所 | SDN load balancing method and system based on Q-learning |
CN113347108A (en) * | 2021-05-20 | 2021-09-03 | 中国电子科技集团公司第七研究所 | SDN load balancing method and system based on Q-learning |
CN113347104A (en) * | 2021-05-31 | 2021-09-03 | 国网山东省电力公司青岛供电公司 | SDN-based routing method and system for power distribution Internet of things |
WO2022257917A1 (en) * | 2021-06-11 | 2022-12-15 | 华为技术有限公司 | Path planning method and related device |
CN113489654A (en) * | 2021-07-06 | 2021-10-08 | 国网信息通信产业集团有限公司 | Routing method, routing device, electronic equipment and storage medium |
CN113489654B (en) * | 2021-07-06 | 2024-01-05 | 国网信息通信产业集团有限公司 | Routing method, device, electronic equipment and storage medium |
CN114124828A (en) * | 2022-01-27 | 2022-03-01 | 广东省新一代通信与网络创新研究院 | Machine learning method and device based on programmable switch |
CN114845359A (en) * | 2022-03-14 | 2022-08-02 | 中国人民解放军军事科学院战争研究院 | Multi-intelligent heterogeneous network selection method based on Nash Q-Learning |
CN115550236B (en) * | 2022-08-31 | 2024-04-30 | 国网江西省电力有限公司信息通信分公司 | Data protection method oriented to security middle station resource pool route optimization |
CN115941579B (en) * | 2022-11-10 | 2024-04-26 | 北京工业大学 | Mixed routing method based on deep reinforcement learning |
CN117033005A (en) * | 2023-10-07 | 2023-11-10 | 之江实验室 | Deadlock-free routing method and device, storage medium and electronic equipment |
CN117033005B (en) * | 2023-10-07 | 2024-01-26 | 之江实验室 | Deadlock-free routing method and device, storage medium and electronic equipment |
Also Published As
Publication number | Publication date |
---|---|
CN110986979B (en) | 2021-09-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110986979B (en) | SDN multi-path routing planning method based on reinforcement learning | |
CN109361601B (en) | SDN route planning method based on reinforcement learning | |
CN106789648B (en) | Software defined network route decision method based on content storage and Network status | |
CN103210617B (en) | For reducing the method and system of message in network and computing cost | |
CN112822109B (en) | SDN core network QoS route optimization method based on reinforcement learning | |
CN112600759B (en) | Multipath traffic scheduling method and system based on deep reinforcement learning under Overlay network | |
CN107294852B (en) | Network routing method using topology dispersed short path set | |
CN101986628B (en) | Method for realizing multisource multicast traffic balance based on ant colony algorithm | |
CN104580165A (en) | Cooperative caching method in intelligence cooperative network | |
CN114143264A (en) | Traffic scheduling method based on reinforcement learning in SRv6 network | |
CN101674220B (en) | Forwarding history-based asynchronous rooting algorithm | |
Haeri et al. | A reinforcement learning-based algorithm for deflection routing in optical burst-switched networks | |
CN102427596A (en) | Routing method and scheduling method of node mobile network assisted by positioning information | |
CN109922161A (en) | Content distribution method, system, equipment and the medium of dynamic cloud content distributing network | |
JP4589978B2 (en) | Route setting method and route setting device | |
Li et al. | A data forwarding mechanism based on deep reinforcement learning for deterministic networks | |
CN115037669A (en) | Cross-domain data transmission method based on federal learning | |
CN114745322A (en) | Video stream routing method based on genetic algorithm in SDN environment | |
Wang et al. | Implementation of multipath network virtualization scheme with SDN and NFV | |
Iqbal | Cache-mab: A reinforcement learning-based hybrid caching scheme in named data networks | |
Liu et al. | Ppo-based reliable concurrent transmission control for telemedicine real-time services | |
CN109450809B (en) | Data center scheduling system and method | |
Dong et al. | Topology control mechanism based on link available probability in aeronautical ad hoc network | |
CN100442758C (en) | Multicast transfer route setting method, and multicast label switching method for implementing former method | |
Chunxin et al. | A hybrid scatter search algorithm for QoS multicast routing problem |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |