CN109361601B - SDN route planning method based on reinforcement learning - Google Patents

SDN route planning method based on reinforcement learning Download PDF

Info

Publication number
CN109361601B
CN109361601B CN201811292342.XA CN201811292342A CN109361601B CN 109361601 B CN109361601 B CN 109361601B CN 201811292342 A CN201811292342 A CN 201811292342A CN 109361601 B CN109361601 B CN 109361601B
Authority
CN
China
Prior art keywords
node
reinforcement learning
action
flow
state
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811292342.XA
Other languages
Chinese (zh)
Other versions
CN109361601A (en
Inventor
李传煌
卢正勇
吴艳
唐豪
任云方
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Gongshang University
Original Assignee
Zhejiang Gongshang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Gongshang University filed Critical Zhejiang Gongshang University
Priority to CN201811292342.XA priority Critical patent/CN109361601B/en
Publication of CN109361601A publication Critical patent/CN109361601A/en
Application granted granted Critical
Publication of CN109361601B publication Critical patent/CN109361601B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/02Topology update or discovery
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/02Topology update or discovery
    • H04L45/08Learning-based routing, e.g. using neural networks or artificial intelligence
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/12Shortest path evaluation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/12Avoiding congestion; Recovering from congestion
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/24Traffic characterised by specific attributes, e.g. priority or QoS

Abstract

The invention discloses an SDN route planning method based on reinforcement learning, which comprises the following steps: in an SDN control plane, a reinforcement learning model capable of generating a route is constructed by adopting Q learning in reinforcement learning, a reward function in a Q learning algorithm is designed, and different reward values are generated according to different QoS levels of flow; and inputting a current network topology matrix, flow characteristics and QoS (quality of service) grades of flows into a reinforcement learning model for training, thereby realizing SDN (software defined network) routing planning of flow differentiation and finding a shortest forwarding path which meets the QoS requirement of each flow. The method utilizes the characteristics of reinforcement learning, continuous interaction of environment and strategy adjustment, and has high link utilization rate and capability of effectively reducing network congestion compared with a Dijkstra algorithm commonly used in the traditional routing planning.

Description

SDN route planning method based on reinforcement learning
Technical Field
The invention relates to the field of network communication technology and reinforcement learning, in particular to an SDN route planning method based on reinforcement learning.
Background
Internet traffic data continuously increases, which causes problems of sharply increased bandwidth consumption, difficulty in ensuring service quality, increased safety problems and the like, internet and various industries are inseparable and become the industries with the widest prospect at present, however, with the popularization of internet and the increase of internet services, various industries and individual users can generate thousands of network information traffic every day, such as file transmission, voice communication, network games and the like, new application modes and requirements are not continuous, a traditional network architecture cannot deal with the rapidly developed internet, and various problems such as insufficient network address space, increasingly overstaffed equipment, difficulty in ensuring service quality and the like are faced.
The Software Defined Network (SDN) is an innovative network architecture proposed by the Clean Slate research group of stanford university, usa in 2007, the initiation purpose of the SDN is to "reshape the Internet", and as a novel network architecture, a brand new technology is provided for solving the existing network problem, and the core idea of the SDN is to separate a network device control plane from a data plane by means of OpenFlow, thereby realizing flexible control of network resources.
SDN is a programmable network architecture with a control plane separate from a data forwarding plane. Thus, the routing algorithm of the SDN may be customized by software. When one flow comes to the switch, a routing algorithm on the SDN control plane starts to plan a route, then a flow table is generated according to the route, and the flow table is issued to the switch by the SDN controller to complete data packet forwarding.
Currently, mainstream SDN controllers such as POX, FloodLight, and the like all provide modules for completing packet forwarding, and basically adopt Dijkstra (shortest path) algorithm. The Dijkstra algorithm searches for a shortest path from an originating node to a destination node for packet forwarding each time. However, if all the packets are forwarded by only relying on the shortest path algorithm, a serious problem will be caused, and data flows are easy to gather together by selecting the same forwarding path, which greatly reduces the link utilization rate and also easily causes network congestion. Some multi-path protocols exist that also do not take into account the quality of service (QoS) requirements of the different traffic flows, which is limiting from a path optimization point of view, since it does not take into account the traffic status of the whole network.
Disclosure of Invention
The invention provides an SDN route planning method based on reinforcement learning, aiming at overcoming the defects of the Dijkstra algorithm. Compared with the traditional Dijkstra algorithm, the method has the advantages that the link utilization rate is high and the network congestion can be effectively reduced by utilizing the characteristics of continuous interaction and strategy adjustment of reinforcement learning and environment.
The technical scheme adopted by the invention for solving the technical problem is as follows: an SDN route planning method based on reinforcement learning is disclosed, which comprises the following steps: in an SDN control plane, a model capable of generating a route is constructed by adopting Q learning in reinforcement learning, a reward function in a Q learning algorithm is designed, and different reward values are generated according to different QoS levels of flow; and inputting a current network topology matrix, flow characteristics and QoS (quality of service) grades of flows into a reinforcement learning model for training, thereby realizing SDN (software defined network) routing planning of flow differentiation and finding a shortest forwarding path which meets the QoS requirement of each flow.
Further, the method has the flow characteristics that: start, end and size of the flow.
Further, the reinforcement learning model is constructed by the following method:
setting the maximum step number of single training, adopting an action strategy P to select an action a, executing the action a, obtaining a next step state s' and a reward value r, updating Q (s, a) according to a quality updating function, and repeating the operation until the terminal is reached.
Further, the function required by the reinforcement learning model is constructed by the following method:
(1) selecting an action a according to the formula (1), wherein the action strategy adopts an epsilon-greedy strategy,
Figure GDA0001915519570000021
wherein pi (a | s) ═ P (A)t=a|StS) is expressed as the probability that the decision maker selects the action a in a certain state s, and epsilon is expressed as the probability that the decision maker takes a random strategy, i.e. selects possible actions with equal probability; adopting a greedy strategy for the probability of 1-epsilon, namely selecting the action with the maximum corresponding quality value; a(s) represents the set of actions a decision-maker may take in state s; q (s, a) represents a quality set obtained by selecting different actions a in a state s;
(2) the prize value is calculated according to equation (2),
Figure GDA0001915519570000022
wherein i, j represent nodes in the network, Rt(St,At|i→j) Represents the reward value resulting from selecting action At (jumping from node i to node j) while in state St; wherein, BtotalRepresents the total bandwidth of the link from node i to node j, B represents the residual bandwidth of the link from node i to node j, BminRepresents the minimum bandwidth required by the flow (namely the size of the flow), beta represents the QoS level of the flow, d represents a destination node, delta (j-d) represents that if the next hop of j is an end point d, the impulse function value is 1, T represents the condition that the nodes are connected, and T [ S ]t][At]Not equal to-1 indicates that node i is connected to node j, T [ S ]t][At]-1 indicates that node i is not connected to node j;
(3) updating the quality function using a Q learning algorithm according to equation (3),
Figure GDA0001915519570000031
wherein γ ∈ [0, 1]]Referred to as discount rate, indicating how important the future award is relative to the current award; alpha is belonged to 0, 1]Called the learning rate, determines the degree of coverage of the newly acquired information with respect to the old information; rt+1Indicating the value of the prize, S, earned at time tt+1Represents the state at time t +1, AtRepresents the action at time t, StRepresenting the state at time t, Q (S)t+1,At) Is shown in state St+1Take action AtThe resulting mass, Q (S)t,At) Is shown in state StTake action AtThe resulting mass, Q (S)t+1A) represents in state St+1The resulting quality set when taking different actions a.
Compared with the prior art, the invention has the following beneficial effects: the method finds the shortest forwarding path meeting the QoS requirement of each flow according to the QoS grades of different flows, has high link utilization rate and can effectively reduce network congestion.
Drawings
Figure 1 is an SDN routing planning architecture diagram;
figure 2 is a SDN network topology.
Detailed Description
The present invention will be described in detail below with reference to the accompanying drawings.
Aiming at the fact that the current mainstream SDN network forwarding data packet basically adopts a Dijkstra algorithm, reinforcement learning is applied to route planning, the characteristics of SDN architecture centralized control, easy link information acquisition and programmable are utilized, the acquired current network topology matrix, flow characteristics and QoS (quality of service) grade are input into a reinforcement learning model, and the model can output the optimal forwarding path of flow from a starting point to an end point.
The SDN route planning method based on reinforcement learning provided by the invention utilizes the characteristics of continuous interaction and strategy adjustment of reinforcement learning and environment, has high link utilization rate compared with the traditional Dijkstra algorithm, and can effectively reduce network congestion. The method comprises the following steps: and designing a reward function according to the QoS grades of different flows, and finding the shortest forwarding path which meets the QoS requirement of each flow.
An SDN route planning architecture is shown in FIG. 1, a reinforcement learning model is constructed to generate a route, the model is deployed on an SDN control plane, a current network topology matrix, flow characteristics (including a starting point, an end point and a size of flow) and QoS (quality of service) grades are input, and after the model is trained for multiple times by using input, an optimal forwarding path from the starting point to the end point can be output.
2. Setting the maximum step number of single training during each training, adopting an action strategy P to select an action a, executing the action a, obtaining a next step state s' and an incentive value r, updating Q (s, a) according to a quality updating function, and repeating the operation until the end point is reached.
3. The function required by the reinforcement learning model is constructed by the following method:
(1) selecting an action a according to the formula (1), wherein the action strategy adopts an epsilon-greedy strategy,
Figure GDA0001915519570000041
wherein pi (a | s) ═ P (A)t=a|StS) is expressed as the probability that the decision maker selects the action a in a certain state s, and epsilon is expressed as the probability that the decision maker takes a random strategy, i.e. selects possible actions with equal probability; adopting a greedy strategy for the probability of 1-epsilon, namely selecting the action with the maximum corresponding quality value; a(s) represents the set of actions a decision-maker may take in state s; q (s, a) represents a quality set obtained by selecting different actions a in a state s;
(2) the prize value is calculated according to equation (2),
Figure GDA0001915519570000042
wherein i, j represent nodes in the network, Rt(St,At|i→j) Represents the reward value resulting from selecting action At (jumping from node i to node j) while in state St; wherein, BtotalRepresents the total bandwidth of the link from node i to node j, B represents the residual bandwidth of the link from node i to node j, BminRepresents the minimum bandwidth required by the flow (namely the size of the flow), beta represents the QoS level of the flow, d represents a destination node, delta (j-d) represents that if the next hop of j is an end point d, the impulse function value is 1, T represents the condition that the nodes are connected, and T [ S ]t][At]Not equal to-1 indicates that node i is connected to node j, T [ S ]t][At]-1 indicates that node i is not connected to node j;
(3) updating the quality function using a Q learning algorithm according to equation (3),
Figure GDA0001915519570000043
wherein γ ∈ [0, 1]]Referred to as discount rate, indicating how important the future award is relative to the current award; alpha is belonged to 0, 1]Called the learning rate, determines the degree of coverage of the newly acquired information with respect to the old information; rt+1Indicating the value of the prize, S, earned at time tt+1Is shown inState at time t +1, AtRepresents the action at time t, StRepresenting the state at time t, Q (S)t+1,At) Is shown in state St+1Take action AtThe resulting mass, Q (S)t,At) Is shown in state StTake action AtThe resulting mass, Q (S)t+1A) represents in state St+1The resulting quality set when taking different actions a.
Example (b):
the specific routing algorithm pseudo-code is described as follows:
Figure GDA0001915519570000044
Figure GDA0001915519570000051
the present invention will be further described with reference to the following examples.
The shortest path planning method involved in the present invention can be described as follows:
in an SDN network with 25 OpenFlow switches and 10 hosts, the SDN network topology is shown in fig. 2, and the topology relationship can be described by a matrix of 25 × 25. The topology matrix T is set to 0 if two switches are connected and not set to-1 as shown below. For example: t [0] [0] ═ 1 denotes that switch s1 is disconnected from s1, and T [0] [1] ═ 0 denotes that switch s1 is connected to s 2. Define the state set S ═ { S1, S2, S3, …, S24, S25}, the action set a for each state S ∈ S (S) { x | T [ S ] [ x ] ≠ 1}
Figure GDA0001915519570000052
One of the hosts hopes to send a message to another node, the sender is a starting point, the receiver is an end point, and the controller performs routing planning under the condition of obtaining the starting point, the end point and the network topology structure, so that the shortest path from the starting point to the end point meeting the QoS grade is realized.
Randomly selecting one node as a starting point, one node as an end point, setting the total training times to be 300, setting the maximum number of steps of single training to be 50, selecting one behavior a from all possible behaviors of the current state s before reaching the destination node, executing the behavior a to obtain the next state s', updating Q (s, a) according to the quality function updating formula, and repeating the steps until the current state s is the target state, wherein the behavior strategy is an epsilon-greedy strategy (epsilon-0.1), the learning rate alpha is 0.7, and the discount rate gamma is 0.8.
The final result of the Q learning algorithm is a Q matrix from which a shortest path can be selected that satisfies the QoS class from the starting point to the end point. When a service request arrives, the controller can easily find a shortest path satisfying the QoS grade from the trained Q matrix according to the start address information and the destination address information carried by the controller.

Claims (2)

1. An SDN route planning method based on reinforcement learning is characterized in that the method comprises the following steps: in an SDN control plane, a reinforcement learning model capable of generating a route is constructed by adopting Q learning in reinforcement learning, a reward function in a Q learning algorithm is designed, and different reward values are generated according to different QoS levels of flow; inputting a current network topology matrix, flow characteristics and QoS (quality of service) grades of flows into a reinforcement learning model for training, thereby realizing SDN (software defined network) routing planning of flow differentiation and finding a shortest forwarding path which meets the QoS requirement of each flow;
the reinforcement learning model is constructed by the following method:
setting the maximum step number of single training, selecting an action a by adopting an action strategy P, executing the action a, obtaining a next step state s' and an incentive value r, updating Q (s, a) according to a quality updating function, and repeating the operation until the end point is reached;
the function required by the reinforcement learning model is constructed by the following method:
(1) selecting an action a according to the formula (1), wherein the action strategy adopts an epsilon-greedy strategy,
Figure FDA0002779145270000011
wherein pi (a | s) ═ P (A)t=a|StS) is expressed as the probability that the decision maker selects the action a in a certain state s, and epsilon is expressed as the probability that the decision maker takes a random strategy, i.e. selects possible actions with equal probability; adopting a greedy strategy for the probability of 1-epsilon, namely selecting the action with the maximum corresponding quality value; a(s) represents the set of actions a decision-maker may take in state s; q (s, a) represents a quality set obtained by selecting different actions a in a state s;
(2) the prize value is calculated according to equation (2),
Figure FDA0002779145270000012
wherein i, j represent nodes in the network, Rt(St,At|i→j) Indicates being in state StHour selection action At(jump from node i to node j), the resulting reward value; wherein, BtotalRepresents the total bandwidth of the link from node i to node j, B represents the residual bandwidth of the link from node i to node j, BminRepresents the minimum bandwidth required by the flow (namely the size of the flow), beta represents the QoS level of the flow, d represents a destination node, delta (j-d) represents that if the next hop of j is an end point d, the value of delta (j-d) is 1, T represents the condition that the nodes are connected, and T [ S ]t][At]Not equal to-1 indicates that node i is connected to node j, T [ S ]t][At]-1 indicates that node i is not connected to node j;
(3) updating the quality function using a Q learning algorithm according to equation (3),
Figure FDA0002779145270000021
wherein γ ∈ [0, 1]]Called discount rate, indicating future awards relative to current awardsThe degree of importance; alpha is belonged to 0, 1]Called the learning rate, determines the degree of coverage of the newly acquired information with respect to the old information; rt+1Indicating the value of the prize, S, earned at time tt+1Represents the state at time t +1, AtRepresents the action at time t, StRepresenting the state at time t, Q (S)t+1,At) Is shown in state St+1Take action AtThe resulting mass, Q (S)t,At) Is shown in state StTake action AtThe resulting mass, Q (S)t+1A) represents in state St+1The resulting quality set when taking different actions a.
2. The reinforcement learning-based SDN route planning method of claim 1, wherein the traffic characteristics comprise a start point, an end point and a size of traffic.
CN201811292342.XA 2018-10-31 2018-10-31 SDN route planning method based on reinforcement learning Active CN109361601B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811292342.XA CN109361601B (en) 2018-10-31 2018-10-31 SDN route planning method based on reinforcement learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811292342.XA CN109361601B (en) 2018-10-31 2018-10-31 SDN route planning method based on reinforcement learning

Publications (2)

Publication Number Publication Date
CN109361601A CN109361601A (en) 2019-02-19
CN109361601B true CN109361601B (en) 2021-03-30

Family

ID=65343754

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811292342.XA Active CN109361601B (en) 2018-10-31 2018-10-31 SDN route planning method based on reinforcement learning

Country Status (1)

Country Link
CN (1) CN109361601B (en)

Families Citing this family (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110081893B (en) * 2019-04-01 2020-09-25 东莞理工学院 Navigation path planning method based on strategy reuse and reinforcement learning
CN110290510A (en) * 2019-05-07 2019-09-27 天津大学 Support the edge cooperation caching method under the hierarchical wireless networks of D2D communication
CN110365514B (en) * 2019-05-24 2020-10-16 北京邮电大学 SDN multistage virtual network mapping method and device based on reinforcement learning
CN110601973B (en) * 2019-08-26 2022-04-05 中移(杭州)信息技术有限公司 Route planning method, system, server and storage medium
CN110611619B (en) * 2019-09-12 2020-10-09 西安电子科技大学 Intelligent routing decision method based on DDPG reinforcement learning algorithm
CN110995619B (en) * 2019-10-17 2021-09-28 北京邮电大学 Service quality aware virtual network mapping method and device
CN110768906B (en) * 2019-11-05 2022-08-30 重庆邮电大学 SDN-oriented energy-saving routing method based on Q learning
CN110635973B (en) * 2019-11-08 2022-07-12 西北工业大学青岛研究院 Backbone network flow determining method and system based on reinforcement learning
CN110986979B (en) * 2019-11-27 2021-09-10 浙江工商大学 SDN multi-path routing planning method based on reinforcement learning
CN115643210A (en) * 2019-11-30 2023-01-24 华为技术有限公司 Control data packet sending method and system
CN111416771B (en) * 2020-03-20 2022-02-25 深圳市大数据研究院 Method for controlling routing action based on multi-agent reinforcement learning routing strategy
CN111479306B (en) * 2020-04-02 2023-08-04 中国科学院上海微系统与信息技术研究所 Q-learning-based flight ad hoc network QoS routing method
CN111770019B (en) * 2020-05-13 2021-06-15 西安电子科技大学 Q-learning optical network-on-chip self-adaptive route planning method based on Dijkstra algorithm
CN112087489B (en) * 2020-08-05 2023-06-30 北京工联科技有限公司 Relay forwarding selection method and system for online mobile phone game network transmission
CN112039767B (en) * 2020-08-11 2021-08-31 山东大学 Multi-data center energy-saving routing method and system based on reinforcement learning
CN111953603A (en) * 2020-08-20 2020-11-17 福建师范大学 Method for defining Internet of things security routing protocol based on deep reinforcement learning software
CN112260953A (en) * 2020-10-21 2021-01-22 中电积至(海南)信息技术有限公司 Multi-channel data forwarding decision method based on reinforcement learning
CN112365077B (en) * 2020-11-20 2022-06-21 贵州电网有限责任公司 Construction method of intelligent storage scheduling system for power grid defective materials
CN112822109B (en) * 2020-12-31 2023-04-07 上海缔安科技股份有限公司 SDN core network QoS route optimization method based on reinforcement learning
CN113347102B (en) * 2021-05-20 2022-08-16 中国电子科技集团公司第七研究所 SDN link surviving method, storage medium and system based on Q-learning
CN113556287B (en) * 2021-06-15 2022-10-14 南京理工大学 Software defined network routing method based on multi-agent reinforcement learning
CN113610271B (en) * 2021-07-01 2023-05-02 四川大学 Multi-Agent airport scene sliding path planning method based on historical data analysis
CN113507412B (en) * 2021-07-08 2022-04-19 中国人民解放军国防科技大学 SRv6 router progressive deployment method, system and storage medium in network interconnection
CN117033005B (en) * 2023-10-07 2024-01-26 之江实验室 Deadlock-free routing method and device, storage medium and electronic equipment

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102571570A (en) * 2011-12-27 2012-07-11 广东电网公司电力科学研究院 Network flow load balancing control method based on reinforcement learning
CN104967533A (en) * 2015-05-26 2015-10-07 国网智能电网研究院 Method and apparatus of adding IEC 61850 configuration interface to SDN controller
CN105007224A (en) * 2015-07-28 2015-10-28 清华大学 System and method for intercommunication between SDN (Software Defined Networking) network and IP (Internet Protocol) network
WO2015167372A1 (en) * 2014-04-29 2015-11-05 Telefonaktiebolaget L M Ericsson (Publ) Identification of suitable network service points
US9197568B2 (en) * 2012-10-22 2015-11-24 Electronics And Telecommunications Research Institute Method for providing quality of service in software-defined networking based network and apparatus using the same
US9225635B2 (en) * 2012-04-10 2015-12-29 International Business Machines Corporation Switch routing table utilizing software defined network (SDN) controller programmed route segregation and prioritization
CN105681191A (en) * 2016-02-25 2016-06-15 武汉烽火网络有限责任公司 SDN (Software Defined Network) platform based on router virtualization and implementation method
CN107547379A (en) * 2016-06-23 2018-01-05 华为技术有限公司 The method and relevant device of route test action are generated in software defined network
CN108401015A (en) * 2018-02-02 2018-08-14 广州大学 A kind of data center network method for routing based on deeply study
CN108540384A (en) * 2018-04-13 2018-09-14 西安交通大学 Intelligent heavy route method and device based on congestion aware in software defined network

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10110465B2 (en) * 2016-07-27 2018-10-23 Cisco Technology, Inc. Distributed HSRP gateway in VxLAN flood and learn environment with faster convergence

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102571570A (en) * 2011-12-27 2012-07-11 广东电网公司电力科学研究院 Network flow load balancing control method based on reinforcement learning
US9225635B2 (en) * 2012-04-10 2015-12-29 International Business Machines Corporation Switch routing table utilizing software defined network (SDN) controller programmed route segregation and prioritization
US9197568B2 (en) * 2012-10-22 2015-11-24 Electronics And Telecommunications Research Institute Method for providing quality of service in software-defined networking based network and apparatus using the same
WO2015167372A1 (en) * 2014-04-29 2015-11-05 Telefonaktiebolaget L M Ericsson (Publ) Identification of suitable network service points
CN104967533A (en) * 2015-05-26 2015-10-07 国网智能电网研究院 Method and apparatus of adding IEC 61850 configuration interface to SDN controller
CN105007224A (en) * 2015-07-28 2015-10-28 清华大学 System and method for intercommunication between SDN (Software Defined Networking) network and IP (Internet Protocol) network
CN105681191A (en) * 2016-02-25 2016-06-15 武汉烽火网络有限责任公司 SDN (Software Defined Network) platform based on router virtualization and implementation method
CN107547379A (en) * 2016-06-23 2018-01-05 华为技术有限公司 The method and relevant device of route test action are generated in software defined network
CN108401015A (en) * 2018-02-02 2018-08-14 广州大学 A kind of data center network method for routing based on deeply study
CN108540384A (en) * 2018-04-13 2018-09-14 西安交通大学 Intelligent heavy route method and device based on congestion aware in software defined network

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
SDN下基于强化学习的路由规划算法研究;程成;《中知网》;20180430;第3-4章 *

Also Published As

Publication number Publication date
CN109361601A (en) 2019-02-19

Similar Documents

Publication Publication Date Title
CN109361601B (en) SDN route planning method based on reinforcement learning
CN110986979B (en) SDN multi-path routing planning method based on reinforcement learning
CN112822109B (en) SDN core network QoS route optimization method based on reinforcement learning
CN105471764B (en) A kind of method of end-to-end QoS guarantee in SDN network
CN102651710B (en) Method and system for routing information in a network
WO2017078922A1 (en) Apparatus and method for network flow scheduling
CN109951335B (en) Satellite network delay and rate combined guarantee routing method based on time aggregation graph
CN114422423B (en) Satellite network multi-constraint routing method based on SDN and NDN
CN109413707B (en) Intelligent routing method based on deep reinforcement learning technology in wireless network environment
CN104883304B (en) For part entangled quantum to the method for routing of bridge communications network
CN101155118A (en) BGP routing processing method and device
CN105490962A (en) QoS management method based on OpenFlow network
CN112600759A (en) Multipath traffic scheduling method and system based on deep reinforcement learning under Overlay network
CN101986628B (en) Method for realizing multisource multicast traffic balance based on ant colony algorithm
Oužecki et al. Reinforcement learning as adaptive network routing of mobile agents
Zheng et al. ONU placement in fiber-wireless (FiWi) networks considering peer-to-peer communications
CN105743804A (en) Data flow control method and system
CN109547505B (en) Multipath TCP transmission scheduling method based on reinforcement learning
CN103905318B (en) Send, method, controller and the forward node of loading forwarding-table item
CN106656795A (en) Wireless sensor and actor networks clustering routing method
CN111865789B (en) SR path constraint method based on segment routing
JP2008219067A (en) Route calculation apparatus, method and program
Devarajan et al. An enhanced cluster gateway switch routing protocol (ECGSR) for congestion control using AODV algorithm in MANET
CN113098771A (en) Distributed self-adaptive QoS routing method based on Q learning
CN104053208A (en) Route method and device based on channel allocation in wireless ad hoc network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant