CN110986979A - SDN multi-path routing planning method based on reinforcement learning - Google Patents

SDN multi-path routing planning method based on reinforcement learning Download PDF

Info

Publication number
CN110986979A
CN110986979A CN201911183909.4A CN201911183909A CN110986979A CN 110986979 A CN110986979 A CN 110986979A CN 201911183909 A CN201911183909 A CN 201911183909A CN 110986979 A CN110986979 A CN 110986979A
Authority
CN
China
Prior art keywords
reinforcement learning
path
routing
flow
data packet
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911183909.4A
Other languages
Chinese (zh)
Other versions
CN110986979B (en
Inventor
李传煌
方春涛
卢正勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Gongshang University
Original Assignee
Zhejiang Gongshang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Gongshang University filed Critical Zhejiang Gongshang University
Priority to CN201911183909.4A priority Critical patent/CN110986979B/en
Publication of CN110986979A publication Critical patent/CN110986979A/en
Application granted granted Critical
Publication of CN110986979B publication Critical patent/CN110986979B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C21/00Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
    • G01C21/26Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 specially adapted for navigation in a road network
    • G01C21/34Route searching; Route guidance
    • G01C21/3446Details of route searching algorithms, e.g. Dijkstra, A*, arc-flags, using precalculated routes

Abstract

The invention discloses an SDN multi-path route planning method based on reinforcement learning, which comprises the following steps: applying reinforcement learning to SDN multi-path routing planning, using a QLearning algorithm as a reinforcement learning model, and generating different reward values according to different QoS (quality of service) levels of flow; setting different reward functions for the flows with different QoS levels according to the input network topology matrix and the current flow characteristic matrix to be forwarded, and planning a plurality of paths to forward the flows; and under the condition that the link bandwidth is not enough, a larger flow is divided into a plurality of small flows, so that the link bandwidth utilization rate is improved. The invention utilizes the characteristics of reinforcement learning, continuous interaction with environment and strategy adjustment, and can realize high link utilization rate and effectively reduce network congestion compared with the traditional single-path routing planning.

Description

SDN multi-path routing planning method based on reinforcement learning
Technical Field
The invention relates to the field of network communication technology and reinforcement learning, in particular to an SDN multi-path route planning method based on reinforcement learning.
Background
In recent years, with the popularization of the internet, particularly with the appearance of related technologies such as cloud computing and big data, the internet has entered a rapid development period. The rapid development of the internet enables the data volume of network transmission services to increase rapidly, and particularly, in recent years, with the rise of short video and live broadcast platforms, the interaction of network services is more real-time, and a terminal user puts higher demands on the Quality of Service (QoS) of the network services. However, in the case of limited network resources, the continuous increase of internet traffic data may cause problems such as a sharp increase of bandwidth consumption, difficulty in guaranteeing quality of service, and an increase of security problems. Obviously, the traditional network architecture is difficult to meet the diversified requirements of users. In view of the foregoing, there is a need in the internet industry for a new network architecture that addresses existing network problems, and that is more flexible and efficient than conventional architectures to meet the ever-increasing traffic data needs of society.
The SDN is a novel network architecture, is widely concerned by various borders, and solves some problems which cannot be avoided in the traditional network. In the traditional network architecture, each device can independently make a forwarding rule and transmit information through a series of network protocols (such as TCP/IP), under the system structure, the control and the forwarding of the network device are closely coupled, the network device can only plan a path by taking the network device as a center as a flow service and does not have network global resource information, and the problems of network link congestion and the like are easily caused. SDN forwards and control separation, can obtain link information in real time through the OpenFlow protocol, be favorable to the centralized control of network for the control layer obtains network global resource information, and carry out unified management and distribution according to the demand of business, and simultaneously, centralized control still makes whole network regard as a whole, convenient maintenance. Compared with the traditional IP network, the SDN network solves the problems of inaccurate routing information, low routing efficiency and the like of the traditional network, and lays a foundation for realizing intelligent routing planning according to the requirements of different flows. Therefore, the research on the SDN network architecture is of great significance. Routing is an indispensable component of both traditional networks and SDN networks, however, the basic adopted by the current mainstream SDN routing modules is Dijkstra (shortest path) algorithm, if all data packets depend on the shortest path algorithm only, data flows are easy to cause link congestion due to the selection of the same link, and other links are placed in an idle state, which greatly reduces link utilization. On the other hand, the shortest path algorithm is an algorithm for finding the shortest path in graph theory, and when the algorithm is operated, the shortest path from the source node to all other nodes in the topology is actually obtained, so the time complexity of the algorithm is high. There are also protocols that support multipath, such as ECMP, but these protocols do not take into account the quality of service requirements of different traffic streams. Therefore, a better routing strategy is needed in the SDN network to generate routes, improve the performance of the network, and guarantee the service quality of different service flows.
Disclosure of Invention
The invention provides a high-bandwidth-utilization-oriented SDN intelligent routing planning technology, which mainly adopts the shortest path as a routing planning algorithm around the current SDN network, so that the problem of low link bandwidth utilization rate and the like is caused.
The technical scheme adopted by the invention for solving the technical problem is as follows: an SDN multi-path route planning method based on reinforcement learning comprises the following steps:
step 1: acquiring available bandwidth information, total bandwidth information, node information and link information of a network to construct a network topology matrix, and acquiring a characteristic matrix of a stream to be forwarded;
step 2: a QLearning algorithm is adopted as a reinforcement learning model, and the network topology matrix and the characteristic matrix of the flow to be forwarded in the step 1 are input into a reinforcement learning model training Q value table; the reward function R in the QLearning algorithm is as follows:
Figure RE-GDA0002379835710000021
wherein: rt(Si,Aj) Indicating slave status S of a data packetiSelection action AjThe obtained reward is represented in a routing planning task as the reward generated when the next hop selected by the data packet at the node i is the node j, β is the flow QoS grade, η is the bandwidth utilization rate, d is the destination node, delta (j-d) is an impulse function and represents that when the next hop of the data packet is the destination node, the value is 1, T is the connection state of the network topology nodes, 1 when the two nodes are connected and 0 when the two nodes are not connected, and g (x) is) As a cost function, the following is shown:
Figure RE-GDA0002379835710000022
in the formula ImIs the total number of links of the network topology. x is the hop count passed by the data packet in forwarding;
and step 3: and obtaining a path Routing according to the Q value table, putting the path Routing into a path set Routing (S, D), and judging whether the minimum link bandwidth of the Routing is smaller than the bandwidth of the stream. If so, a slice of size is divided from the stream
Figure RE-GDA0002379835710000023
Wherein B isCan be usedRepresents the minimum link available bandwidth for the current output path, β represents the Qos level, Σ, of the current flowiβiIndicating the overall QoS class. The divided streams are passed from the source node to the target node through the current output path. Taking the residual flow as a new flow to flow back to the step 2 to train the Q value table again; if not, the planning is finished, and the planned multi-path route is obtained from the Routing (S, D).
Further, the traffic characteristic matrix to be forwarded includes a source address, a destination address, a QoS class, and a traffic size of the flow.
Further, the process of training the Q-value table by the QLearning algorithm is specifically as follows:
and setting the maximum step number of the single training.
(1) Initializing a Q value table and a reward function R;
(2) an action epsilon-greedy strategy P is adopted, and an action a is selected;
(3) executing action a, transferring to a state s', calculating a reward value by using a reward function R, and updating a Q value table;
(4) and judging whether s' is a destination node or not. If not, let s ═ s', return to (2).
Further, the SDN multipath routing planning method based on reinforcement learning is characterized in that the cost function is defined as that the cost increases with the increase of the number x of hops passed by the data packet forwarding, and the cost function g (x) is e (0,1), and the cost function should satisfy: the curve of the cost function g (x) is an upward convex function curve, and when the total hop number of the data packet tends to infinity, the cost function value tends to 1.
The method has the advantages that reinforcement learning is applied to SDN multi-path routing planning, a QLearning algorithm is used as a reinforcement learning model, different reward functions are set for flows with different QoS levels according to an input network topology matrix and a current flow characteristic matrix to be forwarded, a plurality of paths are planned to forward the flows, and a larger flow is divided into a plurality of small flows under the condition that the link bandwidth is not enough, so that the utilization rate of the link bandwidth is improved.
Drawings
Figure 1 is an SDN multi-path routing planning architecture diagram;
figure 2 is a SDN network topology diagram;
FIG. 3 is a graph of a cost function;
fig. 4 is a flow chart of reinforcement learning-based multi-path routing planning.
Detailed Description
Aiming at the fact that the existing SDN control adopts Dijkstra algorithm as the shortest route searching algorithm, the method tries to apply reinforcement learning to the SDN route. And directly using the network topology environment for the training of the Q value table by utilizing the characteristic of SDN forwarding control separation. Considering that different services have different requirements on QoS, the invention provides routes with different service qualities for different services; and under the condition that the link bandwidth is not enough, a larger flow is divided into a plurality of small flows, so that the link bandwidth utilization rate is improved.
As shown in fig. 1, the present invention provides a reinforcement learning based SDN multi-path route planning method, which includes the following steps:
step 1: acquiring available bandwidth information, total bandwidth information, node information and link information of a network to construct a network topology matrix, and constructing a network topology graph as shown in figure 2 by using Mininet, wherein the network topology graph comprises 9 OpenFlow switches and 5 hosts; the method comprises the steps of obtaining a characteristic matrix of the flow to be forwarded, setting the bandwidth of each network link to be 200 according to a multi-path routing planning algorithm and an SDN network topology, setting a sending end to be h 1-h 5 and a receiving end to be h 1-h 5, wherein the sending end randomly sends data to other receiving ends with the probability of 20%, and all hosts send 30 static flows in total, wherein the static flows refer to flows which occupy the bandwidth of the link until the experiment is finished once being injected into the network.
Step 2: as shown in fig. 1, using QLearning algorithm as the reinforcement learning model, the multi-path routing algorithm proposed by the present invention uses markov decision process for modeling, and therefore, the model MDP quadruplet proposed by the present invention is defined as follows:
(1) state collection: in a network topology, each switch represents a state, and thus, according to the network topology, a set of network states is defined herein as follows:
S=[s1,s2,s3,…s9]
wherein s is1~s9Representing 9 OpenFlow switches in the network. The source node information of the data packet indicates an initial state of the data packet, and the destination node information indicates a termination state of the data packet. When a certain data packet reaches the destination node, the data packet reaches the termination state. Once the current data packet reaches the termination state, the termination of one round of training is indicated, and the data packet will return to the initial state again for the next round of training.
(2) An action space: in an SDN network, the transmission path of a data packet is determined by the network state, i.e. the data packet can only be transmitted at connected network nodes. According to the network topology, the network connection state is defined as the following formula:
Figure RE-GDA0002379835710000041
since packets can only be transmitted at connected network nodes, the following set of actions for each state S [ i ] ∈ S can be defined herein according to the set of network states and the network connection state:
A(si)={sj|T[si][sj]=1}
indicates that the current state is at siThe state-selectable action set appears as s on the network topologyiDirectly connected nodes sjI.e. the current state siWill only select the state s connected to itj. For example: state s1The action set of (1) is: a(s)1)={s2,s4}。
(3) And (3) state transition: in each round of training, when the data packet is in state siIf the action is not the selected state of the round, the data packet moves to the next state. Another key issue with reinforcement learning is the generation of reward values. When the Agent generates state transition, the system feeds back a reward to the Agent according to the reward function R.
(4) The final purpose of the multi-path routing planning based on reinforcement learning is to plan reasonable multi-paths through training, so the setting of a reward value R is also important, the bandwidth utilization rate and the delay are mainly considered herein, the delay mainly refers to the hop count of the path, and in order to plan different paths for links with different QoS levels, the hop count of the planned path is considered to be smaller as the traffic level β is larger.
1. QoS level β and link utilization η need to be considered;
2.β large flows are encouraged to allocate paths with fewer hops;
in summary, the reward function formula designed herein is as follows:
Figure RE-GDA0002379835710000051
wherein: rt(Si,Aj) Indicating slave status S of a data packetiSelection action AjThe obtained reward is represented in a routing planning task as the reward generated when the next hop selected by the data packet at the node i is the node j, β is the flow QoS grade, η is the bandwidth utilization rate, d is the destination node, delta (j-d) is an impulse function and represents that when the next hop of the data packet is the destination node, the value is 1, T is the connection state of the network topology nodes, the two nodes are 1 when connected and 0 when not connected, and the reward function represents that the data packet is in the state SiWhen the next hop that can be selected (connected) is j (action a)j) Then, from T [ S ]i][Aj]The bonus function when 1 yields a bonus value, otherwise the bonus value is set to-1.
g (x) is a cost function defined as the cost increases as the number of hops x passed by the packet increases, and the cost function g (x) is e (0,1) with lmFor the total number of links in the network topology, considering that when the network topology is large, it is impractical to walk all paths, and a data packet can only be forwarded through a part of links, so the cost function should satisfy: the early stage is increased quickly and becomes stable to the later stage, if the total number of hops passed by the data packet reaches lmThen the cost function value is maximized. In summary, the cost function is as follows:
Figure RE-GDA0002379835710000061
as shown in fig. 3, it can be seen that the cost function is an increasing function, the range of the increasing function is (0,1), the cost increases with the increase of the number of hops x, and the function grows more rapidly in the early stage and tends to be stable in the later stage, which meets the requirement of the cost function.
The second requirement of designing the reward function is to encourage β flows with large flows to allocate paths with small hop count, so the cost function g (x) in the reward function is multiplied by the traffic QoS class β, at this time, under the same condition, the more the number of the path hops passed by the packet forwarding, the larger the cost required by the flows with large QoS class, and the paths with small hop count will be selected by the flows with large QoS class during planning.
After the MDP quadruples are determined, a Q-value table is trained by using a Qlearning algorithm, which comprises the following specific steps.
And setting the maximum step number of the single training.
(1) Initializing a Q value table and a reward function R;
(2) an action epsilon-greedy strategy P is adopted, and an action a is selected;
(3) executing action a, transferring to a state s', calculating a reward value by using a reward function R, and updating a Q value table;
(4) and judging whether s' is a destination node or not. If not, the process returns to step (2) with s ═ s'.
As shown in fig. 4, the learning rate α is set to 0.8, the discount rate γ is set to 0.6, and the value of the action strategy ∈ -greedy strategy ∈ is equal to 0.1.
And obtaining a path Routing according to the trained Q value table, putting the path Routing into a path set Routing (S, D), and judging whether the minimum link bandwidth of the Routing is smaller than the bandwidth of the stream. If so, a slice of size is divided from the stream
Figure RE-GDA0002379835710000062
Of (a) wherein BCan be usedRepresents the minimum link available bandwidth for the current output path, β represents the Qos level, Σ, of the current flowiβiIndicating the overall QoS class. And the small flow reaches the target node from the source destination node through the current output path. Use of
Figure RE-GDA0002379835710000063
Updating the flow, wherein B represents the size of the planned flow, namely returning the rest flows as new flows to step 2 for the training of the Q value table; if not, the planning is finished, and the planned multi-path route is obtained from the Routing (S, D).
The above-described embodiments are intended to illustrate rather than to limit the invention, and any modifications and variations of the present invention are within the spirit of the invention and the scope of the appended claims.

Claims (4)

1. An SDN multi-path route planning method based on reinforcement learning is characterized by comprising the following steps:
step 1: acquiring available bandwidth information, total bandwidth information, node information and link information of a network to construct a network topology matrix, and acquiring a characteristic matrix of a stream to be forwarded;
step 2: a QLearning algorithm is adopted as a reinforcement learning model, and the network topology matrix and the characteristic matrix of the flow to be forwarded in the step 1 are input into a reinforcement learning model training Q value table; the reward function R in the QLearning algorithm is as follows:
Figure FDA0002291955560000011
wherein: rt(Si,Aj) Indicating slave status S of a data packetiSelection action AjThe obtained reward is represented in a routing planning task as the reward generated when the next hop selected by the data packet at the node i is the node j, β is the flow QoS grade, η is the bandwidth utilization rate, d is the destination node, delta (j-d) is an impulse function and represents that when the next hop node of the data packet is the destination node, the value is 1, T is the connection state of the network topology nodes, the two nodes are 1 when connected and 0 when not connected, and g (x) is a cost function, and the following steps are shown:
Figure FDA0002291955560000012
in the formula ImIs the total number of links of the network topology. x is the hop count passed by the data packet in forwarding;
and step 3: and obtaining a path Routing according to the Q value table, putting the path Routing into a path set Routing (S, D), and judging whether the minimum link bandwidth of the Routing is smaller than the bandwidth of the stream. If so,a division into a size of
Figure FDA0002291955560000013
Wherein B isCan be usedRepresents the minimum link available bandwidth for the current output path, β represents the Qos level, Σ, of the current flowiβiIndicating the overall QoS class. The divided streams are passed from the source node to the target node through the current output path. Taking the residual flow as a new flow to flow back to the step 2 to train the Q value table again; if not, the planning is finished, and the planned multi-path route is obtained from the Routing (S, D).
2. The reinforcement learning-based SDN multipath routing planning method of claim 1, wherein the traffic feature matrix to be forwarded represents information of a flow in a matrix form, wherein the information includes a source address, a destination address, a QoS class, and a traffic size of the flow.
3. The reinforcement learning-based SDN multipath routing planning method of claim 1, wherein the QLearing algorithm training Q-value table specifically comprises the following steps:
after setting the maximum step number of single training, carrying out the following steps;
(1) initializing a Q value table and a reward function R;
(2) an action epsilon-greedy strategy P is adopted, and an action a is selected;
(3) executing action a, transferring to a state s', calculating a reward value by using a reward function R, and updating a Q value table;
(4) and judging whether s' is a destination node or not. If not, the process returns to step (2) with s ═ s'.
4. A reinforcement learning-based SDN multi-path routing planning method according to claim 3, wherein the cost function is defined as that the cost increases as the number of hops x passed by the data packet forwarding increases, and the cost function g (x) is e (0,1), and the cost function should satisfy: the curve of the cost function g (x) is an upward convex function curve, and when the total hop number of the data packet tends to infinity, the cost function value tends to 1.
CN201911183909.4A 2019-11-27 2019-11-27 SDN multi-path routing planning method based on reinforcement learning Active CN110986979B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911183909.4A CN110986979B (en) 2019-11-27 2019-11-27 SDN multi-path routing planning method based on reinforcement learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911183909.4A CN110986979B (en) 2019-11-27 2019-11-27 SDN multi-path routing planning method based on reinforcement learning

Publications (2)

Publication Number Publication Date
CN110986979A true CN110986979A (en) 2020-04-10
CN110986979B CN110986979B (en) 2021-09-10

Family

ID=70087419

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911183909.4A Active CN110986979B (en) 2019-11-27 2019-11-27 SDN multi-path routing planning method based on reinforcement learning

Country Status (1)

Country Link
CN (1) CN110986979B (en)

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111917657A (en) * 2020-07-02 2020-11-10 北京邮电大学 Method and device for determining flow transmission strategy
CN112398733A (en) * 2020-11-24 2021-02-23 新华三大数据技术有限公司 Traffic scheduling forwarding method and device
CN112671648A (en) * 2020-12-22 2021-04-16 北京浪潮数据技术有限公司 SDN data transmission method, SDN, device and medium
CN112822109A (en) * 2020-12-31 2021-05-18 上海缔安科技股份有限公司 SDN core network QoS route optimization algorithm based on reinforcement learning
CN113098771A (en) * 2021-03-26 2021-07-09 哈尔滨工业大学 Distributed self-adaptive QoS routing method based on Q learning
CN113158543A (en) * 2021-02-02 2021-07-23 浙江工商大学 Intelligent prediction method for software defined network performance
CN113347104A (en) * 2021-05-31 2021-09-03 国网山东省电力公司青岛供电公司 SDN-based routing method and system for power distribution Internet of things
CN113347108A (en) * 2021-05-20 2021-09-03 中国电子科技集团公司第七研究所 SDN load balancing method and system based on Q-learning
CN113489654A (en) * 2021-07-06 2021-10-08 国网信息通信产业集团有限公司 Routing method, routing device, electronic equipment and storage medium
CN114124828A (en) * 2022-01-27 2022-03-01 广东省新一代通信与网络创新研究院 Machine learning method and device based on programmable switch
CN114845359A (en) * 2022-03-14 2022-08-02 中国人民解放军军事科学院战争研究院 Multi-intelligent heterogeneous network selection method based on Nash Q-Learning
WO2022257917A1 (en) * 2021-06-11 2022-12-15 华为技术有限公司 Path planning method and related device
US11606265B2 (en) 2021-01-29 2023-03-14 World Wide Technology Holding Co., LLC Network control in artificial intelligence-defined networking
CN117033005A (en) * 2023-10-07 2023-11-10 之江实验室 Deadlock-free routing method and device, storage medium and electronic equipment
CN115941579B (en) * 2022-11-10 2024-04-26 北京工业大学 Mixed routing method based on deep reinforcement learning

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104640168A (en) * 2014-12-04 2015-05-20 北京理工大学 Q-learning based vehicular ad hoc network routing method
CN106713143A (en) * 2016-12-06 2017-05-24 天津理工大学 Adaptive reliable routing method for VANETs
CN107104819A (en) * 2017-03-23 2017-08-29 武汉邮电科学研究院 Adaptive self-coordinating unified communications and communication means based on SDN
US20180034922A1 (en) * 2016-07-28 2018-02-01 At&T Intellectual Property I, L.P. Network configuration for software defined network via machine learning
CN107911299A (en) * 2017-10-24 2018-04-13 浙江工商大学 A kind of route planning method based on depth Q study
CN109005471A (en) * 2018-08-07 2018-12-14 安徽大学 Based on the extensible video stream method of multicasting of QoS Intellisense under SDN environment
CN109361601A (en) * 2018-10-31 2019-02-19 浙江工商大学 A kind of SDN route planning method based on intensified learning
CN109450794A (en) * 2018-12-11 2019-03-08 上海云轴信息科技有限公司 A kind of communication means and equipment based on SDN network
CN109547340A (en) * 2018-12-28 2019-03-29 西安电子科技大学 SDN data center network jamming control method based on heavy-route
US20190123974A1 (en) * 2016-06-23 2019-04-25 Huawei Technologies Co., Ltd. Method for generating routing control action in software-defined network and related device
CN109768940A (en) * 2018-12-12 2019-05-17 北京邮电大学 The flow allocation method and device of multi-service SDN network

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104640168A (en) * 2014-12-04 2015-05-20 北京理工大学 Q-learning based vehicular ad hoc network routing method
US20190123974A1 (en) * 2016-06-23 2019-04-25 Huawei Technologies Co., Ltd. Method for generating routing control action in software-defined network and related device
US20180034922A1 (en) * 2016-07-28 2018-02-01 At&T Intellectual Property I, L.P. Network configuration for software defined network via machine learning
CN106713143A (en) * 2016-12-06 2017-05-24 天津理工大学 Adaptive reliable routing method for VANETs
CN107104819A (en) * 2017-03-23 2017-08-29 武汉邮电科学研究院 Adaptive self-coordinating unified communications and communication means based on SDN
CN107911299A (en) * 2017-10-24 2018-04-13 浙江工商大学 A kind of route planning method based on depth Q study
CN109005471A (en) * 2018-08-07 2018-12-14 安徽大学 Based on the extensible video stream method of multicasting of QoS Intellisense under SDN environment
CN109361601A (en) * 2018-10-31 2019-02-19 浙江工商大学 A kind of SDN route planning method based on intensified learning
CN109450794A (en) * 2018-12-11 2019-03-08 上海云轴信息科技有限公司 A kind of communication means and equipment based on SDN network
CN109768940A (en) * 2018-12-12 2019-05-17 北京邮电大学 The flow allocation method and device of multi-service SDN network
CN109547340A (en) * 2018-12-28 2019-03-29 西安电子科技大学 SDN data center network jamming control method based on heavy-route

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
TRUONG THU HUONG 等: "A global multipath load-balanced routing algorithm Reinforcement Learning", 《2019 INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGY CONVERGENCE (ICTC)》 *
金子晋 等: "SDN环境下基于QLearning算法的业务划分路由选路机制", 《网络与信息安全学报》 *

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111917657A (en) * 2020-07-02 2020-11-10 北京邮电大学 Method and device for determining flow transmission strategy
CN112398733B (en) * 2020-11-24 2022-03-25 新华三大数据技术有限公司 Traffic scheduling forwarding method and device
CN112398733A (en) * 2020-11-24 2021-02-23 新华三大数据技术有限公司 Traffic scheduling forwarding method and device
CN112671648A (en) * 2020-12-22 2021-04-16 北京浪潮数据技术有限公司 SDN data transmission method, SDN, device and medium
CN112822109A (en) * 2020-12-31 2021-05-18 上海缔安科技股份有限公司 SDN core network QoS route optimization algorithm based on reinforcement learning
US11606265B2 (en) 2021-01-29 2023-03-14 World Wide Technology Holding Co., LLC Network control in artificial intelligence-defined networking
CN113158543A (en) * 2021-02-02 2021-07-23 浙江工商大学 Intelligent prediction method for software defined network performance
CN113158543B (en) * 2021-02-02 2023-10-24 浙江工商大学 Intelligent prediction method for software defined network performance
CN113098771A (en) * 2021-03-26 2021-07-09 哈尔滨工业大学 Distributed self-adaptive QoS routing method based on Q learning
CN113347108B (en) * 2021-05-20 2022-08-02 中国电子科技集团公司第七研究所 SDN load balancing method and system based on Q-learning
CN113347108A (en) * 2021-05-20 2021-09-03 中国电子科技集团公司第七研究所 SDN load balancing method and system based on Q-learning
CN113347104A (en) * 2021-05-31 2021-09-03 国网山东省电力公司青岛供电公司 SDN-based routing method and system for power distribution Internet of things
WO2022257917A1 (en) * 2021-06-11 2022-12-15 华为技术有限公司 Path planning method and related device
CN113489654A (en) * 2021-07-06 2021-10-08 国网信息通信产业集团有限公司 Routing method, routing device, electronic equipment and storage medium
CN113489654B (en) * 2021-07-06 2024-01-05 国网信息通信产业集团有限公司 Routing method, device, electronic equipment and storage medium
CN114124828A (en) * 2022-01-27 2022-03-01 广东省新一代通信与网络创新研究院 Machine learning method and device based on programmable switch
CN114845359A (en) * 2022-03-14 2022-08-02 中国人民解放军军事科学院战争研究院 Multi-intelligent heterogeneous network selection method based on Nash Q-Learning
CN115550236B (en) * 2022-08-31 2024-04-30 国网江西省电力有限公司信息通信分公司 Data protection method oriented to security middle station resource pool route optimization
CN115941579B (en) * 2022-11-10 2024-04-26 北京工业大学 Mixed routing method based on deep reinforcement learning
CN117033005A (en) * 2023-10-07 2023-11-10 之江实验室 Deadlock-free routing method and device, storage medium and electronic equipment
CN117033005B (en) * 2023-10-07 2024-01-26 之江实验室 Deadlock-free routing method and device, storage medium and electronic equipment

Also Published As

Publication number Publication date
CN110986979B (en) 2021-09-10

Similar Documents

Publication Publication Date Title
CN110986979B (en) SDN multi-path routing planning method based on reinforcement learning
CN109361601B (en) SDN route planning method based on reinforcement learning
CN106789648B (en) Software defined network route decision method based on content storage and Network status
CN103210617B (en) For reducing the method and system of message in network and computing cost
CN112822109B (en) SDN core network QoS route optimization method based on reinforcement learning
CN112600759B (en) Multipath traffic scheduling method and system based on deep reinforcement learning under Overlay network
CN107294852B (en) Network routing method using topology dispersed short path set
CN101986628B (en) Method for realizing multisource multicast traffic balance based on ant colony algorithm
CN104580165A (en) Cooperative caching method in intelligence cooperative network
CN114143264A (en) Traffic scheduling method based on reinforcement learning in SRv6 network
CN101674220B (en) Forwarding history-based asynchronous rooting algorithm
Haeri et al. A reinforcement learning-based algorithm for deflection routing in optical burst-switched networks
CN102427596A (en) Routing method and scheduling method of node mobile network assisted by positioning information
CN109922161A (en) Content distribution method, system, equipment and the medium of dynamic cloud content distributing network
JP4589978B2 (en) Route setting method and route setting device
Li et al. A data forwarding mechanism based on deep reinforcement learning for deterministic networks
CN115037669A (en) Cross-domain data transmission method based on federal learning
CN114745322A (en) Video stream routing method based on genetic algorithm in SDN environment
Wang et al. Implementation of multipath network virtualization scheme with SDN and NFV
Iqbal Cache-mab: A reinforcement learning-based hybrid caching scheme in named data networks
Liu et al. Ppo-based reliable concurrent transmission control for telemedicine real-time services
CN109450809B (en) Data center scheduling system and method
Dong et al. Topology control mechanism based on link available probability in aeronautical ad hoc network
CN100442758C (en) Multicast transfer route setting method, and multicast label switching method for implementing former method
Chunxin et al. A hybrid scatter search algorithm for QoS multicast routing problem

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant