CN108900419A - Route decision method and device based on deeply study under SDN framework - Google Patents
Route decision method and device based on deeply study under SDN framework Download PDFInfo
- Publication number
- CN108900419A CN108900419A CN201810945527.XA CN201810945527A CN108900419A CN 108900419 A CN108900419 A CN 108900419A CN 201810945527 A CN201810945527 A CN 201810945527A CN 108900419 A CN108900419 A CN 108900419A
- Authority
- CN
- China
- Prior art keywords
- sample flow
- routing
- network
- stream
- priority
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L45/00—Routing or path finding of packets in data switching networks
- H04L45/302—Route determination based on requested QoS
- H04L45/306—Route determination based on the nature of the carried application
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L45/00—Routing or path finding of packets in data switching networks
- H04L45/02—Topology update or discovery
- H04L45/08—Learning-based routing, e.g. using neural networks or artificial intelligence
Abstract
The embodiment of the invention provides the route decision method and device based on deeply study under a kind of SDN framework, the method is applied to SDN controller, including:Obtain the real-time traffic information in network;Determine the priority of every stream;The depth Q network DQN that the real-time traffic information input is trained in advance, the priority sequence flowed according to described every successively determine the routing of every stream.The embodiment of the present invention can realize the load balancing of network in the network of various topological structures, reduce the generation of network congestion, and in the network environment of the highly dynamic variation of network flow, realize the optimization of routing policy.
Description
Technical field
The present invention relates to field of communication technology, determines more particularly to the routing based on deeply study under a kind of SDN framework
Plan method and device.
Background technique
For a long time, Congestion Avoidance and routing optimality always are the important research class of traffic engineering in modern communication networks
Topic.With being skyrocketed through for number of users and network size, network structure becomes increasingly complex, and network congestion and routing optimality face
Increasing challenge.
The flow business of highly dynamic variation in network and the flux density being unevenly distributed are cause network congestion main
Reason.In order to solve network congestion, common solution mainly has:Multipath is carried out to the flow that may cause network congestion
It shunts to prevent the concentration of overload caused by fluid stopping amount.Wherein, equivalent route (Equal-CostMultipathRouting
Hash, ECMP) technology is exactly a kind of common Network Load Balance Technology.Specifically, the basic principle of ECMP technology is:Work as net
There are when a plurality of different links, support that the network protocol of ECMP can be simultaneously using a plurality of between source address and destination address in network
Link of equal value carries out the transmission of data between source address and destination address.
However, flow is only simply averagely allocated to each equal-cost link without considering flow in network by ECMP technology
Distribution, it is not fully up to expectations that this causes it to show in the network with asymmetric topology and flow.With asymmetric topology
In the network of structure, flow distribution is asymmetrical, and flow distribution is more unbalanced, is more difficult to subtract by ECMP technology
Less or avoid the generation of network congestion.And it is dynamic in network flow height due to the generation for being difficult to reduce or avoiding network congestion
In the network environment of state variation, the routing policy based on ECMP technology can not be realized optimal.
Summary of the invention
The route decision method of the embodiment of the present invention being designed to provide based on deeply study under a kind of SDN framework
And device, in the network of various topological structures, to reduce the generation of network congestion, and in the highly dynamic variation of network flow
In network environment, the optimization of routing policy is realized.Specific technical solution is as follows:
In a first aspect, the embodiment of the invention provides the routing decision sides based on deeply study under a kind of SDN framework
Method is applied to SDN controller, the method includes:
Obtain the real-time traffic information in network;Wherein, the real-time traffic information includes:Every stream in the network
Occupied link bandwidth;
Determine the priority of every stream;
The depth Q network DQN that the real-time traffic information input is trained in advance, the priority flowed according to described every are high
Low sequence successively determines the routing of every stream;
Wherein, the DQN is according to sample flow information and the corresponding sample routing policy instruction of the sample flow information
It gets;The sample flow information includes:The occupied link bandwidth of every sample flow, the sample routing policy packet
It includes:The routing of the corresponding every sample flow of the sample flow information.
Second aspect, the embodiment of the invention provides the routing decisions based on deeply study under a kind of SDN framework to fill
It sets, is applied to SDN controller, described device includes:
First obtains module, for obtaining the real-time traffic information in network;Wherein, the real-time traffic information includes:
Every occupied link bandwidth of stream in the network;
First determining module, for determining the priority of every stream;
Second determining module, the depth Q network DQN for training the real-time traffic information input in advance, according to institute
The priority sequence for stating every stream successively determines the routing of every stream;
Wherein, the DQN is according to sample flow information and the corresponding sample routing policy instruction of the sample flow information
It gets;The sample flow information includes:The occupied link bandwidth of every sample flow, the sample routing policy packet
It includes:The routing of the corresponding every sample flow of the sample flow information.
The third aspect, the embodiment of the invention provides a kind of SDN controller, including processor, communication interface, memory and
Communication bus, wherein the processor, the communication interface, the memory are completed each other by the communication bus
Communication;
The memory, for storing computer program;
The processor when for executing the program stored on the memory, is realized described in first aspect as above
The method and step of routing decision based on deeply study under SDN framework.
Fourth aspect, the embodiment of the invention provides a kind of computer readable storage medium, the computer-readable storage
Instruction is stored in medium, when run on a computer, so that computer executes SDN framework described in first aspect as above
Under based on deeply study routing decision method and step.
5th aspect, the embodiment of the invention provides a kind of computer program products comprising instruction, when it is in computer
When upper operation, so that computer executes the routing decision based on deeply study under SDN framework described in first aspect as above
Method and step.
In the embodiment of the present invention, previously according to sample flow information and the corresponding sample arm of the sample flow information
The DQN obtained by Strategies Training, and then in determining network when the routing of every stream, obtaining the real-time traffic letter in network
After breath, by the trained DQN of the real-time traffic information input, so that DQN is successively determined according to the priority of every stream in network
The routing of every stream out.It is routed since the embodiment of the present invention is determined based on DQN network trained in advance, and DQM network training
When can according to the sample data of the network of topological structure to be analyzed, therefore, the embodiment of the present invention can it is various topology knot
The generation of network congestion is reduced in the network of structure, and in the network environment of the highly dynamic variation of network flow, realize routing plan
Optimization slightly.
Certainly, it implements any of the products of the present invention or method must be not necessarily required to reach all the above excellent simultaneously
Point.
Detailed description of the invention
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below
There is attached drawing needed in technical description to be briefly described.
Fig. 1 is the route decision method based on deeply study under a kind of SDN framework provided in an embodiment of the present invention
Flow chart;
Fig. 2 is the route decision method based on deeply study under a kind of SDN framework provided in an embodiment of the present invention
Another flow chart;
Fig. 3 is the route decision method based on deeply study under a kind of SDN framework provided in an embodiment of the present invention
Another flow chart;
Fig. 4 is the routing decision device based on deeply study under a kind of SDN framework provided in an embodiment of the present invention
Structure chart;
Fig. 5 is the routing decision device based on deeply study under a kind of SDN framework provided in an embodiment of the present invention
Another structure chart;
Fig. 6 is a kind of structural schematic diagram of SDN controller provided in an embodiment of the present invention.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete
Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on
Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other
Embodiment shall fall within the protection scope of the present invention.
Scheme in order to facilitate understanding, first below to SDN (Software Defined Network, software defined network
Network), DRL (Deep Reinforcement learning, deeply study) and DQN (DeepQ Network, depth Q
Network) simply introduced.
SDN is a kind of novel network architecture.Different from traditional network architecture, SDN proposes the data plane of network
With the thought of control planar separation.Wherein, the communication between the data plane of network and control plane can pass through a kind of opening
Agreement --- Openflow agreement is realized.Based on Openflow agreement, Openflow interchanger in data plane in addition to
It can carry out except common data traffic forwarding and transmission, additionally it is possible to which the real-time traffic information of the network of acquisition is uploaded to net
The SDN controller of the control plane of network.SDN controller can collect the Openflow interchanger in the network area that it is managed
The flow information that is uploaded simultaneously is summarized, and corresponding routing policy and forwarding are formulated according to the network traffic information being collected into
Mechanism.SDN framework has many advantages compared with traditional network architecture, and the void of network function may be implemented based on SDN framework
Quasi-ization (Network Function Virtualization, NFV), decouples software and hardware, and abstract network function makes the network equipment
Function is no longer dependent on specialized hardware, thus sufficiently flexible shared resource.By SDN controller, network can be fully realized
Global route test.This, which means that, can control the routing policy flowed in network and assignment of traffic from whole angle
To solve congestion problems caused by being unevenly distributed in network due to flux density.
DRL is a kind of novel machine learning method, be combined with deep neural network (Deep Neural Network,
DNN intensified learning (Reinforcement Learning, RL) method), is proposed by DeepMind.If it is intended to DRL is answered
For the control problem under different scenes, need to guarantee that the control problem meets the following conditions:(1) one have clear rule and
The environment of definition;(2) one the system that can be properly and timely fed back (3) one for defining the reward letter of task object
Number.The flow control of network and routing decision problem meet conditions above, that is to say, that realize network using DRL
Flow control and routing decision are feasible.Specifically, Markov decisior process usually can be used when RL handles a task
Journey (MarkovDecisionProcess, MDPs) describes:In a determining environment E, have a state space S and
One motion space A, RL will use an agency and carry out decision in environment E, any one state in state space S indicates
Perceived current environment is acted on behalf of, any one movement in the A of motion space is all alternative movement at each state.When
After agency executes some movement using strategy π (s) under some state, state can be shifted.And after state shifts,
Environment E can be transferred to according to this next state gives one reward (reward) of agency.When agency uses strategy π (s) by original state
Start to execute a series of actions, that is, after carrying out a series of state transfers, agency can obtain a progressive award Qπ(s, a).RL's
Target is exactly to find an optimal tactful π*(s), the progressive award that agency obtains can be maximized.
When RL handles task, optimal policy can be found by Q-learning.But when state space is excessively huge
When, the progressive award Q under each state is solved by Q-learningπ(s, process a) can become very difficult.In order to solve
This problem can be used DNN and seek progressive award Q by approximate modeπ(s, a).It is this to combine DNN and Q-
The method of learning is referred to as DQN.
The primary structure of DQN be a neural network --- be referred to as Q network, Q network can using state s as input,
The Q of optional movement under output state sπ(s, a) value, due to the Q of outputπ(s is a) that Q network is approximate, it is therefore desirable to training Q net
The parameter θ of network comes so that approximate Qπ(s, it is a) more accurate, specifically, in the training process, the value of loss function can be calculated,
When the value of loss function is unsatisfactory for imposing a condition, the ginseng of Q network is updated by common back transfer and gradient descent method
Number θ.As the Q that Q network is after by enough training and update, and Q network exportsπ(s a) can be closer to optimal accumulation
Reward Q*(s, a), while current strategies π (s) can also approach optimal policy π*。
In a network, when the maximum that the data traffic loads of the transmission on link or equipment are more than link bandwidth or equipment
When reason ability, the propagation delay time that will lead in the link or equipment increases, and throughput degradation, simultaneous transmission data, which can generate, loses
Packet, the referred to as network congestion of this case where causing network transmission performance to decline.Under normal circumstances, the congestion in network be all due to
Caused by link or the overload of equipment, when data transfer loads total in network are more than the open ended upper loading limit of network
When, network congestion is difficult to avoid that, in this case can only be upgrade of network hardware or by way of increasing extras
To avoid network congestion.Even if but data transfer loads total in network reach far away network and can accommodate in many cases,
Upper loading limit, network congestion can also occur.This network congestion is since data traffic is unevenly distributed and is led in network mostly
It causes:It is certain in key position in a network due to often using basic shortest path first in conventional routing protocols
A large amount of flow load is often concentrated on link or equipment, and other is in the equipment of network edge or chain road loads
Then seldom, the utilization rate of Internet resources is very low.Caused network congestion in this case, can routing policy to network into
Row optimization is to be reduced or avoided.
, it is clear that the optimal routing policy in a network will be believed according to network topology structure and real-time network flow
Breath is to determine.When the data traffic variation in network, the optimal routing policy of network is also required to change therewith.This requires necessary
The real-time traffic information of network is grasped, to optimize according to the real-time traffic information to the routing policy of network.
In order to realize the generation for reducing network congestion in the network of various topological structures, and it is highly dynamic in network flow
In the network environment of variation, the optimization of routing policy is realized, the embodiment of the invention provides depth is based under a kind of SDN framework
The route decision method and device of intensified learning.
It, can be under the network architecture of SDN, by being located at control plane in SDN network framework in the solution of the present invention
SDN controller realize the acquisition of the real-time traffic information of network and summarize, when SDN controller is by the real-time streams of whole network
After amount information summarizes, the current optimal road of network can be determined using DRL method according to the real-time traffic information of network
By strategy.It is possible to further determine the current optimal routing policy of network based on the DQN in DRL.
It is provided for the embodiments of the invention the routing decision based on deeply study under a kind of SDN framework first below
Method is introduced.
As shown in Figure 1, the routing decision side based on deeply study under a kind of SDN framework provided in an embodiment of the present invention
Method, is applied to SDN controller, and this method may comprise steps of:
S101 obtains the real-time traffic information in network;Wherein, real-time traffic information includes:Every stream institute in network
The link bandwidth of occupancy.
Method provided in an embodiment of the present invention can be applied to SDN controller.SDN controller is controlled in SDN network framework
The controller of plane processed can collect the above-mentioned network of the Openflow interchanger transmission of data plane in SDN network framework
Real-time traffic information, and corresponding routing policy and forwarding mechanism are formulated based on the real-time traffic information.Thus, above-mentioned network can
To be a kind of communication network with SDN network framework.
Network in the present embodiment, stream and congestion problems in order to facilitate understanding, below to the model of network in the present embodiment,
The routing mode of definition and stream and the congestion problems of network flowed in network is introduced.
Firstly, the model of network is:One has several communication nodes, the communication network of m physical link.It is each logical
Letter node corresponds to each Openflow interchanger of data plane in SDN network framework.All communication nodes can be divided into
Two kinds:Source node and forward node.Source node is the node of the generation and final received data packet in network, all in network
Data packet is all generated by source node, and finally can all reach source node.In the present embodiment, the number of the source node in setting network
For n, s is used1, s2, s3......snIndicate source node.Forward node is the node of the responsible forwarding data packet in network, forwarding section
Point does not generate data packet, they are only based on flow table to the data packet to be come by other node-node transmissions and are forwarded operation.
Then, the stream in network refers to:In network it is all by identical source node, eventually arrive at identical purpose section
The data packet of point is classified as one kind, and such data packet collectively constitutes a stream.Wherein, the source node of any bar stream and destination node be not
It can be the same node, the source node of any bar stream here refers to:The start node of this stream.Based on this, it can be concluded that
In a network with n source node, be up to N=n2- n item stream.In order to quantitatively describe each stream in a network
Flow demand, definition:One with node siFor source node, node sjFor the stream in a link shared by normal transmission of purpose node
Link bandwidth is fI, j.For every stream fI, j, can flow for this and be determined between institute's active node and all purposes node
X alternate routing outEvery alternate routing all specifies stream fI, jBy source node siIt sets out and reaches purpose
Node sjAll links passed through.In the present embodiment, it is exactly to the mode of network progress routing decision:For every in network
Item stream selects a practical routing (the referred to as routing of this stream) as this stream for it from alternate routing.Therefore, it flows
Routing mode refer to:Stream realizes the transmission of data packet in stream by the specified all links of its practical routing.It needs to illustrate
, it is that minimum unit carries out routing decision with stream that in this embodiment, stream, which is the minimum unit controlled in routing decision, right
It is easily achieved for the SDN controller forwarded using flow table control data bag.
Finally, the congestion problems of network refer to:It is every physical link specified two for m physical link in network
A parameter:The available bandwidth threshold t of maximum of link1, t2, t3..., tmWith the real-time link load value l of link1, l2,
l3..., lm.Wherein, the bandwidth threshold of link is identical as the linear module of real-time link load value.In the definition of above-mentioned convection current
In, by stream, the occupied link bandwidth of normal transmission is expressed as f in a linkI, j.So for a link k, if current shape
Have a plurality of stream under state is routed across this link k, such as:There are three stream f1,2, f1,3, fIsosorbide-5-NitraeBe routed across link k, definition
Real-time link load value l on this link kkEqual to it is above-mentioned it is a plurality of stream in a link the occupied link bandwidth of normal transmission it
With the real-time traffic load l of i.e. link kk=f1,2+f1,3+fIsosorbide-5-Nitrae.If lkValue be more than link k the available bandwidth of maximum
Threshold value tk, it is considered that congestion, l has occurred on link k at this timekValue be more than bandwidth threshold tkDegree correspond to and sent out on link k
The severity of raw congestion, the severity of congestion is higher, and the handling capacity of link k is lower, flows through the time delay of the stream of link k also just
It is higher, i.e., three stream f1,2, f1,3, fIsosorbide-5-NitraeTransmission delay it is higher;If lkMaximum available bandwidth of the value no more than link k
Threshold value tk, then it is assumed that congestion is not had on link k at this time, the throughput amount of link k is with lkValue increase and linearly increase
Add, the stream for simultaneously flowing through link k can be transmitted in acceptable time delay range.
The congestion problems of routing mode and network based on the definition and stream flowed in the above-mentioned model to network, network
Introduction, for the routing decision problem of network, target is:There is n source node at one, in the network of m physical link,
It is that every stream selects most suitable road in having obtained a certain moment network under conditions of every stream occupied link bandwidth
By so that the load balancing state of network is optimal, the probability that congestion occurs in network is minimum.It is understood that if certain a period of time
It carves certain stream to be not present, occupied link bandwidth is 0.
In the present embodiment, SDN controller can be all on data plane by will be located in SDN network framework
The real-time traffic information for the network that Openflow interchanger is sent is collected and summarizes, to obtain the letter of the real-time traffic in network
Breath.This process can be achieved by the prior art, and the present invention is secondary without repeating.
In actual use, SDN controller can periodically obtain the real-time streams in network by certain time interval
Measure information.The every real-time traffic information obtained in primary network of SDN controller, so that it may believe for acquired real-time traffic
Breath, carries out a routing decision.This just embody in the present embodiment when the flow information in network changes, accordingly
Ground, the routing policy of network also adjust therewith.Thus, when the highly dynamic variation of flow in a network, SDN controller is in real time
The real-time traffic information in network is obtained, and adjusts routing policy, can remain that the routing policy of network is optimal.
Above-mentioned time interval can be determined according to the concrete condition of network.Specifically, can be become according to the flow of network
Change degree determines.If the changes in flow rate of network is very fast, above-mentioned time interval can be set to a lesser value;If net
The changes in flow rate of network is unhappy, then can set above-mentioned time interval to a biggish value.
S102 determines the priority of every stream.
By a large amount of emulation experiment, inventor is observed:When to the network route with asymmetric topological by certainly
When plan, to DQN, (DQN in step S102 refers to the sequence routed for all streams selection in network:Trained DQN) place
Reason speed and effect have significant impact, and the processing speed of DQN can be obviously improved under some cases, and in the case of other
The processing speed of DQN very slowly even is difficult to restrain.By the comparison of multiple groups emulation experiment, find DQN processing speed and
Effect is related with the alternate routing of not cocurrent flow:When having one " ideal " routing in the alternate routing that certain is flowed, here " ideal "
Routing refers to:This routes other stream brings on flowed through path and loads very little, i.e., this is routed on flowed through path
It is not susceptible to congestion, the Route Selection for preferentially carrying out this stream can allow DQN to optimize routing decision with faster processing speed,
And if the stream of " ideal " preferentially non-to those alternate routings routing is routed, the processing time of DQN can be made more
It is more, and treatment effect is also very unsatisfactory.This is because:When one flow alternate routing in there is a routing to be substantially better than other
When routing, DQN is easy for that the optimal routing policy of this stream can be exported, and is all streams choosing in network when in sequence
Routing by when, the sequence of this stream is more forward, and DQN can more quickly export the optimal routing policy of this stream, while optimizing whole
The solution space explored required for the routing policy of a network is also just smaller, and it is also easier to handle.Above-mentioned DQN processing refers to:Base
The Optimization route of every stream in network is determined in trained DQN.
For these reasons, in the present embodiment, before DQN processing, determine that sequence proposes for the routing policy of stream
The priority of stream determines method.In a kind of implementation, the priority of determination every stream in step S102, may include following
Step:
S11, for every stream fI, j, determine x alternate routing of this streamWherein, i indicates stream
fI, jSource node, j indicate stream fI, jDestination node.
Source node and forward node in network can form a plurality of routing.So it is directed to every stream fI, j, can be a plurality of
Some alternate routings are first selected in routing, flow f further to select one in these alternate routings as thisI, jReality
Routing.
Wherein, when choosing the alternate routing of every stream, the alternate routing of every stream can satisfy the following conditions:
Condition 1:Any alternate routing of every stream is acyclic.
It is appreciated that when there are when loop in an alternate routing, it is meant that the data packet transmitted on this alternate routing
It will be unable to the node that achieves the goal.
Condition 2:Every any alternate routing paths traversed flowed and other alternate routing paths traversed are endless
It is exactly the same.
That is, being directed to any bar stream, other alternate routings that every alternate routing of this stream is flowed with this are not
Together.Since the purpose of the present embodiment is:For any bar stream, selected in a plurality of alternate routing of this stream one it is optimal
As the actual routing of this stream, so, for the ease of more a plurality of alternate routing, a plurality of alternate routing can each not phase
Together.
Condition 3:The distance of any alternative path of every stream meets preset value.
For any bar stream, when selecting Optimization route for this stream, it is often desirable that the distance of this Optimization route is shorter.Institute
A preset value can be set, distance is less than alternate routing of the routing of the preset value as this stream.One routing away from
From referring to:The source node of this routing is (i.e.:Start node) to the distance of destination node, specifically, this can be passed through and routed
Item number or other usual ways come measure this routing distance.Above-mentioned preset value can be set according to actual needs.Needle
To different stream, identical preset value can be set, can also be respectively set different preset values, the present invention to this and it is unlimited
It is fixed.
S12 is calculated by the following formula stream fI, jThe r articles alternate routingEvaluation of estimate EVr:
Wherein, l1, l2, l3..., L indicates the r articles alternate routingListen by each link,
It indicates in network except stream fI, jExcept other stream alternate routings in, by each link l1, l2, l3..., each total time of L
Number;Indicate each total degreeIn maximum value.
After the alternate routing for determining every stream, each alternate routing of this stream can be evaluated.Specifically,
The case where each link that every alternate routing is passed through is occupied by other streams can be evaluated.In the present embodiment, it can set
The evaluation of estimate EV statedrTo evaluate the case where each link that every alternate routing is passed through is occupied by other streams.Pass through upper commentary
It is worth EVrCalculation formula it can be concluded that such conclusion:The EV of one alternate routingrIt is worth bigger, it is meant that this routing warp
The utilization rate for the link crossed is lower, flows fI, jA possibility that congestion occurs after selecting this to route is smaller.
S13 is calculated by the following formula stream fI, jPriority reference value PI, j:
PI, j=max (E)-max (E { max (E) })
Wherein, E is indicated by flowing fI, jAll alternate routings evaluation of estimate composition set, E={ EV1, EV2,
EV3...EVX, max (E) indicates the maximum value in the set E, E { max (E) } expression by the maximum value in the set in E
The new set that max (E) is formed after removing, max (E { max (E) }) indicate the new set E maximum value in { max (E) }.
Calculate stream fI, jAll alternate routings evaluation of estimate after, can be further according to the evaluation of all alternate routings
Value calculates stream fI, jPriority reference value PI, j.Priority reference value PI, jIt illustrates:In stream fI, jAll alternate routings
In, the difference of the evaluation of estimate of the maximum alternate routing of evaluation of estimate and the second largest alternate routing of evaluation of estimate.That is, the priority
Reference value PI, jIt illustrates:Flow fI, jAll alternate routings in most " ideal " routing better than other routing degree.Priority
Reference value PI, jIt is bigger, flow fI, jPriority it is higher.
S14, the sequence of the priority reference value flowed according to described every determine the priority of every stream;
Wherein, the priority of the highest stream of priority reference value is 0, and the priority of the minimum stream of priority reference value is N-1.
It, can be by the priority reference values of all streams according to from high to low after the priority reference value for calculating all streams
Sequence sorts.Sequence after sequence is exactly the sequence for being all stream selection routings.One stream priority reference value it is higher, then this
The priority of item stream is bigger.So, when carrying out routing decision, the sequence of this stream is more forward.
In the present embodiment, the priority list of the highest stream of priority reference value is shown as 0, the minimum stream of priority reference value
Priority list be shown as N-1.After the priority for determining every stream, can sequence according to priority from 0 to N-1, successively really
The routing of fixed every stream.
S103, the depth Q network DQN that real-time traffic information input is trained in advance, according to the every priority flowed height
Sequentially, the routing of every stream is successively determined;Wherein, DQN is according to sample flow information and the corresponding sample of sample flow information
The training of this routing policy obtains;Sample flow information includes:The occupied link bandwidth of every sample flow, sample routing policy
Including:The sample arm of the corresponding every sample flow of sample flow information by.
It, can be according to the sample flow information and sample flow information pair obtained in advance in order to determine the routing of every stream
The sample routing policy answered, is trained DQN, obtains trained DQN.It in turn, can be by network after training DQN
The trained DQN of real-time traffic information input so that trained DQN according to every flow priority sequence, successively
Determine the routing of every stream.Wherein, the sample arm of the corresponding every sample flow of sample flow information is by may be considered every
Therefore the optimal routing of sample flow may be considered the optimal of every stream by the routing of the DQN every stream determined
Routing.So the process of training is exactly:Learn the optimal routing to every sample flow.Based on this, after training, by net
After the trained DQN of real-time traffic information input of network, the optimal routing of each item stream in network can be exported.
Before training, can preset one for training environment.In the training environment, including a plurality of sample
Stream, multiple communication nodes (including source node and forward node) and multilink further include the network flow-of a stream rank
Load module passes through the corresponding relationship of network flow and link load in the available training environment of this model.Due to instruction
The routing decision for practicing each sample flow in environment can change with the variation of sample flow information, so being directed to one group of sample flow
Information is measured, the corresponding relationship based on network flow in training environment and link load can be in the routing for determining each sample flow
During, determine the real time load of each link in training environment.It is understood that in the present solution, due to specific
It is the routing that each sample flow is determined according to the sequence of priority, so after often determining the routing of a sample flow, each item
The load of link can change, and the variation will affect the route-determining process of next sample flow.In the present solution, can be with
Link load is regarded as to the linear accumulation of all flows in chain road.Specifically, the load of a link is to flow through this link
The sum of each occupied link bandwidth of sample flow.
Based on above-mentioned preset training environment, DQN can be trained.In order to be suitable for trained DQN
Each item stream in network is routed, it is identical with the network structure of network with one that training environment can be set
Training network, in the training network source node number, forward node number and number of links respectively in network source save
Point number, forward node number is identical with number of links, and in the training network each link bandwidth also respectively with network
In each link bandwidth it is identical.It hereinafter will be described in detail the process of trained DQN.
The depth Q network DQN that real-time traffic information input is trained in advance, the priority sequence flowed according to every,
It successively determines the process of the routing of every stream, one can be carried out for one group of sample flow information with reference to what is be described below
The learning process of bout.
Scheme provided in an embodiment of the present invention, it is corresponding previously according to sample flow information and the sample flow information
The DQN that is obtained by Strategies Training of sample arm, and then obtaining the reality in network when the routing of every stream in determining network
When flow information after, by the trained DQN of the real-time traffic information input so that priority of the DQN according to every stream in network,
Successively determine the routing of every stream.The embodiment of the present invention can realize that the load of network is equal in the network of various topological structures
Weighing apparatus, reduces the generation of network congestion, and in the network environment of the highly dynamic variation of network flow, realize the optimal of routing policy
Change.
The process of training DQN in the embodiment of the present invention is introduced below, as shown in Fig. 2, the training process of DQN can be with
Include the following steps:
S201 constructs initial DQN.
In the present embodiment, in order to train DQN, initial DQN can be constructed.The structure of the initial DQN may include:Shape
State input layer, at least one layer of hidden layer and movement output layer.Wherein, one group of sample flow can be believed in state input layer
It is corresponding can to export this group of sample flow information in movement output layer after the processing of at least one layer of hidden layer by breath input DQN
The current routing of each sample flow.This it is current routing be:DQN is under parameter current, the result that exports after once learning.
In initial DQN, the value of each parameter is initial value.Trained process is exactly the parameter constantly optimized in DQN, so that ginseng
Number optimization after DQN output each sample flow it is current route constantly close to the sample arm of each sample flow by.
S202 obtains sample flow information and the corresponding sample routing policy of sample flow information.
After constructing initial DQN, available sample flow information and the corresponding sample arm of sample flow information by
Strategy.To be further trained according to sample flow information and the corresponding sample routing policy of sample flow information to DQN.
Due to accordingly, it is desirable to adjust the routing policy of network, that is, needing to adjust when network traffic information changes
The routing of each item stream in network.So in the present embodiment, training DQN particularly directed to one group of sample flow information, instruct
Experienced result is exactly:Allow DQN export the sample arm of the corresponding each sample flow of this group of sample flow information by.Due to
In practical application, network traffic information can be the value of each occupied link bandwidth of item stream in one group of any network, so
Training DQN when, the different sample flow information of available multiple groups, for the different sample flow information of the multiple groups respectively into
Row training.In this way, when can when being routed to item stream each in network for a certain group of real-time traffic information in network
First to determine to believe with the immediate sample flow information of this group of real-time traffic information, directly use for the sample flow
Breath and the DQN of training determine the routing of each item stream in network corresponding to this group of real-time traffic information.
S203 obtains sample flow information input DQN according to the priority sequence of preset every sample flow
The current routing of every sample flow.
The process for presetting the priority sequence of every sample flow can be with reference to every in determining network above-mentioned
The process of the priority of stream.It, can be suitable according to this in each training after the priority sequence for determining every sample arm
Sequence determines the current routing of each sample flow, so as to the speed of training for promotion.
In a kind of implementation, in step S203 by sample flow information input DQN, according to preset every sample flow
Priority sequence, obtain the current routing of every sample flow, may comprise steps of:
S21 is constituted initial state information with sample flow information, the initial value of link load vector and priority 0;Its
In, link load vector is the vector being made of the link loading value of each of the links in preset training environment, any bar link
Link loading value be:By the sum of each occupied link bandwidth of sample flow of this link.
In the present embodiment, one bout will be known as a learning process of one group of sample flow information.At each time
In conjunction, DQN executes movement by original state, then carries out all processes of a series of state transfer until terminating state
As shown in Figure 3.In each bout, one group of status information of every input will export one-off by DQN, the output action table
Show:A current routing is determined for a sample flow.After one bout, which outputs the last one movement, to be terminated, in this time
The whole movements exported in conjunction are meant that:Define the current routing of all sample flows.
In each bout, the status information inputted every time can be made of three parts:1, sample flow information, i.e., respectively
The occupied link bandwidth of sample flow, is expressed as:f1,2, f1,3, fIsosorbide-5-Nitrae...fN, n-1;2, link load vector is expressed as (l1, l2,
l3...lm);3, priority value, for determining the sequence of each sample flow.Specifically, a system of the sample flow information in one bout
In column state migration procedure, it will not change as state shifts, this is because conducted in the study of one bout
Routing decision be for this group of sample flow information and the routing decision that carries out.Link load vector illustrates:Currently
The loading condition of each chain road under state, the link load vector are continually changing with state transfer.Every next state
After transfer, the variation of link load vector by movement that the link load vector sum Last status of Last status is exported Lai
It determines.Priority indicates:The priority of routing decision is carried out to each sample stream, while being also used to determine the suitable of each state
Sequence.
In the present embodiment, the value of the priority in initial state information is set as 0, later after every execution one-off, shape
The value of the priority of new state after state transfer is increased by 1.
In initial state information, due to determining to route for any sample flow not yet, so link load to
Amount is 0 vector.
Initial state information is inputted DQN, the current routing for the sample flow that output priority is 0 by S22.
After initial state information is inputted DQN, the sample flow that DQN can be 0 based on current parameter output priority is (simple
Referred to as sample flow 0) current routing.Specifically, DQN can be selected in a plurality of alternate routing determined in advance for sample flow 0
One, the current routing as sample flow 0.Wherein it is determined that the mode of the alternate routing of sample flow 0, can with reference to it is aforementioned really
Determine the mode of the alternate routing of every stream in network.
S23 updates current ink load according to the current routing for the sample flow that initial state information and priority are 0
Vector, and priority is enabled to increase by 1.
In process shown in Fig. 3, the movement exported after each input state information can all influence in next status information
Link load vector.This is because:When a stream f has been determinedI, jRouting rI, jAfterwards, this stream fI, jThe link flowed through
Link load is changed.Therefore, it is possible to first according to 0 institute of sample flow in the current routing of sample flow 0 and initial state information
The link bandwidth of occupancy calculates the increased load of each link institute that sample flow 0 flows through to it, then by the increased load of institute with
The link loading value for each link that sample flow 0 is flowed through in the link load vector of initial state information is added, after obtaining update
Link load vector.The updated link load vector can be used as the link load vector in NextState information.
S24 sets s=1 ..., N-1, executes following steps a1-a3 according to the ascending sequence circulation of s, output is preferential
Grade is the current routing of the sample flow of 1~N-1, wherein the quantity of N expression sample flow:
a1:S-th of state letter is constituted with sample flow information, updated link load vector and current priority
Breath.
This step can refer to step S21.
a2:S-th of status information is inputted into DQN, output priority is the current routing of the sample flow of s.
This step can refer to step S22.
a3:According to the current routing for the sample flow that s-th of status information and priority are s, current ink load is updated
Vector, and priority is enabled to increase by 1.
This step can refer to step S23.
Above-mentioned steps a1-a3 is executed by circulation, so that it may be sequentially output the current of the sample flow that priority is 1~N-1
Routing.Working as all sample flows has been determined that out when outputing the last item sample flow after the current routing of i.e. sample flow N-1
Preceding routing.The current routing of all sample flows may further be compared with the Optimization route of all sample flows, to optimize
The parameter of DQN.
S204 calculates pre-set loss function according to the current routing of every sample flow and sample routing policy
Value.
During training DQN, a loss function can be preset.Every galley proof can be measured by the loss function
The gap of the current routing of this stream and the sample arm of every sample flow between.
In a kind of implementation, on the basis of implementation (i.e. step S21-S24) in step S203, step S204
In current routing according to every sample flow and sample routing policy, calculate the value of pre-set loss function, can be with
Include the following steps:
S31 calculates object chain according to the current routing for the sample flow that N-1 status information and priority are N-1
Road load vector;Wherein, Target Link load vector includes:Every chain in the corresponding preset training environment of sample flow information
The real-time link load value on road.
After determining the current routing that priority is the sample flow of N-1, the current routing of all sample flows has been determined that out.
Therefore, it is possible to calculate the real-time link load value of each of the links in training environment, Target Link load vector is formed, further
According to the Target Link load vector, to evaluate the load balancing state of training environment.Wherein, Target Link load vector is calculated
Mode can refer to step S23.
S32 calculates the corresponding reward function value MLV of sample flow information according to Target Link load vector.
Using calculated Target Link load vector, the load balancing state of training environment can be evaluated, with
Convenient for further being optimized to the routing policy of training environment.The load balancing state of training environment is evaluated, also
It is:Evaluate the learning outcome of one bout.Wherein, the load balancing state of training environment refers to:Each link in training environment
On loading condition.
Due to the purpose of the present invention is:It is reduced as far as the occurrence probability of network congestion and the degree of congestion.Specifically,
Need clear two kinds of demands:1, as the link loading value l of any bar chain road in training environmentkMaximum lower than this link can
Bandwidth threshold tkWhen, it needs to make link loading value lkAs far as possible far from bandwidth threshold tk;2, as link loading value lkIt is more than
Bandwidth threshold tkWhen, it needs to make link loading value lkAs close as bandwidth threshold tk.In order to realize both demands, need
The relationship first quantitatively described in training environment between the link loading value and bandwidth threshold of each link defines one here
The maximum load value (maximumloading Value, MLV) of a training environment, expression formula is:
MLV=min ((t1-l1), (t2-l2), (t3-l3)...(tm-lm))
Wherein, l1, l2, l3..., lmRespectively indicate link 1,2,3 ..., the real-time link load value of m, t1, t2,
t3..., tmRespectively indicate link 1,2,3 ..., the bandwidth threshold of m.
MLV is indicated:In training environment between the bandwidth threshold of the link of pack heaviest and real-time link load value
Difference.When MLV value be timing, illustrate that the real-time link load value of all links in training environment is respectively less than bandwidth threshold, instruct
Practicing in environment does not have congestion, and the value of MLV is bigger at this time, then it is assumed that the load in training environment is more balanced.And when MLV's
When value is negative, illustrate that the real-time link load value of at least one link in training environment has been more than bandwidth threshold, training environment
In congestion has occurred, the value of MLV is smaller at this time, means that the congestion in training environment is more serious.
It, can be using MLV as reward function, by this in the training of DQN based on the said circumstances that MLV can be indicated
Reward function evaluates the load balancing state of training environment.That is, the evaluation can use a reward function value
To indicate.Each positive reward function value indicates:It rewards in one bout, the movement that DQN is exported;Each negative reward
Functional value indicates:It punishes in one bout, the movement that DQN is exported.When the study by multi-round, so that DQN is gradually learned
Can how output action after obtaining bigger reward function value, illustrates to train completion.It, can be with based on the DQN that training is completed
For the real-time traffic information of network, optimal routing policy is provided.
It is understood that in the expression formula of MLV, selection is using being most worth rather than mean value describes the congestion of training environment
Situation, be due to:Network congestion is often caused by network load is uneven, it is therefore desirable to may cause network load mistake to any
The routing policy that degree is concentrated is punished.By measuring the link that loading condition is worst in network, can easily judge
The quality of routing policy, and the routing policy and bad routing policy that can make it difficult to distinguish using average value.
In the present embodiment, sample flow can be calculated by the calculation formula of above-mentioned MLV according to Target Link load vector
The corresponding reward function value MLV of information is measured, further to carry out to the learning outcome of one bout according to reward function value MLV
Evaluation.
S33 calculates the value of pre-set loss function according to reward function value and sample routing policy.
In the present embodiment, it can be calculated by the following formula and set in advance according to reward function value and sample routing policy
The value for the loss function set:
L (θ)=E [(MLV+ γ maxaQ (s ', a ' | θ)-Q (s, a | θ)2]
Wherein, L (θ) indicates that loss function, MLV indicate reward function value, and γ indicates that discount factor, 0≤γ≤1, θ indicate
The current network parameter of DQN, and Q (s, a | θ) it indicates after initial state information s is inputted DQN, export the current road of every sample flow
The progressive award obtained after, a indicate the current routing of every sample flow, maxaQ (s ', a ' | θ) it indicates according to sample arm by plan
The optimal progressive award slightly determined.
Specifically, the next state shifted after s ' expression execution movement a, maxaQ (s ', a ' | θ) indicate that state s ' is corresponding
Current sample flow all alternate routings corresponding to maximum value in progressive award, the corresponding current sample of a ' expression state s '
All alternative roads of this stream.
Wherein it is possible to which above-mentioned progressive award Q (s, a | θ) is calculated by reward function value MLV.Specifically, passing through
The method that reward function value MLV calculates progressive award Q (s, a | θ), and the optimal accumulation determined according to sample routing policy
The method of reward belongs to the prior art.The present invention is herein without repeating.
S205 adjusts the network parameter of DQN, and return when the value of loss function calculated is not less than the first preset value
It returns and sample flow information input DQN is obtained into every sample flow according to the priority sequence of preset every sample flow
The step of current routing.
When the value of loss function calculated is not less than the first preset value, illustrate that the training effect of DQN has not been reached yet
Expected effect, thus the network parameter of adjustable DQN, return to step S203.Specifically, back transfer can be used
The network parameter of DQN is adjusted with gradient descent method.
Certainly, the first preset value can be set according to actual needs.
S206 terminates training when the value of loss function calculated is lower than the first preset value, obtains training completion
DQN。
When the value of loss function calculated is lower than the first preset value, illustrate that the training effect of DQN has reached expected
Effect, training can terminate, to obtain the DQN of trained completion.That is, the network ginseng of the DQN completed based on training
Number, can export the Optimization route of each item stream in network.
In addition, in the present embodiment, on the basis of embodiment shown in Fig. 1, can also include the following steps:
S104 (not shown), the routing flowed according to every, updates local flow table;Updated flow table is sent to
Each openflow interchanger, so that each openflow interchanger carries out the data in the network according to updated flow table
Corresponding operating.
SDN controller can formulate corresponding routing policy and forwarding mechanism according to the network traffic information being collected into.
Thus after the routing for determining every stream, the routing that SDN controller can directly be flowed according to every updates local flow table.
Updated flow table can be then sent to each openflow interchanger of data plane in SDN network framework, so that respectively
Openflow interchanger carries out corresponding operating according to updated flow table, to the data in the network.For example, in network
Data packet is forwarded operation.Each openflow interchanger can realize data according to the optimal routing policy of network as a result,
Transmission.
The above-mentioned routing flowed according to every, can be achieved by the prior art, this hair the step of updating local flow table
It is bright herein without repeating.
Corresponding to above method embodiment, the embodiment of the invention provides be based on deeply under a kind of SDN framework to learn
Routing decision device, be applied to SDN controller, as shown in figure 4, the apparatus may include:
First obtains module 401, for obtaining the real-time traffic information in network;Wherein, the real-time traffic packet
It includes:Every occupied link bandwidth of stream in the network;
First determining module 402, for determining the priority of every stream;
Second determining module 403, the depth Q network DQN for training the real-time traffic information input in advance, according to
The priority sequence of every stream successively determines the routing of every stream;
Wherein, the DQN is according to sample flow information and the corresponding sample routing policy instruction of the sample flow information
It gets;The sample flow information includes:The occupied link bandwidth of every sample flow, the sample routing policy packet
It includes:The sample arm of the corresponding every sample flow of the sample flow information by.
Scheme provided in an embodiment of the present invention, it is corresponding previously according to sample flow information and the sample flow information
The DQN that is obtained by Strategies Training of sample arm, and then obtaining the reality in network when the routing of every stream in determining network
When flow information after, by the trained DQN of the real-time traffic information input so that priority of the DQN according to every stream in network,
Successively determine the routing of every stream.It is routed since the embodiment of the present invention is determined based on DQN network trained in advance, and DQM
Can be according to the sample data of the network of topological structure to be analyzed, therefore when network training, the embodiment of the present invention can be each
The generation of network congestion is reduced in the network of kind topological structure, and in the network environment of the highly dynamic variation of network flow, it is real
The optimization of existing routing policy.
Further, on the basis of the embodiment shown in fig. 4, as shown in figure 5, one kind provided by the embodiment of the present invention
Under SDN framework based on deeply study routing decision device, can also include:
Module 501 is constructed, for constructing initial DQN;
Second obtains module 502, for obtaining sample flow information and the corresponding sample arm of the sample flow information
By strategy;
Third determining module 503 is used for by DQN described in the sample flow information input, according to preset every described
The priority sequence of sample flow obtains the current routing of every sample flow;
Computing module 504, for according to every sample flow current routing and the sample routing policy, meter
Calculate the value of pre-set loss function;
First processing module 505, for when the value of loss function calculated is not less than the first preset value, described in adjustment
The network parameter of DQN, and trigger the third determining module 503;
Second processing module 506, for terminating instruction when the value of loss function calculated is lower than first preset value
Practice, obtains the DQN of training completion.
Optionally, the third determining module 503 may include:
Construction unit is constituted just for the initial value and priority 0 with the sample flow information, link load vector
Beginning status information;Wherein, the link load vector is made of the link loading value of each of the links in preset training environment
Vector, the link loading value of any bar link is:By the sum of each occupied link bandwidth of sample flow of this link;
First output unit, for the initial state information to be inputted the DQN, the sample flow that output priority is 0
Current routing;
Updating unit, the current road for the sample flow for being 0 according to the initial state information and the priority
By updating current ink load vector, and priority is enabled to increase by 1;
Second output unit, for setting s=1 ..., N-1 executes following steps according to the ascending sequence circulation of s
A1-a3, output priority are the current routing of the sample flow of 1~N-1, wherein the N indicates the quantity of sample flow:
a1:S-th of state is constituted with the sample flow information, updated link load vector and current priority
Information;
a2:S-th of status information is inputted into the DQN, output priority is the current routing of the sample flow of s;
a3:According to the current routing for the sample flow that s-th of status information and the priority are s, update current
Link load vector, and priority is enabled to increase by 1.
Optionally, the computing module 504 may include:
First computing unit, the current road for the sample flow for being N-1 according to N-1 status information and priority
By calculating Target Link load vector;Wherein, the Target Link load vector includes:The sample flow information is corresponding
The physical link load value of each of the links in the preset training environment;
Second computing unit, for it is corresponding to calculate the sample flow information according to the Target Link load vector
Reward function value MLV;
Third computing unit, for being preset according to the reward function value and the sample routing policy, calculating
Loss function value.
Optionally, the second computing unit is specifically used for being calculated by the following formula according to the Target Link load vector
The corresponding reward function value MLV of the sample flow information:
MLV=min ((t1-l1), (t2-l2), (t3-l3)...(tm-lm))
Wherein, l1, l2, l3..., lmRespectively indicate link 1,2,3 ..., the physical link load value of m, t1, t2,
t3..., tmRespectively indicate link 1,2,3 ..., the bandwidth threshold of m;
Third computing unit is specifically used for according to the reward function value and the sample routing policy, by following
Formula calculates the value of pre-set loss function:
L (θ)=E [(MLV+ γ maxaQ (s ', a ' | θ)-Q (s, a | θ)2]
Wherein, L (θ) indicates that the loss function, MLV indicate the reward function value, and γ indicates discount factor, 0≤γ
≤ 1, θ indicate the current network parameter of the DQN, Q (s, a | θ) it indicates after the initial state information s is inputted the DQN,
The progressive award obtained after the current routing of every sample flow is exported, a indicates the current routing of every sample flow, maxaQ
(s ', a ' | θ) indicate the optimal progressive award determined according to sample routing policy.
Optionally, the first determining module 402 may include:
First determination unit, for being directed to every stream fI, j, determine x alternate routing of this streamWherein, i indicates the stream fI, jSource node, j indicates the stream fI, jDestination node;
4th computing unit, for being calculated by the following formula the stream fI, jThe r articles alternate routingEvaluation of estimate
EVr:
Wherein, l1, l2, l3..., L indicates the r articles alternate routingEach link passed through,It indicates in the network except the stream fI, jExcept other stream alternate routings in, by each item
Link l1, l2, l3..., each total degree of L;Indicate each total degree
In maximum value;
5th computing unit is calculated by the following formula the stream fI, jPriority reference value PI, j:
PI, j=max (E)-max (E { max (E) })
Wherein, E is indicated by the stream fI, jAll alternate routings evaluation of estimate composition set, E={ EV1, EV2,
EV3...EVX, max (E) indicates the maximum value in the set E, E { max (E) } expression by the maximum value in the set in E
The new set that max (E) is formed after removing, max (E { max (E) }) indicate the new set E maximum value in { max (E) };
Second determination unit, the sequence of the priority reference value for flowing according to described every are determined described every
The priority of item stream;Wherein, the priority of the highest stream of priority reference value be 0, the minimum stream of priority reference value it is preferential
Grade is N-1.
Optionally, the alternate routing of every stream meets the following conditions:
Any alternate routing of every stream is acyclic;
Any alternate routing paths traversed and other alternate routing paths traversed of every stream are incomplete
It is identical;
The distance of any alternative path of every stream meets preset value.
In addition, the embodiment of the invention also provides a kind of SDN controllers, as shown in fig. 6, connecing including processor 601, communication
Mouth 602, memory 603 and communication bus 604, wherein processor 601, communication interface 602, memory 603 pass through communication bus
604 complete mutual communication,
Memory 603, for storing computer program;
Processor 601 when for executing the program stored on memory 603, realizes SDN any in above-described embodiment
Route decision method based on deeply study under framework.
The communication bus that above-mentioned SDN controller is mentioned can be Peripheral Component Interconnect standard (Peripheral
Component Interconnect, abbreviation PCI) bus or expanding the industrial standard structure (Extended Industry
Standard Architecture, abbreviation EISA) bus etc..The communication bus can be divided into address bus, data/address bus, control
Bus processed etc..Only to be indicated with a thick line in figure convenient for indicating, it is not intended that an only bus or a type of total
Line.
Communication interface is for the communication between above-mentioned electronic equipment and other equipment.
Memory may include random access memory (Random Access Memory, abbreviation RAM), also may include
Nonvolatile memory (non-volatile memory), for example, at least a magnetic disk storage.Optionally, memory may be used also
To be storage device that at least one is located remotely from aforementioned processor.
Above-mentioned processor can be general processor, including central processing unit (Central Processing Unit,
Abbreviation CPU), network processing unit (Network Processor, abbreviation NP) etc.;It can also be digital signal processor
(Digital Signal Processing, abbreviation DSP), specific integrated circuit (Application Specific
Integrated Circuit, abbreviation ASIC), field programmable gate array (Field-Programmable Gate Array,
Abbreviation FPGA) either other programmable logic device, discrete gate or transistor logic, discrete hardware components.
In another embodiment provided by the invention, a kind of computer readable storage medium is additionally provided, which can
Read storage medium in be stored with instruction, when run on a computer so that computer execute it is any in above-described embodiment
Route decision method based on deeply study under SDN framework.
In the above-described embodiments, can come wholly or partly by software, hardware, firmware or any combination thereof real
It is existing.When implemented in software, it can entirely or partly realize in the form of a computer program product.Computer program product
Including one or more computer instructions.When loading on computers and executing computer program instructions, all or part of real estate
Raw process or function according to the embodiment of the present invention.Computer can be general purpose computer, special purpose computer, computer network,
Or other programmable devices.Computer instruction may be stored in a computer readable storage medium, or from a computer
Readable storage medium storing program for executing to another computer readable storage medium transmit, for example, computer instruction can from a web-site,
Computer, server or data center by wired (such as coaxial cable, optical fiber, Digital Subscriber Line (DSL)) or wireless (such as
Infrared, wireless, microwave etc.) mode transmitted to another web-site, computer, server or data center.Computer
Readable storage medium storing program for executing can be any usable medium or include one or more usable medium collection that computer can access
At the data storage devices such as server, data center.Usable medium can be magnetic medium, (for example, floppy disk, hard disk, magnetic
Band), optical medium (for example, DVD) or semiconductor medium (such as solid state hard disk Solid State Disk (SSD)) etc..
It should be noted that, in this document, relational terms such as first and second and the like are used merely to a reality
Body or operation are distinguished with another entity or operation, are deposited without necessarily requiring or implying between these entities or operation
In any actual relationship or order or sequence.Moreover, the terms "include", "comprise" or its any other variant are intended to
Non-exclusive inclusion, so that the process, method, article or equipment including a series of elements is not only wanted including those
Element, but also including other elements that are not explicitly listed, or further include for this process, method, article or equipment
Intrinsic element.In the absence of more restrictions, the element limited by sentence " including one ... ", it is not excluded that
There is also other identical elements in the process, method, article or equipment for including element.
Each embodiment in this specification is all made of relevant mode and describes, same and similar portion between each embodiment
Dividing may refer to each other, and each embodiment focuses on the differences from other embodiments.Especially for device/
For SDN controller/storage medium embodiment, since it is substantially similar to the method embodiment, so being described relatively simple, phase
Place is closed to illustrate referring to the part of embodiment of the method.
The above is merely preferred embodiments of the present invention, it is not intended to limit the scope of the present invention.It is all in this hair
Any modification, equivalent replacement, improvement and so within bright spirit and principle, are included within the scope of protection of the present invention.
Claims (10)
1. the route decision method based on deeply study under a kind of SDN framework, which is characterized in that be applied to software defined network
Network SDN controller, the method includes:
Obtain the real-time traffic information in network;Wherein, the real-time traffic information includes:Shared by every stream in the network
Link bandwidth;
Determine the priority of every stream;
The depth Q network DQN that the real-time traffic information input is trained in advance, it is suitable according to the described every priority flowed height
Sequence successively determines the routing of every stream;
Wherein, the DQN is to be obtained according to sample flow information and the corresponding sample arm of the sample flow information by Strategies Training
It arrives;The sample flow information includes:The occupied link bandwidth of every sample flow, the sample routing policy include:Institute
State the sample arm of the corresponding every sample flow of sample flow information by.
2. the method according to claim 1, wherein the training process of the DQN includes:
Construct initial DQN;
Obtain sample flow information and the corresponding sample routing policy of the sample flow information;
DQN described in the sample flow information input is obtained according to the priority sequence of preset every sample flow
To the current routing of every sample flow;
According to the current routing of every sample flow and the sample routing policy, pre-set loss function is calculated
Value;
When the value of loss function calculated is not less than the first preset value, the network parameter of the DQN is adjusted, and described in return
DQN described in the sample flow information input is obtained into institute according to the priority sequence of preset every sample flow
The step of stating the current routing of every sample flow;
When the value of loss function calculated is lower than first preset value, terminate training, obtains the DQN of training completion.
3. according to the method described in claim 2, pressing it is characterized in that, described by DQN described in the sample flow information input
According to the priority sequence of preset every sample flow, the current routing of every sample flow is obtained, including:
Initial state information is constituted with the sample flow information, the initial value of link load vector and priority 0;Wherein,
The link load vector is the vector being made of the link loading value of each of the links in preset training environment, any bar link
Link loading value be:By the sum of each occupied link bandwidth of sample flow of this link;
The initial state information is inputted into the DQN, the current routing for the sample flow that output priority is 0;
According to the current routing for the sample flow that the initial state information and the priority are 0, current ink load is updated
Vector, and priority is enabled to increase by 1;
Set s=1 ..., N-1, according to s it is ascending sequence circulation execute following steps a1-a3, output priority be 1~
The current routing of the sample flow of N-1, wherein the N indicates the quantity of sample flow:
a1:S-th of state letter is constituted with the sample flow information, updated link load vector and current priority
Breath;
a2:S-th of status information is inputted into the DQN, output priority is the current routing of the sample flow of s;
a3:According to the current routing for the sample flow that s-th of status information and the priority are s, current ink is updated
Load vector, and priority is enabled to increase by 1.
4. according to the method described in claim 3, it is characterized in that, the current routing according to every sample flow, with
And the sample routing policy, the value of pre-set loss function is calculated, including:
According to the current routing for the sample flow that N-1 status information and priority are N-1, calculate Target Link load to
Amount;Wherein, the Target Link load vector includes:It is every in the corresponding preset training environment of the sample flow information
The real-time link load value of link;
According to the Target Link load vector, the corresponding reward function value MLV of the sample flow information is calculated;
According to the reward function value and the sample routing policy, the value of pre-set loss function is calculated.
5. according to the method described in claim 4, it is characterized in that,
According to the Target Link load vector, it is calculated by the following formula the corresponding reward function value of the sample flow information
MLV:
MLV=min ((t1-l1), (t2-l2), (t3-l3)...(tm-lm))
Wherein, l1, l2, l3..., lmRespectively indicate link 1,2,3 ..., the real-time link load value of m, t1, t2, t3..., tm
Respectively indicate link 1,2,3 ..., the bandwidth threshold of m;
According to the reward function value and the sample routing policy, it is calculated by the following formula pre-set loss letter
Several values:
L (θ)=E [(MLV+ γ maxaQ (s ', a ' | θ)-Q (s, a | θ)2]
Wherein, L (θ) indicates that the loss function, MLV indicate the reward function value, and γ indicates discount factor, 0≤γ≤1, θ
Indicate the current network parameter of the DQN, and Q (s, a | θ) it indicates after the initial state information s is inputted the DQN, export institute
The progressive award obtained after the current routing of every sample flow is stated, a indicates the current routing of every sample flow, maxaQ (s ', a ' |
θ) indicate the optimal progressive award determined according to sample routing policy.
6. the method according to claim 1, wherein the determination it is described every stream priority, including:
For every stream fI, j, determine x alternate routing of this streamWherein, i indicates the stream
fI, jSource node, j indicates the stream fI, jDestination node;
It is calculated by the following formula the stream fI, jThe r articles alternate routingEvaluation of estimate EVr:
Wherein, l1, l2, l3..., L indicates the r articles alternate routingEach link passed through,
It indicates in the network except the stream fI, jExcept other stream alternate routings in, by each link l1, l2,
l3..., each total degree of L;Indicate each total degreeIn maximum
Value;
It is calculated by the following formula the stream fI, jPriority reference value PI, j:
PI, j=max (E)-max (E { max (E) })
Wherein, E is indicated by the stream fI, jAll alternate routings evaluation of estimate composition set, E={ EV1, EV2,
EV3...EVX, max (E) indicates the maximum value in the set E, E { max (E) } expression by the maximum value in the set in E
The new set that max (E) is formed after removing, max (E { max (E) }) indicate the new set E maximum value in { max (E) };
The sequence of the priority reference value flowed according to described every determines the priority of every stream;Wherein, preferentially
The priority of the grade highest stream of reference value is 0, and the priority of the minimum stream of priority reference value is N-1.
7. according to the method described in claim 6, it is characterized in that, the alternate routing of every stream meets the following conditions:
Any alternate routing of every stream is acyclic;
Any alternate routing paths traversed and other alternate routing paths traversed of every stream are not exactly the same;
The distance of any alternative path of every stream meets preset value.
8. the routing decision device based on deeply study under a kind of SDN framework, which is characterized in that it is applied to SDN controller,
Described device includes:
First obtains module, for obtaining the real-time traffic information in network;Wherein, the real-time traffic information includes:It is described
Every occupied link bandwidth of stream in network;
First determining module, for determining the priority of every stream;
Second determining module, the depth Q network DQN for training the real-time traffic information input in advance, according to described every
The priority sequence of item stream successively determines the routing of every stream;
Wherein, the DQN is to be obtained according to sample flow information and the corresponding sample arm of the sample flow information by Strategies Training
It arrives;The sample flow information includes:The occupied link bandwidth of every sample flow, the sample routing policy include:Institute
State the routing of the corresponding every sample flow of sample flow information.
9. device according to claim 8, which is characterized in that described device further includes:
Module is constructed, for constructing initial DQN;
Second obtains module, for obtaining sample flow information and the corresponding sample routing policy of the sample flow information;
Third determining module is used for by DQN described in the sample flow information input, according to preset every sample flow
Priority sequence obtains the current routing of every sample flow;
Computing module, for according to every sample flow current routing and the sample routing policy, calculating set in advance
The value for the loss function set;
First processing module, for adjusting the net of the DQN when the value of loss function calculated is not less than the first preset value
Network parameter, and trigger the third determining module;
Second processing module, for terminating training, obtaining when the value of loss function calculated is lower than first preset value
The DQN that training is completed.
10. a kind of SDN controller, which is characterized in that including processor, communication interface, memory and communication bus, wherein institute
It states processor, the communication interface, the memory and completes mutual communication by the communication bus;
The memory, for storing computer program;
The processor when for executing the program stored on the memory, realizes side as claimed in claim 1 to 7
Method step.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810945527.XA CN108900419B (en) | 2018-08-17 | 2018-08-17 | Routing decision method and device based on deep reinforcement learning under SDN framework |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810945527.XA CN108900419B (en) | 2018-08-17 | 2018-08-17 | Routing decision method and device based on deep reinforcement learning under SDN framework |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108900419A true CN108900419A (en) | 2018-11-27 |
CN108900419B CN108900419B (en) | 2020-04-17 |
Family
ID=64354702
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810945527.XA Active CN108900419B (en) | 2018-08-17 | 2018-08-17 | Routing decision method and device based on deep reinforcement learning under SDN framework |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108900419B (en) |
Cited By (23)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109379747A (en) * | 2018-12-04 | 2019-02-22 | 北京邮电大学 | The deployment of wireless network multi-controller and resource allocation methods and device |
CN109547340A (en) * | 2018-12-28 | 2019-03-29 | 西安电子科技大学 | SDN data center network jamming control method based on heavy-route |
CN109614215A (en) * | 2019-01-25 | 2019-04-12 | 广州大学 | Stream scheduling method, device, equipment and medium based on deeply study |
CN109768940A (en) * | 2018-12-12 | 2019-05-17 | 北京邮电大学 | The flow allocation method and device of multi-service SDN network |
CN110247795A (en) * | 2019-05-30 | 2019-09-17 | 北京邮电大学 | A kind of cloud net resource service chain method of combination and system based on intention |
CN110324260A (en) * | 2019-06-21 | 2019-10-11 | 北京邮电大学 | A kind of network function virtualization intelligent dispatching method based on flow identification |
CN110535770A (en) * | 2019-08-30 | 2019-12-03 | 西安邮电大学 | A kind of video flowing method for intelligently routing based on QoS perception under SDN environment |
CN110995858A (en) * | 2019-12-17 | 2020-04-10 | 大连理工大学 | Edge network request scheduling decision method based on deep Q network |
CN111010294A (en) * | 2019-11-28 | 2020-04-14 | 国网甘肃省电力公司电力科学研究院 | Electric power communication network routing method based on deep reinforcement learning |
CN111200566A (en) * | 2019-12-17 | 2020-05-26 | 北京邮电大学 | Network service flow information grooming method and electronic equipment |
CN111314171A (en) * | 2020-01-17 | 2020-06-19 | 深圳供电局有限公司 | Method, device and medium for predicting and optimizing SDN routing performance |
WO2020134507A1 (en) * | 2018-12-28 | 2020-07-02 | 北京邮电大学 | Routing construction method for unmanned aerial vehicle network, unmanned aerial vehicle, and storage medium |
CN111526055A (en) * | 2020-04-23 | 2020-08-11 | 北京邮电大学 | Route planning method and device and electronic equipment |
CN111585915A (en) * | 2020-03-30 | 2020-08-25 | 西安电子科技大学 | Long and short flow balanced transmission method and system, storage medium and cloud server |
CN111917657A (en) * | 2020-07-02 | 2020-11-10 | 北京邮电大学 | Method and device for determining flow transmission strategy |
CN111988220A (en) * | 2020-08-14 | 2020-11-24 | 山东大学 | Multi-target disaster backup method and system among data centers based on reinforcement learning |
CN112039767A (en) * | 2020-08-11 | 2020-12-04 | 山东大学 | Multi-data center energy-saving routing method and system based on reinforcement learning |
CN113347108A (en) * | 2021-05-20 | 2021-09-03 | 中国电子科技集团公司第七研究所 | SDN load balancing method and system based on Q-learning |
CN113489654A (en) * | 2021-07-06 | 2021-10-08 | 国网信息通信产业集团有限公司 | Routing method, routing device, electronic equipment and storage medium |
CN113923758A (en) * | 2021-10-15 | 2022-01-11 | 广州电力通信网络有限公司 | POP point selection access method in SD-WAN network |
CN113992595A (en) * | 2021-11-15 | 2022-01-28 | 浙江工商大学 | SDN data center congestion control method based on prior experience DQN playback |
CN114039927A (en) * | 2021-11-04 | 2022-02-11 | 国网江苏省电力有限公司苏州供电分公司 | Control method for routing flow of power information network |
US11606265B2 (en) | 2021-01-29 | 2023-03-14 | World Wide Technology Holding Co., LLC | Network control in artificial intelligence-defined networking |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2022014916A1 (en) * | 2020-07-15 | 2022-01-20 | 한양대학교 에리카산학협력단 | Apparatus for determining packet transmission, and method for determining packet transmission schedule |
Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1518291A (en) * | 2003-01-13 | 2004-08-04 | 曹伟龙 | Local network capable grading control electric device in communication mode |
US20050055618A1 (en) * | 2003-09-04 | 2005-03-10 | Thomas Finteis | Test arrangement and method for selecting a test mode output channel |
US9225635B2 (en) * | 2012-04-10 | 2015-12-29 | International Business Machines Corporation | Switch routing table utilizing software defined network (SDN) controller programmed route segregation and prioritization |
CN106559407A (en) * | 2015-11-19 | 2017-04-05 | 国网智能电网研究院 | A kind of Network traffic anomaly monitor system based on SDN |
CN106779072A (en) * | 2016-12-23 | 2017-05-31 | 深圳市唯特视科技有限公司 | A kind of enhancing based on bootstrapping DQN learns deep search method |
CN107911299A (en) * | 2017-10-24 | 2018-04-13 | 浙江工商大学 | A kind of route planning method based on depth Q study |
CN108011827A (en) * | 2016-10-28 | 2018-05-08 | 中国电信股份有限公司 | A kind of data forwarding method based on SDN, system and controller |
CN108075974A (en) * | 2016-11-14 | 2018-05-25 | 中国移动通信有限公司研究院 | A kind of flow transmission control method, device and SDN architecture systems |
CN108307435A (en) * | 2018-01-29 | 2018-07-20 | 大连大学 | A kind of multitask route selection method based on SDSIN |
CN108390833A (en) * | 2018-02-11 | 2018-08-10 | 北京邮电大学 | A kind of software defined network transmission control method based on virtual Domain |
CN108401015A (en) * | 2018-02-02 | 2018-08-14 | 广州大学 | A kind of data center network method for routing based on deeply study |
-
2018
- 2018-08-17 CN CN201810945527.XA patent/CN108900419B/en active Active
Patent Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1518291A (en) * | 2003-01-13 | 2004-08-04 | 曹伟龙 | Local network capable grading control electric device in communication mode |
US20050055618A1 (en) * | 2003-09-04 | 2005-03-10 | Thomas Finteis | Test arrangement and method for selecting a test mode output channel |
US9225635B2 (en) * | 2012-04-10 | 2015-12-29 | International Business Machines Corporation | Switch routing table utilizing software defined network (SDN) controller programmed route segregation and prioritization |
CN106559407A (en) * | 2015-11-19 | 2017-04-05 | 国网智能电网研究院 | A kind of Network traffic anomaly monitor system based on SDN |
CN108011827A (en) * | 2016-10-28 | 2018-05-08 | 中国电信股份有限公司 | A kind of data forwarding method based on SDN, system and controller |
CN108075974A (en) * | 2016-11-14 | 2018-05-25 | 中国移动通信有限公司研究院 | A kind of flow transmission control method, device and SDN architecture systems |
CN106779072A (en) * | 2016-12-23 | 2017-05-31 | 深圳市唯特视科技有限公司 | A kind of enhancing based on bootstrapping DQN learns deep search method |
CN107911299A (en) * | 2017-10-24 | 2018-04-13 | 浙江工商大学 | A kind of route planning method based on depth Q study |
CN108307435A (en) * | 2018-01-29 | 2018-07-20 | 大连大学 | A kind of multitask route selection method based on SDSIN |
CN108401015A (en) * | 2018-02-02 | 2018-08-14 | 广州大学 | A kind of data center network method for routing based on deeply study |
CN108390833A (en) * | 2018-02-11 | 2018-08-10 | 北京邮电大学 | A kind of software defined network transmission control method based on virtual Domain |
Non-Patent Citations (1)
Title |
---|
尹弼柏等: "基于SDN拓扑集中更新的NDN路由策略", 《北京邮电大学学报》 * |
Cited By (39)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109379747A (en) * | 2018-12-04 | 2019-02-22 | 北京邮电大学 | The deployment of wireless network multi-controller and resource allocation methods and device |
CN109379747B (en) * | 2018-12-04 | 2022-04-12 | 北京邮电大学 | Wireless network multi-controller deployment and resource allocation method and device |
CN109768940B (en) * | 2018-12-12 | 2020-12-29 | 北京邮电大学 | Flow distribution method and device for multi-service SDN |
CN109768940A (en) * | 2018-12-12 | 2019-05-17 | 北京邮电大学 | The flow allocation method and device of multi-service SDN network |
CN109547340B (en) * | 2018-12-28 | 2020-05-19 | 西安电子科技大学 | SDN data center network congestion control method based on rerouting |
WO2020134507A1 (en) * | 2018-12-28 | 2020-07-02 | 北京邮电大学 | Routing construction method for unmanned aerial vehicle network, unmanned aerial vehicle, and storage medium |
CN109547340A (en) * | 2018-12-28 | 2019-03-29 | 西安电子科技大学 | SDN data center network jamming control method based on heavy-route |
US11129082B2 (en) | 2018-12-28 | 2021-09-21 | Beijing University Of Posts And Telecommunications | Method of route construction of UAV network, UAV and storage medium thereof |
CN109614215A (en) * | 2019-01-25 | 2019-04-12 | 广州大学 | Stream scheduling method, device, equipment and medium based on deeply study |
CN110247795A (en) * | 2019-05-30 | 2019-09-17 | 北京邮电大学 | A kind of cloud net resource service chain method of combination and system based on intention |
CN110324260A (en) * | 2019-06-21 | 2019-10-11 | 北京邮电大学 | A kind of network function virtualization intelligent dispatching method based on flow identification |
US11411865B2 (en) | 2019-06-21 | 2022-08-09 | Beijing University Of Posts And Telecommunications | Network resource scheduling method, apparatus, electronic device and storage medium |
CN110535770A (en) * | 2019-08-30 | 2019-12-03 | 西安邮电大学 | A kind of video flowing method for intelligently routing based on QoS perception under SDN environment |
CN110535770B (en) * | 2019-08-30 | 2021-10-22 | 西安邮电大学 | QoS-aware-based intelligent routing method for video stream in SDN environment |
CN111010294A (en) * | 2019-11-28 | 2020-04-14 | 国网甘肃省电力公司电力科学研究院 | Electric power communication network routing method based on deep reinforcement learning |
CN111010294B (en) * | 2019-11-28 | 2022-07-12 | 国网甘肃省电力公司电力科学研究院 | Electric power communication network routing method based on deep reinforcement learning |
CN111200566A (en) * | 2019-12-17 | 2020-05-26 | 北京邮电大学 | Network service flow information grooming method and electronic equipment |
CN110995858A (en) * | 2019-12-17 | 2020-04-10 | 大连理工大学 | Edge network request scheduling decision method based on deep Q network |
CN110995858B (en) * | 2019-12-17 | 2022-02-25 | 大连理工大学 | Edge network request scheduling decision method based on deep Q network |
CN111314171A (en) * | 2020-01-17 | 2020-06-19 | 深圳供电局有限公司 | Method, device and medium for predicting and optimizing SDN routing performance |
CN111585915A (en) * | 2020-03-30 | 2020-08-25 | 西安电子科技大学 | Long and short flow balanced transmission method and system, storage medium and cloud server |
CN111585915B (en) * | 2020-03-30 | 2023-04-07 | 西安电子科技大学 | Long and short flow balanced transmission method and system, storage medium and cloud server |
CN111526055A (en) * | 2020-04-23 | 2020-08-11 | 北京邮电大学 | Route planning method and device and electronic equipment |
CN111917657A (en) * | 2020-07-02 | 2020-11-10 | 北京邮电大学 | Method and device for determining flow transmission strategy |
CN112039767A (en) * | 2020-08-11 | 2020-12-04 | 山东大学 | Multi-data center energy-saving routing method and system based on reinforcement learning |
CN112039767B (en) * | 2020-08-11 | 2021-08-31 | 山东大学 | Multi-data center energy-saving routing method and system based on reinforcement learning |
CN111988220A (en) * | 2020-08-14 | 2020-11-24 | 山东大学 | Multi-target disaster backup method and system among data centers based on reinforcement learning |
CN111988220B (en) * | 2020-08-14 | 2021-05-28 | 山东大学 | Multi-target disaster backup method and system among data centers based on reinforcement learning |
US11606265B2 (en) | 2021-01-29 | 2023-03-14 | World Wide Technology Holding Co., LLC | Network control in artificial intelligence-defined networking |
CN113347108B (en) * | 2021-05-20 | 2022-08-02 | 中国电子科技集团公司第七研究所 | SDN load balancing method and system based on Q-learning |
CN113347108A (en) * | 2021-05-20 | 2021-09-03 | 中国电子科技集团公司第七研究所 | SDN load balancing method and system based on Q-learning |
CN113489654A (en) * | 2021-07-06 | 2021-10-08 | 国网信息通信产业集团有限公司 | Routing method, routing device, electronic equipment and storage medium |
CN113489654B (en) * | 2021-07-06 | 2024-01-05 | 国网信息通信产业集团有限公司 | Routing method, device, electronic equipment and storage medium |
CN113923758A (en) * | 2021-10-15 | 2022-01-11 | 广州电力通信网络有限公司 | POP point selection access method in SD-WAN network |
CN113923758B (en) * | 2021-10-15 | 2022-06-21 | 广州电力通信网络有限公司 | POP point selection access method in SD-WAN network |
CN114039927A (en) * | 2021-11-04 | 2022-02-11 | 国网江苏省电力有限公司苏州供电分公司 | Control method for routing flow of power information network |
CN114039927B (en) * | 2021-11-04 | 2023-09-12 | 国网江苏省电力有限公司苏州供电分公司 | Control method for routing flow of power information network |
CN113992595A (en) * | 2021-11-15 | 2022-01-28 | 浙江工商大学 | SDN data center congestion control method based on prior experience DQN playback |
CN113992595B (en) * | 2021-11-15 | 2023-06-09 | 浙江工商大学 | SDN data center congestion control method based on priority experience playback DQN |
Also Published As
Publication number | Publication date |
---|---|
CN108900419B (en) | 2020-04-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108900419A (en) | Route decision method and device based on deeply study under SDN framework | |
CN112437020B (en) | Data center network load balancing method based on deep reinforcement learning | |
CN108260169B (en) | QoS guarantee-based dynamic service function chain deployment method | |
CN102415059B (en) | Bus control device | |
CN103825823B (en) | Data forwarding method based on different priorities in software-defined network | |
CN112486690B (en) | Edge computing resource allocation method suitable for industrial Internet of things | |
CN108833279B (en) | Method for multi-constraint QoS routing based on service classification in software defined network | |
CN107409099A (en) | The method, apparatus and machine readable media of traffic engineering in communication network with service quality stream and stream of doing one's best | |
CN105897575A (en) | Path computing method based on multi-constrained path computing strategy under SDN | |
CN109547358B (en) | Method for constructing time-sensitive network slice | |
CN109600319B (en) | Flow scheduling method in real-time transmission mechanism | |
US20170061041A1 (en) | Automatic performance characterization of a network-on-chip (noc) interconnect | |
JPWO2019026684A1 (en) | Route control method and route setting device | |
CN105515987A (en) | SDN framework based virtual optical network oriented mapping method | |
Manevich et al. | A cost effective centralized adaptive routing for networks-on-chip | |
CN103329492A (en) | System and method for implementing periodic early discard in on-chip buffer memories of network elements | |
CN108076158A (en) | Minimum load route selection method and system based on Naive Bayes Classifier | |
CN108028805A (en) | A kind of system and method for control flow equalization in band in software defined network | |
Chen et al. | Albrl: Automatic load-balancing architecture based on reinforcement learning in software-defined networking | |
CN103051546B (en) | Delay scheduling-based network traffic conflict prevention method and delay scheduling-based network traffic conflict prevention system | |
CN106059941A (en) | Backbone network traffic scheduling method for eliminating link congestion | |
CN108600098B (en) | Scheduling method for fixed bandwidth of multiple variable paths in high-performance network | |
CN110535705A (en) | A kind of service function chain building method of adaptive user delay requirement | |
He et al. | RTHop: Real-time hop-by-hop mobile network routing by decentralized learning with semantic attention | |
Zhou et al. | Multi-task deep learning based dynamic service function chains routing in SDN/NFV-enabled networks |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right | ||
TR01 | Transfer of patent right |
Effective date of registration: 20220119 Address after: 609-1, floor 6, building 1, No. 10, caihefang Road, Haidian District, Beijing 100080 Patentee after: Fenomen array (Beijing) Technology Co.,Ltd. Address before: 100876 Beijing city Haidian District Xitucheng Road No. 10 Patentee before: Beijing University of Posts and Telecommunications |