CN114355775A - Multi-controller deployment method and system based on SDN (software defined network) and deep reinforcement learning - Google Patents

Multi-controller deployment method and system based on SDN (software defined network) and deep reinforcement learning Download PDF

Info

Publication number
CN114355775A
CN114355775A CN202111641069.9A CN202111641069A CN114355775A CN 114355775 A CN114355775 A CN 114355775A CN 202111641069 A CN202111641069 A CN 202111641069A CN 114355775 A CN114355775 A CN 114355775A
Authority
CN
China
Prior art keywords
controller
controllers
deployment
atomix
node
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111641069.9A
Other languages
Chinese (zh)
Inventor
尤龙
陈佳
王冲
王夏菁
廖晨茜
刘上
王艳广
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Aerospace Science And Technology Network Information Development Co ltd
Original Assignee
Aerospace Science And Technology Network Information Development Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Aerospace Science And Technology Network Information Development Co ltd filed Critical Aerospace Science And Technology Network Information Development Co ltd
Priority to CN202111641069.9A priority Critical patent/CN114355775A/en
Publication of CN114355775A publication Critical patent/CN114355775A/en
Pending legal-status Critical Current

Links

Images

Abstract

The invention provides a multi-controller deployment method based on an SDN network and deep reinforcement learning, which comprises the following steps: acquiring a first performance optimization index of multi-controller deployment according to an SDN network structure; the first performance optimization indicator comprises: average propagation delay between the switch and the controller, average propagation delay between the controllers, synchronization overhead between the controllers, and minimum security of synchronization between the controllers; establishing a first objective function according to the first performance optimization index; acquiring a first constraint condition of multi-controller deployment; the first constraint condition comprises controller load constraint, mapping relation constraint of the switch and the controller and control layer synchronous link bandwidth constraint; constructing a multi-controller deployment model according to the first objective function and the first constraint condition; and solving the multi-controller deployment model by utilizing a deep reinforcement learning algorithm based on a Markov decision model to obtain a multi-controller deployment scheme.

Description

Multi-controller deployment method and system based on SDN (software defined network) and deep reinforcement learning
Technical Field
The invention relates to the technical field of controller deployment, in particular to a multi-controller deployment method and a multi-controller deployment system based on an SDN network and deep reinforcement learning.
Background
With the increase of network flow and the continuous expansion of network scale, the inherent defects of a single controller, such as single point failure, limited controller resources and the like, become increasingly prominent, which will increase the communication consumption of a control link. In addition, if a plurality of controllers are used to manage the network, network congestion or paralysis may occur when unreasonable deployment of the plurality of controllers meets the network service requirements, which may greatly affect the scalability of the SDN network, and thus the deployment problem of the plurality of controllers is particularly important. Deployment of the controller may also have a significant impact on the performance, reliability, and network cost of the SDN network. Therefore, the invention provides a multi-controller deployment method and system based on an SDN network and deep reinforcement learning.
Disclosure of Invention
The invention aims to provide a multi-controller deployment method and a multi-controller deployment system based on an SDN (software defined network) network and deep reinforcement learning.
In order to achieve the purpose, the invention provides the following scheme:
a multi-controller deployment method based on an SDN network and deep reinforcement learning comprises the following steps:
acquiring a first performance optimization index of multi-controller deployment according to an SDN network structure; the first performance optimization indicator comprises: average propagation delay between the switch and the controller, average propagation delay between the controllers, synchronization overhead between the controllers, and minimum security of synchronization between the controllers;
establishing a first objective function according to the first performance optimization index;
acquiring a first constraint condition of multi-controller deployment; the first constraint condition comprises controller load constraint, mapping relation constraint of the switch and the controller and control layer synchronous link bandwidth constraint;
constructing a multi-controller deployment model according to the first objective function and the first constraint condition;
and solving the multi-controller deployment model by utilizing a deep reinforcement learning algorithm based on a Markov decision model to obtain a multi-controller deployment scheme.
A system based on an SDN network and a deep reinforcement learning multi-controller deployment method comprises the following steps:
the first performance optimization index acquisition module is used for acquiring a first performance optimization index deployed by the multiple controllers according to the SDN network structure; the first performance optimization indicator comprises: average propagation delay between the switch and the controller, average propagation delay between the controllers, synchronization overhead between the controllers, and minimum security of synchronization between the controllers;
the first objective function establishing module is used for establishing a first objective function according to the first performance optimization index;
the first constraint condition acquisition module is used for acquiring a first constraint condition of multi-controller deployment; the first constraint condition comprises controller load constraint, mapping relation constraint of the switch and the controller and control layer synchronous link bandwidth constraint;
the multi-controller deployment model building module is used for building a multi-controller deployment model according to the first objective function and the first constraint condition;
and the solving module is used for solving the multi-controller deployment model by utilizing a deep reinforcement learning algorithm based on a Markov decision model to obtain a multi-controller deployment scheme.
According to the specific embodiment provided by the invention, the invention discloses the following technical effects:
the invention provides a multi-controller deployment method and a multi-controller deployment system based on an SDN network and deep reinforcement learning, wherein the method comprises the following steps: acquiring a first performance optimization index of multi-controller deployment according to an SDN network structure; the first performance optimization indicator comprises: average propagation delay between the switch and the controller, average propagation delay between the controllers, synchronization overhead between the controllers, and minimum security of synchronization between the controllers; establishing a first objective function according to the first performance optimization index; acquiring a first constraint condition of multi-controller deployment; the first constraint condition comprises controller load constraint, mapping relation constraint of the switch and the controller and control layer synchronous link bandwidth constraint; constructing a multi-controller deployment model according to the first objective function and the first constraint condition; and solving the multi-controller deployment model by utilizing a deep reinforcement learning algorithm based on a Markov decision model to obtain a multi-controller deployment scheme. Aiming at special application scenes such as a complex SDN (software defined network) network, a battlefield and the like, a multi-controller deployment mechanism is provided to reduce time delay, improve network performance and avoid the problem of control node breakdown caused by frequent service flow issuing of control nodes. And the field format of the synchronous data packet of the control layer is flexibly designed according to the interactive information of the control layer in the special application scene. The invention establishes an optimized deployment model of the cluster, so that when one controller in the network is damaged and stops working, other controllers can take over all nodes under the control of the fault controller, thereby ensuring that the communication of all nodes is not interrupted and enhancing the survivability of the control nodes. Meanwhile, a deep reinforcement learning algorithm is applied to establish a reliable and stable data transmission channel, so that accurate management of the network is realized.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.
Fig. 1 is a flowchart of a multi-controller deployment method based on an SDN network and deep reinforcement learning according to embodiment 1 of the present invention;
fig. 2 is a flowchart of synchronization between controllers according to embodiment 1 of the present invention;
fig. 3 is a diagram of a neural network structure provided in embodiment 1 of the present invention;
fig. 4 is a block diagram of a multi-controller deployment system based on an SDN network and deep reinforcement learning according to embodiment 2 of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The invention aims to provide a multi-controller deployment method and a multi-controller deployment system based on an SDN (software defined network) network and deep reinforcement learning.
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.
Models established in the prior art mostly adopt single-target optimization as a main part, and the multi-target optimization model takes time delay between a switch and a controller and time delay between controllers as optimization indexes and takes controller load as a constraint condition. The existing model which takes the overhead of synchronous information between controllers as an optimization index is less, and although the method is simple to implement, the optimization effect is poor. Meanwhile, most deployment schemes do not consider the optimized deployment of the cluster, the existing technology for deployment of the ONOMIs assisted by Atomix is less, and the optimization is not carried out aiming at the deployment mode that Atomix nodes are physically separated from the ONOS controller. The multi-controller deployment problem is an NP-hard problem and computationally very time consuming. The existing solving algorithms mainly comprise an integer linear programming algorithm, a heuristic algorithm and the like, and the algorithms have the problems of high complexity, poor expandability, easy falling into local optimization and the like. Therefore, the distributed coordination framework Atomix is adopted to assist the ONOS controller to establish the cluster in the research of the invention. The invention provides a method for establishing a full-network control layer by adopting reasonable deployment of distributed controllers under complex network environments such as SDN scenes, variable battlefields and the like. The invention aims at the design of the control layer synchronous data packet format and reasonably and efficiently realizes the deployment of the distributed controller under the constraint of the control layer synchronous overhead. The design scheme mainly comprises three aspects of control layer synchronous message design, establishment of a flat multi-controller deployment model and establishment of a cluster deployment optimization model of a distributed coordination framework Atomix.
Example 1
As shown in fig. 1, the present embodiment provides a multi-controller deployment method based on an SDN network and deep reinforcement learning, including:
s1: acquiring a first performance optimization index of multi-controller deployment according to an SDN network structure; the first performance optimization indicator comprises: average propagation delay between the switch and the controller, average propagation delay between the controllers, synchronization overhead between the controllers, and minimum security of synchronization between the controllers;
specifically, the expression of the average propagation delay between the switch and the controller is as follows:
Figure BDA0003443426120000041
wherein N is the number of switches, dijIs the shortest link delay, x, between the switch and the controllerijIs binary number, and when the value is 1, the successful connection between the switch i and the controller j is represented; siFor switch i, S is a set of switches, cjIs controller j and C is a set of controllers.
The expression of the average propagation delay between the controllers is as follows:
Figure BDA0003443426120000042
ckk, the number of controllers,
Figure BDA0003443426120000043
the shortest link delay between controllers;
the expression of the synchronization overhead among the controllers is as follows:
Figure BDA0003443426120000044
wherein ljkFor controller j packet length, p, synchronized between controllers ksjkIs the frequency of synchronization between controller j and controller k.
S2: establishing a first objective function according to the first performance optimization index;
specifically, the first objective function is:
min imize(αTscavg+βTccavg+ρCcc)+μ·K
wherein, TscavgRepresenting the average propagation delay between the switch and the controller; t isccavgAverage propagation delay among controllers; inter-controller synchronization overhead; mu is a safety factor for realizing the minimum safety of the synchronization between the controllers; α, β, ρ are each first performance optimization index weight, and α + β + ρ is 1.
S3: acquiring a first constraint condition of multi-controller deployment; the first constraint condition comprises controller load constraint, mapping relation constraint of the switch and the controller and control layer synchronous link bandwidth constraint;
specifically, the controller load constraint is:
Figure BDA0003443426120000045
wherein L isoptThe desired optimal controller load number; l iscIs the actual controller load number; l isjFor the load number of controller j, LBI is defined as the load bias for measuring the load number difference among all controllers in the networkA difference index; the delta LB is the difference value of the number of the exchangers managed by each controller and is defined as the average value of the load difference; LBCA threshold of Δ LB;
the mapping relation constraint of the switch and the controller is as follows:
Figure BDA0003443426120000051
the control layer synchronous link bandwidth constraint is as follows:
Figure BDA0003443426120000052
wherein, BWAnd provides bandwidth for controlling layer physical links.
S4: constructing a multi-controller deployment model according to the first objective function and the first constraint condition;
s5: and solving the multi-controller deployment model by utilizing a deep reinforcement learning algorithm based on a Markov decision model to obtain a multi-controller deployment scheme.
Wherein, step S5 specifically includes:
acquiring a state space of the Markov decision model according to the controller placement condition and the network topology information of each node in the network at the current time t;
acquiring an action space of the Markov decision model according to the number of controllers required to be deployed in a network, the positions of nodes deployed by the controllers and a mapping relation between a switch and the controllers;
acquiring the state transition probability of the Markov decision model according to the probability of transition to the next state after executing a certain action in the current state;
acquiring a reward function of the Markov decision model according to the average communication time delay between the switch and the controller, the average communication time delay between the controllers, the synchronous overhead between the controllers and the minimum safety of the controllers;
obtaining a multi-controller deployment scenario based on the state space, the action space, the state transition probability, and the reward function of the Markov decision.
As another optional implementation, on the basis of the multiple controller deployment model, an Atomix node is embedded, and optimized deployment is performed on the Atomix node, specifically including:
(1) obtaining a second performance optimization index of Atomix node deployment; the second performance indicator comprises an average synchronization delay between the Atomix nodes, an average synchronization delay between the Atomix nodes and the controller node
Specifically, the average synchronization delay between the Atomix nodes is:
Figure BDA0003443426120000053
wherein A is the number of Atomix nodes,
Figure BDA0003443426120000061
is the shortest link delay between Atomix nodes, aj,ak,aiAtomix nodes i, j, k;
the average synchronization delay between the Atomix node and the controller node is: ,
Figure BDA0003443426120000062
Figure BDA0003443426120000063
is the shortest link delay, z, between the Atomix node and the controller nodeijA binary number, with a value of 1, indicates a successful connection of Atomix node i with controller j.
(2) Constructing a second objective function according to the second performance optimization index;
specifically, the second objective function is:
min imizeρ1Taaavg2Tacavg,ρ1,ρ2setting a weight value rho for each second performance optimization index12=1。
(3) Acquiring a second constraint condition of Atomix node deployment; the second constraint condition is that the number of Atomix nodes deployed in the network is B, and at least one mapping relation exists between the Atomix nodes and the controller nodes;
specifically, at least one mapping relationship exists between the Atomix node and the controller node, and is as follows:
Figure BDA0003443426120000064
(4) constructing an Atomix node deployment model according to the first objective function and the first constraint condition;
(5) and solving the Atomix node deployment model by using a depth reinforcement learning algorithm.
The solution method may refer to solving a portion of the multi-controller deployment model.
In the embodiment, based on network environments such as a variable complex SDN scene and an actual battlefield, a multi-controller deployment mechanism based on an SDN network and deep reinforcement learning is provided, a plurality of controllers are efficiently and reasonably placed in a network, cooperative management of the plurality of controllers on the whole network is realized, network delay and bandwidth overhead in a special application scene is reduced, the utilization rate of network resources is improved, loads under the controllers are balanced, and the robustness of the network is enhanced, so that the network failure probability caused by the collapse of the controllers is reduced. And comprehensively considering the performances of network delay, overhead, load and the like, establishing a control layer management method under a special application scene, and realizing the distributed management of each node in the network. In conclusion, the innovation points of the invention are as follows:
(1) and establishing a multi-controller deployment model. And comprehensively selecting a plurality of network performance optimization indexes according to the network requirements in the special application scene. In order to enable the control layer to carry out cooperative management on the network, normal operation of the network can not be influenced when one controller fails, except for considering time delay and load, synchronous overhead among the controllers, minimum safety of the controllers and bandwidth constraint of a control layer link are used as network performance optimization indexes, factors influencing deployment in multiple aspects are considered, so that special requirements under special application scenes are met, and the controllers are reasonably deployed.
And designing a multi-controller deployment model. Selecting and defining network performance optimization indexes to be considered when deploying the controller, obtaining an objective function by analyzing the relation among the optimization indexes and the deployment problem to be solved by the model, completing the establishment of the model, and solving the problem by a deep reinforcement learning algorithm.
(2) Aiming at the optimization of the synchronous overhead of the controllers in the multi-controller deployment model, the invention designs the synchronous data packet field format among the controllers according to the synchronous requirements of the controllers in a special application scene, thereby optimizing the network performance index of the controller deployment model in a targeted manner and establishing a more accurate and more flexible deployment model.
And designing synchronous data packet information of a control layer, and designing information needing to be synchronized among all controllers according to the requirements of different application scenes so as to provide a flexible optimization target for a controller deployment scheme.
(3) And aiming at the control layer cooperative management, establishing an optimized deployment model of the controller cluster. The network control layer established based on the ONOS controller needs to utilize a distributed coordination framework Atomix, and the invention provides an Atomix node deployment model to optimize the deployment quantity and the deployment position of Atomix.
Aiming at the fact that the version of the existing ONOS controller can be physically separated from the Atomix nodes, cluster deployment is more flexible, and an Atomix node deployment optimization scheme is designed to improve network synchronization performance.
(4) Aiming at the NP-Hard problem of multi-controller deployment, the method solves the model by using a deep reinforcement learning algorithm. When an actual network application scene is deployed, the distance between each node is long, and an optimal result needs to be calculated within a limited time. The deep reinforcement learning algorithm is low in complexity, and integrates historical network data learning into controller deployment and switch controller mapping decisions so as to adapt to a network environment.
Aiming at the established model, a deep reinforcement learning algorithm is applied to the solution of the deployment model, so that the result of the method is more efficient and reasonable.
In order to make the solution of the present embodiment more clearly understood, the following detailed description is provided:
control layer synchronous message design based on ONOS controller
Due to the complexity and the changeability of application scenes, the invention needs to establish a large-scale complex SDN network platform. Meanwhile, in a complex network environment, a controller needs to frequently issue control information such as a flow table and a service flow, however, a single controller is difficult to bear functions such as control information issuing of all nodes in a network, and therefore conflict of control information issuing may be caused, and collapse of control nodes is caused. Therefore, to avoid a single point of failure, increasing the reaction speed and overall performance of the network requires deployment of multiple controllers across the network for network management.
In the mechanism of the present invention, the entire network is divided into a plurality of sub-domains, and each controller controls one area in the network. Because each controller needs to undertake the task of calculating the policy for the switches controlled by the controller, the controller in each domain needs to master not only the switch topology relationship in the control range of the controller, but also the switch topology relationship in the control range of other controllers. Therefore, a synchronous communication mechanism needs to be established between the domain controllers to synchronize the topology information of the domains with each other. Meanwhile, in order to avoid the out-of-control situation of the switch caused by the failure of the controller, when the corresponding controller of the switch fails, other non-failure controllers can still quickly and timely send control information to the out-of-control switch after taking over the out-of-control switch, and information such as calculated strategies and the like needs to be synchronized among all domain controllers.
Therefore, two controllers are taken as an example to describe the synchronization process between the controllers, as shown in fig. 2, the process is divided into three steps:
each domain controller collects the topology information of the data layer;
the controllers establish connection through a TCP three-way handshake protocol;
the controller A and the controller B respectively calculate strategies;
the controllers send synchronous data packets to each other for information synchronization;
after the synchronization is completed, the controller A and the controller B mutually send an ACK message to terminate the synchronization.
Aiming at the step II, the invention designs the format of the synchronous data packet. In the invention, the synchronous information among the controllers comprises topology information, flow table information and specific service flow information. The three sync packet field formats are shown in table 1, table 2, and table 3.
Table 1 topology information synchronization packet field format
Figure BDA0003443426120000081
Table 2 flow table information synchronization packet field format
Figure BDA0003443426120000082
Table 3 synchronous data packet field format for specific service flow information
Figure BDA0003443426120000083
(II) establishment of flat multi-controller deployment model based on distribution
The deployment of multiple controllers needs to address three key issues:
given a network topology, calculating the number of controllers to be deployed;
determining that the deployment controller is located at the most reasonable location;
each switch should be managed by which controller.
Therefore, starting from the above three key problems, the invention establishes a deployment model of the controller according to a large-scale SDN and a complex network application scenario.
Physical network: the network topology is made up of an undirected graph G (V, E), where V represents a set of switches and E represents a set of physical links. K denotes the number of controllers in the network, C ═ C1,...,ckDenotes a set of controllers, each controller being placed at the position of a switch in the network, in this case p, in the present inventionθIndicating the deployment location of the controller. By using
Figure BDA0003443426120000091
Is represented by a controller thetaiManaged switches, the mapping relationship between the switches and the controllers can be expressed as a set
Figure BDA0003443426120000092
The average propagation delay between the switch and the controller and the average propagation delay between the controllers are related to the deployment positions of the controllers, and the network performance optimization indexes are selected based on the average propagation delay and the average propagation delay between the switches and the controllers as follows:
average propagation delay between the switch and the controller: representing the average of the propagation delay between the switch and the controller. As shown in (formula 1). Where N ═ V | is the number of switches, dijIs the shortest link delay, x, between the switch and the controllerijIs a binary number and a value of 1 indicates a successful connection of switch i to controller j.
Figure BDA0003443426120000093
Average propagation delay between controllers: representing the average value of the propagation delay between the controllers. As shown in (equation 2). Wherein K is the number of the controllers,
Figure BDA0003443426120000094
is the shortest link delay between the controllers.
Figure BDA0003443426120000095
③ synchronization overhead between controllers: representing the communication overhead that occurs when synchronization is performed between controllers. As shown in (equation 3). The data packet format and synchronization between it and the controllerThe step frequency is related. Wherein ljkPacket length, p, for synchronization between controller j and controller ksjkIs the frequency of synchronization between controller j and controller k.
Figure BDA0003443426120000096
And fourthly, the minimum safety of the synchronization between the controllers is as follows:
because the controller and other controllers can obtain the topology information of the whole network through communication, in order to reduce the probability of information leakage when the network is attacked and enhance the security of the network, the number of deployed controllers needs to be minimized. In the model, a safety factor mu is set to restrict the number of deployment controllers.
Aiming at the description of the optimization indexes selected by the model, the multi-controller deployment model based on time delay and synchronous overhead is established in the invention. The deployment of the controller is realized by comprehensively considering the time delay and the synchronization overhead condition and combining the constraint conditions of the load of the controller, the minimum safety of synchronization and the like. The optimization objective is shown in (equation 4).
min imize(αTscavg+βTccavg+ρCcc)+μ·K (4)
In the model, the average propagation delay between the switch and the controller and the average propagation delay between the controllers have a contradiction relationship. That is, the two network performance optimization indicators are constrained with each other, and in order to minimize the average propagation delay between the switches and the controllers, the controllers are disposed at positions that are close to the switches, so that the distance between the controllers is increased, and the average propagation delay between the controllers is increased. And vice versa. Under such a relationship, there is no solution that optimizes all performance optimization indexes, and generally, if one optimization index is improved, the performance of the other optimization indexes will be sacrificed. Therefore, in the present invention, a weight is set for each network performance optimization index, where α + β + ρ ═ 1. According to the specific application scene, an efficient and flexible deployment model is established for the emphasis points of each optimization index.
The model needs to satisfy the following constraint conditions:
controller load constraint:
in the controller deployment process, the model needs to meet the load limit of the controller, that is, the number of switches managed by each controller cannot exceed a specific threshold, the difference value of the number of switches managed by each controller is defined as a load difference average value Δ LB, as shown in (equation 7), and the value cannot exceed a specified threshold LBC。LoptThe desired optimal controller load amount. The difference of the load quantity among all the controllers in the network is defined as a load deviation degree index LBI, as shown in (formula 6), the smaller the index is, the better the network load balancing performance after the controllers are reasonably deployed is.
Figure BDA0003443426120000101
Figure BDA0003443426120000102
Figure BDA0003443426120000103
The mapping relation between the switch and the controller is as follows:
in the model, the number of the controllers deployed in the network is K, and there is only one mapping relationship between the switches and the controllers, and meanwhile, if it is required to ensure that each switch has a controller to which it belongs, the mapping relationship needs to satisfy the following expression (formula 8).
Figure BDA0003443426120000111
Figure BDA0003443426120000112
xsc≤yc (8)
Figure BDA0003443426120000113
Figure BDA0003443426120000114
Controlling layer synchronous link bandwidth constraint: deployed controllers in nodes
In a specific application environment such as a large-scale SDN or a complex network, the bandwidth requirement of synchronous information between controllers cannot exceed the bandwidth resource provided by a control layer physical link, and the bandwidth provided by the control layer physical link is set as BWThe bandwidth constraint is as shown in (equation 9).
Figure BDA0003443426120000115
(III) Cluster deployment optimization model based on distributed coordination framework Atomix
Because the invention adopts the ONOS controller, the cluster management under the environment of the ONOS controller needs to adopt a distributed coordination framework Atomix which physically separates the functions of cluster management, service discovery, persistent data storage and the like from the ONOS node. With knowledge of ATOMIX, before deploying the ONOS controller cluster, one ATOMIX cluster must first be formed for data storage and coordination, and then the ONOS nodes are configured with a list of ATOMIX nodes to be connected. Meanwhile, in the ONOS controller of the past version, Atomix nodes need to be embedded into the ONOS controller to form a cluster and synchronize states. Whereas in the current version of the ONOS controller, functions like synchronization status are moved to a separate Atomix cluster. According to the learning of the Atomix framework, the Atomix framework can be deployed on a non-control node and can also be embedded into a control node. However, the deployment of Atomix nodes is scarce according to the reference of related data. Therefore, it is also important to effectively select the number and the positions of the Atomix nodes in the large-scale SDN. Therefore, on the basis that the Atomix nodes of the first step and the second step are embedded into the ONOS controller to establish the model, the Atomix nodes are optimally deployed to establish the model for determining the deployment number and the deployment positions of the Atomix nodes, and the synchronization of the state information of the whole network is realized.
In the model for optimizing the deployment of the Atomix nodes, the strong consistency between the Atomix nodes and the ONOS nodes and between the Atomix nodes is maintained due to the Raft protocol, and the performance optimization indexes are selected and defined in the invention.
Average synchronization delay between Atomix nodes: and representing the average value of the propagation delay generated by the synchronization information between the Atomix nodes. As shown in (equation 10). Wherein A is the number of Atomix nodes,
Figure BDA0003443426120000116
is the shortest link delay between Atomix nodes.
Figure BDA0003443426120000121
Average synchronization delay between Atomix nodes and ONOS nodes: represents the average value of the propagation delay generated by the synchronization information between the Atomix node and the ONOS node. As shown in (equation 11). Wherein K is the number of the ONOS controllers,
Figure BDA0003443426120000122
is the shortest link delay, z, between the Atomix node and the ONOS nodeijA binary number, with a value of 1, indicates a successful connection of Atomix node i with the ONOS controller j.
Figure BDA0003443426120000123
Aiming at the description of the optimization indexes selected by the model, the distributed coordination framework Atomix optimized deployment model based on time delay is established in the invention. The optimization objective is shown in (equation 12).
min imizeρ1Taaavg2Tacavg (12)
In the model, the average synchronization delay between the Atomix nodes and the ONOS nodes are mutually restricted. In order to reduce the average synchronization delay between the Atomix nodes, the deployment of the Atomix nodes is more concentrated, and thus the average synchronization delay between the Atomix nodes and the ONOS nodes is increased. And vice versa. Therefore, in the present invention, in order to balance the relationship between the two, by setting weights where ρ is12And (1) establishing an efficient and flexible Atomix node deployment model according to a specific application scene.
The constraint conditions to be met by the model are as follows:
in the model, the number of Atomix nodes deployed in the network is B, and at least one mapping relationship exists between the Atomix nodes and the ONOS nodes, so that the mapping relationship needs to satisfy the condition shown in (formula 13).
Figure BDA0003443426120000124
Figure BDA0003443426120000125
Figure BDA0003443426120000126
Figure BDA0003443426120000127
Adaptive multi-controller deployment algorithm framework design
Aiming at the established multi-controller deployment model, the invention provides a self-adaptive multi-controller deployment algorithm based on deep reinforcement learning, and aims to realize the deployment of the controllers more efficiently and accurately. And converting the controller deployment problem into an MDP model for solving. Setting the state space, the action space, the state transition probability and the reward function as a quadruple (S, A, P, R). The definition is as follows:
(1) state space S
In the invention, the state space can be expressed as the controller placement condition and network topology information of each node in the network at the current time t. Is represented as follows:
Figure BDA0003443426120000131
Figure BDA0003443426120000132
Figure BDA0003443426120000133
Figure BDA0003443426120000134
Figure BDA0003443426120000135
wherein, the meaning of each element is as follows:
Figure BDA0003443426120000136
representing physical network topology information at time t, including tt,ct
tt: representing the time delay of each link at t.
ct: representing the size of the synchronization overhead for each control link at time t.
bt: representing the load situation of the deployed control node at the time t.
ft: representing the probability of each node failing at time t.
ωt: representing control of nodes at time tThe placement of the device includes
Figure BDA0003443426120000137
Figure BDA0003443426120000138
Representing the number of deployments of the controller at time t.
Figure BDA0003443426120000139
Representing the deployment location of the controller at time t.
δt: representing the Atomix placement of each node at time t, including
Figure BDA00034434261200001310
Figure BDA00034434261200001311
Representing the number of deployments of Atomix at time t.
Figure BDA00034434261200001312
Representing the deployment location of Atomix at time t.
(2) Action space A
In the present invention, the action space a is expressed as the number of controllers that need to be deployed in the network, the location of which appropriate node the controller is deployed on, and the mapping relationship between the switch and the controller. Is represented as follows:
at=(pθ,SC),θ∈(1,2,...,K)
a′t=pc,c∈(1,2,...,B)
wherein, the meaning of each element is as follows:
pθ: representing the appropriate location at which the controller is deployed.
pc: representing the location at which the Atomix is deployed.
SC
Figure BDA0003443426120000141
Representing a mapping between switches and controllers.
K: representing the number of controllers that need to be deployed in the network.
B: representing the number of Atomix needed to be deployed in the network.
(3) Probability of state transition P
In the present invention, the state transition probability represents the transition from the current state stPerform a certain action atPost-transition to the next state st+1The probability of (c). Expressed as follows:
Figure BDA0003443426120000142
wherein, the meaning of each element is as follows:
st: representing the current state.
st+1: representing the next state.
at: representing some action in the current state.
(4) Reward function R
In the invention, each time an action is executed, a reward value is generated according to a set reward function, the larger the reward is, the higher the value of the action is, and the better the performance can be obtained by the deployment of the controller. Therefore, the average communication delay between the switch and the controller, the average communication delay between the controllers, the synchronization overhead between the controllers, and the minimum security of the controllers are set as bonus functions. Is represented as follows:
r1=-((αT1t+βT2t+ρCcc)+μ·K)
r2=-(ρ1T3t2T4t)
wherein, the meaning of each element is as follows:
α: and the proportion of the average communication delay between the switch and the controller in the reward penalty measures in the deployment process is represented.
Beta: representing the proportion of the average communication delay between controllers in the reward penalty measure during deployment.
ρ: representing the proportional amount of synchronization overhead between controllers in a reward penalty measure during deployment.
μ: representing a safety factor for the controller during deployment.
ρ1: and the communication delay among the Atomix accounts for the proportion of the reward penalty measures in the deployment process.
ρ2: and the communication delay between the Atomix and the controller in the deployment process accounts for the proportion of the reward penalty measures.
T1t: representing the average communication delay between the switch and the controller during deployment.
T2t: representing the average communication delay between controllers of the deployment process.
T3t: representing the inter-Atomix communication delay during deployment.
T4t: representing the communication delay between the Atomix and the controller during deployment.
Ccc: representing the synchronization overhead between the controllers of the deployment process.
K: representing the number of controllers to deploy the process.
In the invention, the importance of the link delay to the deployment is considered by the reward function, and different weights are distributed to the delay between the switch and the controller and the delay between the controllers according to the actual application scene; meanwhile, considering the synchronization overhead and the minimum security, the more the deployed result meets the four targets, the larger the reward value is.
In order to enable the model to select the optimal action in a certain network state, and simultaneously obtain the maximum accumulated reward value, so that the obtained deployment result is more accurate, the invention enables the intelligent agent to better sense the environmental information such as the network topology state and the like by establishing the neural network structure, thereby generating a better strategy through better interactive learning with the environment.
The state at each time is used as the input of the neural network, and the dimension of the input state determines the number of input layer neurons. The middle two layers of the neural network are fully connected, the output is the Q value of all possible actions executed in the input state, and the number of output neurons is determined by the size of an action set.
The DQN algorithm adopted by the invention is an off-line learning method, the parameters of the neural network are set, and each control node in the network is deployed by training the result of the output model. As shown in fig. 3, the structure of the neural network is given. The following is the design of the multi-controller deployment decision algorithm training flow and the design of the Atomix deployment decision algorithm training flow.
Figure BDA0003443426120000151
Figure BDA0003443426120000161
Figure BDA0003443426120000171
Example 2
As shown in fig. 4, the present embodiment provides a system based on an SDN network and a deep reinforcement learning multi-controller deployment method, including:
a first performance optimization index obtaining module M1, configured to obtain a first performance optimization index of a multi-controller deployment according to an SDN network structure; the first performance optimization indicator comprises: average propagation delay between the switch and the controller, average propagation delay between the controllers, synchronization overhead between the controllers, and minimum security of synchronization between the controllers;
a first objective function establishing module M2, configured to establish a first objective function according to the first performance optimization index;
a first constraint condition obtaining module M3, configured to obtain a first constraint condition for multi-controller deployment; the first constraint condition comprises controller load constraint, mapping relation constraint of the switch and the controller and control layer synchronous link bandwidth constraint;
a multi-controller deployment model building module M4, configured to build a multi-controller deployment model according to the first objective function and the first constraint condition;
and the solving module M5 is used for solving the multi-controller deployment model by utilizing a deep reinforcement learning algorithm based on a Markov decision model to obtain a multi-controller deployment scheme.
For the system disclosed by the embodiment, the description is relatively simple because the system corresponds to the method disclosed by the embodiment, and the relevant points can be referred to the method part for description.
The principles and embodiments of the present invention have been described herein using specific examples, which are provided only to help understand the method and the core concept of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, the specific embodiments and the application range may be changed. In view of the above, the present disclosure should not be construed as limiting the invention.

Claims (10)

1. A multi-controller deployment method based on an SDN network and deep reinforcement learning is characterized by comprising the following steps:
acquiring a first performance optimization index of multi-controller deployment according to an SDN network structure; the first performance optimization indicator comprises: average propagation delay between the switch and the controller, average propagation delay between the controllers, synchronization overhead between the controllers, and minimum security of synchronization between the controllers;
establishing a first objective function according to the first performance optimization index;
acquiring a first constraint condition of multi-controller deployment; the first constraint condition comprises controller load constraint, mapping relation constraint of the switch and the controller and control layer synchronous link bandwidth constraint;
constructing a multi-controller deployment model according to the first objective function and the first constraint condition;
and solving the multi-controller deployment model by utilizing a deep reinforcement learning algorithm based on a Markov decision model to obtain a multi-controller deployment scheme.
2. The method of claim 1, wherein the average propagation delay between the switch and the controller is expressed by:
Figure FDA0003443426110000011
wherein N is the number of switches, dijIs the shortest link delay, x, between the switch and the controllerijIs a binary number, with a value of 1 indicating a successful connection of switch i to controller j, siFor switch i, S is a set of switches, cjIs controller j and C is a set of controllers.
The expression of the average propagation delay between the controllers is as follows:
Figure FDA0003443426110000012
ckk, the number of controllers,
Figure FDA0003443426110000013
the shortest link delay between controllers;
the expression of the synchronization overhead among the controllers is as follows:
Figure FDA0003443426110000014
wherein ljkFor controller j packet length, p, synchronized between controllers ksjkIs the frequency of synchronization between controller j and controller k.
3. The method of claim 2, wherein the first objective function is:
min imize(αTscavg+βTccavg+ρCcc)+μ·K
wherein, TscavgRepresenting the average propagation delay between the switch and the controller; t isccavgAverage propagation delay among controllers;inter-controller synchronization overhead; mu is a safety factor for realizing the minimum safety of the synchronization between the controllers; α, β, ρ are each first performance optimization index weight, and α + β + ρ is 1.
4. The method of claim 1, wherein the first constraint is:
the controller load constraints are:
Figure FDA0003443426110000021
wherein L isoptThe desired optimal controller load number; l iscIs the actual controller load number; l isjFor the load number of the controller j, LBI is used for measuring the load number difference among all controllers in the network and is defined as a load deviation degree index; the delta LB is the difference value of the number of the exchangers managed by each controller and is defined as the average value of the load difference; LBCA threshold of Δ LB;
the mapping relation constraint of the switch and the controller is as follows:
Figure FDA0003443426110000022
the control layer synchronous link bandwidth constraint is as follows:
Figure FDA0003443426110000023
wherein, BWAnd provides bandwidth for controlling layer physical links.
5. The method according to claim 3, wherein solving the multi-controller deployment model using a deep reinforcement learning algorithm based on a Markov decision model specifically comprises:
acquiring a state space of the Markov decision model according to the controller placement condition and the network topology information of each node in the network at the current moment;
acquiring an action space of the Markov decision model according to the number of controllers required to be deployed in a network, the positions of nodes deployed by the controllers and a mapping relation between a switch and the controllers;
acquiring the state transition probability of the Markov decision model according to the probability of transition to the next state after executing a certain action in the current state;
acquiring a reward function of the Markov decision model according to the average communication time delay between the switch and the controller, the average communication time delay between the controllers, the synchronous overhead between the controllers and the minimum safety of the controllers;
obtaining a multi-controller deployment scenario based on the state space, the action space, the state transition probability, and the reward function of the Markov decision.
6. The method of claim 1, further comprising: embedding an Atomix node on the basis of the multi-controller deployment model, and performing optimized deployment on the Atomix node, wherein the method specifically comprises the following steps:
obtaining a second performance optimization index of Atomix node deployment; the second performance indicator comprises an average synchronization delay between the Atomix nodes, an average synchronization delay between the Atomix nodes and the controller node
Constructing a second objective function according to the second performance optimization index;
acquiring a second constraint condition of Atomix node deployment; the second constraint condition is that the number of Atomix nodes deployed in the network is B, and at least one mapping relation exists between the Atomix nodes and the controller nodes;
constructing an Atomix node deployment model according to the first objective function and the first constraint condition;
and solving the Atomix node deployment model by using a depth reinforcement learning algorithm.
7. The method of claim 6The average synchronization time delay between the Atomix nodes is as follows:
Figure FDA0003443426110000031
wherein A is the number of Atomix nodes,
Figure FDA0003443426110000032
is the shortest link delay between Atomix nodes, aj,ak,aiAtomix nodes i, j, k;
the average synchronization delay between the Atomix node and the controller node is: ,
Figure FDA0003443426110000033
Figure FDA0003443426110000034
is the shortest link delay, z, between the Atomix node and the controller nodeijA binary number, with a value of 1, indicates a successful connection of Atomix node i with controller j.
8. The method of claim 7, wherein the second objective function is:
min imizeρ1Taaavg2Tacavg,ρ1,ρ2setting a weight value rho for each second performance optimization index12=1。
9. The method of claim 7, wherein at least one mapping relationship exists between the Atomix node and the controller node:
Figure FDA0003443426110000035
10. a system based on the SDN network and deep reinforcement learning multi-controller deployment method of any one of claims 1-9, comprising:
the first performance optimization index acquisition module is used for acquiring a first performance optimization index deployed by the multiple controllers according to the SDN network structure; the first performance optimization indicator comprises: average propagation delay between the switch and the controller, average propagation delay between the controllers, synchronization overhead between the controllers, and minimum security of synchronization between the controllers;
the first objective function establishing module is used for establishing a first objective function according to the first performance optimization index;
the first constraint condition acquisition module is used for acquiring a first constraint condition of multi-controller deployment; the first constraint condition comprises controller load constraint, mapping relation constraint of the switch and the controller and control layer synchronous link bandwidth constraint;
the multi-controller deployment model building module is used for building a multi-controller deployment model according to the first objective function and the first constraint condition;
and the solving module is used for solving the multi-controller deployment model by utilizing a deep reinforcement learning algorithm based on a Markov decision model to obtain a multi-controller deployment scheme.
CN202111641069.9A 2021-12-29 2021-12-29 Multi-controller deployment method and system based on SDN (software defined network) and deep reinforcement learning Pending CN114355775A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111641069.9A CN114355775A (en) 2021-12-29 2021-12-29 Multi-controller deployment method and system based on SDN (software defined network) and deep reinforcement learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111641069.9A CN114355775A (en) 2021-12-29 2021-12-29 Multi-controller deployment method and system based on SDN (software defined network) and deep reinforcement learning

Publications (1)

Publication Number Publication Date
CN114355775A true CN114355775A (en) 2022-04-15

Family

ID=81103213

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111641069.9A Pending CN114355775A (en) 2021-12-29 2021-12-29 Multi-controller deployment method and system based on SDN (software defined network) and deep reinforcement learning

Country Status (1)

Country Link
CN (1) CN114355775A (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104980296A (en) * 2014-04-11 2015-10-14 华为技术有限公司 OpenFlow multi-controller system and management method thereof
CN107276662A (en) * 2017-07-27 2017-10-20 大连大学 A kind of software definition Information Network multi-controller dynamic deployment method
CN108650131A (en) * 2018-05-10 2018-10-12 合肥工业大学 The processing system disposed for multi-controller in SDN network
CN108777636A (en) * 2018-05-25 2018-11-09 陕西师范大学 A kind of multi-controller Optimization deployment method of robust in software defined network
CN110120892A (en) * 2019-04-30 2019-08-13 山东工商学院 SDN multi-controller dispositions method and system based on improved glowworm swarm algorithm

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104980296A (en) * 2014-04-11 2015-10-14 华为技术有限公司 OpenFlow multi-controller system and management method thereof
CN107276662A (en) * 2017-07-27 2017-10-20 大连大学 A kind of software definition Information Network multi-controller dynamic deployment method
CN108650131A (en) * 2018-05-10 2018-10-12 合肥工业大学 The processing system disposed for multi-controller in SDN network
CN108777636A (en) * 2018-05-25 2018-11-09 陕西师范大学 A kind of multi-controller Optimization deployment method of robust in software defined network
CN110120892A (en) * 2019-04-30 2019-08-13 山东工商学院 SDN multi-controller dispositions method and system based on improved glowworm swarm algorithm

Similar Documents

Publication Publication Date Title
CN108400890B (en) Software defined network multi-controller deployment method
Li et al. SDN-based load balancing scheme for multi-controller deployment
EP2493118B1 (en) Information processing system
CN108650131B (en) Processing system for multi-controller deployment in SDN network
EP3373533B1 (en) Sdn network system, controller, and controlling method
CN108076158B (en) Minimum load route selection method and system based on naive Bayes classifier
CN105429185B (en) A kind of economic load dispatching method with the harmonious property of robust
CN107465966B (en) Topology reconstruction control method for optical network
CN113341712B (en) Intelligent hierarchical control selection method for unmanned aerial vehicle autonomous control system
CN113645146B (en) New stream density-based software defined network controller load balancing method and system
CN111191955B (en) Power CPS risk area prediction method based on dependent Markov chain
Chaudhary et al. PARC: Placement availability resilient controller scheme for software-defined datacenters
CN113311711A (en) Method for realizing grouping consistency of heterogeneous multi-agent system under leader following
CN114355775A (en) Multi-controller deployment method and system based on SDN (software defined network) and deep reinforcement learning
Li et al. Adaptive event-triggered group consensus of multi-agent systems with non-identical nonlinear dynamics
CN113783720B (en) Network energy consumption two-stage control method based on parameterized action space
CN111817975B (en) Hybrid intra-network dynamic load balancing method, device and system
Gouareb et al. Joint reactive and proactive SDN controller assignment for load balancing
CN112134807A (en) SDN load balancing method, system, equipment and storage medium based on Nash bargaining
CN111901141A (en) Design method of Internet of things application simulation system
CN114338537B (en) SDN load balancing dual-weight switch migration method and system based on prediction
KR102346417B1 (en) Method for selecting leader controller node in decentralized software-defined network
CN117499960B (en) Resource scheduling method, system, equipment and medium in communication network
Wang et al. Load Balancing Strategy of Power Communication Network Based on SDN Controller
Tsukuda et al. Reducing Inconsistency between Software-Defined Networking Controllers

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination