CN110247795B - Intent-based cloud network resource service chain arranging method and system - Google Patents

Intent-based cloud network resource service chain arranging method and system Download PDF

Info

Publication number
CN110247795B
CN110247795B CN201910461367.6A CN201910461367A CN110247795B CN 110247795 B CN110247795 B CN 110247795B CN 201910461367 A CN201910461367 A CN 201910461367A CN 110247795 B CN110247795 B CN 110247795B
Authority
CN
China
Prior art keywords
service
cost
arrangement
preset
vnf
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910461367.6A
Other languages
Chinese (zh)
Other versions
CN110247795A (en
Inventor
郭少勇
喻鹏
邱雪松
贺文晨
李文萃
申京
邵苏杰
徐思雅
亓峰
丰雷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Posts and Telecommunications
Information and Telecommunication Branch of State Grid Henan Electric Power Co Ltd
Original Assignee
Beijing University of Posts and Telecommunications
Information and Telecommunication Branch of State Grid Henan Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Posts and Telecommunications, Information and Telecommunication Branch of State Grid Henan Electric Power Co Ltd filed Critical Beijing University of Posts and Telecommunications
Priority to CN201910461367.6A priority Critical patent/CN110247795B/en
Publication of CN110247795A publication Critical patent/CN110247795A/en
Application granted granted Critical
Publication of CN110247795B publication Critical patent/CN110247795B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/08Configuration management of networks or network elements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/145Network analysis or design involving simulating, designing, planning or modelling of a network

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The embodiment of the invention provides an intention-based cloud network resource service chain arranging method and system, wherein the method comprises the following steps: providing an end-to-end service for cloud network resources based on a preset northbound interface reference architecture; and performing online arrangement and dynamic adjustment on the end-to-end service based on a service chain arrangement framework of deep reinforcement learning, wherein in the online arrangement and dynamic adjustment, a preset multi-objective optimization problem model is solved to minimize the arrangement cost and delay of the service chain. According to the method and the system for arranging the cloud network resource service chain based on the intention, provided by the embodiment of the invention, a multi-target optimization problem model is constructed by providing a preset northbound interface reference framework and an SFC arrangement framework based on the DRL, so that the long-term service chain arrangement cost is reduced to the maximum extent.

Description

Intent-based cloud network resource service chain arranging method and system
Technical Field
The invention relates to the technical field of communication, in particular to an intention-based cloud network resource service chain arranging method and system.
Background
The rapid growth of internet of things services with different QoS requirements presents network operators with significant challenges in terms of rapid delivery and QoS guarantees. Network Function Virtualization (NFV) and Software Defined Networking (SDN) have become key technologies for flexible resource allocation and dynamic service provisioning. However, both techniques still require the application of manual operations to define the service model and configure the network details, which in turn requires highly skilled administrators and a significant amount of time. These manual tasks have a detrimental effect on improving reliability and providing service quickly. Therefore, Intent Based Networking (IBN) has been proposed to simplify low level configuration and speed up service delivery.
One key aspect of supporting intent-based Service provisioning is the vendor-independent and technology-independent northbound interface (NBI) for converting customer languages into abstract definitions of Service Functional Chain (SFC). Another key step is the online orchestration based on the abstract definition of the SFC model to achieve a demand-driven, automatically-tuned service delivery style.
However, the above method still needs to obtain complete network details in advance to obtain a global optimal solution, but such accurate information is often difficult to collect. Therefore, there is a need for an intent-based cloud resource service chaining method to solve the above problems.
Disclosure of Invention
In order to solve the above problems, embodiments of the present invention provide an intent-based cloud resource service chaining method and system that overcome the above problems or at least partially solve the above problems.
In a first aspect, an embodiment of the present invention provides an intention-based cloud network resource service chain arrangement method, including:
providing an end-to-end service for cloud network resources based on a preset northbound interface reference architecture;
and performing online arrangement and dynamic adjustment on the end-to-end service based on a service chain arrangement framework of deep reinforcement learning, wherein in the online arrangement and dynamic adjustment, a preset multi-objective optimization problem model is solved to minimize the arrangement cost and delay of the service chain.
Wherein the multi-objective optimization problem model is represented as:
min{cost(server)+cost(link)}
Figure GDA0002148684400000021
wherein, cost (server) is related cost of server resource, cost (link) traffic forwarding cost, C1,C2,C3,C4,C5,C6,C7Is a resource constraint.
Wherein solving the predetermined multi-objective optimization problem model to minimize service chain orchestration cost and delay comprises:
and obtaining the optimal solution of the multi-objective optimization problem model based on a preset double-layer depth Q network algorithm.
The method for obtaining the optimal solution of the multi-objective optimization problem model based on the preset double-layer depth Q network algorithm comprises the following steps:
initializing a business process;
and performing service arrangement on the initialized service flow based on a preset double-layer depth Q network.
The initializing the business process includes:
randomly selecting a target scheme meeting requirements from a cloud server set;
determining a target routing scheme between the VNFs based on a shortest path selection algorithm;
the orchestration costs for all service chains are calculated.
The service arrangement is performed on the initialized service process based on the preset double-layer depth Q network, and comprises the following steps:
after the state space is initialized, inputting a state to the double-layer depth Q network;
acquiring an action corresponding to the input state and calculating a target Q value;
and updating the input state based on a gradient descent method until a preset termination condition is reached.
In a second aspect, an embodiment of the present invention further provides an intent-based cloud network resource service chaining system, including:
the service module is used for providing end-to-end service for the cloud network resources based on a preset northbound interface reference framework;
and the arrangement adjusting module is used for carrying out online arrangement and dynamic adjustment on the end-to-end service based on a service chain arrangement framework of deep reinforcement learning, wherein in the online arrangement and dynamic adjustment, a preset multi-objective optimization problem model is solved so as to minimize the arrangement cost and delay of the service chain.
Third aspect an embodiment of the present invention provides an electronic device, including:
a processor, a memory, a communication interface, and a bus; the processor, the memory and the communication interface complete mutual communication through the bus; the memory stores program instructions executable by the processor, and the processor calls the program instructions to execute the intent-based cloud network resource service chaining method.
In a fourth aspect, an embodiment of the present invention provides a non-transitory computer-readable storage medium, which stores computer instructions, where the computer instructions cause the computer to execute the above method for arranging an intent-based cloud network resource service chain.
According to the method and the system for arranging the cloud network resource service chain based on the intention, provided by the embodiment of the invention, a multi-target optimization problem model is constructed by providing a preset northbound interface reference framework and an SFC arrangement framework based on the DRL, so that the long-term service chain arrangement cost is reduced to the maximum extent.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.
Fig. 1 is a schematic flowchart of an intention-based cloud network resource service chain arrangement method according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of training steps at different learning rates according to an embodiment of the present invention;
FIG. 3 is a diagram illustrating training steps under different algorithms provided by an embodiment of the present invention;
FIG. 4 is a graph illustrating average delay for various algorithms provided by embodiments of the present invention;
FIG. 5 is a diagram of the total cost of various algorithms provided by an embodiment of the present invention;
FIG. 6 is a schematic structural diagram of an intent-based cloud resource service chaining system according to an embodiment of the present invention;
fig. 7 is a block diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some embodiments, but not all embodiments, of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Fig. 1 is a schematic flowchart of a method for arranging a cloud network resource service chain based on intentions according to an embodiment of the present invention, as shown in fig. 1, including:
101. providing an end-to-end service for cloud network resources based on a preset northbound interface reference architecture;
102. and performing online arrangement and dynamic adjustment on the end-to-end service based on a service chain arrangement framework of deep reinforcement learning, wherein in the online arrangement and dynamic adjustment, a preset multi-objective optimization problem model is solved to minimize the arrangement cost and delay of the service chain.
It should be noted that an application scenario of the embodiment of the present invention is how to implement service chain arrangement on cloud network resources in the internet of things, and in this scenario, the embodiment of the present invention provides an IBN reference architecture to manage infrastructure of the internet of things and provide end-to-end services across multiple domains. Specifically, in step 101, the IBN reference architecture provided in the embodiment of the present invention includes: VNF Manager (VNF Manager, VNFM) and NFV Orchestrator (NFV editor, NFVO), management and control plane, and data plane. The VNF manager and NFV orchestrator are used to allow customers to declare IR using human-readable language and then translate the declarative policies into high-level service abstractions, such as VNF properties, QoS functions, and thresholds, through intent-based NBI. The management and control plane includes a Virtualization Infrastructure Manager (VIM) that maps high-level abstract policies to low-level service chaining orchestration policies, and a controller that coordinates SDN _ C and Cloud controller (Cloud _ C) to automate SFC orchestration. The data plane is configured to receive control messages through a South Bound Interface (SBI), and provide physical resources for VNF placement and traffic routing in a cloud domain, and sensors and actuators in an internet of things domain are responsible for data collection. The IBN reference architecture provided by the above embodiments of the present invention can implement management of the internet of things infrastructure and provide end-to-end services across multiple domains.
Further, in step 102, the embodiment of the present invention provides a service chain orchestration framework based on deep reinforcement learning, i.e. an SFC orchestration framework based on DRL. The SFC orchestration framework of the DRL can obtain SFC abstraction models and environment details through intent-based NBI and web learning models. Then, the DRL-based SFC orchestration framework can coordinate the controller, realize corresponding actions of the current network through the technology-specific SBI, and simultaneously, the network provides feedback related to rewards or punishments to prompt the DRL-based SFC orchestration framework to adjust the behaviors thereof, so that the DRL-based SFC orchestration framework can realize an optimal strategy through effective training, wherein the optimal strategy is a cloud network resource service chain orchestration scheme required by the embodiment of the invention. In the process of obtaining the optimal strategy, the embodiment of the invention establishes a multi-objective optimization problem model to minimize the SFC arrangement cost and delay, and obtains the finally required cloud network resource service chain arrangement scheme through the optimal solution of the multi-objective optimization problem model.
In the multi-objective optimization problem model established in the embodiment of the invention, the SFC is composed of a triplet s { (v)so,vde)s,Fs,rsDenotes wherein (v)so,vde)sA source node pair and a target node pair representing s. v. ofsoGenerating with a data transmission rate rsThe flow rate of (c). FsF ∈ F representing specific SFC information including attributes, order and connectivity of VNF, CPU, memory resources required in cloud server and VNFsIs delayed by the cpuf,memfAnd dfAnd (4) showing. Virtual link between u and w of VNF
Figure GDA0002148684400000061
And (4) showing. Definition DsIs the delay threshold.
The physical network of the cloud domain is then represented by a weighted undirected graph G ═ N, LThe number of servers and links are denoted by M and H.cloud server V has CPU computing and memory resources for placing VNF instances, which are respectively represented by Capcpu(v) And Capmem(v) And (4) showing. Physical link l between nodes i and jijWith maximum data transmission rate bijAnd a transmission delay dij
Figure GDA0002148684400000062
Figure GDA0002148684400000063
F, which represents the VNF in s, is mapped to the cloud server, and is 0 otherwise;
Figure GDA0002148684400000064
Figure GDA0002148684400000065
representing virtual links in s
Figure GDA0002148684400000066
Maps to a physical link, otherwise is 0.
VNF instances require CPU computing resources and memory resources in the cloud server. Load balancing needs to be considered because maintaining load balancing between servers and links can avoid traffic congestion and further improve network cost efficiency. Therefore, the embodiment of the invention provides two load balancing factors phivAnd ΘijAnd the load state is used for indicating the load state of the network, and the values of the load state and the load state have positive correlation with the resource utilization rate. PhivThe calculation is as follows:
Figure GDA0002148684400000067
α therein111Is a positive parameter used to adjust phi in the cost calculation processvThe value of (c). PhivIs UvWhether the linear or exponential function of (a) depends on (U)vThe range of (1). U shapevA weighted sum representing CPU and memory usage, calculated by:
Figure GDA0002148684400000068
wherein e ispAnd emWeight representing CPU and memory usage, ep+e m1. The associated cost of server resources is calculated by:
Figure GDA0002148684400000071
unit prices of CPU and memory resources are respectively c1And c2And (4) showing. Next consider the forwarding cost in traffic routing, the load balancing factor ΘijThe calculation is as follows:
Figure GDA0002148684400000072
α therein222Is a positive parameter and is used to adjust ΘijThe value of (c). U shapeijRepresents a link lijThe utilization rate of the medium transmission rate is calculated as follows:
Figure GDA0002148684400000073
the cost calculation method of traffic forwarding is as follows:
Figure GDA0002148684400000074
wherein, c3Representing the unit price of the link transmission rate. (e)l·Θij+ed·dij/Ds) Represents thetaijWeighted sum delay d ofij,el+e d1. cost (link) is composed of three parts: thetaijFixed delay dijAnd a unit price. As can be seen from the above calculation, nodes or links with larger remaining resources have a relatively lower cost.
The total Cost _ total of the SFC orchestration flow is calculated as follows:
Cost_total=cost(server)+cost(link);
the resource constraints are given by:
C1:
Figure GDA0002148684400000075
C2:
Figure GDA0002148684400000076
C3:
Figure GDA0002148684400000077
C4:
Figure GDA0002148684400000078
C5:
Figure GDA0002148684400000079
the delay constraint is given by:
C6:
Figure GDA00021486844000000710
therefore, a multi-objective optimization problem model aiming at improving cost efficiency and ensuring QoS is established:
min{cost(server)+cost(link)}
Figure GDA0002148684400000081
according to the method and the system for arranging the cloud network resource service chain based on the intention, provided by the embodiment of the invention, a multi-target optimization problem model is constructed by providing a preset northbound interface reference framework and an SFC arrangement framework based on the DRL, so that the long-term service chain arrangement cost is reduced to the maximum extent.
On the basis of the above embodiment, solving the preset multi-objective optimization problem model to minimize service chain arrangement cost and delay includes:
and obtaining the optimal solution of the multi-objective optimization problem model based on a preset double-layer deep Q network (DDQN) algorithm.
Aiming at the multi-objective optimization problem model provided in the embodiment, the embodiment of the invention designs a double-layer depth Q network algorithm to obtain the optimal solution of the multi-objective optimization problem model.
In particular, embodiments of the present invention formulate the optimization problem as a Markov decision process { ST, A, Rd, P }, where ST represents a state space, A represents an action space, Rd is defined as a reward function, and P is a state transition probability. The definition is as follows:
the state space is: each agent has a corresponding orchestration scheme at a certain time. This state is defined as the satisfaction of the QoS requirements of all SFCs and is calculated by:
ST={st1,sts,...,stK};
wherein stsK is the number of SFCs, {0,1 }. st s1 indicates that the delay requirement of SFC can be satisfied under the current scheduling scheme. Otherwise, st s0. The number of all states is 2K
The action space is as follows: the transition between the two states of the SFC means that the VNF placement or routing is changed by taking an action. The action set A is defined as follows:
Figure GDA0002148684400000082
where X is designed as the set of available actions placed by VNFs in the SFC business process. In addition, if X is given, the route between VNFs can be obtained by a shortest path algorithm.
Figure GDA0002148684400000083
Y is designed as the available set of actions for traffic routing between VNFs. Thus, the operation space for VNF placement and traffic routing is denoted as a ═ X, Y. s has an action number of 2M+H
The reward is as follows: state st if the agent takes some actionsWill shift to a new state st's. Agents may also obtain an immediate reward Rds(asSt, st'), defined as stsTransfer to st'sThe cost is reduced.
Rds(as,st,st')=cost(sts)-cost(st's);
Where cost (st)s) And cost (st)s') represents the state stsAnd st'sThe arrangement cost of (a) and (b). By accumulating long-term rewards Rds(asSt, st') can achieve the highest cost efficiency, and the strategy pi can obtain the corresponding action to be taken by the SFC according to the current state. The best action is
Figure GDA0002148684400000091
Qs(st, a) is defined as a state-action function and represents the expected cumulative discount reward for the specified state-action. Qs(st, a) is expressed as:
Figure GDA0002148684400000092
where γ is a discount factor, indicating the importance of future rewards in learning. From the Bellman equation, the optimum can be obtained as follows
Figure GDA0002148684400000093
Figure GDA0002148684400000094
Ps(asSt, st ') represents the probability of a transition from state st to state st'. Therefore, an optimal strategy can be obtained based on the above formula
Figure GDA0002148684400000095
And is represented as:
Figure GDA0002148684400000096
in practice, it is often difficult to obtain an accurate transition probability. Therefore, Q-learning is designed to find the optimal solution in an iterative manner based on the available information, and it updates the Q-value function using the following equation:
Figure GDA0002148684400000097
wherein Q is affected for learning efficiencys(st, a) update rate.
It can be appreciated that Q learning performs iterations based on a Q-value table, so it is difficult to obtain an optimal solution if the state and motion space is very large. To overcome this weakness, the Deep Q Network (DQN) provided by embodiments of the invention approximates the Q-value function by a Deep Neural Network (DNN) rather than a Q-value table. The DNN may be viewed as a depth map with multiple processing layers. θ represents the weights of these layers and is updated by gradient descent. The approximation of the value function used by DQN is calculated by:
Qs(st,a,θ)≈Qs(st,a)。
in addition, DQN utilizes empirical replay and independent target networks to eliminate data dependencies. Defining a target network to be based on a weight θ-A target Q value is calculated. Theta and theta-The difference between is that theta is updated in each iteration, but theta is-Updated in a fixed number of iterations. The target Q function is given by:
Figure GDA0002148684400000101
the loss function of DQN is defined as the mean square error and is calculated by:
L(θ)=E[(Target_Qs-Qs(st,a,θ))2];
in each iteration, the weight θ needs to be updated to depend on the gradient
Figure GDA0002148684400000104
The loss function is minimized. The update function is calculated by:
θ'=θ+[Target_Qs-Qs(st,a,θ)]·▽Qs(st,a,θ);
it is worth emphasizing that both DQN and Q learning utilize the maximum function to calculate Target _ QsThis leads to an overestimation problem. As an improvement, the DDQN first finds the corresponding action with the largest Q value in the current network, rather than the largest Q value of all actions directly in the target network:
amax(st',θ)=argmaxa'∈AQs(st',a',θ);
then using the selected action amax(st', θ) rewrite Target _ Qs. New Target _ Q in DDQNsCalculated from the following formula:
Figure GDA0002148684400000102
similarly, L (θ) and θ' also need to be updated together in DDQN.
Specifically, the two-layer deep Q network algorithm provided by the embodiment of the present invention may include two parts, where the first part is a service process initialization, and the service process initialization may include the following steps:
1. from σf,sRandomly select F ∈ FsPossible placement solution of
Figure GDA0002148684400000103
σf,sFor F ∈ FsA set of feasible cloud servers.
2. The routing scheme between VNFs is obtained by a shortest path algorithm.
3. The cost of the layout of all the SFCs is calculated.
The second part is the service arrangement using the DDQN network, and the part can comprise the following steps:
1. the state space is initialized to ST ═ ST1,sts,...,stK}。
2. For SFCs, it will stsAdding Q-value network Q as inputs(sts,a,θ)。
3. The action is obtained through a greedy policy. The strategy selects random and best actions with probabilities and 1-respectively.
4. All SFC conversions are stored in the empirical replay memory.
5. Each agent samples (st, a, Rd, st') from the ER and calculates its target Q value depending on whether it is the final state or not.
6. They perform a gradient descent step (Target _ Q) with respect to θ of the Q-value networks-Qs(st,a,θ))2. At each ufθAfter step, theta-Is replaced with theta.
7. If the current state ST is {1, 1., 1}, training will be terminated.
By combining the processes, the DDQN algorithm designed by the embodiment of the invention can obtain the optimal solution of the multi-objective optimization problem model, has better cost efficiency and convergence, and can ensure the QoS requirement and balance the flow.
In order to verify the performance of the method provided by the embodiment of the invention, simulation experiments are carried out in the embodiment of the invention. Specifically, the embodiment of the present invention simulates the proposed algorithm by using a cloud network composed of 30 nodes (10 cloud servers and 20 switches) and 50 links. The maximum data transmission rate of the link is fixed to 1 Gbps. The transmission delay of the link is 1-3 ms. The CPU and memory resources of the server are set to 32 and 100-200GB, respectively. Each SFC requires 2-4 VNFs with a data transfer rate of 20-50 Mbps. Each VNF requires 2-4 CPUs and 5-10GB memory resources with a processing delay of 2-5 ms. The structure of DNN consists of three hidden layers of fully connected neural networks with 64, 32 neurons. The simulation is mainly carried out from two aspects of convergence performance and optimization performance of the algorithm.
The convergence performance was first evaluated at different learning rates: fig. 2 is a schematic diagram of training steps at different learning rates according to an embodiment of the present invention, and as shown in fig. 2, the DDQN algorithm with three learning rates has a huge training step number at the beginning of an epicode. The number of training steps tends to decrease as the epicode increases, reflecting the good convergence performance of the DDQN. On the other hand, the learning rate is a key factor of convergence performance. For example, in EP 90, DDQN of 0.001 requires 92 training steps. By comparison, the 0.01 and 0.1 algorithms require only 40 and 26 steps to obtain the optimal solution.
Then, convergence performance under different reinforcement learning algorithms is compared, and fig. 3 is a schematic diagram of training steps under different algorithms provided by the embodiment of the present invention, as shown in fig. 3. It can be seen that Q learning has lower convergence performance at different epsilodes because it requires less measures to remove data dependencies. In contrast, DQN and DDQN establish empirical replay and independent target networks to resolve data correlations, so their training steps are less than Q learning. Taking the training steps of Q-learning, DQN, DDQN as examples, 51, 32, 26 respectively. Therefore, solving the overestimation problem is also an advantage of DDQN compared to DQN.
Further, embodiments of the present invention evaluate the average delay, total reward, and load balancing status using the following two algorithms as a comparison. QoS-driven placement algorithm (QPA): it first takes an end-to-end path of the SFC and then extends over the path to minimize cost and delay while meeting resource requirements. Random fit placement algorithm (RPA): the placement of the VNFs is performed in the form of a random fit, taking into account all the solutions that satisfy all the constraints and randomly selecting one of them, and then also randomly selecting the path therein. Fig. 4 is a schematic diagram of the average delay of different algorithms provided by the embodiment of the present invention, and as shown in fig. 4, the average delay of the four algorithms is low at the beginning because the number of SFC requests is small. As SFC increases, the delay increases with different magnitudes. When the number of SFCs is 200, the average delays of DDQN, DQN, QPA and RPA are 38ms, 43ms, 48ms, 56ms, respectively. Due to the randomness of the RPA, the delay performance of the RPA is poor. Although DQN and QPA take delay minimization into account, they ignore the impact of load balancing, which can lead to network congestion. In contrast, DDQN has better latency performance at different SFC request numbers.
FIG. 5 is a schematic diagram of the total cost of the different algorithms provided by the embodiment of the present invention, as shown in FIG. 5, the total cost of DDQN, DQN, QPA is always lower than RPA under different number of SFCs, since they all consider cost minimization of the objective function. For example, when the number of SFCs is 300, the cost of RPA is 16%, 20%, 27% higher than QPA, DQN, DDQN. Of these three algorithms, DDQN has the best cost efficiency, as it overcomes the overestimation problem and focuses on load balancing, which QPA and DQN ignore. Therefore, DDQN can achieve optimal delay and cost in SFC orchestration.
For the load balancing state, taking SFCs as an example of 300, the variance of the DDQN link usage rates is 62%, 55%, 41% lower than that of RPA, SPA and DQN. Similarly, the DDQN server usage rate varies less than 81%, 65%, and 48%. In the SFC orchestration optimization model, phivAnd ΘijThe design is used for maintaining network balance, so that the DDQN can effectively avoid network congestion.
Fig. 6 is a schematic structural diagram of an intent-based cloud resource service chaining system according to an embodiment of the present invention, as shown in fig. 6, including: a service module 601 and an orchestration adjustment module 602, wherein:
the service module 601 is used for providing end-to-end service for cloud network resources based on a preset northbound interface reference architecture;
the orchestration adjustment module 602 is configured to perform online orchestration and dynamic adjustment on the end-to-end service based on a service chain orchestration framework of deep reinforcement learning, where in the online orchestration and dynamic adjustment, a preset multi-objective optimization problem model is solved to minimize a service chain orchestration cost and delay.
Specifically, how to use the service module 601 and the orchestration adjustment module 602 to execute the technical solution of the intent-based cloud network resource service chaining method embodiment shown in fig. 1 is similar, and the implementation principle and the technical effect are similar, and are not described herein again.
The intention-based cloud network resource service chain arrangement system provided by the embodiment of the invention has the advantages that a preset northbound interface reference framework and an SFC arrangement framework based on DRL are provided, and a multi-objective optimization problem model is constructed, so that the long-term service chain arrangement cost is reduced to the maximum extent.
On the basis of the above embodiment, the multi-objective optimization problem model is expressed as:
min{cost(server)+cost(link)}
Figure GDA0002148684400000141
wherein, cost (server) is related cost of server resource, cost (link) traffic forwarding cost, C1,C2,C3,C4,C5,C6,C7Is a resource constraint.
On the basis of the above embodiment, the arrangement adjustment module includes:
and the DDQN unit is used for obtaining the optimal solution of the multi-objective optimization problem model based on a preset double-layer depth Q network algorithm.
On the basis of the above embodiment, the DDQN unit includes:
the initialization part is used for initializing the business process;
and the service arrangement part is used for carrying out service arrangement on the initialized service flow based on the preset double-layer depth Q network.
On the basis of the above embodiment, the initialization section is specifically configured to:
randomly selecting a target scheme meeting requirements from a cloud server set;
determining a target routing scheme between the VNFs based on a shortest path selection algorithm;
the orchestration costs for all service chains are calculated.
On the basis of the above embodiment, the service orchestration part is specifically configured to:
after the state space is initialized, inputting a state to the double-layer depth Q network;
acquiring an action corresponding to the input state and calculating a target Q value;
and updating the input state based on a gradient descent method until a preset termination condition is reached.
An embodiment of the present invention provides an electronic device, including: at least one processor; and at least one memory communicatively coupled to the processor, wherein:
fig. 7 is a block diagram of an electronic device according to an embodiment of the present invention, and referring to fig. 7, the electronic device includes: a processor (processor)701, a communication Interface (Communications Interface)702, a memory (memory)703 and a bus 704, wherein the processor 701, the communication Interface 702 and the memory 703 complete communication with each other through the bus 704. The processor 701 may call logic instructions in the memory 703 to perform the following method: providing an end-to-end service for cloud network resources based on a preset northbound interface reference architecture; and performing online arrangement and dynamic adjustment on the end-to-end service based on a service chain arrangement framework of deep reinforcement learning, wherein in the online arrangement and dynamic adjustment, a preset multi-objective optimization problem model is solved to minimize the arrangement cost and delay of the service chain.
An embodiment of the present invention discloses a computer program product, which includes a computer program stored on a non-transitory computer readable storage medium, the computer program including program instructions, when the program instructions are executed by a computer, the computer can execute the methods provided by the above method embodiments, for example, the method includes: providing an end-to-end service for cloud network resources based on a preset northbound interface reference architecture; and performing online arrangement and dynamic adjustment on the end-to-end service based on a service chain arrangement framework of deep reinforcement learning, wherein in the online arrangement and dynamic adjustment, a preset multi-objective optimization problem model is solved to minimize the arrangement cost and delay of the service chain.
Embodiments of the present invention provide a non-transitory computer-readable storage medium, which stores computer instructions, where the computer instructions cause the computer to perform the methods provided by the above method embodiments, for example, the methods include: providing an end-to-end service for cloud network resources based on a preset northbound interface reference architecture; and performing online arrangement and dynamic adjustment on the end-to-end service based on a service chain arrangement framework of deep reinforcement learning, wherein in the online arrangement and dynamic adjustment, a preset multi-objective optimization problem model is solved to minimize the arrangement cost and delay of the service chain.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method according to each embodiment or some parts of the embodiments.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (7)

1. A cloud network resource service chain arranging method based on intents is characterized by comprising the following steps:
providing an end-to-end service for cloud network resources based on a preset northbound interface reference architecture;
performing online arrangement and dynamic adjustment on the end-to-end service based on a service chain arrangement framework of deep reinforcement learning, wherein in the online arrangement and dynamic adjustment, a preset multi-objective optimization problem model is solved to minimize the arrangement cost and delay of the service chain;
the solving of the preset multi-objective optimization problem model to minimize service chain orchestration cost and delay includes:
acquiring an optimal solution of the multi-objective optimization problem model based on a preset double-layer depth Q network algorithm;
the multi-objective optimization problem model is expressed as:
Figure FDA0002597346880000011
wherein, cost (server) is related cost of server resource, cost (link) traffic forwarding cost, C1,C2,C3,C4,C5,C6,C7Is a resource constraint condition;
wherein, C1
Figure FDA0002597346880000012
C2:
Figure FDA0002597346880000013
C3:
Figure FDA0002597346880000014
C4:
Figure FDA0002597346880000015
C5:
Figure FDA0002597346880000016
C6:
Figure FDA0002597346880000017
Wherein s { (v)so,vde)s,Fs,rsDenotes the SFC orchestration framework,
Figure FDA0002597346880000018
f representing the VNF in s maps to the cloud server v,
Figure FDA0002597346880000019
f, representing a VNF in s, is not mapped to cloud server v;
Figure FDA0002597346880000021
representing virtual links in s
Figure FDA0002597346880000022
Is mapped to physical link lij
Figure FDA0002597346880000023
Representing virtual links in s
Figure FDA0002597346880000024
Not mapped to physical link lij
Figure FDA0002597346880000025
Representing a virtual link between u and w of the VNF,/ijRepresenting a physical link between nodes i and j; f represents an instance f of the VNF, u represents an instance u of the VNF, and w represents an instance w of the VNF.
2. The method for arranging the resource service chain of the cloud network based on the intention according to claim 1, wherein the obtaining the optimal solution of the multi-objective optimization problem model based on the preset double-layer depth Q network algorithm comprises:
initializing a business process;
and performing service arrangement on the initialized service flow based on a preset double-layer depth Q network.
3. The method for orchestrating intent-based cloud resource service chaining according to claim 2, wherein initializing a business process comprises:
randomly selecting a target scheme meeting requirements from a cloud server set;
determining a target routing scheme between the VNFs based on a shortest path selection algorithm;
the orchestration costs for all service chains are calculated.
4. The method for orchestrating intent-based cloud network resource service chain according to claim 2, wherein the performing business orchestration on the initialized business process based on a preset two-layer deep Q network comprises:
after the state space is initialized, inputting a state to the double-layer depth Q network;
acquiring an action corresponding to the input state and calculating a target Q value;
and updating the input state based on a gradient descent method until a preset termination condition is reached.
5. An intent-based cloud resource service chaining system, comprising:
the service module is used for providing end-to-end service for the cloud network resources based on a preset northbound interface reference framework;
the arrangement adjusting module is used for carrying out online arrangement and dynamic adjustment on the end-to-end service based on a service chain arrangement framework of deep reinforcement learning, wherein in the online arrangement and dynamic adjustment, a preset multi-objective optimization problem model is solved to minimize the arrangement cost and delay of the service chain;
wherein solving the predetermined multi-objective optimization problem model to minimize service chain orchestration cost and delay comprises: acquiring an optimal solution of the multi-objective optimization problem model based on a preset double-layer depth Q network algorithm;
the multi-objective optimization problem model is expressed as:
Figure FDA0002597346880000031
wherein, cost (server) is related cost of server resource, cost (link) traffic forwarding cost, C1,C2,C3,C4,C5,C6,C7Is a resource constraint condition;
wherein, C1
Figure FDA0002597346880000032
C2:
Figure FDA0002597346880000033
C3:
Figure FDA0002597346880000034
C4:
Figure FDA0002597346880000035
C5:
Figure FDA0002597346880000036
C6:
Figure FDA0002597346880000037
Wherein s { (v)so,vde)s,Fs,rsDenotes the SFC orchestration framework,
Figure FDA0002597346880000038
f representing the VNF in s maps to the cloud server v,
Figure FDA0002597346880000039
f, representing a VNF in s, is not mapped to cloud server v;
Figure FDA00025973468800000310
representing virtual links in s
Figure FDA00025973468800000311
Is mapped to physical link lij
Figure FDA00025973468800000312
Representing virtual links in s
Figure FDA00025973468800000313
Not mapped to physical link lij
Figure FDA00025973468800000314
Representing a virtual link between u and w of the VNF,/ijRepresenting a physical link between nodes i and j; f represents an instance f of the VNF, u represents an instance u of the VNF, and w represents an instance w of the VNF.
6. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor when executing the program implements the steps of the method for intent based chain of services for cloud resources as claimed in any of claims 1 to 4.
7. A non-transitory computer readable storage medium, having a computer program stored thereon, wherein the computer program, when executed by a processor, implements the steps of the method for intent based service chaining of cloud network resources according to any of claims 1 to 4.
CN201910461367.6A 2019-05-30 2019-05-30 Intent-based cloud network resource service chain arranging method and system Active CN110247795B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910461367.6A CN110247795B (en) 2019-05-30 2019-05-30 Intent-based cloud network resource service chain arranging method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910461367.6A CN110247795B (en) 2019-05-30 2019-05-30 Intent-based cloud network resource service chain arranging method and system

Publications (2)

Publication Number Publication Date
CN110247795A CN110247795A (en) 2019-09-17
CN110247795B true CN110247795B (en) 2020-09-25

Family

ID=67885322

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910461367.6A Active CN110247795B (en) 2019-05-30 2019-05-30 Intent-based cloud network resource service chain arranging method and system

Country Status (1)

Country Link
CN (1) CN110247795B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112583629B (en) * 2019-09-30 2022-06-10 华为技术有限公司 Information processing method, related equipment and computer storage medium
CN111756853A (en) * 2020-06-30 2020-10-09 北京来也网络科技有限公司 RPA simulation training method and device, computing equipment and storage medium
CN111865686A (en) * 2020-07-20 2020-10-30 北京百度网讯科技有限公司 Cloud product capacity expansion method, device, equipment and storage medium
CN112637064A (en) * 2020-12-31 2021-04-09 广东电网有限责任公司电力调度控制中心 Service chain arranging method based on improved depth-first search algorithm
CN114827284B (en) * 2022-04-21 2023-10-03 中国电子技术标准化研究院 Service function chain arrangement method and device in industrial Internet of things and federal learning system

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108536144A (en) * 2018-04-10 2018-09-14 上海理工大学 A kind of paths planning method of fusion dense convolutional network and competition framework
CN108600101A (en) * 2018-03-21 2018-09-28 北京交通大学 A kind of network for the optimization of end-to-end time delay performance services cross-domain method of combination
CN109245916A (en) * 2018-08-15 2019-01-18 西安电子科技大学 A kind of the cloud access net system and method for intention driving

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10411989B2 (en) * 2014-09-22 2019-09-10 Wolting Holding B.V. Compiler for and method of software defined networking, storage and compute determining physical and virtual resources
US9882833B2 (en) * 2015-09-28 2018-01-30 Centurylink Intellectual Property Llc Intent-based services orchestration
CN108900419B (en) * 2018-08-17 2020-04-17 北京邮电大学 Routing decision method and device based on deep reinforcement learning under SDN framework
CN109358971B (en) * 2018-10-30 2020-06-23 电子科技大学 Rapid and load-balancing service function chain deployment method in dynamic network environment

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108600101A (en) * 2018-03-21 2018-09-28 北京交通大学 A kind of network for the optimization of end-to-end time delay performance services cross-domain method of combination
CN108536144A (en) * 2018-04-10 2018-09-14 上海理工大学 A kind of paths planning method of fusion dense convolutional network and competition framework
CN109245916A (en) * 2018-08-15 2019-01-18 西安电子科技大学 A kind of the cloud access net system and method for intention driving

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
Markov Approximation Method for Optimal Service Orchestration in IoT Network;wenchen he;《IEEE ACCESS》;20190412;第7卷(第1期);第49540-49549页 *
Performance of Intent-based Virtualized Network Infrastructure Management;Franco Callegati;《IEEE ICC 2017 SAC Symposium SDN & NFV Track》;20170731;摘要,第一部分到第四部分,结论 *
基于动作注意策略的树形DDQN目标候选区域提取方法;左国玉;《电子与信息学报》;20190331;第41卷(第3期);第667-670,673页 *
基于强化学习的服务链映射算法,;魏亮;《通信学报》;20180131;第39卷(第1期);第90-96页,99页 *
基于软件定义网络的虚拟网络功能服务链在线调度技术;肖逸凯;《万方学位论文》;20181218;摘要,正文8-11,24-30页,总结 *
强化学习研究综述;马骋乾;《指挥控制与仿真》;20181231;第40卷(第6期);第69-71页 *

Also Published As

Publication number Publication date
CN110247795A (en) 2019-09-17

Similar Documents

Publication Publication Date Title
CN110247795B (en) Intent-based cloud network resource service chain arranging method and system
US11233710B2 (en) System and method for applying machine learning algorithms to compute health scores for workload scheduling
CN110365514B (en) SDN multistage virtual network mapping method and device based on reinforcement learning
US10389585B2 (en) System and method for data flow optimization
Kim et al. Multi-agent reinforcement learning-based resource management for end-to-end network slicing
CN112486690B (en) Edge computing resource allocation method suitable for industrial Internet of things
Rezazadeh et al. Continuous multi-objective zero-touch network slicing via twin delayed ddpg and openai gym
US20080271022A1 (en) Utilizing graphs to detect and resolve policy conflicts in a managed entity
CN111416774B (en) Network congestion control method and device, computer equipment and storage medium
CN111211987B (en) Method and system for dynamically adjusting flow in network, electronic equipment and storage medium
US20150363240A1 (en) System for controlling resources, control pattern generation apparatus, control apparatus, method for controlling resources and program
CN113794748A (en) Performance-aware service function chain intelligent deployment method and device
Dalgkitsis et al. Dynamic resource aware VNF placement with deep reinforcement learning for 5G networks
JP7439931B2 (en) Control device, virtual network allocation method, and program
Quang et al. Evolutionary actor-multi-critic model for VNF-FG embedding
CN110971451B (en) NFV resource allocation method
CN110233763B (en) Virtual network embedding algorithm based on time sequence difference learning
CN113543160A (en) 5G slice resource allocation method and device, computing equipment and computer storage medium
CN114125595A (en) OTN network resource optimization method, device, computer equipment and medium
Bensalem et al. Towards optimal serverless function scaling in edge computing network
Xia et al. Learn to optimize: Adaptive VNF provisioning in mobile edge clouds
CN113596138B (en) Heterogeneous information center network cache allocation method based on deep reinforcement learning
Xiao et al. A sub-action aided deep reinforcement learning framework for latency-sensitive network slicing
US20230216811A1 (en) Method and apparatus for managing network traffic via uncertainty
CN113179175B (en) Real-time bandwidth prediction method and device for power communication network service

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant