CN112543119A

CN112543119A - Service function chain reliability deployment method based on deep reinforcement learning

Info

Publication number: CN112543119A
Application number: CN202011359654.5A
Authority: CN
Inventors: 王珂; 曲桦; 赵季红
Original assignee: Xian Jiaotong University
Current assignee: Xian Jiaotong University
Priority date: 2020-11-27
Filing date: 2020-11-27
Publication date: 2021-03-23
Anticipated expiration: 2040-11-27
Also published as: CN112543119B

Abstract

The invention discloses a service function chain reliability deployment method based on deep reinforcement learning, and belongs to the field of communication technology and machine learning. The method can realize the fault-tolerant service function chain deployment by respectively solving the two problems of priority perception and reliability deployment of the service function chain, and adapts to the complex and dynamic service function chain deployment request by utilizing the advantage of deep reinforcement learning. The method selects the backup scheme of the virtual network function on the service function chain by dynamically sensing the priority of the service function chain, and simultaneously determines the deployment node and the backup node of the virtual network function in the service function chain by utilizing deep reinforcement learning, thereby realizing the reliable deployment of the service function chain. The method and the device consider the reliability of the service function chain, improve the effective utilization rate of network resources, consider the transmission delay and the load balance of the nodes and the links, and are suitable for the dynamic complex service function chain deployment request scene.

Description

Service function chain reliability deployment method based on deep reinforcement learning

Technical Field

The invention belongs to the field of communication technology and machine learning, and relates to a service function chain reliability deployment method based on deep reinforcement learning.

Background

With the advent of the 5G era, the number and diversity of network service requests has increased dramatically, creating a significant challenge for the reliable deployment of service function chains. In the operator network of today, a large number of network functions, such as firewalls, NATs, DPIs, etc., are distributed and deployed on different server nodes, and the network functions are combined in a certain order, and this Function sequence is called a Service Function Chain (SFC).

A traditional service function chain deployment manner relies on a routing policy formulated by a service operator, so that traffic flows pass through a plurality of virtual network functions in a certain order to provide a required network service. This approach has certain drawbacks: the reliability deployment requirement of a service function chain is difficult to meet due to the lack of certain fault tolerance measures; the resource demand of different service function chains is different, so that the priority is different, and a single strategy is difficult to meet all the deployment requirements; the virtual network function is pre-deployed on the server node, and is difficult to meet the dynamic complex service function chain deployment requirement, and the like. With the rapid development of Software Defined Networking (SDN) and Network Function Virtualization (NFV) technologies, a service Function chain based on an SDN/NFV fusion architecture receives a lot of attention. The SDN/NFV network joint architecture takes an SDN controller as a control plane of a network and NFV as a novel universal infrastructure as a data plane, combines the advantages of control and forwarding separation, centralized control, network resource abstraction, network programmability and the like, can get rid of the locking of deeply entrenched manufacturers, and flexibly and quickly provides an elastic network as required so as to better meet diversified application scenes and differentiated service quality requirements.

Aiming at the problem of dynamic reliability deployment of service function chains, the existing deployment methods have some defects. For example, the chinese patent invention "CN 109586982B" proposes a virtual network function backup method, which determines the backup requirement of each function node by determining the evaluation value of each function node in the deployed service function chain. However, the method is difficult to perform real-time function node backup on a dynamically arrived service function chain deployment request, and does not perform priority division on the service function chain, which is easy to cause network resource waste. The Chinese invention patent "CN 110460465A" proposes a service function chain deployment method facing mobile edge calculation, which adopts Q-learning algorithm in reinforcement learning to deploy. However, the method needs to maintain a huge Q-value table, which wastes a lot of CPU performance and cannot accurately determine the unknown conditions in the sample. The chinese invention patent "CN 111669291A" proposes a virtualized network service function chain deployment method based on deep reinforcement learning, which realizes service function chain deployment with minimum cost by respectively solving two problems of virtual function placement and traffic routing. However, the method does not consider the possible failure condition of the functional node, and the fault tolerance and the reliable deployment of the service function chain are difficult to ensure.

Therefore, an effective solution to the problem of reliable deployment of dynamically complex service function chains is still lacking.

Disclosure of Invention

In order to overcome the defects of the prior art, the invention aims to provide a service function chain reliability deployment method based on deep reinforcement learning, which solves the problem of reliability deployment of a dynamic complex service function chain in the context of network function virtualization.

In order to achieve the purpose, the invention adopts the following technical scheme to realize the purpose:

the invention discloses a service function chain reliability deployment method based on deep reinforcement learning, which comprises the following steps:

1) processing a service function chain deployment request;

2) the method comprises the steps of carrying out priority division on a service function chain, and selecting a backup scheme of a virtual network function in the corresponding service function chain according to the priority of the service function chain;

3) and determining a deployment node and a backup node of the virtual network function in the service function chain by applying deep reinforcement learning, and realizing the reliable deployment of the service function chain.

Preferably, in the step 2), the service function chain is prioritized, and the method specifically includes the following steps:

a) respectively calculating the total CPU requirement r of each virtual network function in the service function chain_cpuAnd the total bandwidth requirement r of each virtual link_bw；

b) Calculating r_cpuTaking the weight of the largest possible demand, r is calculated_bwWeight to the maximum possible demand;

c) according to r_cpuThe weight sum r which accounts for the largest possible demand_bwCalculating the average weight of the weight occupying the maximum possible demand;

d) when the average weight obtained in the step c) is judged to be more than 2/3, dividing the service function chain into a service function chain with high priority;

when the average weight obtained in the step c) is judged to be less than 1/3, dividing the service function chain into service function chains with low priority;

otherwise, the service function chain with the medium priority is divided.

It is further preferred that each virtual network function in a service function chain is backed up when classified as a high priority service function chain.

Further preferably, when a service function chain of a medium priority is divided, a node to be deployed is selected for a virtual network function in the service function chain, and importance of the selected node to be deployed is divided;

when the nodes to be deployed with high importance are divided, backing up virtual network functions in corresponding service function chains when the nodes to be deployed are divided;

otherwise, the virtual network function in the corresponding service function chain is not backed up when the node is to be deployed.

Preferably, the importance degree division is performed on the selected nodes to be deployed, and the method specifically includes the following steps:

respectively calculating the node degree and the CPU occupation number of the selected nodes to be deployed;

respectively calculating the weight of the node degree of the selected node to be deployed in the maximum node degree and the weight of the CPU occupation number in the total CPU number;

calculating average weight according to the weight of the node degree in the maximum node degree and the weight of the CPU occupation number in the total CPU number;

and fourthly, when the average weight obtained in the step III of judging is more than 0.5, the high importance degree is divided.

It is further preferred that virtual network functions in a service function chain are not backed up when classified as a low priority service function chain.

Preferably, in step 3), the deployment of the virtual network function in the service function chain by applying deep reinforcement learning specifically includes the following steps:

i) setting deep reinforcement learning parameters comprising a state set, an action set and a feedback function;

wherein the state set is S (t) ═ C_r，B_r，F_info)；

In the formula, C_rRepresenting the amount of CPU resource remaining in each node, B_rRepresenting the amount of remaining bandwidth resources of each link, F_infoRepresenting the type and deployment position of the last virtual network function;

the action set is A (t) ═ V_d，V_b)；

In the formula, V_dLocation of deployment node, V, representing virtual network function_bA backup node location representing a virtual network function;

the feedback function is

In the formula, L represents the maximum transmission time delay between adjacent virtual network functions, C represents the maximum CPU resource occupation amount of all nodes, B represents the maximum bandwidth resource occupation amount of all links, and alpha, beta and gamma are weights;

ii) initializing a neural network and an experience replay set D;

iii) selecting the node with the minimum load to deploy or backup the first virtual network function in the service function chain to obtain the current C_r、B_rAnd F_infoInformation and setting an initial state S (t);

iv) selecting deployment and backup node pairs (V) for the current state S (t) according to an epsilon-greedy method_d，V_b) After executing the action, obtain updated C_r、B_rAnd F_infoInformation and go to the next state S (t + 1);

v) calculating according to the feedback function R (t)Feed back the value R, and apply the four tuples (S (t), (V)_d，V_b) S (t +1), R) is stored in the experience playback set D, and S (t) is set as S (t + 1);

vi) updating the neural network parameters by gradient back propagation according to the samples in D.

Further preferably, in step iv), if the node pair (V) is not reserved_d，V_b) If yes, backup node V_bIs set to empty.

Further preferably, after step vi) is completed, if it is determined that the last virtual network function is performed, the deep reinforcement learning is ended; otherwise, re-executing steps iv) -vi).

Preferably, in step 1), the service function chain reliability deployment request includes a virtual network function type, a virtual network function sequence, a CPU resource requirement of the virtual network function, and a bandwidth resource requirement of the virtual link.

Preferably, after step 3) is completed, if it is determined that the last service function chain reliability deployment request is received, the reliability deployment method is ended; otherwise, re-executing the step 1) to the step 3).

Compared with the prior art, the invention has the following beneficial effects:

the invention discloses a service function chain reliability deployment method based on deep reinforcement learning, which determines the priority of a service function chain resource demand and determines a backup scheme of a virtual network function by dynamically sensing the service function chain resource demand, and aims to improve the reliability of the service function chain, improve the effective utilization rate of network resources, reduce transmission delay and improve the load balance of nodes and links. Meanwhile, the deployment node and the backup node are determined simultaneously by utilizing the deep reinforcement learning, and the advantages of the deep reinforcement learning are utilized to adapt to the complex and dynamic service function chain deployment request. Therefore, the method realizes the service function chain deployment with fault tolerance by respectively solving two problems of priority perception and reliability deployment of the service function chain while realizing the reliability service function chain deployment facing to the priority perception, and realizes the dynamic deployment request of the service function chain by utilizing a deep reinforcement learning method.

Furthermore, the method determines the priority of the service function chain by dynamically sensing the CPU and bandwidth requirements of the service function chain, so that a backup scheme of the virtual network function on the service function chain is selected, the survivability of the service function chain with higher priority can be improved, the overall deployment reliability of the service function chain is further improved, and the condition of large-scale failure of the service function chain caused by physical network faults is prevented.

Furthermore, the invention determines the importance of the node by sensing the node degree of the node and the CPU resource occupation number, so as to select the backup scheme of the virtual network function on the service function chain, thereby improving the effective utilization rate of the network resource, and avoiding the resource waste caused by the fact that the non-important virtual network function on the service function chain with the medium priority occupies the node backup resource.

In conclusion, the invention realizes the purposes of improving the reliability of the service function chain, improving the effective utilization rate of network resources, reducing transmission delay, improving the load balance of nodes and links, and is suitable for the dynamic and complex service function chain deployment request scene.

Drawings

FIG. 1 is a general flow chart of the algorithm of the present invention;

FIG. 2 is a flow chart of a service function chain priority awareness algorithm of the present invention;

FIG. 3 is a flowchart of an importance degree division algorithm for nodes to be deployed according to the present invention;

FIG. 4 is a flowchart of an algorithm of a deep reinforcement learning deployment module of the present invention;

FIG. 5 is a service function chain deployment diagram of one embodiment of the invention.

Detailed Description

In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

The invention is described in detail below with reference to the following figures and specific embodiments:

referring to fig. 1, it can be seen that the method for deploying reliability based on Deep Reinforcement Learning (DRL) in a priority-aware Service Function Chain (SFC) disclosed in the present invention includes the following steps:

s1, the service orchestrator responds to the service function chain deployment request;

s2, carrying out priority division on the service function chain, selecting a corresponding Virtual Network Function (VNF) backup scheme according to the priority, if the service function chain is of a medium priority, executing S3, and if the service function chain is of a medium priority, executing S4;

s3, selecting nodes to be deployed according to the virtual network function in the service function chain, and dividing the importance of the nodes to be deployed;

s4, applying deep reinforcement learning to perform reliable deployment on the virtual network function in the service function chain;

and S5, the service orchestrator judges whether the current service function chain is the last service function chain reliability deployment request, if so, the service function chain reliability deployment request is ended, and if not, the service orchestrator re-executes S1.

Further, the service function chain deployment request in step S1 includes a virtual network function type, a virtual network function sequence, a CPU resource requirement of the virtual network function, a bandwidth resource requirement of the virtual link, and the like.

Referring to fig. 2, further, the step S2 includes:

s2.1, respectively calculating the total CPU requirement r of each virtual network function in the service function chain_cpuAnd the total bandwidth requirement r of each virtual link_bw；

S2.2, respectively calculating r_cpu、r_bwWeight to the maximum possible demand;

s2.3, judging whether the average weight of the two weights obtained in the S2.2 is larger than 2/3, if so, outputting the service function chain with high priority and ending;

and S2.4, judging whether the average weight is smaller than 1/3, if so, outputting the service function chain with low priority and ending, otherwise, outputting the service function chain with medium priority and executing S3.

Referring to fig. 3, further, when the service function chain is divided into service function chains with medium priority, a node to be deployed is selected for a virtual network function in the service function chain, and importance of the selected node to be deployed is divided, where the step S3 includes:

s3.1, respectively calculating the node degree and the CPU occupation number of the selected nodes to be deployed;

s3.2, respectively calculating the weight of the node degree of the selected node to be deployed in the maximum node degree and the weight of the CPU occupation number in the total CPU number;

and S3.3, judging whether the average weight of the weights obtained in the S3.2 is greater than 0.5 or not, if so, judging that the output node is high in importance and ending, and otherwise, judging that the output node is low in importance and ending.

Referring to fig. 4, further, the step S4 includes:

s4.1, setting deep reinforcement learning parameters:

the method comprises the following steps: the state set S is denoted as S (t) ═ C_r，B_r，F_info)；

The action set a is represented by a (t) ═ V_d，V_b)；

The feedback function R is expressed as

In the formula, C_rRepresenting the amount of CPU resource remaining in each node, B_rRepresenting the amount of remaining bandwidth resources of each link, F_infoRepresenting the type and deployment location, V, of the last VNF_dDeployment node location, V, on behalf of VNF_bRepresenting the backup node position of the VNF, L representing the maximum transmission time delay between adjacent VNFs, C representing the maximum CPU resource occupation amount of all nodes, B representing the maximum bandwidth resource occupation amount of all links, and alpha, beta and gamma are weights;

s4.2, initializing a neural network and an experience playback set D;

s4.3, selecting the node with the minimum load to deploy or backup the first virtual network function in the service function chain to obtain the current C_r、B_rAnd F_infoInformation and setting an initial state S (t);

s4.4, selecting deployment and backup node pairs (V) for the current state S (t) according to an epsilon-greedy method_d，V_b) (if not, backup node V_bSet to null), get updated C after performing the action_r、B_rAnd F_infoInformation and go to the next state S (t + 1);

s4.5, calculating the current feedback value R according to the feedback function, and combining the four tuples (S (t), (A)_a，A_b) S (t +1), R) is stored in the experience playback set D, and S (t) is set as S (t + 1);

s4.6, updating the neural network parameters through gradient back propagation according to the samples in the D;

and S4.7, judging whether the current VNF is the last VNF, if so, ending, and otherwise, executing S4.4 again.

The following detailed description of the present invention will be made with reference to the accompanying drawings and examples.

In this section, a detailed description will be given of a reliability deployment method based on deep reinforcement learning in a priority-aware SFC with reference to the above drawings, where the types and resource requirements of virtual network functions and virtual links are shown in table 1:

TABLE 1 resource requirement settings

As shown in fig. 5, the present invention is illustrated in one embodiment:

a service function chain deployment request consists of three virtual network functions of a firewall, DPI and load balancing, the function sequence is firewall-DPI-load balancing, and a service orchestrator responds to the request.

The CPU and bandwidth requirements of the service function chain are calculated to be 6 units and 3 units respectively, the weights of the CPU and the bandwidth requirements occupying the maximum requirements are calculated to be 0.6 and 0.3 respectively, the average weight is 0.45, and the output service function chain is of medium priority.

The node degrees and the CPU occupation numbers of the nodes A, B, C to be deployed with the three virtual network functions are obtained through calculation, and the importance weight is obtained through calculation, in this example, it is assumed that the node B is a high importance node, and the node A, C is a low importance node, so that the DPI network function to be deployed on the node B needs to be backed up.

And sequentially deploying the firewall to the node A according to the deep reinforcement learning deployment module, deploying the DPI to the node B and backing up to the node D, and then deploying the load to the node C in a balanced manner, wherein the current service function chain reliability deployment request is processed.

And then the service orchestrator judges whether the current service function chain is the reliable deployment request of the last service function chain, if so, the service orchestrator finishes the operation, and if not, the service orchestrator continues to process the reliable deployment requests of the subsequent service function chains in sequence.

In summary, the present invention relates to a deep reinforcement learning and network function virtualization technology, and in particular, to a reliability deployment method based on deep reinforcement learning in a priority-aware service function chain, which is used for solving a problem of reliability service function chain deployment in a network function virtualization background. The method can realize the fault-tolerant service function chain deployment by respectively solving the two problems of priority perception and reliability deployment of the service function chain, and adapts to the complex and dynamic service function chain deployment request by utilizing the advantage of deep reinforcement learning. The method selects the backup scheme of the virtual network function on the service function chain by dynamically sensing the priority of the service function chain, and simultaneously determines the deployment node and the backup node of the virtual network function in the service function chain by utilizing deep reinforcement learning, thereby realizing the reliable deployment of the service function chain. The priority of the service function chain can be determined by dynamically sensing the CPU and the bandwidth requirements of the service function chain, and the node importance can be determined by sensing the node degree of the node and the CPU resource occupation number. The invention considers the reliability of the service function chain, improves the effective utilization rate of network resources, namely the proportion between resources occupied by the deployment and the backup of the service function chain, simultaneously considers the transmission delay and the load balance of nodes and links, and is suitable for the dynamic complex service function chain deployment request scene.

The above-mentioned contents are only for illustrating the technical idea of the present invention, and the protection scope of the present invention is not limited thereby, and any modification made on the basis of the technical idea of the present invention falls within the protection scope of the claims of the present invention.

Claims

1. A service function chain reliability deployment method based on deep reinforcement learning is characterized by comprising the following steps:

1) processing a service function chain deployment request;

2. The method for deploying the service function chain reliability based on the deep reinforcement learning according to claim 1, wherein in the step 2), the service function chain is prioritized, and specifically comprises the following steps:

a) respectively calculateTotal CPU requirement r for each virtual network function in a service function chain_cpuAnd the total bandwidth requirement r of each virtual link_bw；

otherwise, the service function chain with the medium priority is divided.

3. The deep reinforcement learning-based service function chain reliability deployment method according to claim 2, wherein when a service function chain with a high priority is classified, each virtual network function in the service function chain is backed up.

4. The service function chain reliability deployment method based on deep reinforcement learning according to claim 2, characterized in that when a service function chain of a medium priority is divided, a node to be deployed is selected for a virtual network function in the service function chain, and importance of the selected node to be deployed is divided;

5. The service function chain reliability deployment method based on deep reinforcement learning according to claim 4, wherein the importance degree division is performed on the selected nodes to be deployed, and the method specifically comprises the following steps:

6. The deep reinforcement learning-based service function chain reliability deployment method according to claim 2, wherein when classified as a low-priority service function chain, virtual network functions in the service function chain are not backed up.

7. The method for deploying the service function chain reliability based on the deep reinforcement learning according to claim 1, wherein in the step 3), the deep reinforcement learning is applied to deploy a virtual network function in the service function chain, and specifically includes the following steps:

wherein the state set is S (t) ═ C_r，B_r，F_info)；

the action set is A (t) ═ V_d，V_b)；

the feedback function is

ii) initializing a neural network and an experience replay set D;

v) calculating the current feedback value R according to the feedback function R (t), and adding the four tuples (S (t), (V)_d，V_b) S (t +1), R) is stored in the experience playback set D, and S (t) is set as S (t + 1);

8. The deep reinforcement learning-based service function chain reliability deployment method as claimed in claim 7, wherein in step iv), if node pairs (V) are not reserved, the node pairs (V) are not reserved_d，V_b) If yes, backup node V_bIs set to empty.

9. The method for deploying the service function chain reliability based on the deep reinforcement learning of claim 7, wherein after the step vi) is completed, if the last virtual network function is determined, the deep reinforcement learning is ended; otherwise, re-executing steps iv) -vi).

10. The method for deploying the service function chain reliability based on the deep reinforcement learning according to claim 1, wherein after the step 3) is completed, if the last service function chain reliability deployment request is determined, the reliability deployment method is ended; otherwise, re-executing the step 1) to the step 3).