CN113708969A

CN113708969A - Cooperative embedding method of cloud data center virtual network based on deep reinforcement learning

Info

Publication number: CN113708969A
Application number: CN202110995220.2A
Authority: CN
Inventors: 王廷; 杨芃; 王志浩
Original assignee: East China Normal University
Current assignee: East China Normal University
Priority date: 2021-08-27
Filing date: 2021-08-27
Publication date: 2021-11-26
Anticipated expiration: 2041-08-27
Also published as: CN113708969B

Abstract

The invention discloses a deep reinforcement learning-based collaborative embedding method for a cloud data center virtual network, which specifically comprises the following steps: s1, completing modeling of the virtual network embedding problem; s2, completing modeling of the computing resource CPU and the network resource bandwidth in the underlying network; and S3, constructing a neural network, and processing the request of the virtual network to realize the embedding of the virtual network into the underlying network. The invention is based on the deep reinforcement learning technology, and dynamically learns the resource allocation strategy under a series of constraint conditions according to the existing available resources and the virtual network request of the tenant, so as to improve the resource utilization rate and the request allocation success rate. Meanwhile, the invention combines a resource abstraction model Blocking Island, which can abstractly express the structure and available resources of each part of the underlying network, thereby reducing the search space and improving the calculation efficiency and the algorithm performance.

Description

Cooperative embedding method of cloud data center virtual network based on deep reinforcement learning

Technical Field

The invention relates to the technical field of cloud computing, in particular to a cooperative embedding method for processing a virtual network request in a cloud data center and realizing mapping from a virtual network to a bottom network based on deep reinforcement learning.

Background

The cloud computing data center is used as a highly-multiplexed shared environment, and is configured with a large number of physical servers and virtual machines running on the physical servers, so that resources such as high-reliability computing, storage and network are provided for multiple tenants in a pay-as-you-go mode. How to effectively share underlying physical network resources among multiple tenants with different network characteristics and requirements is a key issue. Therefore, a Network Virtualization (Network Virtualization) technology has become an effective resource sharing technology, and multi-tenant multiplexing of computing resources (such as CPUs) and Network resources (such as bandwidth) in a data center is effectively realized. The Network structure customized by the tenant is generally called Virtual Network (VN). How to use the existing and dynamically-changing limited underlying physical resources to serve the Virtual Network requests of multiple tenants and effectively allocate resources, and maximizing the benefits is the Virtual Network Embedding (VNE) problem.

In the VNE problem, the physical Network owned by a cloud computing data center (or ISP) is called an underlying Network (infrastructure Network); the Network customized by the tenant is called a Virtual Network (Virtual Network). In an underlying network, each compute node (e.g., host) owns a certain amount of available computing resources, such as a CPU count or a processor core count; the computing nodes are connected by physical links, each of which can carry a certain amount of network resources, such as bandwidth. Virtual networks are essentially tenants' requirements for their network structure and resources. In a virtual network, each node represents a virtual operation node, which needs a certain amount of computing resources; each link connecting nodes represents a virtual link, which similarly requires a certain amount of network bandwidth resources. VNE is to map one or more virtual networks onto an existing underlay network while ensuring that resources are utilized efficiently and requests are handled as successfully as possible. The "mapping", i.e., resource allocation, allocates resources on the underlying network to the tenant's virtual network. Specifically, the computing resources of the computing nodes in the underlying network are allocated to the virtual nodes as needed, and the network resources of each link on the underlying network path are allocated to the virtual links as needed.

The mapping of virtual networks to shared physical resources by allocating resources of an underlying network to virtual nodes and virtual links of the virtual network and satisfying a series of constraints is a resource allocation and scheduling problem with extremely high computational complexity, and the VNE problem proved by existing studies is the NP-hard problem, and the method for the optimal solution of this kind of problem has reached exponential order complexity. In a real scene, a network architecture of a cloud data center is large and complex in scale, and a great amount of time is consumed for making a decision on resource allocation through the methods, which is unrealistic and difficult to meet the requirements of low time delay and high efficiency of a modern cloud data center. Therefore, many studies on VNE have been seeking approximately optimal solutions through some heuristic, deterministic method. These methods usually consist of a set of manually written, static, fixed processes that are directly sorted, filtered, and finally mapped according to the characteristics of the virtual network and the underlying network only. Therefore, the method cannot dynamically adjust the mapping process according to the feedback or the income obtained in actual operation, is difficult to learn some resource allocation rules from historical data, and cannot be adjusted and optimized while falling into a local optimal solution. With the development of artificial intelligence related fields in recent years, reinforcement learning is widely applied to the fields of science, engineering, art and the like as a machine learning method. Generally, in the reinforcement learning method, there is one agent interacting with the environment: at each time slice, the agent receives the current state from the environment as input, and selects an action to interact with the environment according to the policy of the agent, so as to enter a new state and obtain a reward (reward). The reinforcement learning method can continuously adjust the strategy of agent selection action according to the obtained reward size and the learning target so as to efficiently obtain an approximately optimal solution in some complex problems. The nature of reinforcement learning also makes it applicable to solving VNE problems. In fact, there have been studies that began to introduce methods such as Q-Learning, Policy Network, etc., while Learning strategies for allocating underlying Network resources in conjunction with other machine Learning models. However, there is relatively little work in this area and there is much room for improvement and optimization in the work. Therefore, how to combine machine learning to solve the VNE problem is still a popular research topic in academic and industrial circles, and has high scientific research value.

Disclosure of Invention

Aiming at the problems that the model is complex, the calculation requirement is high, or only the characteristics related to the underlying network nodes are simply extracted as input and the like in the prior art, the invention aims to provide a cooperative embedding method of a cloud data center virtual network based on deep reinforcement learning so as to improve the resource utilization rate and the request allocation success rate.

In order to achieve the purpose, the technical scheme adopted by the invention is as follows:

a collaborative embedding method of a cloud data center virtual network based on deep reinforcement learning comprises the following specific steps:

s1, completing the modeling of virtual network embedding;

firstly, the underlying network in the actually existing data center is modeled into a weighted undirected graph G_s＝(N_s，E_s) Wherein the network nodes and links respectively form a point set N in the undirected graph_sAnd edge set E_sPoint n in the figure_s∈N_sWeight C (n) of_v) And edge e_s∈E_sWeight B (e) of_s) Respectively representing the computing resource CPU of the network node and the network resource bandwidth of the link; modeling the virtual network requested by the tenant into a weighted undirected graph G in the same way_v＝(N_v，E_v)；

Virtual network embedding is then translated into finding a graph G that represents the virtual network_vTo graph G representing the underlying network_sA set of mapping relation functions of:

wherein

And ensure that:

meanwhile, both the CPU resource and the bandwidth resource in the virtual network need to be met by the mapped network:

wherein the function C (n)_v) Representing a node n_vCPU computing resources available at the top, function B (e)_v) Represents a link e_vBandwidth resources available on;

on the basis, linear combination of the total CPU requirement and the total bandwidth requirement in the virtual network request is used

To represent the expected gain when the request is successfully processed, where α is a weighting factor; summing the cost of computing resources and the cost of network resources when processing requests

To represent the total overhead of processing the request, wherein

Representing edges e for composing virtual networks_vUnderlying network path of

The number of middle edges;

embedding the virtual network is definitely determined to maximize the benefit and minimize the cost on the basis of searching a mapping relation function meeting the conditions;

s2, completing modeling of the computing resource CPU and the network resource bandwidth in the underlying network; first, defining CPU available quantity upper limit CPU_maxAnd the bandwidth availability upper limit Band_max(ii) a Then a fixed set of values cpu is preset_stridesAnd band_stridesAnd will [0, CPU_max]And [0, Band_max]Two intervals are equally divided into cpu_stridesAnd band_stridesA segment; respectively recording the end points of each equal part as

And

wherein

i∈[0，cpu_strides)，j∈[0，band_strides) (ii) a Finally, for each node n in the underlay network_sSequentially constructing a beta-BI model of the CPU resource

And beta-BI model of bandwidth resources

Wherein

Refers to a slave node n_sAvailable CPU reachable from start is not lower than

Of the node(s) of (a),

refers to a slave node n_sStarting from available bandwidth not less than

A set of edge-reachable points of (c);

s3, constructing a neural network; various virtual network requests are then processed: for received virtual network G with the configuration described in step 1_v＝(N_v，E_v) Request, neural network traversal N_vFor each node n traversed to_vAfter being processed by the neural network, the virtual node n is given_vMapping to the probability distribution of different bottom nodes, and then sampling according to the probability distribution to obtain a bottom node to complete a virtual node n_vEmbedding; to be treated as a whole virtual node set N_vAfter traversing and completing embedding, completing the virtual link E by BVLO and HVLM algorithms_vThereby realizing the embedding of the whole virtual network; and finally, finishing the back propagation of the neural network according to the error between the reward value reward after the virtual network request is mapped and the predicted value of the neural network in the forward propagation process, and updating the model parameters in the neural network.

Step S2 beta-BI model of CPU resource

And beta-BI model of bandwidth resources

By constructing an undirected graph G based on an underlying network_s＝(N_s，E_s) The above method of breadth-first search is used.

The step S3 specifically includes:

s31, constructing a neural network;

firstly, extracting the characteristics of an input layer; traversing underlying network node set N_sEach underlying network node n in_sSequentially acquiring the beta-BI model of the CPU resource calculated in the step S2

And beta-BI model of bandwidth resources

From the beta-BI model corresponding to the underlying network node and the virtual network node n to be mapped currently_vExtracting features from the information as neural network input; in particular for the underlying network node n_sEach of (1)

And

extracting from it

Maximum value, minimum value, median and average value of CPU available quantity, degree, maximum value, minimum value, median and average value of bandwidth available quantity in adjacent edges, virtual node n to be mapped currently_vAnd n is the CPU computing resource demand_vAnd the bottom node n_sThe difference of the available computing resource amounts of (a) as an input feature;

then constructing the rest part of the network, wherein the rest part of the network comprises 4 layers; specifically, the rest part is a one-dimensional convolution layer comprising two convolution kernels, a ReLU layer, a one-dimensional convolution layer of a single kernel and a softmax layer in sequence;

s32, the neural network receives the virtual network G_v＝(N_v，E_v) The request is processed by first completing the virtual node set N_vMapping of (2);

traverse N_vFor each virtual network node n therein_vAcquiring the input features in S31;

firstly, processing input features by a convolution layer in one-dimensional direction, wherein the convolution layer comprises two convolution kernels; the input features are recorded as

Where m is the number of bottom nodes, f is the number of features, and the two convolution kernels are each

The output of the convolutional layer is M.w₁，w₂]；

After passing through the convolutional layer with 2 convolutional kernels, the input features are processed through a ReLU layer to avoid overfitting;

the ReLU layer is followed by a layer of single-core one-dimensional convolutional layer processing; specifically, a score is calculated for each underlying network node from the learning results of the previous two layers, and the current virtual node n is quantitatively shown_vMapping to the quality of each underlying network node; the result output by the layer is recorded as v ═ v₁，v₂，...，v_m]Then for the underlying network nodes

With a calculated score of v_i；

Finally, the scoring results calculated by the previous layer for each node are converted into probabilities and output through processing of a Softmax layer; specifically, the output result of the Softmax layer is recorded as S ═ S₁，s₂，...，s_m]Wherein s is₁，s₂，...，s_mCorresponding to the bottom nodes one to one, the probability distribution is expressed as:

where exp () is an exponent based on a natural constantA function; for the underlying network nodes with higher corresponding scores in the upper layer, the higher the probability value output in the layer is, the more the current virtual node n is selected as_vThe greater the probability of the mapped point of (a); then, on the basis of output results of the Softmax layer, firstly filtering the available computing resources which cannot meet the virtual node n_vCalculating the resource requirement or the number of the nodes which are mapped to the nodes and reach a certain threshold value, and then sampling according to the probability output by the Softmax layer; current virtual node n_vMapping the obtained data to the sampled underlying network nodes;

s33, after the mapping of the nodes in the virtual network request is completed through the above step S32, the link E is completed_vMapping of (2);

firstly, sequencing virtual links to be mapped by using a BVLO algorithm; specifically, let virtual link e_vThe bottom network nodes mapped by the two end points are n ″)_sAnd n'_sThe symbol also includes n ″)_sAnd n'_sBandwidth resource beta-BI model

Of which the largest beta value is LCF_sn(n″_sn″′_s) Beta value is expressed as

The BVLO algorithm will sequence the virtual links in order according to the following keys:

1)

smaller virtual links are ranked more forward;

2)LCF_sn(n″_s，n″′_s) The smaller the number of nodes, the more forward the virtual link is ordered;

3) the higher the required bandwidth is, the higher the virtual link sequence is;

4) if the two virtual links are in equal sequence in 1), 2) and 3), randomly sequencing;

then virtual link mapping is performed based on the HVLM algorithm:

1) corresponding to underlying network nodes n' from two end points of the virtual link_sAnd n'_sLCF of_sn(n″_s，n″′_s) In (1), seek from n ″)_sAnd n'_sEach edge of the set of paths needs to meet the bandwidth requirement of the virtual link;

2) selecting a path with the least number of edges from the paths obtained in the step 1);

3) selecting a path with the maximum minimum value of available network resources in a passing edge from the paths with the minimum number of edges;

4) if a plurality of feasible paths still exist after the steps 1), 2) and 3), randomly selecting one path;

s34, completing point set N_vAnd edge set E_vI.e. the virtual network G is completed_v＝(N_v，E_v) Processing the request, and then updating the neural network parameters according to reward value reward mapped by the virtual network;

the reward is calculated as follows:

1) if the mapping of the virtual node fails, that is, the mapping corresponding to a certain virtual node is not found, then n is applied to each virtual node of the virtual network request_vThe method comprises the following steps:

that is, in the forward propagation process corresponding to the virtual node, the neural network predicts the available CPU number of the underlying network node mapped by the maximum probability and the virtual node n_vDifference in required CPU number from the virtual node n_vThe ratio of the number of CPUs required;

2) if the virtual link e_v＝(n″_v，n″′_v) Mapping fails, where n ″)_vAnd n'_vIs e_vI.e., does not find a suitable, feasible underlying path for mapping, first uses binary enumeration to find all the nodes in the tree

To

On the path of (2), the maximum value of the available bandwidth of the minimum edge is recorded as

Then for each virtual node n of the virtual network request_vThe method comprises the following steps:

the ratio of the difference between the maximum value and the required bandwidth of the virtual link to the required bandwidth of the virtual link;

3) virtual network G_v＝(N_v，E_v) The request mapping is successful; for each virtual node n_vCalculating the weighted sum of the computing resources of the virtual node and the network resources of the virtual link connected with the virtual node, and dividing the weighted sum by the sum of the resources consumed by mapping the points and the links; using the obtained quotient as a mapping node n_vThe corresponding return is sent back to the user,

wherein e is_vTo connect node n_vThe virtual link of (2);

after calculating the reward value reward corresponding to the virtual network request, calculating the error of forward propagation according to the predicted value of forward propagation each time and the reward corresponding to the forward propagation, and performing backward propagation to update the model parameters of the neural network.

The invention has the beneficial effects that:

after experimental comparison is carried out on the yield-cost ratio, the request success rate, the yield and the average mapping overhead by comparing the method in the invention with a Presto algorithm (a heuristic method for solving VNE problems based on a BI model) and an RLA algorithm (a method for learning virtual network mapping strategies based on Policy Gradient), the following results can be seen: the method provided by the invention is higher than Presto and RLA in the benefit-overhead ratio, request success rate and benefit; on average mapping overhead, this method is again lower than the other two comparison methods. Therefore, the resource allocation strategy is dynamically learned based on the deep reinforcement learning technology, and the resource utilization rate and the request allocation success rate are effectively improved. Meanwhile, the resource abstraction model combined by the method effectively reduces the search space and improves the calculation efficiency and the algorithm performance. The method has better performance in terms of resource utilization, overhead saving and income acquisition.

Drawings

FIG. 1 is a diagram of a neural network architecture constructed in the present invention;

FIG. 2 is a graph comparing the present invention with the Presto, RLA algorithm in terms of benefit-to-cost ratio;

FIG. 3 is a graph comparing the performance of the present invention with Presto, RLA algorithms on the success rate of virtual network requests;

FIG. 4 is a graph comparing the performance of the present invention with that of Presto, RLA algorithms on average mapping overhead;

FIG. 5 is a graph of the performance of the present invention versus the Presto, RLA algorithm in terms of yield.

Detailed Description

The invention is described in detail below with reference to the figures and examples. It will be clear that the examples given are only intended to illustrate the invention and are not intended to limit the scope of the invention.

The invention relates to a deep reinforcement learning-based collaborative embedding method for a cloud data center virtual network, which comprises the following steps:

s1, completing the modeling of virtual network embedding;

wherein

And ensure that:

To represent the total overhead of processing the request, wherein

Representing edges e for composing virtual networks_vUnderlying network path of

And the number of edges;

And

wherein

i∈[0，cpu_strides)，j∈[0，band_strides) (ii) a Finally, for each node n in the underlay network_sSequentially constructing the beta-BI model of the CPU resource by a breadth-first search method

And beta-BI model of bandwidth resources

Wherein

Refers to a slave node n_sAvailable CPU reachable from start is not lower than

Of the node(s) of (a),

refers to a slave node n_sStarting from available bandwidth not less than

A set of edge-reachable points of (c);

s3, constructing a neural network based on the policy gradient method, then processing the request of the virtual network, respectively completing the mapping of the virtual nodes and the virtual links to realize the embedding of the virtual network into the underlying network, and completing the updating of parameters in the neural network according to the reward value obtained by processing the request; the method specifically comprises the following substeps:

s31, constructing a neural network with the structure shown in the figure 1;

And beta-BI model of bandwidth resources

And

extracting from it

The output of the convolutional layer is M.w₁，w₂]；

the ReLU layer is followed by a layer of single-core one-dimensional convolutional layer processing; specifically, from the frontCalculating a grade for each underlying network node in the two-layer learning result, and quantitatively showing the current virtual node n_vMapping to the quality of each underlying network node; the result output by the layer is recorded as v ═ v₁，v₂，...，v_m]Then for the underlying network nodes

With a calculated score of v_i；

Finally, the scoring results calculated by the previous layer for each node are converted into probabilities and output through a Soffmax layer; specifically, the output result of the Soffmax layer is S ═ S₁，s₂，...，s_m]Wherein s is₁，s₂，...，s_mCorresponding to the bottom nodes one to one, the probability distribution is expressed as:

here exp () is an exponential function with a natural constant as the base; for the underlying network nodes with higher corresponding scores in the upper layer, the higher the probability value output in the layer is, the more the current virtual node n is selected as_vThe greater the probability of the mapped point of (a); then, on the basis of the output result of the Soffmax layer, firstly filtering available computing resources which cannot meet the requirement of the virtual node n_vCalculating the resource requirement or the number of the nodes which are mapped to the nodes and reach a certain threshold value, and then sampling according to the probability output by the Softmax layer; current virtual node n_vMapping the obtained data to the sampled underlying network nodes;

Of which the largest beta value is LCF_sn(n″_s，n″′_s) Beta value is expressed as

1)

smaller virtual links are ranked more forward;

then virtual link mapping is performed based on the HVLM algorithm:

the reward is calculated as follows:

To

wherein e is_vTo connect node n_vThe virtual link of (2);

From the test results, it can be found that as shown in fig. 2, where ICE refers to the present invention, the mapping strategy of the present invention has an average gain-cost ratio higher than that of Presto algorithm by 3.7%, the highest by approximately 5%, and an average gain-cost ratio higher than that of RLA algorithm by 6.3%, and the highest by approximately 7.8%. Further, as shown in FIG. 3, where ICE refers to the present invention; the mapping strategy of the invention also has better performance than that of Presto algorithm and RLA algorithm in the success rate of virtual network requests, and the highest performance is respectively higher than 9.4 percent and 14.4 percent. As the number of virtual network requests increases, the success rate of each algorithm will decrease, but the rate of decrease is slowest with the present invention. As shown in fig. 4, wherein ICE refers to the present invention; on the overhead of mapping, the present invention saves resource consumption by 4.9% on average compared to Presto algorithm and 8.5% on average compared to RLA algorithm, and gains 9.2% and 12.7% more gains than the two algorithms respectively at the highest, as shown in fig. 5, where ICE refers to the present invention; the invention is mainly because the invention is a cooperative virtual network embedding method based on a beta-BI model, which fully considers the available resource amount and corresponding topological information of each node, edge and each part of an underlying network when mapping nodes, learns a point set mapping strategy from the information through a neural network, and selects a proper edge set mapping step to process an incoming virtual network request.

Claims

1. A collaborative embedding method of a cloud data center virtual network based on deep reinforcement learning is characterized by comprising the following specific steps:

s1, completing the modeling of virtual network embedding;

firstly, the seed is put into practical useThe underlying network in the data center that exists is modeled as a weighted undirected graph G_s＝(N_s，E_s) Wherein the network nodes and links respectively form a point set N in the undirected graph_sAnd edge set E_sPoint n in the figure_s∈N_sWeight C (n) of_v) And edge e_s∈E_sWeight B (e) of_s) Respectively representing the computing resource CPU of the network node and the network resource bandwidth of the link; modeling the virtual network requested by the tenant into a weighted undirected graph G in the same way_v＝(N_v，E_v)；

wherein

And ensure that:

To represent the total overhead of processing the request, wherein

Representing edges e for composing virtual networks_vUnderlying network path of

The number of middle edges;

s2, completing modeling of the computing resource CPU and the network resource bandwidth in the underlying network; first, defining CPU available quantity upper limit CPU_maxAnd the bandwidth availability upper limit Band_max(ii) a Then a fixed set of values cpu is preset_stridesAnd band_stridesAnd will [0, CPU_max]And [0, Band_max]Two intervals are equally divided into cpu_stridesAnd band_striaesA segment; respectively recording the end points of each equal part as

Medicine for curing cancer

Wherein

Finally, for each node n in the underlay network_sSequentially constructing a beta-BI model of the CPU resource

And beta-BI model of bandwidth resources

Wherein

Refers to a slave node n_sAvailable CPU reachable from start is not lower than

Of the node(s) of (a),

refers to a slave node n_sStarting from available bandwidth not less than

A set of edge-reachable points of (c);

s3, constructing a neural network; various virtual network requests are then processed: for received virtual network G with the configuration described in step 1_v＝(N_v，E_v) Request, neural network traversal N_vFor each node n traversed to_vAfter being processed by the neural network, the virtual node n is given_vMapping to the probability distribution of different bottom nodes, and then sampling according to the probability distribution to obtain a bottom node to complete a virtual node n_vEmbedding; to be treated as a whole virtual node set N_vAfter traversing and completing embedding, completing the virtual link E by BVLO and HVLM algorithms_vFromThe embedding of the whole virtual network is realized; and finally, finishing the back propagation of the neural network according to the error between the reward value reward after the virtual network request is mapped and the predicted value of the neural network in the forward propagation process, and updating the model parameters in the neural network.

2. The deep reinforcement learning-based collaborative embedding method for the cloud data center virtual network according to claim 1, wherein step S2 is implemented by using the β -BI model of the CPU resource

And beta-BI model of bandwidth resources

3. The method for collaborative embedding of the cloud data center virtual network based on the deep reinforcement learning of claim 1, wherein the step S3 specifically includes:

s31, constructing a neural network;

And beta-BI model of bandwidth resources

And

extracting from it

The output of the convolutional layer is M.w₁，w₂]；

With a calculated score of v_i；

here exp () is an exponential function with a natural constant as the base; for the underlying network nodes with higher corresponding scores in the upper layer, the higher the probability value output in the layer is, the more the current virtual node n is selected as_vThe greater the probability of the mapped point of (a); then, on the basis of output results of the Softmax layer, firstly filtering the available computing resources which cannot meet the virtual node n_vCalculating the resource requirement or the number of the nodes which are mapped to the nodes and reach a certain threshold value, and then sampling according to the probability output by the Softmax layer; current virtual node n_vMapping the obtained data to the sampled underlying network nodes; s33, after the mapping of the nodes in the virtual network request is completed through the above step S32, the link E is completed_vMapping of (2);

firstly, sequencing virtual links to be mapped by using a BVLO algorithm; more specifically, it relates toVirtual link e_vThe bottom network nodes mapped by the two end points are n ″)_sAnd n'_sThe symbol also includes n ″)_sAnd n'_sBandwidth resource beta-BI model

1)

smaller virtual links are ranked more forward;

then virtual link mapping is performed based on the HVLM algorithm:

the reward is calculated as follows:

To

wherein e is_vTo connect node n_vThe virtual link of (2);