Disclosure of Invention
In view of this, the present invention provides a method for reliably deploying a service function chain based on deep reinforcement learning, so as to reduce failure probability and loss on the premise that a deployment result meets an overall reliability requirement.
In order to achieve the purpose, the invention provides the following technical scheme:
a service function chain reliable deployment method based on deep reinforcement learning comprises the following steps:
s1: obtaining a reliability value based on a reliability measurement mode of the equipment use degree and the surrounding safety coefficient;
s2: preliminarily determining the reliability requirement of each virtual network function through the functional characteristics and the topological characteristics;
s3: obtaining a deployable length of the link reliability requirement which meets the virtual link reliability requirement;
s4: based on each reliability requirement, using deep reinforcement learning to find an optimal mapping scheme suitable for a virtual network environment and a basic level environment;
s5: in the mapping process, if the VNF reliability cannot be met, a node backup method based on the importance is used, the link deployment result does not meet the link reliability, and a link backup method based on the link backup importance is used.
Further, in step S1, the reliability is determined based on the device usage degree, which estimates the wear of the device caused by the self-operation through the weibull distribution, and the surrounding safety factor, which is determined by counting the number of times of failure of the device and the distance from the failed device, and specifically includes:
s11: the equipment use degree is evaluated by Weibull distribution, the longer the equipment is used, the more serious the loss is, the greater the failure probability is, the equipment use degree is indirectly described by the failure probability, and the cumulative distribution function is
The larger the reliability is, the longer the device is used, the greater the failure probability is, so the reliability of the physical node from the viewpoint of the use degree is defined as:
means over time
The probability of failure due to equipment loss. Wherein eta and m are Weibull scale parameters and shape parameters, and can be obtained by solving by using a statistical analysis method according to log files and system configuration data of a local server.
For the same reason, physical link li,jLink utilization UL (l)i,j) The same expression is also used;
s12: generally, the environment of some devices is relatively severe or attacked, the number of times of faults occurring is large, and meanwhile, if the last fault occurs just nearby, it indicates to a certain extent that the environment of the node currently may have a large potential safety hazard. Therefore, the physical node perimeter safety factor is defined as follows:
wherein q is a physical node niNumber of failures in a period of time, pos ni]Represents niDistance hop count to last failed physical node; the physical node does not have faults or has fewer faults, which shows that the reliability coefficient of the surrounding environment of the physical node is higher.
Similarly, the physical link reliability coefficient is:
wherein pos [ li,j]Leaving this time for last failed linki,jThe number of physical nodes in the link failure at intervals; the final reliability is the product of two coefficients, namely: RN (n)i)=UR(ni)·ER(ni)、RL(li,j)=UR(li,j)·ER(li,j)。
Further, in step S2, the present invention places SFC reliability requirements on
The refinement is to the reliability requirements of multiple components, and by satisfying these requirements, a reliable mapping of service function chains is achieved. Because virtual nodes and virtual links in a virtual network are different in nature, reliability requirements need to be met firstly
Dividing the reliability into two parts, namely the reliability requirement of the virtual node and the reliability requirement of the virtual link
And virtual link reliability requirements
Are all defined as
And determining the importance score of each virtual network function according to the topology information and the function information of the service function chain, namely, the VNF sharing degree, the recovery resource cost, the VNF function importance and the VNF state.
Further, the reliability requirement of the virtual network function is determined as follows:
because each VNF in the SFC has different topological characteristics and functional characteristics, so that the failure impact, the recovery difficulty, and the recovery time are different, the importance of each VNF in failure needs to be determined through these characteristics, and then the reliability requirement matching with the importance is obtained, and a reference feature for determining the reliability requirement is provided:
1) VNF sharing degree: the function multiplexing enables a single VNF to be shared by a plurality of SFCs at the same time, and in the step, the parameter shared by the VNF by the plurality of SFCs is defined as the VNF sharing degree, and compared with the VNF dedicated to one SFC, the VNF fault shared among the plurality of SFCs affects all related services;
2) and recovery cost: each VNF consumes a corresponding resource and requires an amount of resources
Different, the difficulty of recovery after failure is different, the higher the resource demand is, the fewer physical nodes meeting the resource demand are, the lower the recovery probability is, and more reliability needs to be provided to the VNF; the recovery probability is improved, and meanwhile, the overall cost consumption of long-term operation is reduced.
3) VNF functional importance: for InP, VNF functions themselves have respective importance, for example, content is lost due to failure of content caching function, and if content is not backed up, it is difficult to recover, which results in immeasurable loss. And the firewall function failure is easy to remap and recover, so that the caused instant service interruption has no great influence. Because the attribute is difficult to describe by objective data, the importance of each VNF is determined according to human experience, and relatively important functions are endowed with higher reliability;
4) VNF state: each type of VNF needs to store respective network state information including data flow information and address mapping, where the state includes a relevant state and an irrelevant state, the relevant state is only relevant to the VNF itself, and the irrelevant state is modified along with the arrival and processing of the data flow, and a failure recovery process of the VNF is more complex than the relevant state, and requires a longer recovery time and more recovery cost consumption, so that the VNF needs to have higher reliability when including the irrelevant state;
the importance degree of the VNF is determined through the 4 attributes, and therefore the corresponding reliability requirement is determined; first, it should be determined
Because of the different attribute types, normalize the attribute values:
the sharing degree of the VNF is expressed by the number psi of the shared SFCs, and in order to avoid the overlarge probability of common cause faults caused by the overlarge sharing, the maximum sharing degree of the VNF is designed to be only 4, so that the result is normalized
Is psi/4;
the recovery cost is the resource demand, and is obtained by taking the maximum value of the resource demand in the SFC as the reference
The VNF function importance degree is manually graded to obtain an importance grade, and the importance degree grade is X
IRecorded in the scoring table, normalizing the results
Comprises the following steps:
the state of the VNF includes an
irrelevant state 1, no non-correlated state is 0.5;
according to each normalized attribute value, finally
The sum of the attribute values of (a) represents the degree of importance in the SFC, resulting in a normalized reliability coefficient ω
I:
At this time can obtain
Reliability requirement of
It is sufficient to find a physical node that meets VNF requirements at deployment time.
Further, in step S3, the virtual link reliability requirement has been determined, and now the link reliability requirement needs to be satisfied by the corresponding link mapping scheme. Meanwhile, function multiplexing is considered, so in order to jointly optimize function deployment and bandwidth requirements, an optimal balance point needs to be found between function multiplexing and path length, and a link reliable mapping model based on function multiplexing is adopted to decompose the deployment problem of a service function chain into two problems: 1) setting maximum value u 'of transmission hop number'k(ii) a 2) In u'kFinding an effective path with the highest multiplexing degree in the range;
for the problem 1), the function multiplexing and the link reliability requirement are considered jointly, the reliable deployment of the virtual link is different from the virtual node, and as the hop count of the link is not fixed, the more the hop count of the deployed link is, the lower the reliability of the mapped link is, so that the link length u can be controlledkTo meet the reliability requirements of the link, one can obtain:
wherein]+Denotes rounding up, rlAverage reliability for base layer links:
l represents the number of base links, and the link length should be smaller than u when finally deployedk(ii) a With reference to the base layer network distribution, it must also be satisfied:
a. in order to ensure that the service path can communicate with the source node and the destination node, the deployment path is not less than the shortest path between the receiving and transmitting nodes;
b. whether it is a reuse or a new deployment, it is guaranteed that all VNFs can be deployed on the path.
Further, u iskThe method is from reliability, a base layer network topology is not considered, and situations that a and b cannot be met exist in actual deployment, so that the request can only be rejected even if resources are sufficient; the main reason for this is insufficient path length, and to avoid this, the maximum hop count u is appropriately extendedkAnd if the link reliability of the deployment result is insufficient, the link reliability can be satisfied through backup, so that:
SF is a spreading factor, and after spreading, SF u is usedkDeployment can greatly increase the likelihood of node deployment success.
Further, in step S4, a mapping scheme is implemented through deep reinforcement learning based on VNF reliability and deployable link length, the deep reinforcement learning applies the strong sensing capability of the deep learning to the decision process of the reinforcement learning, interacts with the environment through a continuous trial and error manner, and senses the unknown condition of the environment through a neural network, thereby obtaining the most accurate reward evaluation value. And find the best strategy by maximizing the cumulative prize. The method needs to determine 3 element sets, state sets, action sets and rewards, and the definition of corresponding data is as follows:
state space: st={Bleft,Cleft,bmap,cmap},BleftAnd CleftRespectively representing the rest node resource set and the link resource set; the value can reflect the underlying topological property; in actual conditions, the network topology of the server is changed due to the fault caused by the influence of random factors in the environment, and the failed nodes and links are set to be 0; when the repair is complete, the mapping may continue on the node/link; namely, the real-time performance of interaction with the environment is strengthened and learned, and the topological change is sensed and responded; bmap,cmapMapping state sets of the mapped nodes and links in the SFC to represent virtual network states corresponding to corresponding base layer environment states, wherein the node states comprise deployment results and reusability of the mapped nodes, and the link states are deployed links;
an action space: a ist={an,al,aκ},anFor node mapping actions, alFor link mapping actions, akAllocating corresponding resources for the resource allocation action according to the requirements; wherein, when executing node mapping action, it will search the VNF carried by the mapped base level node, if finding that the carried physical node carries the VNF, it will directly multiplex the VNF, at the same time, it will not execute node resource allocation action, and recalculate each omegaIOtherwise, the VNF is instantiated to consume the computing resource;
the action set is screened, meanwhile, the probability of selecting effective actions is increased, and the training result is to correctly reflect the strategy of obtaining the maximum accumulated reward, so that the neural network only needs to reflect the deployment strategy meeting the requirement, and is not important for learning of the non-effective action state pair, so that the performance loss is not caused, and the learning time can be effectively reduced; node chain meeting requirementsThe way sets are respectively represented as
And
rewarding:
the reward ensures that the VNF mapping action minimizes the reliability waste on the basis of ensuring the reliability requirement; the resource consumption of the segment mapping is minimized while considering the load balance; of course, if an invalid action is selected, such as reliability not being satisfied, the action award is set to-100.
Deep reinforcement learning obtains corresponding reward, namely the next state, by continuously executing actions, and collects a plurality of st,at,rt,st+1]The vectors are put into an experience playback pool, then random extraction is carried out in the experience playback pool to reduce the correlation of training data, the convolution neural network is trained, a state action pair is reflected through the strong perception of the neural network to serve as a potential value state value function in the future, and when training is sufficient and fitting is complete, the optimal strategy can be obtained.
Further, in step S5, in response to the situation that the reliability in the deployment process is insufficient, a node backup scheme based on the importance of the virtual network function and a link backup scheme based on the importance of the link backup are adopted, which specifically includes:
in the action selection process, when a node meeting the reliability is not selected, a node backup scheme is used, the scheme determines a backup mode based on the VNF importance obtained when deployment is completed, for the more important VNF, namely, the VNF higher than the average importance is subjected to special backup, namely, a VNF deployed is instantiated to be deployed again to be deployed at a nearby physical node as a backup, and if a source node fails, the VNF of the backup node is directly used; the method can effectively deal with multi-node faults on the basis of ensuring the reliability. The shared backup can be carried out relatively unimportant, namely a plurality of backup nodes share the resource of the same backup node, so that the resource consumption can be effectively reduced while the reliability is ensured;
executing a link backup scheme when the deployment result does not meet the reliability of the virtual link, wherein the scheme considers that the resource consumption is reduced and the reliability increment is improved
Where BW is the link backup resource consumption,
meanwhile, the backup is carried out on the first reliable virtual link as much as possible, and the following steps are obtained:
according to
Can select a higher reliability boost, consume as little resources as possible, and can reduce the probability of a lower reliability link failure, based on
And sequencing the iterative backups.
The invention has the beneficial effects that: the reliable deployment method provided by the invention can effectively deal with the basic layer faults on the basis of ensuring the reliability requirement, reduces the number of invalid SFCs, and ensures the load balance to ensure that the whole virtual network is more stable and reliable.
Additional advantages, objects, and features of the invention will be set forth in part in the description which follows and in part will become apparent to those having ordinary skill in the art upon examination of the following or may be learned from practice of the invention. The objectives and other advantages of the invention may be realized and attained by the means of the instrumentalities and combinations particularly pointed out hereinafter.
Detailed Description
The technical solutions in the embodiments of the present invention will be described in detail and clearly with reference to the accompanying drawings.
Fig. 1 is a framework diagram of an application environment in the present invention, see fig. 1, which is a basic model for providing end-to-end services in a network virtualization environment, and the framework is a network virtualization architecture composed of NFV-based orchestration and control framework. An end-to-end service function chain starts from a terminal device, and VNFs are sequentially deployed on the terminal device through an access network and a core network to implement corresponding service requirements. The virtual sub-network of service function chains is built and managed by the Service Provider (SP). The infrastructure provider (InP) is responsible for building, operating and maintaining the basic physical facilities, and provides reliable resources through a standardized interface according to SP resource application. When a user makes a request, the SP will generate an end-to-end network service chain consisting of several virtual network functions, such as SFC1 in slice 1 and SFC2 in slice 2, which requires determining the required functions, number and order between them.
Fig. 2 is a schematic diagram of function multiplexing. The function multiplexing means that if the VNF to be deployed is found to be carried on the server in the mapping process, the VNF can be shared with the deployed SFC, and the process does not consume node computing resources. The number of VNFs multiplexed in the SFC is called multiplexing degree, and the multiplexing degree can be improved during mapping so as to reduce the consumption of node resources. However, the deployed functions are fixed in location and consume more link resources than instantiating a new VNF. For example, in FIG. 2, a request for service is required
Virtual network function v in
IDeployment is carried out, and two deployment schemes are provided, wherein the first scheme is deployment to n
iTwo physical links are needed, and corresponding node resource instantiation v needs to be consumed at the same time
I(ii) a The second is deployed to n
jIn the above, 4 physical links are consumed, but the server n
jHas carried v
IAnd the node resources can be directly reused and are not consumed any more. It can be seen that the two schemes focus on optimizing link deployment and node deployment, respectively. Therefore, in order to jointly optimize the function deployment and the bandwidth requirement, an optimal balance point needs to be found between the function multiplexing and the path length.
FIG. 3 is a schematic diagram of the deep reinforcement learning framework of the present invention, the Global service editor (GS-O) when a virtual network request arrives
First, a search for a base layer network environment and a virtual layer environment is started, and the search process is as the search loop of fig. 3. The corresponding states are obtained by searching in the base layer/virtual layer environment, and the executable actions are obtained by an epsilon greedy strategy or by randomly selecting actions in the base layer environment, or by a value neural network.
In the process of randomly selecting the action, the VNF requirement may be high, and if there is no physical node satisfying the reliability requirement in the mapping range, the reliability may be satisfied through backup under the condition of sufficient resources and many nodes. At this time, the importance of the VNF is determined, and the node backup scheme is executed to obtain an action group. In addition, when the maximum Q value action selected by the neural network cannot meet the reliability, the same backup operation is carried out by adding the next largest action.
Determining to execute the deployment action a of the tth VNF according to the selection resulttGet the corresponding deployment reward rtAnd changing the state of the basic level environment resource to st+1To obtain a vector [ s ]t,at,rt,st+1]And stored in the experience pool. Then from st+1The search is continued and the t +1 th VNF is deployed. Until the VNF is fully deployed. A deployment cycle is performed. Then perform a next deployment cycle to seek from the beginning. Vectors are iteratively collected in this process.
Enough experience is stored in the experience poolAfter the vectors are extracted, a set of vectors may be randomly drawn in an experience pool. stAnd atObtaining the Q estimated value Q(s) of the corresponding action by the main neural network at the momentt,at| θ), θ represents a parameter of the master neural network. r ist、st+1To obtain a target Q value:
at this time, the difference between the main network Q value and the target Q value is made as a loss function:
the loss function updates θ using the SGD. The training of the target network is derived from the training results of the primary network. To reduce the time complexity, the parameter θ of the master network is copied to the target network every T times the master network is trained. The learning loop in fig. 3 is continuously executed in a loop. After a group of vectors is trained, the environment is continuously searched, the vectors in the experience pool are updated, and iteration is continuously carried out. Until the loss function converges to a sufficiently small range. At this point the Q-estimation neural network has already tended to converge. Whereby the deployment loop of the figure can be implemented.
Finding out that the link reliability is not satisfied after the SFC deployment is finished on the deployment loop, executing a link backup module to the link backup module
And sequencing, and iterating backup from front to back until the reliability requirement is met. And then obtaining a final deployment scheme.
The whole method is summarized and obtained as follows:
step 1: initializing an experience playback pool, randomly initializing parameters of a main network theta, and copying the parameters to a target network theta-And obtaining a reliability pre-distribution result w of the virtual network function based on a virtual network function reliability determination model and a virtual link reliable mapping model based on function multiplexingIAnd deployable length u'k。episode=0
Step 2: initialization state S1={Bleft,Cleft,bmap,cmapAnd t is 0
And step 3: determining a set of selectable actions
Randomly generating a number τ of 0 to 1 if τ<E, go to
step 4, otherwise go to step 6.
And 4, step 4: if it is not
Non-null where action a is randomly selected
tOtherwise, go to
step 5.
And 5: if it is not
Not null, when ω is
IExecuting shared backup when the ratio is less than 1/D, newly building nodes near the deployed backup nodes, if no deployed backup node exists or omega
IAnd executing special backup if the reliability is higher than 1/D, and selecting the node with the highest reliability as a deployment node and the other node as a backup node until the reliability requirement of the VNF is met.
Step 6: estimation based on a main neural network
And selecting an action.
And 7: performing action atTo obtain a reward rtAnd the next state st+1Will vector(s)t,at,rt,st+1) And putting the experience playback pool. And if the experience pool is full, the step 8 is carried out. If not, step 2 is entered if all VNFs have been deployed. Otherwise, if t is t +1<And D, entering the step 3.
And 8: and randomly taking out a small batch sample vector set from the experience playback pool.
And step 9: taking a vector, inputting the vector into the main network, and obtaining a loss function by the target network
Step 10: and updating the parameters through random gradient descent, and copying the main network parameters to the target network after updating for T times of theta. If the vector is not trained, go to step 9, otherwise, the epsilon is equal to epsilon +1, if the epsilon is equal to E, go to step 11, otherwise, go to step 2
Step 11: obtaining the optimal strategy of each VNF through the main network, if the strategy is Q(s)t,at) If < -100 then the request is denied.
Step 12: if the reliability requirements of the virtual links are not satisfied, calculating the reliability requirements of each virtual link
The link backup is iterated from high to low until the virtual link reliability requirements are met.
Finally, the above embodiments are only intended to illustrate the technical solutions of the present invention and not to limit the present invention, and although the present invention has been described in detail with reference to the preferred embodiments, it will be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions, and all of them should be covered by the claims of the present invention.