CN111147307B

CN111147307B - Service function chain reliable deployment method based on deep reinforcement learning

Info

Publication number: CN111147307B
Application number: CN201911397560.4A
Authority: CN
Inventors: 唐伦; 曹睿; 贺兰钦; 管令进; 胡彦娟; 陈前斌
Original assignee: Chongqing University of Post and Telecommunications
Current assignee: Ningbo Shipping Exchange Co ltd; Shenzhen Wanzhida Technology Transfer Center Co ltd
Priority date: 2019-12-30
Filing date: 2019-12-30
Publication date: 2022-04-29
Anticipated expiration: 2039-12-30
Also published as: CN111147307A

Abstract

The invention relates to a service function chain reliable deployment method based on deep reinforcement learning, and belongs to the technical field of communication. The method comprises the following steps: s1: obtaining a reliability value based on a reliability measurement mode of the equipment use degree and the surrounding safety coefficient; s2: preliminarily determining the reliability requirement of each virtual network function through the functional characteristics and the topological characteristics; s3: obtaining a deployable length of the link reliability requirement which meets the virtual link reliability requirement; s4: based on each reliability requirement, using deep reinforcement learning to find an optimal mapping scheme suitable for a virtual network environment and a basic level environment; s5: in the mapping process, if the VNF reliability cannot be met, a node backup method based on the importance is used, the link deployment result does not meet the link reliability, and a link backup method based on the link backup importance is used. The method can effectively deal with the basic layer faults on the basis of ensuring the reliability requirement, reduces the number of invalid SFCs, and ensures the load balance to ensure that the whole virtual network is more stable and reliable.

Description

Service function chain reliable deployment method based on deep reinforcement learning

Technical Field

The invention belongs to the technical field of communication, and relates to a service function chain reliable deployment method based on deep reinforcement learning.

Background

With the rise of technologies such as cloud computing and big data, people increasingly use the internet to perform activities such as work, shopping and video in daily life. These technologies bring convenience to people and also have more requirements on the performance and architecture of the network. Conventional network architectures lack flexibility and require the deployment of a range of dedicated hardware devices to provide services. The network functions thus form a rigid and rigid network function chain, which has the disadvantage of being fixed in the network topology and the location of the service providing elements, and modifying the network function chain means modifying the network topology or changing the connections of the middleboxes. When providing new services, huge investment costs and operation costs are faced with deploying and upgrading physical infrastructure. Network function virtualization is used as an industrial network system structure, a new network design idea is provided for future network design by utilizing a virtualization technology, the main idea is that a software implementation part of a network function is decoupled from underlying hardware and migrated to a virtual platform, and the software is operated on general hardware equipment such as a server, a memory and a switch which accord with an industrial standard to implement the functions of traditional network special hardware equipment. Therefore, flexible network function deployment is provided, and rapid service provision is realized. At present, the overall framework of the virtualized network is relatively clear, but the construction mechanism of the virtualized network, especially the reliability problem, still prevents the large-scale implementation and deployment of the engineering of the virtualized network, and needs to be deeply researched and explored.

Reliability is a measure of the ability of a network to cope with a failure. With the continuous expansion and complexity of networks, the networks face more and more threats, and in order to ensure the stability and security of services provided to users and avoid irreparable loss to users, after natural and man-made failures occur, the networks need to utilize preventive techniques or reparative techniques to alleviate damage to network services, so that the networks still have the capability of providing certain services.

Currently, in the existing technology for reliability, there are the following disadvantages:

firstly, the connotation of network reliability is gradually expanded from the traditional communication reliability based on a network topology structure to the capacity reliability considering the network flow, and is extended to the performance reliability considering the user requirements based on services and the like. The existing reliability description models are no longer accurate. Secondly, the existing schemes based on reliability mapping simply meet the overall reliability requirement, ignore the different losses caused by the failure of different virtual network functions, and how to give each virtual network function the corresponding reliability, so that the loss caused by the failure is reduced under the condition of reducing the failure rate of the SFC. In addition, the solution of the current literature is mostly based on a heuristic method, and the complexity is reduced while the solution is easy to fall into a local optimal solution.

Therefore, how to determine a corresponding reliability model according to the current network characteristics, how to effectively determine the loss situation caused by the failure of each virtual network function (SFC), and how to give a high-loss VNF a high reliability requirement, and how to intelligently and efficiently obtain a corresponding deployment scheme are worthy of study.

Disclosure of Invention

In view of this, the present invention provides a method for reliably deploying a service function chain based on deep reinforcement learning, so as to reduce failure probability and loss on the premise that a deployment result meets an overall reliability requirement.

In order to achieve the purpose, the invention provides the following technical scheme:

a service function chain reliable deployment method based on deep reinforcement learning comprises the following steps:

s1: obtaining a reliability value based on a reliability measurement mode of the equipment use degree and the surrounding safety coefficient;

s2: preliminarily determining the reliability requirement of each virtual network function through the functional characteristics and the topological characteristics;

s3: obtaining a deployable length of the link reliability requirement which meets the virtual link reliability requirement;

s4: based on each reliability requirement, using deep reinforcement learning to find an optimal mapping scheme suitable for a virtual network environment and a basic level environment;

s5: in the mapping process, if the VNF reliability cannot be met, a node backup method based on the importance is used, the link deployment result does not meet the link reliability, and a link backup method based on the link backup importance is used.

Further, in step S1, the reliability is determined based on the device usage degree, which estimates the wear of the device caused by the self-operation through the weibull distribution, and the surrounding safety factor, which is determined by counting the number of times of failure of the device and the distance from the failed device, and specifically includes:

s11: the equipment use degree is evaluated by Weibull distribution, the longer the equipment is used, the more serious the loss is, the greater the failure probability is, the equipment use degree is indirectly described by the failure probability, and the cumulative distribution function is

The larger the reliability is, the longer the device is used, the greater the failure probability is, so the reliability of the physical node from the viewpoint of the use degree is defined as:

means over time

The probability of failure due to equipment loss. Wherein eta and m are Weibull scale parameters and shape parameters, and can be obtained by solving by using a statistical analysis method according to log files and system configuration data of a local server.

For the same reason, physical link l_i,jLink utilization UL (l)_i,j) The same expression is also used;

s12: generally, the environment of some devices is relatively severe or attacked, the number of times of faults occurring is large, and meanwhile, if the last fault occurs just nearby, it indicates to a certain extent that the environment of the node currently may have a large potential safety hazard. Therefore, the physical node perimeter safety factor is defined as follows:

wherein q is a physical node n_iNumber of failures in a period of time, pos n_i]Represents n_iDistance hop count to last failed physical node; the physical node does not have faults or has fewer faults, which shows that the reliability coefficient of the surrounding environment of the physical node is higher.

Similarly, the physical link reliability coefficient is:

wherein pos [ l_i,j]Leaving this time for last failed link_i,jThe number of physical nodes in the link failure at intervals; the final reliability is the product of two coefficients, namely: RN (n)_i)＝UR(n_i)·ER(n_i)、RL(l_i,j)＝UR(l_i,j)·ER(l_i,j)。

Further, in step S2, the present invention places SFC reliability requirements on

The refinement is to the reliability requirements of multiple components, and by satisfying these requirements, a reliable mapping of service function chains is achieved. Because virtual nodes and virtual links in a virtual network are different in nature, reliability requirements need to be met firstly

Dividing the reliability into two parts, namely the reliability requirement of the virtual node and the reliability requirement of the virtual link

And virtual link reliability requirements

Are all defined as

And determining the importance score of each virtual network function according to the topology information and the function information of the service function chain, namely, the VNF sharing degree, the recovery resource cost, the VNF function importance and the VNF state.

Further, the reliability requirement of the virtual network function is determined as follows:

because each VNF in the SFC has different topological characteristics and functional characteristics, so that the failure impact, the recovery difficulty, and the recovery time are different, the importance of each VNF in failure needs to be determined through these characteristics, and then the reliability requirement matching with the importance is obtained, and a reference feature for determining the reliability requirement is provided:

1) VNF sharing degree: the function multiplexing enables a single VNF to be shared by a plurality of SFCs at the same time, and in the step, the parameter shared by the VNF by the plurality of SFCs is defined as the VNF sharing degree, and compared with the VNF dedicated to one SFC, the VNF fault shared among the plurality of SFCs affects all related services;

2) and recovery cost: each VNF consumes a corresponding resource and requires an amount of resources

Different, the difficulty of recovery after failure is different, the higher the resource demand is, the fewer physical nodes meeting the resource demand are, the lower the recovery probability is, and more reliability needs to be provided to the VNF; the recovery probability is improved, and meanwhile, the overall cost consumption of long-term operation is reduced.

3) VNF functional importance: for InP, VNF functions themselves have respective importance, for example, content is lost due to failure of content caching function, and if content is not backed up, it is difficult to recover, which results in immeasurable loss. And the firewall function failure is easy to remap and recover, so that the caused instant service interruption has no great influence. Because the attribute is difficult to describe by objective data, the importance of each VNF is determined according to human experience, and relatively important functions are endowed with higher reliability;

4) VNF state: each type of VNF needs to store respective network state information including data flow information and address mapping, where the state includes a relevant state and an irrelevant state, the relevant state is only relevant to the VNF itself, and the irrelevant state is modified along with the arrival and processing of the data flow, and a failure recovery process of the VNF is more complex than the relevant state, and requires a longer recovery time and more recovery cost consumption, so that the VNF needs to have higher reliability when including the irrelevant state;

the importance degree of the VNF is determined through the 4 attributes, and therefore the corresponding reliability requirement is determined; first, it should be determined

Because of the different attribute types, normalize the attribute values:

the sharing degree of the VNF is expressed by the number psi of the shared SFCs, and in order to avoid the overlarge probability of common cause faults caused by the overlarge sharing, the maximum sharing degree of the VNF is designed to be only 4, so that the result is normalized

Is psi/4;

the recovery cost is the resource demand, and is obtained by taking the maximum value of the resource demand in the SFC as the reference

The VNF function importance degree is manually graded to obtain an importance grade, and the importance degree grade is X_IRecorded in the scoring table, normalizing the results

Comprises the following steps:

the state of the VNF includes an irrelevant state

1, no non-correlated state is 0.5;

according to each normalized attribute value, finally

The sum of the attribute values of (a) represents the degree of importance in the SFC, resulting in a normalized reliability coefficient ω_I：

At this time can obtain

Reliability requirement of

It is sufficient to find a physical node that meets VNF requirements at deployment time.

Further, in step S3, the virtual link reliability requirement has been determined, and now the link reliability requirement needs to be satisfied by the corresponding link mapping scheme. Meanwhile, function multiplexing is considered, so in order to jointly optimize function deployment and bandwidth requirements, an optimal balance point needs to be found between function multiplexing and path length, and a link reliable mapping model based on function multiplexing is adopted to decompose the deployment problem of a service function chain into two problems: 1) setting maximum value u 'of transmission hop number'_k(ii) a 2) In u'_kFinding an effective path with the highest multiplexing degree in the range;

for the problem 1), the function multiplexing and the link reliability requirement are considered jointly, the reliable deployment of the virtual link is different from the virtual node, and as the hop count of the link is not fixed, the more the hop count of the deployed link is, the lower the reliability of the mapped link is, so that the link length u can be controlled_kTo meet the reliability requirements of the link, one can obtain:

wherein]⁺Denotes rounding up, r_lAverage reliability for base layer links:

l represents the number of base links, and the link length should be smaller than u when finally deployed_k(ii) a With reference to the base layer network distribution, it must also be satisfied:

a. in order to ensure that the service path can communicate with the source node and the destination node, the deployment path is not less than the shortest path between the receiving and transmitting nodes;

b. whether it is a reuse or a new deployment, it is guaranteed that all VNFs can be deployed on the path.

Further, u is_kThe method is from reliability, a base layer network topology is not considered, and situations that a and b cannot be met exist in actual deployment, so that the request can only be rejected even if resources are sufficient; the main reason for this is insufficient path length, and to avoid this, the maximum hop count u is appropriately extended_kAnd if the link reliability of the deployment result is insufficient, the link reliability can be satisfied through backup, so that:

SF is a spreading factor, and after spreading, SF u is used_kDeployment can greatly increase the likelihood of node deployment success.

Further, in step S4, a mapping scheme is implemented through deep reinforcement learning based on VNF reliability and deployable link length, the deep reinforcement learning applies the strong sensing capability of the deep learning to the decision process of the reinforcement learning, interacts with the environment through a continuous trial and error manner, and senses the unknown condition of the environment through a neural network, thereby obtaining the most accurate reward evaluation value. And find the best strategy by maximizing the cumulative prize. The method needs to determine 3 element sets, state sets, action sets and rewards, and the definition of corresponding data is as follows:

state space: s_t＝{B_left,C_left,b_map,c_map}，B_leftAnd C_leftRespectively representing the rest node resource set and the link resource set; the value can reflect the underlying topological property; in actual conditions, the network topology of the server is changed due to the fault caused by the influence of random factors in the environment, and the failed nodes and links are set to be 0; when the repair is complete, the mapping may continue on the node/link; namely, the real-time performance of interaction with the environment is strengthened and learned, and the topological change is sensed and responded; b_map,c_mapMapping state sets of the mapped nodes and links in the SFC to represent virtual network states corresponding to corresponding base layer environment states, wherein the node states comprise deployment results and reusability of the mapped nodes, and the link states are deployed links;

an action space: a is_t＝{a_n,a_l,a_κ}，a_nFor node mapping actions, a_lFor link mapping actions, a_kAllocating corresponding resources for the resource allocation action according to the requirements; wherein, when executing node mapping action, it will search the VNF carried by the mapped base level node, if finding that the carried physical node carries the VNF, it will directly multiplex the VNF, at the same time, it will not execute node resource allocation action, and recalculate each omega_IOtherwise, the VNF is instantiated to consume the computing resource;

the action set is screened, meanwhile, the probability of selecting effective actions is increased, and the training result is to correctly reflect the strategy of obtaining the maximum accumulated reward, so that the neural network only needs to reflect the deployment strategy meeting the requirement, and is not important for learning of the non-effective action state pair, so that the performance loss is not caused, and the learning time can be effectively reduced; node chain meeting requirementsThe way sets are respectively represented as

And

rewarding:

the reward ensures that the VNF mapping action minimizes the reliability waste on the basis of ensuring the reliability requirement; the resource consumption of the segment mapping is minimized while considering the load balance; of course, if an invalid action is selected, such as reliability not being satisfied, the action award is set to-100.

Deep reinforcement learning obtains corresponding reward, namely the next state, by continuously executing actions, and collects a plurality of s_t,a_t,r_t,s_t+1]The vectors are put into an experience playback pool, then random extraction is carried out in the experience playback pool to reduce the correlation of training data, the convolution neural network is trained, a state action pair is reflected through the strong perception of the neural network to serve as a potential value state value function in the future, and when training is sufficient and fitting is complete, the optimal strategy can be obtained.

Further, in step S5, in response to the situation that the reliability in the deployment process is insufficient, a node backup scheme based on the importance of the virtual network function and a link backup scheme based on the importance of the link backup are adopted, which specifically includes:

in the action selection process, when a node meeting the reliability is not selected, a node backup scheme is used, the scheme determines a backup mode based on the VNF importance obtained when deployment is completed, for the more important VNF, namely, the VNF higher than the average importance is subjected to special backup, namely, a VNF deployed is instantiated to be deployed again to be deployed at a nearby physical node as a backup, and if a source node fails, the VNF of the backup node is directly used; the method can effectively deal with multi-node faults on the basis of ensuring the reliability. The shared backup can be carried out relatively unimportant, namely a plurality of backup nodes share the resource of the same backup node, so that the resource consumption can be effectively reduced while the reliability is ensured;

executing a link backup scheme when the deployment result does not meet the reliability of the virtual link, wherein the scheme considers that the resource consumption is reduced and the reliability increment is improved

Where BW is the link backup resource consumption,

meanwhile, the backup is carried out on the first reliable virtual link as much as possible, and the following steps are obtained:

according to

Can select a higher reliability boost, consume as little resources as possible, and can reduce the probability of a lower reliability link failure, based on

And sequencing the iterative backups.

The invention has the beneficial effects that: the reliable deployment method provided by the invention can effectively deal with the basic layer faults on the basis of ensuring the reliability requirement, reduces the number of invalid SFCs, and ensures the load balance to ensure that the whole virtual network is more stable and reliable.

Additional advantages, objects, and features of the invention will be set forth in part in the description which follows and in part will become apparent to those having ordinary skill in the art upon examination of the following or may be learned from practice of the invention. The objectives and other advantages of the invention may be realized and attained by the means of the instrumentalities and combinations particularly pointed out hereinafter.

Drawings

For the purposes of promoting a better understanding of the objects, aspects and advantages of the invention, reference will now be made to the following detailed description taken in conjunction with the accompanying drawings in which:

FIG. 1 is a block diagram of an application environment in the present invention;

FIG. 2 is a schematic diagram of function multiplexing;

fig. 3 is a schematic diagram of a service function chain mapping scheme based on deep reinforcement learning according to the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be described in detail and clearly with reference to the accompanying drawings.

Fig. 1 is a framework diagram of an application environment in the present invention, see fig. 1, which is a basic model for providing end-to-end services in a network virtualization environment, and the framework is a network virtualization architecture composed of NFV-based orchestration and control framework. An end-to-end service function chain starts from a terminal device, and VNFs are sequentially deployed on the terminal device through an access network and a core network to implement corresponding service requirements. The virtual sub-network of service function chains is built and managed by the Service Provider (SP). The infrastructure provider (InP) is responsible for building, operating and maintaining the basic physical facilities, and provides reliable resources through a standardized interface according to SP resource application. When a user makes a request, the SP will generate an end-to-end network service chain consisting of several virtual network functions, such as SFC1 in slice 1 and SFC2 in slice 2, which requires determining the required functions, number and order between them.

Fig. 2 is a schematic diagram of function multiplexing. The function multiplexing means that if the VNF to be deployed is found to be carried on the server in the mapping process, the VNF can be shared with the deployed SFC, and the process does not consume node computing resources. The number of VNFs multiplexed in the SFC is called multiplexing degree, and the multiplexing degree can be improved during mapping so as to reduce the consumption of node resources. However, the deployed functions are fixed in location and consume more link resources than instantiating a new VNF. For example, in FIG. 2, a request for service is required

Virtual network function v in_IDeployment is carried out, and two deployment schemes are provided, wherein the first scheme is deployment to n_iTwo physical links are needed, and corresponding node resource instantiation v needs to be consumed at the same time_I(ii) a The second is deployed to n_jIn the above, 4 physical links are consumed, but the server n_jHas carried v_IAnd the node resources can be directly reused and are not consumed any more. It can be seen that the two schemes focus on optimizing link deployment and node deployment, respectively. Therefore, in order to jointly optimize the function deployment and the bandwidth requirement, an optimal balance point needs to be found between the function multiplexing and the path length.

FIG. 3 is a schematic diagram of the deep reinforcement learning framework of the present invention, the Global service editor (GS-O) when a virtual network request arrives

First, a search for a base layer network environment and a virtual layer environment is started, and the search process is as the search loop of fig. 3. The corresponding states are obtained by searching in the base layer/virtual layer environment, and the executable actions are obtained by an epsilon greedy strategy or by randomly selecting actions in the base layer environment, or by a value neural network.

In the process of randomly selecting the action, the VNF requirement may be high, and if there is no physical node satisfying the reliability requirement in the mapping range, the reliability may be satisfied through backup under the condition of sufficient resources and many nodes. At this time, the importance of the VNF is determined, and the node backup scheme is executed to obtain an action group. In addition, when the maximum Q value action selected by the neural network cannot meet the reliability, the same backup operation is carried out by adding the next largest action.

Determining to execute the deployment action a of the tth VNF according to the selection result_tGet the corresponding deployment reward r_tAnd changing the state of the basic level environment resource to s_t+1To obtain a vector [ s ]_t,a_t,r_t,s_t+1]And stored in the experience pool. Then from s_t+1The search is continued and the t +1 th VNF is deployed. Until the VNF is fully deployed. A deployment cycle is performed. Then perform a next deployment cycle to seek from the beginning. Vectors are iteratively collected in this process.

Enough experience is stored in the experience poolAfter the vectors are extracted, a set of vectors may be randomly drawn in an experience pool. s_tAnd a_tObtaining the Q estimated value Q(s) of the corresponding action by the main neural network at the moment_t,a_t| θ), θ represents a parameter of the master neural network. r is_t、s_t+1To obtain a target Q value:

at this time, the difference between the main network Q value and the target Q value is made as a loss function:

the loss function updates θ using the SGD. The training of the target network is derived from the training results of the primary network. To reduce the time complexity, the parameter θ of the master network is copied to the target network every T times the master network is trained. The learning loop in fig. 3 is continuously executed in a loop. After a group of vectors is trained, the environment is continuously searched, the vectors in the experience pool are updated, and iteration is continuously carried out. Until the loss function converges to a sufficiently small range. At this point the Q-estimation neural network has already tended to converge. Whereby the deployment loop of the figure can be implemented.

Finding out that the link reliability is not satisfied after the SFC deployment is finished on the deployment loop, executing a link backup module to the link backup module

And sequencing, and iterating backup from front to back until the reliability requirement is met. And then obtaining a final deployment scheme.

The whole method is summarized and obtained as follows:

step 1: initializing an experience playback pool, randomly initializing parameters of a main network theta, and copying the parameters to a target network theta^-And obtaining a reliability pre-distribution result w of the virtual network function based on a virtual network function reliability determination model and a virtual link reliable mapping model based on function multiplexing_IAnd deployable length u'_k。episode＝0

Step 2: initialization state S₁＝{B_left,C_left,b_map,c_mapAnd t is 0

And step 3: determining a set of selectable actions

Randomly generating a number τ of 0 to 1 if τ<E, go to step 4, otherwise go to step 6.

And 4, step 4: if it is not

Non-null where action a is randomly selected_tOtherwise, go to step 5.

And 5: if it is not

Not null, when ω is_IExecuting shared backup when the ratio is less than 1/D, newly building nodes near the deployed backup nodes, if no deployed backup node exists or omega_IAnd executing special backup if the reliability is higher than 1/D, and selecting the node with the highest reliability as a deployment node and the other node as a backup node until the reliability requirement of the VNF is met.

Step 6: estimation based on a main neural network

And selecting an action.

And 7: performing action a_tTo obtain a reward r_tAnd the next state s_t+1Will vector(s)_t,a_t,r_t,s_t+1) And putting the experience playback pool. And if the experience pool is full, the step 8 is carried out. If not, step 2 is entered if all VNFs have been deployed. Otherwise, if t is t +1<And D, entering the step 3.

And 8: and randomly taking out a small batch sample vector set from the experience playback pool.

And step 9: taking a vector, inputting the vector into the main network, and obtaining a loss function by the target network

Step 10: and updating the parameters through random gradient descent, and copying the main network parameters to the target network after updating for T times of theta. If the vector is not trained, go to step 9, otherwise, the epsilon is equal to epsilon +1, if the epsilon is equal to E, go to step 11, otherwise, go to step 2

Step 11: obtaining the optimal strategy of each VNF through the main network, if the strategy is Q(s)_t,a_t) If < -100 then the request is denied.

Step 12: if the reliability requirements of the virtual links are not satisfied, calculating the reliability requirements of each virtual link

The link backup is iterated from high to low until the virtual link reliability requirements are met.

Finally, the above embodiments are only intended to illustrate the technical solutions of the present invention and not to limit the present invention, and although the present invention has been described in detail with reference to the preferred embodiments, it will be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions, and all of them should be covered by the claims of the present invention.

Claims

1. A service function chain reliable deployment method based on deep reinforcement learning is characterized in that: the method comprises the following steps:

the reliability is determined based on the equipment use degree and the peripheral safety factor, the equipment use degree estimates the abrasion of self operation to the equipment through Weibull distribution, the peripheral safety factor is judged by counting the self equipment failure times and the distance between the self equipment failure times and the failure equipment, and the reliability determination method specifically comprises the following steps:

means over time

The probability of failure caused by equipment loss is obtained, wherein eta and m are Weibull scale parameters and shape parameters, and the probability can be obtained by solving by using a statistical analysis method according to log files and system configuration data of a local server;

s12: the safety factor around the physical node is defined as follows:

wherein q is a physical node n_iNumber of failures in a period of time, pos n_i]Represents n_iDistance hop count to last failed physical node;

similarly, the physical link reliability coefficient is:

wherein pos [ l_i,j]Leaving this time for last failed link_i,jLink failure intermediateThe number of physical nodes in the partition; the final reliability is the product of two coefficients, namely: RN (n)_i)＝UR(n_i)·ER(n_i)、RL(l_i,j)＝UR(l_i,j)·ER(l_i,j)；

will require reliability

And virtual link reliability requirements

Are all defined as

Determining the importance score of each virtual network function according to the topology information and the function information of the service function chain, namely through the VNF sharing degree, the recovery resource cost, the VNF function importance and the VNF state;

the reliability requirement of the virtual network function is determined as follows:

determining the importance degree of each VNF in failure through the topological characteristic and the functional characteristic of each VNF, further obtaining the reliability requirement matched with the importance degree, and providing a reference characteristic for determining the reliability requirement:

2) and recovery cost: each VNF consumes a corresponding resource, andamount of resources required

Different, the difficulty of recovery after failure is different, the higher the resource demand is, the fewer physical nodes meeting the resource demand are, the lower the recovery probability is, and more reliability needs to be provided to the VNF;

3) VNF functional importance: for InP, VNF functions have respective importance degrees, and because the attribute is difficult to describe by objective data, the importance degree is determined for each VNF according to human experience, and relatively important functions are endowed with higher reliability;

Because of the different attribute types, normalize the attribute values:

Is psi/4;

Comprises the following steps:

the state of the VNF includes an irrelevant state

1, no non-correlated state is 0.5;

according to each normalized attribute value, finally

At this time can obtain

Reliability requirement of

Searching a physical node meeting VNF requirements during deployment;

in order to jointly optimize function deployment and bandwidth requirements, an optimal balance point needs to be found between function multiplexing and path length, and a reliable link mapping mode based on function multiplexing is adoptedType, the deployment problem of service function chains is broken down into two problems: 1) setting maximum value u of transmission hop count_k'; 2) at u_kFinding the effective path with the highest multiplexing degree in the range;

for problem 1), the functional multiplexing is considered jointly with the link reliability requirement by controlling the link length u_kTo meet the reliability requirements of the link, one can obtain:

wherein]⁺Denotes rounding up, r_lAverage reliability for base layer links:

b. whether multiplexing or new deployment is performed, all VNFs need to be guaranteed to be deployed on the path;

u_kthe method is from reliability, a base layer network topology is not considered, and situations that a and b cannot be met exist in actual deployment, so that the request can only be rejected even if resources are sufficient; the main reason for this is insufficient path length, and to avoid this, the maximum hop count u is appropriately extended_kAnd if the link reliability of the deployment result is insufficient, the link reliability can be satisfied through backup, so that:

SF is a spreading factor, and after spreading, SF u is used_kThe deployment can greatly increase the probability of successful node deployment;

2. The deep reinforcement learning-based reliable service function chain deployment method according to claim 1, characterized in that: in step S4, a mapping scheme is implemented by deep reinforcement learning based on VNF reliability and deployable link length, the method needs to determine 3 element sets, state sets, action sets, rewards, and the corresponding data are defined as follows:

an action space: a is_t＝{a_n,a_l,a_κ}，a_nFor node mapping actions, a_lFor link mapping actions, a_kAllocating corresponding resources for the resource allocation action according to the requirements; VNFs carried by base level nodes in which a mapping is sought when performing node mapping actions, e.g.If the physical node bearing the VNF is found, the VNF is directly multiplexed, meanwhile, the node resource allocation action is not executed, and each omega is recalculated_IOtherwise, the VNF is instantiated to consume the computing resource;

the action set is screened, meanwhile, the probability of selecting effective actions is increased, and the training result is to correctly reflect the strategy of obtaining the maximum accumulated reward, so that the neural network only needs to reflect the deployment strategy meeting the requirement, and is not important for learning of the non-effective action state pair, so that the performance loss is not caused, and the learning time can be effectively reduced; the node link sets satisfying the requirements are respectively expressed as

And

rewarding:

the reward ensures that the VNF mapping action minimizes the reliability waste on the basis of ensuring the reliability requirement; the resource consumption of the segment mapping is minimized while considering the load balance;

3. The deep reinforcement learning-based reliable service function chain deployment method according to claim 1, characterized in that: in step S5, in response to the situation of insufficient reliability during the deployment process, a node backup scheme based on the importance of the virtual network function and a link backup scheme based on the importance of the link backup are adopted, which specifically include:

in the action selection process, when a node meeting the reliability is not selected, a node backup scheme is used, the scheme determines a backup mode based on the VNF importance obtained when deployment is completed, for the more important VNF, namely, the VNF higher than the average importance is subjected to special backup, namely, a VNF deployed is instantiated to be deployed again to be deployed at a nearby physical node as a backup, and if a source node fails, the VNF of the backup node is directly used; the shared backup can be carried out relatively unimportant, namely a plurality of backup nodes share the resource of the same backup node, so that the resource consumption can be effectively reduced while the reliability is ensured;

Where BW is the link backup resource consumption,

according to

And sequencing the iterative backups.