CN110996334B

CN110996334B - Virtualized wireless network function arrangement strategy

Info

Publication number: CN110996334B
Application number: CN201911247877.XA
Authority: CN
Inventors: 朱贤友; 邹赛; 李浪
Original assignee: Chongqing College of Electronic Engineering; Hengyang Normal University
Current assignee: Chongqing College of Electronic Engineering; Hengyang Normal University
Priority date: 2019-12-09
Filing date: 2019-12-09
Publication date: 2022-10-11
Anticipated expiration: 2039-12-09
Also published as: CN110996334A

Abstract

The invention provides a virtualized wireless network function arrangement strategy which is beneficial to reducing rejection rate of Internet of things access service and improving utilization rate of network system resources, and the strategy comprises the following steps of S1: and establishing a chemical reaction optimization mathematical model for arranging the resources of the virtualized wireless network. S2: and solving the mathematical model established in the S1, wherein the solving comprises improving the local optimization capability of the CRO based on Gaussian disturbance, balancing the global and local search capabilities based on a random walk method, and improving the search capability and the search speed of the global approximate optimal solution of the CRO based on reinforcement learning. The invention has the beneficial effects that: the method is beneficial to reducing the rejection rate of the access service of the Internet of things, improving the utilization rate of network system resources, accelerating the solving speed of the global approximate optimal solution, improving the approximation degree of the approximate optimal solution and finally accelerating the automation and intelligentization process of the virtual network.

Description

Virtualized wireless network function arrangement strategy

Technical Field

The invention belongs to the field of mobile communication, and particularly relates to a resource arranging method for a virtualized network slice of a wireless mobile communication network.

Background

With the development of network technology, communication networks no longer only satisfy person-to-person communication, but extend to person-to-object and object-to-object communication. However, the performance indexes of the network requirements of different communication modes are very different. Various businesses want to have a vertical proprietary network to provide services, such as the autonomous vehicle networking needs to provide real-time and highly reliable services, while the monitoring internet of things needs to have low-bandwidth and ultra-massive connections. With the emergence of ever-changing applications, the requirement degree of everything interconnection is enhanced, the access mode and the network function positioning are changed greatly, and the chimney type wireless mobile access network architecture cannot meet the development requirement of services to a certain extent. The chimney-type wireless access technology is difficult to realize efficient service support through a unified air interface and a network control protocol, and a new service type is difficult to rapidly deploy. Diversified network nodes and networking forms not only cause inconsistency of user experience, but also bring heavy burden to network operation and maintenance work. In the future, a wireless network needs to support various application scenarios such as eMBB, mMTC, URLLC, various combination requirements among eMBB, mMTC and URLLC on a unified common platform. However, the demands of various applications or services on network metrics vary greatly. In order to meet the requirements of different indexes of each service, a future virtualized wireless network management platform needs to have flexible management capability and rapid expansion and contraction capability. Meanwhile, the future wireless network not only serves individuals, but also serves vertical industries (such as public safety, intelligent factories, intelligent medical treatment, V2X and the like), and business modes are remarkably differentiated. The differentiation of business models requires the decoupling of software and hardware of a wireless network, the virtualization and the software of network functions, the programmable and customizable support of the network functions and the provision of different network services for users in different industries by a uniform architecture in the future. The virtual network function arrangement is mainly used for selecting and sequencing each virtual network function and arranging resources (computing resources, communication resources and storage resources) required by each network function, so the resource arrangement becomes an important part in the virtual network function arrangement and is also one of key technologies influencing the success of a network arrangement system.

Disclosure of Invention

The invention aims to overcome the defects of the prior art and provide a virtualized wireless network function arrangement strategy which is beneficial to reducing the rejection rate of the access service of the Internet of things and improving the utilization rate of network system resources.

The invention is realized by the following steps:

s1: the following formula is adopted to establish a chemical reaction optimization mathematical model for arranging virtualized wireless network resources,

where n is the number of functions in the resource pool, m is the number of features in the resource pool, μ _j,k J-th network function f representing completion of virtual request _j With the kth feature a _k The cost of the need;

s2: and solving the mathematical model established in the S1, wherein the solving comprises improving the local optimization capability of the CRO based on Gaussian disturbance, balancing the global and local search capabilities based on a random walk method, and improving the search capability and the search speed of the global approximate optimal solution of the CRO based on reinforcement learning.

Further, the step S1 includes the steps of,

s101, modeling the virtual feature cost of the virtual function, including,

j network function f of a virtual request _j With the kth feature a _k The amount of resources required is represented by the following equation:

η _d ＝σ _b ×η _s +σ _p ×η _p +σ _it ×η _it

δ _s representing combinations of functional modules x _j',k' Coefficient of (d), d _p Representing combinations of functional modules x _j',k' Coefficient of (d) _it Representing combinations of functional modules x _j',k' Eta of _s Is the unit price, η, of the corresponding resource _p Is the unit price, eta, of the corresponding resource _it Is the unit price of the corresponding resource. Eta _d Is the combined cost of the various resources. it denotes service resources, s denotes bandwidth resources, p denotes power domain resources, mcost _j,k Representing a plurality of items having an attribute a _j Virtual function module f of _i The cost, σ, paid for using the same resource together _b Is the weight coefficient, σ _p Is the weight coefficient, σ _it Is a weight coefficient, and the constraint relationship is that 0 is less than or equal to sigma _b ,σ _p ,σ _it ≤1，σ _b +σ _p +σ _it ＝1，

S102, the functions selected in the virtualized network function set and the quantity and the characteristics of the resources required by each function are expressed by the following constraint conditions:

R _j,k .it≤N×x _j,k .it，

R _j,k .p≤N×x _j,k .p，

R _j,k .s≤N×x _j,k .s，

wherein R represents a virtual request and x represents a selected module;

the virtual service orchestration is represented by the following constraints:

where s ', p ', it ' denote the relevant resources that have been used. all represents all resources;

s103: j network function f of a virtual request _j With the kth feature a _k The amount of resources required is represented by the following equation:

the following mathematical model was established:

f _i →f _i+y representing virtual function modules f _i And a virtual function module f _i+y There is a dependency relationship, f _i ≠f _i+y Representing virtual function modules f _i And a virtual function module f _i+y There is an exclusive relationship that exists between,

s104: adding virtual function module f in solving process _i The cost is expressed by the following formula:

μ _j,k '＝μ _j,k +μ _j+y,k ，

chemical reaction optimization mathematical model for expressing virtualized wireless network resource layout by adopting following formula

Further, the step S2 comprises the following steps,

s201: let ω (i) be the structure of the ith molecule, and adopt KE as a means of measuring the state of the molecule to represent the ability of a molecule to escape from the current state to reach a worse molecular structure, the initial value of KE is "0", buff is the buffer energy, generated by molecule null collision, and is responsible for by the global function, and the initial value is "0";

s202: let' be the structure of the molecule after the impact, indicate all objects, ω (i). Best is the structure of the ith molecule with the lowest current potential energy, ω. Gbest indicates the molecular structure with the lowest current global potential energy, firstly, the structure with the lowest potential energy of the current molecule i is utilized, gaussian is adopted for a perturbation, and then a random walk model is used for walking between the structure with the lowest current potential energy of the ith molecule and the molecular structure with the lowest global potential energy after the structure with the lowest current potential energy of the ith molecule is perturbed by gaussian to obtain the structure of the molecule after the impact:

wherein the content of the first and second substances,

is Gaussian disturbance, and rand is a random number;

the conditions under which the molecules undergo a wall-collision reaction are expressed by the following formula:

PE _ω(i) +KE _ω(i) ≥PE _ω(i)'

the kinetic energy KE of the resulting molecule is expressed using the following formula:

KE _ω(i)' ＝(PE _ω(i) +KE _ω(i) -PE _ω(i)' )×q，

wherein q is a loss coefficient, and (1-q) represents the loss proportion of KE in the wall collision process;

s203: make ω' ₁ ，ω' ₂ Is the structure of the decomposed molecule, adopts the following formula to perform a disturbance on omega by adopting Gauss, then performs random walk,

the conditions under which the molecules undergo decomposition reaction are expressed by the following formula:

the kinetic energy KE calculation formula of the resulting molecule is expressed by the following formula

Wherein temp is a temporary variable;

s204: two molecules omega ₁ ，ω ₂ Randomly selecting values of the same positions for exchange, and randomly adding a random number to each molecular structure to ensure that the random number is omega' ₁ ，ω' ₂ Is the structure of the exchanged molecule, and is represented by the following formula ω' ₁ ，ω' ₂ ：

Wherein the content of the first and second substances,

represents from ω ₂ Replacing omega by k bits at any place ₁ The corresponding value.

Represents from ω ₁ In-and-out k-bit substitution omega ₂ And the corresponding value rand (ω) is a molecular structure randomly generated.

The conditions under which the exchange reaction of the molecules takes place are expressed by the following formula:

temp2＝buff×rand，

temp2 is a temporary variable;

the kinetic energy KE of the exchanged molecules is obtained by the following formula:

buff＝buff-temp2，

s205: and (3) synthesis reaction: two molecules omega ₁ ，ω ₂ The values of the same location are added and modulo the highest value of that location. Let ω ' be the structure of the molecule after exchange, and ω ' is represented by the following formula '

ω'＝ω ₁ +ω ₂ ，

The conditions under which the molecules undergo synthesis are expressed by the following formula:

temp2＝buff×rand，

PE _ω1 +KE _ω1 +PE _ω2 +KE _ω2 +temp2≥PE _ω' ，

the kinetic energy KE of the resulting molecule is obtained using the following formula,

KE _ω' ＝(PE _ω1 +KE _ω1 +PE _ω2 +KE _ω2 -PE _ω' )×q，

buff＝buff-temp2，

s206: the state where the chemical reaction occurs in each molecule is set as S = { S } in the Q-learning method ₁ ,…,S _t ,…S _T And pi is the behavior set of the Q-learning method, where pi = { a = a +1, a = a-1}, and 0 ≦ a ≦ T, where a is "0", only the row a = a +1 action, and when a is T, only the row a = a +1 action, where a has an initial value of T, and T is the molecular issueThe number of biochemical reactions, T, is the number of total iterations generated, each benefit is expressed as γ = | PE (ω') -PE (ω) |, each cost, l, is the value of buf increase in the occurrence of an invalid collision or invalid decomposition, and the Q value is updated using the following equation:

where σ is the learning rate, β is the discounting factor,

is a benefit in memory;

the value of q is adjusted by the following formula:

where λ is the coefficient of the exponential distribution.

S207: analyzing each molecule in the population pop to determine whether the molecule meets the collision reaction condition, if so, generating the collision reaction, and after the collision reaction, judging the PE _ω((i) ≥PE _ω(i)' If the energy is larger than the predetermined value, ω (i) = ω (i)', otherwise the reaction is invalid wall collision, the energy is converted into buffer energy when the wall collision occurs, and the following formula is adopted,

buff＝buff+(PE _ω(i) +KE _ω(i) -PE _ω(i)' )×(1-q)；

when ineffective wall collision occurs, the molecules continue to collide with the wall and reach PE _ω(i) <PE _ω(i)' Until the end;

each molecule in the population pop is analyzed whether a decomposition reaction condition is satisfied, and if so, a decomposition reaction occurs. After the decomposition reaction, judgment was made

Or

If it is largeThen ω (i) = min (ω (i) ₁ ',ω(i) ₂ ') while adding one ω (pop + 1) = max (ω (i) ₁ ',ω(i) ₂ ') otherwise the reaction is ineffective decomposition, and the energy at the time of wall collision is converted into buffer zone energy, and the energy is expressed by the following formula:

buff＝buff+(PE _ω(i) +KE _ω(i) -PE _ω(i)1' -PE _ω(i)2' )×(1-q)，

when a non-effective collision occurs, the molecules continue to decompose and reach

Or

Until now, the decomposed macromolecules ω (pop + 1) = max (ω (i) ₁ ',ω(i) ₂ ') carrying out a wall-collision reaction and to PE _ω(pop+1) <PE _ω(pop+1)' Until, population is 1,pop +1 on the original basis.

Optionally selecting one molecule for analysis of each molecule in the population pop, and judging whether the exchange reaction condition is met or not, if not, selecting one molecule for analysis, otherwise, carrying out the exchange reaction;

and (3) optionally selecting one molecule for analyzing each molecule in the population pop, judging whether the binding reaction condition is met, if not, selecting one molecule for analyzing, otherwise, carrying out the binding reaction, and subtracting 1,pop by the population on the original basis to (= pop-1).

The invention has the beneficial effects that: the method is beneficial to reducing the rejection rate of the access service of the Internet of things, improving the utilization rate of network system resources, accelerating the solving speed of the global approximate optimal solution, improving the approximation degree of the approximate optimal solution and finally accelerating the automation and intelligentization process of the virtual network.

Drawings

FIG. 1 is a flow chart of the present invention.

Detailed Description

The resource arrangement of the virtualized network function is a combined optimization problem, and the virtualized wireless access network architecture has heterogeneity, distributivity, dynamics and openness; the characteristics of the discreteness of network functions, the exponential nature of network load, the rapid difference of new service emergence and the like make the arrangement of network virtualization resources very complicated, which is an NP complete problem. For the characteristics of the management platform of the virtualized wireless access network, the resource arrangement of the virtualized network function is generally solved by adopting a heuristic algorithm. However, without free lunch, the metaheuristics of all search extrema are exactly the same on average performance of all possible objective functions. Therefore, in the industrial application of resource arrangement of the virtualized network function, the optimization algorithm must take the search speed into consideration when being good at global search. A Chemical Reaction Optimization algorithm (CRO) is inspired by interaction between molecules in Chemical Reaction to seek the lowest potential energy phenomenon in a potential energy surface, adopts four elementary reactions, follows a first law and a second law of thermodynamics, and has the characteristics of simplicity, universality, strong robustness, self-learning, self-organization, self-adaptation and the like. The algorithm solves the problems of combination optimization and function optimization, particularly the single-target optimization problem of a high-dimensional multi-modal function, has high convergence speed and strong robustness, and can effectively avoid falling into local optimization. In a broad sense, the chemical reaction optimization is an algorithm framework, only general operation agents (molecules) and energy management schemes need to be defined, the molecular properties of the chemical reaction optimization can be correspondingly changed according to the requirements of users, and the population scale can be adjusted in real time. Therefore, the algorithm has strong flexibility and can be self-adapted to different optimization problems.

A great deal of time is consumed in the arrangement process; meanwhile, the agent model assisted evolution algorithm is a main idea for solving the time-consuming optimization problem. The SAEA is adopted to correct all stages of initialization, wall collision, decomposition, exchange, synthesis and target value estimation of the CRO, and the calculation times of a real target are reduced to the maximum extent by evaluating the individual advantages and disadvantages in a multi-dimensional space formed by a Gaussian process model predicted value and an error value.

As shown in fig. 1, the present invention provides a virtualized wireless network function orchestration policy, comprising the following steps:

s1: based on the characteristics of the virtualized wireless network resource arrangement, the requirements of the chemical reaction optimization model are combined, and a chemical reaction optimization mathematical model for the virtualized wireless network resource arrangement is established as follows:

n is the number of functions in the resource pool. m is the number of features in the resource pool. Mu.s _j,k J-th network function f representing the completion of a virtual request _j With the kth feature a _k The cost required.

S2: and (3) correcting each stage of initialization, wall collision, decomposition, exchange, synthesis and target value estimation of the CRO by adopting SAEA, and solving the mathematical model established by S1, wherein the specific steps are as follows: improving the local optimization capability of the CRO based on Gaussian (Gaussian) disturbance; the global and local search capabilities are balanced based on a random walk approach. And the search capability and the search speed of the global approximate optimal solution of the CRO are improved based on reinforcement learning.

Further, the step S1 includes the steps of,

s101, modeling the virtual feature cost of the virtual function, including,

j network function f of a virtual request _j With the kth feature a _k The required amount of resources may be provided by one physical AP, or by multiple physical APs, or only one physical AP may be required to provide a portion of the resources, as shown in the following equation:

when a share of virtual resources is provided by a physical node: the cost is the sum of the product of the unit price of each resource and the required quantity of the resource and the combined cost of each resource. Delta _s ，δ _p ，δ _it Representing combinations of functional modules x _j',k' Eta of _s ，η _p ，η _it Is the unit price of the corresponding resource. Eta _d Is the combined cost of the various resources. it represents a service resource, s represents a bandwidth resource, p represents a power domain resource, and N represents the number of nodes.

η _d ＝σ _b ×η _s +σ _p ×η _p +σ _it ×η _it (3)

σ _b ，σ _p ，σ _it Is a weight coefficient, and the constraint relation is that the weight coefficient is more than or equal to 0 and more than or equal to sigma _b ,σ _p ,σ _it ≤1，σ _b +σ _p +σ _it ＝1.

When a share of virtual resources is provided by multiple physical nodes: its cost is the sum of the costs of N nodes plus the combined cost of N nodes _j,k 。cost _j,k A plurality of fingers having a characteristic a _j Virtual function module f of (2) _i The price paid for parallel use.

When multiple virtual resources are provided by one physical node: its cost is the sum of the 1/N node costs plus the 1/N node combined cost Mcost _j,k .Mcost _j,k Refers to a plurality of characters having an attribute of a _j Virtual function module f of _i The cost of using the same resource

As can be seen from the formula (1-3),

according to the system model, it can be known that the number of resources and the feature requirement of the functions selected from the virtualized network function set and each function are greater than or equal to the number of resources corresponding to the virtual request, and the following constraint conditions exist:

R _j,k .it≤N×x _j,k .it (4)

R _j,k .p≤N×x _j,k .p (5)

R _j,k .s≤N×x _j,k .s (6)

r denotes a virtual request and x denotes a selected module. Virtual service orchestration is essentially the selection of a sub-virtual function from a set of virtual functions. When the construction costs are equal, the specific selection scheme has diversity. Thus, it is an NP-hard problem. In order to reduce the difficulty of solving and simultaneously embody the resource shortage, the following constraint conditions are added:

where s ', p ', it ' denote the relevant resources that have been used. all represents all resources.

In combination with formula (7-9), formula (2) is converted to:

since there may be dependencies between functional modules. f. of _i →f _i+y Representing virtual function modules f _i And a virtual function module f _i+y There is a dependency if f _i If exist, then f _i+y Must be present. f. of _i ≠f _i+y Representing virtual function modules f _i And a virtual function module f _i+y There is an exclusive relationship if f _i If exist, then f _i+y Must not be present.

Virtual function modelBlock f _i And a virtual function module f _i+y The exclusion relationship exists, and can be embodied in the service request. Therefore, in the process of solving,

the constraints may be removed. Simultaneous virtual function module f _i And a virtual function module f _i+y The dependency relationship exists, and only the virtual function module f is added in the solving process _i The cost, increment, is shown as:

μ _j,k '＝μ _j,k +μ _j+y,k (13)

the formula (1) is converted into:

further, the step S2 comprises the following steps,

the local search capability of CRO is mainly determined by the collision reaction and decomposition reaction of molecules; the global search capability of CRO is mainly determined by the exchange reaction and synthesis reaction of molecules. The CRO is integrated with some heuristic algorithms, so that the global and local searching capability is balanced, and the solving speed is increased. And improving the local optimization capability of the CRO based on a Gaussian random walk model. And the maximum Hamming distance is used for improving the global optimization capability of the CRO. The calculation times of the real target are reduced to the maximum extent by evaluating the individual advantages and disadvantages in a multidimensional space formed by the predicted value and the error value of the Gaussian process model.

S201: let ω (i) be the structure of the ith molecule. KE can be used as a measure of the state of a molecule, which indicates the ability of a molecule to escape from the current state to a worse molecular structure (a new solution, with a higher value for the fitness function). The initial value of KE is "0". buff is the buffer energy, generated by molecular null collisions, and is accounted for by the global function, with an initial value of "0".

S202: wall collision reaction:

the molecules hit the walls of the container and some of the structure of the molecules changes. Let' be the structure of the molecule after impact, indicate all objects, ω (i). Best be the structure of the ith molecule with the lowest current potential energy, and ω. Gbest indicate the structure of the molecule with the lowest current global potential energy. Firstly, a structure with the lowest potential energy of the current molecule i is utilized, and Gaussian is adopted for carrying out disturbance; and then the structure with the lowest current potential energy of the ith molecule is disturbed by Gauss and then moves away from the molecular structure with the lowest global potential energy through a random walk model (random walk approach), so that the structure of the impacted molecule can be obtained:

wherein, the first and the second end of the pipe are connected with each other,

is a Gaussian disturbance, and rand is a random number. The conditions for the collision reaction of the molecules are as follows:

PE _ω(i) +KE _ω(i) ≥PE _ω(i)' (16)

according to the law of conservation of energy, the calculation formula of kinetic energy KE of the result molecule can be obtained

KE _ω(i)' ＝(PE _ω(i) +KE _ω(i) -PE _ω(i)' )×q (17)

Wherein q is a loss coefficient, and (1-q) represents the loss proportion of KE in the wall collision process.

S203: the molecule is broken down into two molecules. Make ω' ₁ ，ω' ₂ Is the structure of the decomposed molecule. A Gaussian is adopted for omega to carry out disturbance, and then random walk is carried out, then

The conditions under which the decomposition reaction of the molecules takes place are:

according to the law of conservation of energy, the calculation formula of kinetic energy KE of the resultant molecule can be obtained

Where temp is a temporary variable.

S204: two molecules omega ₁ ，ω ₂ Values of some identical positions are randomly chosen to be exchanged. In order to better obtain a global approximate optimal solution, when molecules are exchanged, a random number is randomly added to each molecular structure. Make ω' ₁ ，ω' ₂ Is the structure of the exchanged molecule, then

Wherein the content of the first and second substances,

Represents from ω ₁ In-and-out k-bit substitution omega ₂ Rand (ω) is a randomly generated molecular structure.

The conditions under which the exchange reaction of the molecules takes place are:

temp2＝buff×rand (26)

temp2 is a temporary variable, and a calculation formula of kinetic energy KE of exchanged molecules can be obtained according to the law of conservation of energy

buff＝buff-temp2 (31)

S205: and (3) synthesis reaction: two molecules omega ₁ ，ω ₂ The values of the same location are added and modulo the highest value of that location. Let ω' be the structure of the exchanged molecule, then

ω'＝ω ₁ +ω ₂ (32)

The conditions under which the molecules undergo synthesis reaction are:

temp2＝buff×rand (33)

PE _ω1 +KE _ω1 +PE _ω2 +KE _ω2 +temp2≥PE _ω' (34)

KE _ω' ＝(PE _ω1 +KE _ω1 +PE _ω2 +KE _ω2 -PE _ω' )×q (35)

buff＝buff-temp2 (36)

S206: adjusting CRO parameters based on Q-learning method:

in order to accelerate the convergence speed and obtain a global approximate optimal solution and reduce the times of invalid collision and invalid decomposition, a Q-learning method is adopted to determine the value of Q.

The state where the chemical reaction occurs in each molecule is set as S = { S } in the Q-learning method ₁ ,…,S _t ,…S _T And pi is an action set of the Q-learning method, where pi = { a = a +1, a =a-1},0 ≦ a ≦ T, where a is "0", only i.e., row a = a +1 action, and when a is T, only i.e., row a = a-1 action. The initial value of a is t, i.e. a = t. T is the number of times the molecule undergoes a chemical reaction and T is the number of times the overall iteration occurs. The gain at each time is gamma = | PE (omega') -PE (omega) |. The cost per time, l, is the value at which buff increases when an invalid collision or invalid decomposition occurs. The Q value updating formula is as follows:

where σ is a learning rate (learning rate) and β is a discount factor (discount factor), it can be seen from the formula that the larger the learning rate σ, the less the effect of retaining the previous training. The greater the discount factor beta is, the greater,

the greater the effect that is played.

Is a benefit in memory.

And adjusting the value of Q based on a Q-learning method, so that the value in the early stage is larger, and the value in the later stage is smaller.

The formula for q is:

where λ is the coefficient of the exponential distribution.

Q-learning is a value-based algorithm in a reinforcement learning algorithm, wherein Q is Q (S, a), namely in the S State (S belongs to S) at a certain moment, the expectation that the profit can be obtained by taking the Action a (a belongs to A) is taken, and the environment can feed back the corresponding rewardr according to the Action of agent, so the main idea of the algorithm is to construct a Q-table by State and Action to store a Q value, and then the Action capable of obtaining the maximum profit is selected according to the Q value.

S207: specific implementation of the step S2

The CROROS algorithm is realized by firstly initializing the number pop of chemical reaction molecule groups and the times T of generating overall iteration; and then initializing the virtual request R and initializing the virtualized network function and the virtualized network resource of the network management system platform.

And adjusting the value of the parameter Q of the CROROS algorithm based on a Q-learning method.

Each molecule in the population pop is analyzed whether the wall-collision reaction condition is met, and if so, the wall-collision reaction occurs. After the wall-collision reaction, the PE was judged _ω(i) ≥PE _ω(i)' If the energy is larger than the energy, the value omega (i) = omega (i)', otherwise, the reaction is invalid to touch the wall, and the energy when the wall is touched is converted into the energy of the buffer zone according to the principle of energy conservation, as shown in the following formula.

buff＝buff+(PE _ω(i) +KE _ω(i) -PE _ω(i)' )×(1-q) (39)

When ineffective wall collision occurs, the molecules continue to collide with the wall and reach PE _ω(i) <PE _ω(i)' Until now.

Or

If greater than, ω (i) = min (ω (i) ₁ ',ω(i) ₂ ') while adding one ω (pop + 1) = max (ω (i) ₁ ',ω(i) ₂ ') to a test; otherwise, the reaction is ineffective decomposition, and the energy in collision with the wall is converted into the energy of the buffer region according to the principle of energy conservation, as shown in the following formula.

buff＝buff+(PE _ω(i) +KE _ω(i) -PE _ω(i)1' -PE _ω(i)2' )×(1-q) (40)

Or

Until now, the decomposed macromolecules ω (pop + 1) = max (ω (i) ₁ ',ω(i) ₂ ') carrying out a wall-collision reaction and to PE _ω(pop+1) <PE _ω(pop+1)' Until, population is added with 1,pop = pop +1 on the original basis.

And optionally selecting one molecule for analysis of each molecule in the population pop, judging whether the exchange reaction condition is met, if not, selecting one molecule for analysis, and otherwise, carrying out the exchange reaction.

And (3) optionally selecting one molecule for analyzing each molecule in the population pop, judging whether the binding reaction condition is met, if not, selecting one molecule for analyzing, otherwise, generating the binding reaction, and subtracting 1,pop (= pop-1) from the population on the original basis.

Finally, it is noted that the above-mentioned preferred embodiments illustrate rather than limit the invention, and that, while the invention has been described in detail with reference to the above-mentioned preferred embodiments, it will be understood by those skilled in the art that various changes in form and detail may be made therein without departing from the scope of the invention as defined by the appended claims.

Claims

1. A virtualized wireless network function orchestration policy comprising the steps of,

s2: solving the mathematical model established in the S1, wherein the solving comprises improving the local optimization capability of the CRO based on Gaussian disturbance, balancing the global and local search capabilities based on a random walk method, and improving the search capability and the search speed of the global approximate optimal solution of the CRO based on reinforcement learning; CRO represents a chemical reaction optimization algorithm;

said step S1 comprises the following steps,

s101, modeling the virtual feature cost of the virtual function, including,

η _d ＝σ _b ×η _s +σ _p ×η _p +σ _it ×η _it ，

δ _s representation composition into functional modules x _j',k' Coefficient of (d), d _p Representation composition into functional modules x _j',k' Coefficient of (d), d _it Representation composition into functional modules x _j',k' Coefficient of (a) (. Eta.) _s Is the unit price, eta, of the corresponding resource _p Is the unit price, η, of the corresponding resource _it Is the unit price of the corresponding resource; eta _d Is the combined cost of the various resources; it denotes service resources, s denotes bandwidth resources, p denotes power domain resources, mcot _j,k Representing a plurality of items having an attribute a _j Virtual function module f of _i The cost of using the same resource together,σ _b is the weight coefficient, σ _p Is the weight coefficient, σ _it Is a weight coefficient, and the constraint relation is that the weight coefficient is more than or equal to 0 and more than or equal to sigma _b ,σ _p ,σ _it ≤1，σ _b +σ _p +σ _it ＝1；

S102, the functions selected in the virtualization network function set and the quantity and the characteristics of the resources required by each function are expressed by the following constraint conditions:

R _j,k .it≤N×x _j,k .it，

R _j,k .p≤N×x _j,k .p，

R _j,k .s≤N×x _j,k .s，

wherein R represents a virtual request and x represents a selected module;

the virtual service orchestration is represented by the following constraints:

where s ', p ', it ' denote the relevant resources that have been used, all denote all resources;

the following mathematical model was established:

f _i →f _i+y representing virtual function modules f _i And a virtual function module f _i+y There is a dependency relationship, f _i ≠f _i+y Representing virtual function modules f _i And a virtual function module f _i+y There is an exclusive relationship;

μ _j,k '＝μ _j,k +μ _j+y,k ，

chemical reaction optimization mathematical model for expressing virtualized wireless network resource arrangement by adopting following formula

Said step S2 comprises the following steps,

s202: and omega (i) Best is the structure with the lowest current potential energy of the ith molecule, omega Gtest represents the molecular structure with the lowest current global potential energy, the structure with the lowest potential energy of the current molecule i is firstly utilized, gaussian is adopted for carrying out disturbance, and then a random walk model is used for walking between the structure with the lowest current potential energy of the ith molecule and the molecular structure with the lowest global potential energy after the structure with the lowest current potential energy of the ith molecule is disturbed by Gaussian, so that the structure of the impacted molecule is obtained:

wherein the content of the first and second substances,

is gaussian perturbation, rand is a random number,

PE _ω(i) +KE _ω(i) ≥PE _ω(i)'

KE _ω(i)' ＝(PE _ω(i) +KE _ω(i) -PE _ω(i)' )×q，

wherein, q is a loss coefficient, and (1-q) represents the loss proportion of KE in the wall collision process;

s203: make ω' ₁ ，ω' ₂ Is the structure of the decomposed molecule, adopts the following formula to perform disturbance on omega by adopting Gaussian, then performs random walk,

the conditions under which the decomposition reaction of the molecules occurs are expressed by the following formula:

Wherein temp is a temporary variable;

represents from ω ₂ In-and-out k-bit substitution omega ₁ The corresponding value.

Represents from ω ₁ Replacing omega by k bits at any place ₂ Rand (ω) is a randomly generated molecular structure,

temp2＝buff×rand，

temp2 is a temporary variable;

the kinetic energy KE of the exchanged molecules is obtained by using the following formula:

buff＝buff-temp2，

s205: and (3) synthesis reaction: two molecules omega ₁ ，ω ₂ The values of the same location are added and modulo the highest value of that location; let ω ' be the structure of the exchanged molecule and ω ' be represented by the following equation '

ω’＝ω ₁ +ω ₂ ，

temp2＝buff×rand，

PE _ω1 +KE _ω1 +PE _ω2 +KE _ω2 +temp2≥PE _ω' ，

KE _ω' ＝(PE _ω1 +KE _ω1 +PE _ω2 +KE _ω2 -PE _ω' )×q，

buff＝buff-temp2，

s206: the state in which each occurrence of a chemical reaction of a molecule is set to S = { S } in the Q-learning method ₁ ,…,S _t ,…S _T Is the behavior set of the Q-learning method, where π = { a = a +1, a = a-1},0 ≦ a ≦ T, where a is "0", onlyThe action can be called row a = a +1, when a is T, only the action can be called row a = a-1, the initial value of a is T, T is the number of times the molecules have chemical reactions, T is the number of times the overall iteration is generated, the gain of each time is represented as γ = | PE (ω') -PE (ω) |, the cost l of each time is the value of buff increase when an invalid collision or an invalid decomposition occurs, and the Q value is updated by the following formula:

where σ is the learning rate, β is a discounting factor,

is a benefit in memory;

the value of q is adjusted by the following formula:

wherein λ is a coefficient of exponential distribution;

s207: analyzing each molecule in the population pop, judging whether the molecule meets the wall collision reaction condition, if so, performing the wall collision reaction, and judging PE after the wall collision reaction _ω((i) ≥PE _ω(i)' If the energy is larger than the preset value, then omega (i) = omega (i)', otherwise, the reaction is invalid wall collision, the energy in the wall collision is converted into buffer zone energy, the following formula is adopted,

buff＝buff+(PE _ω(i) +KE _ω(i) -PE _ω(i)' )×(1-q)；

when ineffective wall collision occurs, the molecules continue to collide with the wall and reach PE _ω(i) ＜PE _ω(i)' Until the end;

analyzing each molecule in the population pop to determine whether a decomposition reaction condition is met, and if so, performing a decomposition reaction; after the decomposition reaction, judgment was made

Or

If greater than ω (i) = min (ω (i) ₁ ',ω(i) ₂ ') and adding a ω (pop + 1) = max (ω (i) ₁ ',ω(i) ₂ '), otherwise, the reaction is ineffective decomposition, and the energy in wall collision is converted into buffer zone energy, and the energy is expressed by the following formula:

buff＝buff+(PE _ω(i) +KE _ω(i) -PE _ω(i)1' -PE _ω(i)2' )×(1-q)，

Or

Until now, the decomposed macromolecules ω (pop + 1) = max (ω (i) ₁ ',ω(i) ₂ ') by wall-collision reaction and to PE _ω(pop+1) ＜PE _ω(pop+1)' So far, population is added with 1,pop (= pop + 1) on the basis of the original population,