CN115484304A - Real-time service migration method based on lightweight learning - Google Patents
Real-time service migration method based on lightweight learning Download PDFInfo
- Publication number
- CN115484304A CN115484304A CN202210921760.0A CN202210921760A CN115484304A CN 115484304 A CN115484304 A CN 115484304A CN 202210921760 A CN202210921760 A CN 202210921760A CN 115484304 A CN115484304 A CN 115484304A
- Authority
- CN
- China
- Prior art keywords
- service
- migration
- delay
- learning
- agent
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000013508 migration Methods 0.000 title claims abstract description 87
- 230000005012 migration Effects 0.000 title claims abstract description 87
- 238000000034 method Methods 0.000 title claims abstract description 36
- 238000012549 training Methods 0.000 claims abstract description 16
- 238000005457 optimization Methods 0.000 claims abstract description 13
- 230000008901 benefit Effects 0.000 claims abstract description 8
- 238000004891 communication Methods 0.000 claims description 55
- 230000008569 process Effects 0.000 claims description 19
- 238000005265 energy consumption Methods 0.000 claims description 13
- 230000006870 function Effects 0.000 claims description 10
- 238000004364 calculation method Methods 0.000 claims description 8
- 230000008014 freezing Effects 0.000 claims description 6
- 238000007710 freezing Methods 0.000 claims description 6
- 230000009471 action Effects 0.000 claims description 5
- 208000036829 Device dislocation Diseases 0.000 claims description 4
- 238000013528 artificial neural network Methods 0.000 claims description 4
- 238000005070 sampling Methods 0.000 claims description 4
- 230000003111 delayed effect Effects 0.000 claims description 3
- 241000818946 Homethes Species 0.000 claims 1
- 230000008859 change Effects 0.000 claims 1
- 230000009466 transformation Effects 0.000 claims 1
- 238000004422 calculation algorithm Methods 0.000 abstract description 68
- 239000003795 chemical substances by application Substances 0.000 description 59
- 230000000694 effects Effects 0.000 description 19
- 230000005540 biological transmission Effects 0.000 description 9
- 238000009826 distribution Methods 0.000 description 9
- 238000013461 design Methods 0.000 description 6
- 238000010586 diagram Methods 0.000 description 6
- 238000004088 simulation Methods 0.000 description 5
- 230000009977 dual effect Effects 0.000 description 4
- 230000007246 mechanism Effects 0.000 description 4
- 230000010354 integration Effects 0.000 description 3
- 239000000654 additive Substances 0.000 description 2
- 230000000996 additive effect Effects 0.000 description 2
- 238000002474 experimental method Methods 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 230000035945 sensitivity Effects 0.000 description 2
- 230000001133 acceleration Effects 0.000 description 1
- 230000006978 adaptation Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000003190 augmentative effect Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000003139 buffering effect Effects 0.000 description 1
- 230000002860 competitive effect Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000002596 correlated effect Effects 0.000 description 1
- 230000000875 corresponding effect Effects 0.000 description 1
- 230000008878 coupling Effects 0.000 description 1
- 238000010168 coupling process Methods 0.000 description 1
- 238000005859 coupling reaction Methods 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 230000001934 delay Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000010894 electron beam technology Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 239000002360 explosive Substances 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003252 repetitive effect Effects 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
- 238000003860 storage Methods 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/14—Network analysis or design
- H04L41/142—Network analysis or design using statistical or mathematical methods
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/14—Network analysis or design
- H04L41/145—Network analysis or design involving simulating, designing, planning or modelling of a network
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W24/00—Supervisory, monitoring or testing arrangements
- H04W24/02—Arrangements for optimising operational condition
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Physics & Mathematics (AREA)
- Algebra (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Analysis (AREA)
- Mathematical Optimization (AREA)
- Mathematical Physics (AREA)
- Probability & Statistics with Applications (AREA)
- Pure & Applied Mathematics (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
The invention discloses a real-time service migration method based on lightweight learning, which constructs a service collaborative migration frame facing a dynamic edge network and constructs a dual-objective optimization problem so as to simultaneously optimize service performance and cost. In order to solve the problem, an offline expert strategy based on a global state is provided to provide an optimal result as an expert track. In order to realize real-time service cooperative migration based on an observable state, the invention provides a lightweight online agent strategy based on imitation learning to imitate an expert track, and accelerates model migration by using element updating. The experimental performance results show that compared with other representative algorithms, the scheme provided by the invention can obviously improve the migration performance and reduce the training cost, and has obvious advantages on multiple indexes such as service delay, payment cost and the like under different working loads.
Description
Technical Field
The invention relates to a cooperative migration method of real-time services in a dynamic edge network, in particular to a service migration algorithm based on imitation learning and a model migration acceleration algorithm based on meta learning.
Background
Enhanced mobile broadband has pushed 5G into commercial reality. With the transition to 6G, the rapid expansion of smart devices and the explosive growth of real-time applications, the most advanced service requirements such as holographic communication, digital twin and augmented reality are brought forward, a large amount of data which needs to be processed in time is generated, and the global mobile traffic will reach 1 ZB/month in 2028, which is equivalent to that 50 hundred million users spend 200GB each month in the world. The stringent computational power requirements are a significant challenge for resource-limited edge networks. The current equipment function is imperfect, which results in limited timeliness requirement and limited edge resource of entity service.
The high cost of updating or maintaining hardware limits the development of new service commercialization. To guarantee the performance of real-time services, resources (including computation, communication and caching) are reserved according to the requirements of the service session announcements. However, service execution requires heterogeneous resources between multiple edge devices, highly dependent on global network state. Since the information is isolated on a separate device, the edge devices cannot observe the global state due to limited communication capabilities. However, frequent interactions with a central node, such as a base station or other infrastructure with powerful sensors, burden the network and threaten private information. Therefore, a fundamental problem is how to design lightweight and distributed proxy strategies to enable autonomous service collaboration of devices to make optimal decisions in real time, especially for dynamic edge networks. The challenges facing the study of this problem are as follows:
1. resource contention is more intense in mobile devices with limited energy. A single service provider not only increases the rental burden, but also reduces the efficiency of resource utilization. Therefore, how to schedule services and manage heterogeneous resources in combination to optimize the quality of experience of service requesters is worthy of study.
2. Users are selfish and rational in the real world, and have different wishes to rent out resources. Therefore, there is a need to design an efficient pricing mechanism to incentivize devices and provide services to requesters by making a satisfactory tradeoff between stable but competitive infrastructure resources and decentralized but available device resources.
3. The training costs, communication load, and convergence speed created by the learning algorithm result in a dramatic drop in time-sensitive quality of service. Designing a lightweight learning strategy for distributed decision-making that supports online is quite challenging.
Disclosure of Invention
The invention aims to design an efficient heterogeneous resource integration scheme for providing optimized service performance and service cost of real-time service, and establishes a dynamic edge system supporting the cooperative migration of the real-time service. In order to minimize the delay and payment of service execution, the invention designs a lightweight continuous simulation service cooperative migration algorithm, provides an offline expert strategy based on matching and is used for providing an expert strategy for an agent, and designs a distributed agent strategy by minimizing the error of state-action pair distribution through simulation learning based on an obtained expert data set so as to fit the expert strategy. The method gets rid of the high learning load of the traditional algorithm and reduces the learning cost, and utilizes the element updating to accelerate the model training to realize the light-weight continuous simulation.
The main inventive content is summarized as follows:
1. the invention constructs an intelligent service cooperation migration framework based on resource combination optimization and provides a pricing mechanism capable of reflecting service cooperation willingness. The problem is formulated as a dual target optimization problem to minimize execution latency and payment, and the dual target problem is decomposed into selection of execution devices and determination of optimal mobility by analysis of optimal execution latency.
2. The invention provides an Online Service collaborative migration strategy (LOS) based on simulation Learning. And an off-line expert strategy is provided to obtain an optimal matching result to generate an expert trajectory data set for the agent.
3. The invention provides a lightweight online agent strategy, and online decision is made by simulating an obtained expert trajectory data set. To overcome the staleness of the expert data set, the present invention applies meta-learning to accelerate the migration of models to update agent strategies to reduce the effort of continuous training of the models.
In view of the above, the technical scheme adopted by the invention is as follows: a real-time service migration method based on lightweight learning comprises the following steps:
1) Constructing a dynamic edge network model; the method comprises the steps that areas are divided according to communication capacity of infrastructure, one area comprises a service provider and a service requester, service migration is set to be executed in discrete time slots, a user terminal can serve as the service requester and also can serve as the service provider, the service generated by the service requester can be partially migrated to other equipment to be executed, migration execution of the service is divided into three steps of inputting, executing and outputting, and the service requester divides the migrated part into two parts of local execution and migration execution to be executed in parallel, so that workload is dispersed, work efficiency is improved, and cost is reduced.
2) Resolving a service migration problem; and respectively taking the service delay and the migration payment cost as indexes of service cooperative migration performance and cost to construct a dual-objective optimization problem.
3) The infrastructure makes an optimal matching strategy based on the observed global states.
4) The expert data set is passed to the agent for the agent to train the agent policy based on the mock learning.
5) The intelligent agent trains an intelligent agent strategy based on an expert data set, and accelerates a model updating process based on a meta-learning strategy, so that the learning cost of a traditional neural network is eliminated, the traditional learning load is reduced, d time slots are set as an updating period, the expert track data set is updated in each updating period and provided for the distributed intelligent agent to learn, and each device needs to independently learn the strategy and independently update the strategy according to observable information so as to ensure the accuracy of the strategy.
The invention has the following advantages and beneficial effects:
1. the invention constructs an intelligent service cooperative migration framework based on resource combination optimization, and achieves the full utilization of resources by combining and optimizing heterogeneous resources. And then, a pricing mechanism capable of reflecting the cooperation intention of the service is provided, and the state of the service provider can be reflected through the price. The problem is formulated as a dual target optimization problem to minimize execution latency and payment to simultaneously optimize execution performance and cost, and the dual target problem is decomposed into select execution devices and determine optimal mobility by analyzing optimal execution latency.
2. The invention provides an online service cooperative migration strategy based on imitation learning. And an off-line expert strategy is provided to obtain an optimal matching result to generate an expert track data set for the intelligent agent, and the strategy can obtain an optimal migration result through a matching mode to be used for the intelligent agent to train a local model.
3. The invention provides a lightweight online agent strategy, which is used for online decision-making through an expert track data set obtained by simulation. To overcome the staleness of the expert data set, the present invention applies meta-learning to accelerate the migration of models to update agent strategies to reduce the effort of continuous training of the models. The strategy of the intelligent agent can be updated with a lower load by retaining part of prior knowledge and recording the migration process, so that the intelligent agent updates the training model with a very low working load, the updating process is accelerated, and the intelligent agent is more efficient in the actual execution process.
Drawings
FIG. 1 is a diagram of an illustrative system model for service migration in a dynamic network;
FIG. 2 is a service migration illustration;
FIG. 3 is a schematic diagram of the variation of the percentage of power consumed, the available CPU frequency and the rent;
FIG. 4 is a graph of the accuracy performance of the algorithm proposed by the present invention and other representative algorithms for different update rounds;
FIG. 5 is a graph of the performance of the execution times of the algorithm proposed by the present invention and other representative algorithms for different update rounds;
FIG. 6 is a graph of mobility profiles for the proposed algorithm and other representative algorithms under low workload;
FIG. 7 is a graph of mobility distribution for the proposed algorithm of the present invention and other representative algorithms at high workload;
FIG. 8 is a graph of achievable QoS distribution for the proposed algorithm and other representative algorithms under low workload;
FIG. 9 is a graph of achievable QoS distribution for the proposed algorithm and other representative algorithms under high workload;
FIG. 10 is a graphical illustration of the effect of service data size on average latency for the algorithm proposed by the present invention and other representative algorithms at low workload;
FIG. 11 is a graphical illustration of the effect of service data size on average latency for the algorithm proposed by the present invention and other representative algorithms at high workload;
FIG. 12 is a graphical illustration of the effect of service data size on average pay cost for the algorithm proposed by the present invention and other representative algorithms at low workload;
FIG. 13 is a graphical illustration of the effect of service data size on average pay cost for the proposed algorithm and other representative algorithms under high workload;
FIG. 14 is a graphical illustration of the impact of service data size on average energy consumption ratio for the algorithm proposed by the present invention and other representative algorithms at low workload;
FIG. 15 is a schematic diagram of the effect of service data size on the average energy consumption ratio of the algorithm of the present invention and other representative algorithms at high workload;
FIG. 16 is a graph illustrating the effect of service data size on the average time-to-live gain of the proposed algorithm and other representative algorithms at low workload;
FIG. 17 is a graph illustrating the effect of service data size on the average time-to-live gain of the proposed algorithm and other representative algorithms at high workload;
FIG. 18 is a graphical illustration of the effect of range on average delay for low workload communication algorithms of the present invention and other representative algorithms;
FIG. 19 is a graph illustrating the effect of communication range on average delay for the algorithm of the present invention and other representative algorithms under high workload;
FIG. 20 is a graphical illustration of the effect of range on average payment for the proposed algorithm and other representative algorithms of the present invention at low workload;
FIG. 21 is a graphical illustration of the effect of range on average pay rates for the algorithm proposed by the present invention and other representative algorithms at high workload;
FIG. 22 is a graphical illustration of the effect of range on average power consumption for the algorithm of the present invention and other representative algorithms at low workload;
FIG. 23 is a schematic diagram of the effect of communication distance on the average power consumption ratio of the algorithm of the present invention and other representative algorithms under high operating load;
FIG. 24 is a graphical illustration of the effect of range on average time-to-live gain for the algorithm proposed by the present invention and other representative algorithms at low workload;
FIG. 25 is a graphical illustration of the effect of range on average time-to-live gain of the proposed algorithm and other representative algorithms at high workload;
FIG. 26 is a graph showing the effect of the number of classes of service on the average latency of the proposed algorithm and other representative algorithms under low workload;
FIG. 27 is a graph illustrating the effect of the number of classes of service on the average latency of the proposed algorithm and other representative algorithms under high workload;
FIG. 28 is a graphical illustration of the impact of the number of service classes on the average payment for the proposed algorithm and other representative algorithms under low workload;
FIG. 29 is a graphical illustration of the impact of the number of classes of service at high workload on the average payment for the proposed algorithm and other representative algorithms of the present invention;
FIG. 30 is a graphical illustration of the impact of the number of classes of service on the average energy consumption ratio of the proposed algorithm and other representative algorithms at low workload;
FIG. 31 is a graphical illustration of the impact of the number of classes of service under high workload on the average energy consumption ratio of the proposed algorithm and other representative algorithms;
FIG. 32 is a graphical illustration of the impact of the number of classes of service on the average time-to-live gain of the proposed algorithm and other representative algorithms at low workload;
FIG. 33 is a graph illustrating the effect of number of service classes on average time-to-live gain for the algorithm proposed by the present invention and other representative algorithms at high workload.
Detailed Description
In order to show the advantages of the present invention more clearly and in detail, the following description will further describe the embodiments of the present invention with reference to the drawings.
The invention provides an efficient service cooperative migration framework, aims to design an efficient heterogeneous resource integration scheme to provide optimized service performance and service cost of real-time service, and provides a lightweight learning scheme based on imitation learning by analyzing the optimal migration rate of service cooperative migration.
Step 1):
fig. 1 is a diagram of an illustrative system model for service migration in a dynamic network, as shown, a dynamic edge network may divide areas according to communication capabilities of infrastructure, one area includes a service provider and a service requester, and in order to be able to capture dynamic conditions, it is configured that service migration is performed in discrete time slots, a user terminal (including a vehicle or an intelligent device) may also be used as a service provider while being used as a service requester, and a service generated by a service requester is configured to be partially migrated to another device for execution.
And the detailed migration execution process of the service can be divided into three steps of inputting, executing and outputting, as shown in the service migration description diagram of fig. 2, the input of the service comprises two parts, namely service data and data packets required by the service. The service requester decomposes the migrated part into a local execution part and a migrated execution part to execute in parallel, so that the workload is dispersed to improve the working efficiency and reduce the cost.
In time slot t, the random arrival number is n t May be represented as S i (t) is D i (t) generating service requests for services, for respectively different services,indicating the class of service. K represents the total number of service classes.
Step 1.1):
the details of the service execution model are as follows:
the scenario studied comprises two modes of communication, i.e. device-to-device communication and device-to-infrastructure communication, the achievable communication rate between the two devicesCan be calculated from the shannon formula as follows:
wherein, B ij Representing the communication bandwidth between device i and device j, Γ ij (t) represents the signal to interference plus noise ratio between device i and device j at time slot t. Once the communication conditions between device i and device j satisfy the restrictions, a communication link can be established.
In order to guarantee the communication quality, the methodIt is obvious to consider that the user terminal can only communicate with one user equipment at the same time, i.e. the equipment does not interfere with the inter-equipment communication,whereinPresentation device D i (t) a communication transmission power of (t),presentation device D i (t) and device D j Channel gain between (t), and σ 2 It represents additive white gaussian noise. Accordingly, if device D i (t) and the infrastructure R (t) satisfy communicable conditions, a communication link can be constructed based on non-orthogonal multiple access, and the signal-to-interference-and-noise ratio (SIR) is gamma ir (t) can be calculated from the following formula:
whereinAs a device D i (t) a communication transmission power of (t),for the gain of the communication channel, σ 2 Is an additive white gaussian noise and is,representing other devices and infrastructure communication power, channel gain, and device set, respectively. In time slot t, the service provider may receive more than one transmission request from other devices. The present invention sets each individual request to follow first come first serve, its arrival to follow a poisson distribution. Each user terminal has only one service table and can accommodate up to N requests. The service requests received by each device can be modeled as an M/G/1 queuing system. Transmission waiting timeDelay pipeCan be calculated as:
where the variable lambda represents the transmission strength of the task andrepresenting the average transmission delay between the two devices. Theta 2 Representing the variance of the propagation delay. Communication time delayCan pass throughIs calculated, whereinRepresenting the transmission delay of the task data.
When no service is found in the available devices, the service provider needs to download the data packets required for the service from the network and store them with sufficient remaining storage resources. The present invention recognizes that different packets buffered in a device follow a random distribution. The buffered packets may be shared with other communicable devices. Due to the scarcity of spectrum resources communicating with the infrastructure, the infrastructure can only download packets from the network.
After obtaining all of the input data, the service provider may provide computing resources to perform the service. Service execution up to processing rate(in megabytes/second) can be calculated by:
whereinData size, α, for service Si (t) ij (t) is a decision variable for task migration and, when i = j, indicates that the service is executing locally,as a device D j (t) available computing resources, R comp (t) available computing resources of the infrastructure,for service S i (t) required computational resources.
Based on the model, at device D j (t) executing service S i The delay of (t) includes four parts, namely service data acquisition delay, service required packet acquisition delay, execution delay and feedback delay. According to mobility gamma i (t) dividing the service into a migration part and a local execution part. The present invention defines a binary decision variable alpha ij (t) to indicate the selected service provider whenThe service is migrated to the infrastructure for execution. And a binary decision variable beta ijh (t) indicating the packet sharing device whenIn time, the data package required for the service can be obtained by downloading. Thus, local execution latencyCan be calculated from the following equation:
i.e. local executionDelay pipeComputing time delays for localAnd local get packet delayAnd (4) summing. The local calculation time delay calculation mode is as follows:
wherein gamma is i (t) is a service S i (t) a mobility of the electron beam,for service S i (t) the computational resources required for execution,is a device D i (t) computing power.
wherein beta is iih (t) decision variables obtained for the data packet,in order to be the size of the data packet,to obtain the communication rate of the data packets locally,in order to wait for the delay in the transmission,the packet download rate.
whereinIs the communication delay of the two devices,in order to calculate the time delay for the device,the acquisition delay of the data packet required for service, therefore, the migration execution delay can be calculated by the following formula:
wherein alpha is ij (t) is a binary decision variable, γ, for selecting an execution device i (t) is a service S i (t) a mobility decision variable of (t),for service S i (t) the size of the data,for service S i (t) a size of the output data,is the rate of communication between the two devices,for the communication waiting rate, beta, between two devices ijh (t) obtaining decision variables for the service data packet,in order to be the size of the data packet required,for the rate of download of the device,the computational resources required for the data packet,for the computing resources available to the device, R comp (t) infrastructure available computing resources, R down (t) infrastructure data download rate.
Since the local and migrated portions are performed concurrently, the total service execution delay T i (t) can be obtained by the following formula:
i.e. taking the maximum latency of the local execution and the migration execution part, whereinIn order to perform the delay locally,latency is performed for migration.
Step 1.2):
the rent model is detailed as follows:
due to the user's rationality and selfishness,a fair incentive mechanism is needed to facilitate device cooperation. In the present invention, the unit price of lease of a computing resourceAlong with equipment D j (t) stateA variation, defined as:
wherein the parameter k represents a price coefficient for adjusting the available computing powerAnd the remaining available energy of the plantImpact on unit rent. These two factors are negatively correlated with the unit's rent to reflect the willingness trend for the profit of the leased resource. The pricing function may be divided into two parts (i.e.And) To reflect different sensitivities of computing power and remaining available energy to pricing, respectively. The present invention selects an exponential function to represent a higher sensitivity of the battery charge. If it is notExtremely low, then D is whatever computing resource is available j (t) the price is increased to avoid the fault caused by excessive power consumption.
Fig. 3 shows an example of the impact of two relevant state factors on pricing for k =0.5, with how much of a side of pricing reflects the propensity of a service provider to lease resources to provide service to a requester. The horizontal axis represents time slots and the vertical axis represents values of remaining energy, available computational resources, and leases. It is clear that when the remaining energy is quite low, the rent will rise dramatically to prevent the collapse of a power drain, no matter how much computing resources are available.
The infrastructure deployed in the real world has a fixed power supply, so the remaining capacity of the infrastructure can be considered to be sufficient at all times. Rent functionThe calculation method is as follows:
wherein R is comp (t) is the available computing resources of the infrastructure, 1 the remaining available energy is always sufficient, and k is the price factor, thus the corresponding energy consumptionIs calculated as
Wherein gamma is i (t) is the mobility of the service Si (t),for local calculation of time delay, e comp In order to calculate the percentage of energy consumed per unit,for local download delay, e down In order to download the percentage of energy consumed per unit,for communication delay, e comm Is the percentage of energy consumed by the communication unit.
Step 2):
the detailed steps of the optimization target construction are as follows:
in order to reduce the influence of the time-varying heterogeneous resource state on the service cooperative migration performance, the service time delay and the migration payment cost are respectively used as the indexes of the service cooperative migration performance and the cost, and the dual-objective optimization problem P1 can be expressed as follows:
whereinIndicating the length of the execution slot, alpha ij (t) represents a service migration device decision variable, β ijh (t) represents a service data packet acquisition decision variable, γ i (T) is a service mobility decision variable, T i (t) time slot execution for device, P i (t) resource lease cost, S is total number of service requests to be executed. P1 is thus bound to
C6:γ i (t)∈[0,1],
Constraint C1 ensures that the execution latency of a service cannot exceed its tolerable latency to guarantee the quality of experience for the user, where T i (t) is the service execution latency,is K i Tolerable delay for class services; constraint C2 ensures that the migration portion of each service needs to be completed within a communicable time, whereA delay is performed for the migration of the service,the time delay is the communicable time delay between the two devices; constraint C3 ensures that each service provider should not exhaust its remaining energy to prevent service outages due to energy exhaustion, whereIn order to provide the remaining energy for the equipment,to perform energy consumption, D i (t) andrespectively representing a device and a set of devices; c4 defines the upper limit of the communication capacity, alpha, of the device with the infrastructure ij (t) is a device migration decision variable, R ch (t) is the upper limit of the number of channels; constraint C5 constrains the binary decision variable value, α ij (t) and beta ijh (t) decision variables, n, for the device migration and service packet acquisition modes, respectively t To be provided withPreparing the total number; c6 illustrates the service mobility γ i (t) value range, constraint C7 indicates when mobility γ is present i (t) =0, when no service provider provides cooperation, that is, when no service provider provides cooperation
And step 3):
the optimization problem P1 constructed varies as follows:
since the purpose of the problem P1 is to minimize the average performance of the service cooperative migration, the present invention intends to minimize the average latency T of the service cooperative migration per slot i (t) and cost P i (t), P1 can be converted into:
constrained by C1-C7. Due to mobility, when locally executed, the delayAnd migration execution latencyEqual, service execution delay T i (t) is lowest, so P2 can be rewritten as:
constrained by C1-C7. Due to two decision variables alpha ij (t) and beta ijh (t) coupling each other, and in order to evaluate the pareto optimal solution, the invention defines a metric effect for expressing the optimal cost asThus, the joint optimization problem can be decomposed into two sub-problems P4 and P5 as follows:
constrained by C3-C5.
Constrained by C1, C2, C7.
And step 4):
the detailed steps of acquiring the expert track are as follows:
the system of the present invention involves multiple devices and multiple migrated services simultaneously. In time slot t, the service requester and the service provider can be constructed as two sets of entities without intersection, respectively denoted as Andthe benefit of migrating to each device can be derived from the observed global state. The problem presented can therefore translate into a matching problem that maximizes the overall benefit.
Step 4.1):
at the beginning of each time slot in an updating round, the matching times D of the equipment are initialized firstly j (t), visit, and number of service matches S i (t) visit is 0, whereinThen the preference value of each device is initialized to 0, i.e.And initializing the tuning parametersIs infinity;
step 4.2):
for each service request, firstly, the optimal mobility executed on each migration device is obtained, and the matching decision alpha is obtained according to the obtained optimal mobility ij (t) and beta ijh (t), lower limit of mobilityComprises the following steps:
whereinIs K i The tolerable delay of the service is then determined,in order to obtain the packet delay locally,for calculating the time delay locally whenUpper limit of mobilityComprises the following steps:
whereinIn order to delay the communication between the two devices,in order to wait for the communication to be delayed,the time delay is obtained for the data packet,in order to delay the time of communication,to calculate the time delay. When in use Upper limit of mobilityComprises the following steps:
whereinIn order to be able to tolerate the delay of the service,waiting for a delay for communication,The time delay is obtained for the data packet,in order to achieve a delay in the communication,to calculate the time delay. Since the optimal time delay is the same as the local time delay and the migration time delay, the optimal mobility ratioCan be expressed as:
whereinIn order to obtain the packet delay locally,in order to calculate the time delay locally,in order to migrate the execution time delay,the time delay is obtained for the data packet,in order to achieve a delay in the communication,to calculate the time delay.Represents the actual execution time delay of the task, canObserve ifγ i (t) =0, monthThe mobility was obtained as follows:
step 4.3):
for each attempted migration device, if the constraints C1-C7 are satisfied, the benefit U will be ij (t) adding to service S in descending order i (t) in the preference list. Otherwise, gamma will be i Benefit U when (t) =0 ij (t) adding to the preference list.
Obtaining a priority value for each service request based on all preference valuesThe maximum preference value for all services.
Step 4.4):
for service S in service request set i (t) device setAnd executing matching operation, wherein the specific execution process is as follows: from the collectionIn is S i (t) finding a suitable matching procedure for the performing device. Defining an expected value U ij (t) isAndand (4) the sum. If it satisfiesThen S i (t) migration to device D j (t) and returning a matching result. Otherwise, the tuning parameters Δ are matched j (t) needs to be updated toWhereinFor service S i (t) a preference value for the value of (t),is a device D j Preference value of (t), U ij (t) is a desired value.
Step 4.5):
if no matching result is returned in step 4.4), an update operation is performed to update the list of tuning variables for the device that has not been previously matchedThe adjustment factor is updated to min { delta, delta j (t) }, where δ is the adjustment factor initialized to ∞, Δ j (t) adjusting all accessed service preference values to adjustment variablesAdjusting all vehicle preferences toAnd all the adjustment variables Delta j (t) update to Δ j (t)-δ。
The invention sets the infrastructure as the expert node to obtain the complete global state and constructs the expert track<s(t),a(t)>. The execution phases can be divided intoA batch comprisingA state-action pair for updating the impersonation policy. In order to realize real-time service migration in an edge network, the invention provides a lightweight distributed online agent emulation strategy. After completing the expert strategy, the tracks of the experts can be obtained as a data setAnd transmitted to the agent training strategy as needed.
Step 5):
the online agent strategy comprises the following detailed steps:
in a dynamic edge network, devices are treated as distributed agents, and agent policies are trained to make migration decisions by mimicking expert trajectories and approximating the expert policies. However, an excessively large expert trail dataset can create a huge communication burden, and the expert trail dataset can become obsolete over time, so the agent needs to retrain the model to prevent performance loss, which is a repetitive process with a huge consumption of computing resources. Aiming at the problem, the invention provides a lightweight online agent strategy, which continuously imitates the updated expert track through some demonstrations.
The simulated learning process involves two participants: experts and agents. Setting d time slots as an updating period, wherein each updating period updates the expert trajectory data set and provides the updated expert trajectory data set to the distributed intelligent agent for learning. Update period is composed ofDenotes epsilon l D sample trajectory data are included to construct an expert strategy. Each device needs to independently learn and independently update the strategy based on observable information to ensure the accuracy of the strategy. The updating steps of the intelligent agent strategy are as follows:
step 5.1):
the initial model needs to be pre-trained to provide a priori knowledge before the model is updated. After obtaining the initial expert demonstration data set epsilon 0 And expert strategyThereafter, each agent needs to obtain an initial agent model by training the neural network. Proxy network estimating actions based on observed statesAnd fitting the observed states and the estimated motion profile according to a loss functionAnd expert strategy pi e (a, s) to train its strategy, loss functionThe following were used:
whereinRepresenting agent policy,. Pi e (a, s) denotes an expert policy, a denotes an actual action, s denotes an observed state,a predicted motion is represented that is a function of,representing a freezing parameter, theta 0 Which is indicative of the initial parameters of the device,indicating the desire. Therefore, the updating process of the parameters is as follows:
wherein l b Fundamental theory of representationThe learning rate of the learning device is increased,representing loss functionOf the gradient of (c).
Step 5.2):
in the refresh periodIn (1),representing a set of update periods. The agent obtains a partially updated expert trajectory epsilon l . To speed up the process of repeated model migration, the present invention utilizes meta-learning to record the scaling and translation of model migration. The meta-learning parameter in the period l is denoted as ω l . The meta learning process willIs converted intoBy passingTo obtain omega l . The goal of meta-learning is to makeIs similar to
The meta-update of an agent comprises two sub-phases, namely basic learner training and meta-learner training. In the first period, randomly extracting the expert track epsilon from the data set e,l Then samplingNumber of stripsTraining basic learning model according to the data, samplingTo train meta model learning, anTemporary parameter θ' l From the parameter theta of the l-1 period l-1 The initialization is derived and used for fine tuning, updated as:
wherein l b Based on the learning rate of the base learner,to gradient the loss function of the basis learner,to freeze the parameter, θ l-1 Parameter of period l-1, ω l-1 Are meta learner parameters. Thus the parameter omega of the meta learner l The updating is as follows:
wherein l m Based on the learning rate of the base learner,to solve the gradient of the loss function of the meta-learner,is a freezing parameter, θ' l As a temporary parameter, ω l-1 Is the meta-learner parameter for the l-1 period. Thus agent parameter θ l Can be updated as:
wherein l m Based on the learning rate of the base learner,to solve the gradient of the loss function of the meta-learner,is a freezing parameter, θ' l As a temporary parameter, ω l Is the meta-learner parameter for the l period.
Step 5.3):
after completing the first training of the agent, the distributed agent according to the strategyAnd making a migration decision based on the observed state until the agent enters the coverage of other infrastructures or until the (l + 1) th updating period, repeating the 2 nd stage by the agent for updating, wherein the updating process of the agent model can continuously imitate the expert strategy in a lightweight way, and the expert data set is effectively adapted while some known prior knowledge is kept.
Through the steps, the cooperative transmission provided by the invention is realized. Figures 4 and 5 show the efficiency of the present invention. The expert trajectory needs to be updated at intervals to prevent performance loss due to data obsolescence. In order to ensure the timeliness of the expert track. As shown in fig. 4, the runtime of the proposed LOS agent policy is drastically reduced as the expert trajectory is updated. By combining the precision performance of fig. 5, it is shown that the LOS strategy without migration needs to retrain the agent strategy to make a decision by directly using the updated data set, which wastes prior knowledge, reduces accuracy based on a smaller data set, and obviously makes the agent strategy of LOS more suitable for continuous update required by a long-term scene, with an accuracy rate ranging around 0.74.
Fig. 6-9 illustrate the average mobility and achievable quality of service distribution for 10 update periods at low and high workloads, respectively. According to the proposed mobility acquisition scheme, the value of the optimal mobility depends on the selected service provider and the request state. The gap in achievable quality of service between high and low loads is quite small as shown in fig. 8 and 9. Based on this, the service requester of the proposed LOS policy (including the LOS agent policy and the LOS expert policy) achieves the highest quality of service except for full offloading of the service, proving the high efficiency of the proposed LOS scheme.
The policy performance at different service data sizes is shown in fig. 10-17. Fig. 10 and 11 are average latencies with increasing service data size at low and high workloads, respectively. Where the delay of the LOS agent policy is increased from 0.91 to 3.07 seconds at low workload and from 116.85 to 426.34 seconds at high workload, just above the LOS expert policy. The LOS agent policy can accommodate different workloads by balancing communication and computational load with an adjustable migration ratio, making a reasonable trade-off between local and migrated devices. Figures 14 and 15 show the average service processing energy consumption percentages at different workloads, respectively. It is clear that the average energy consumption percentage increases with increasing amount of service data, and in conjunction with the decreasing payment performance in fig. 12 and 13, the LOS expert policy gets an optimal decision to reduce costs while suppressing the rate of energy consumption increase, providing a better expert trajectory for LOS agent policy emulation. Fig. 16 and 17 evaluate the average lifetime increase of service requesters at low and high workloads, respectively. The rapid drop in time-to-live gain shown in fig. 17 indicates that the communication consumption exceeds the computational consumption saved at high workloads, and the LOS agent strategy can achieve near-optimal time-to-live gain by trading off different workloads.
Different communication distance limitations are shown in fig. 18-25. Fig. 18 and 19 show latency performance at different workloads, respectively, where the LOS agent policy has significant advantages over other policies. Fig. 22 and 23 illustrate energy consumption at low and high workloads, respectively. The power consumption of the LOS agent increases at low workload and decreases at high workload, indicating that the service requester is more inclined to migrate services by leasing resources at high load, at the expense of slight latency and power consumption, as can also be explained in fig. 18 and 20. As shown in fig. 22, 23, 24, 25, the LOS agent policy can more flexibly adapt to different communicable restrictions by generating a global state-action distribution through an LOS expert policy that approximates an optimal result, reducing local power consumption and thus extending the life cycle of local devices.
Service performance evaluation with policies of different numbers of service classes is shown in fig. 26-33. Experiments were performed from 3 to 9 classes of service to evaluate the generalization of the algorithm in case of multiple classes of service. Under the same experiment condition, the more the cache content types are, the lower the cache hit rate between the communication devices is. Fig. 26, 27 illustrate that LOS expert policy integration can take into account the communication, computation, and buffering states of the communicable devices under the same conditions, resulting in minimal latency. The LOS agent policy with timely update policy has good emulation performance based on limited observation states. Figures 28, 29 evaluate the pay-cost performance, demonstrating a satisfactory tradeoff between cache content and migration portion for the pay-cost of LOS agent policy, the adaptation of LOS agent policy to increasing classes of service, and the ability of agent policy to accurately model expert decision distribution and obtain near-optimal decisions. As shown in fig. 30, 31, the average energy consumption of the LOS agent policy is more stable than other algorithms under different number of service classes. Under different workloads, there is only a small gap between LOS agent policies from 3 to 9 service classes, thereby improving the lifetime gains assessed in fig. 32 and 33. This is not only because the LOS agent policy considers the status of both the service requester and provider, but also because the LOS agent policy is able to obtain a global state fit based on the partially observed status. The performance gain of the LOS agent policy rises with the increase in the number of service classes, indicating that LOS can effectively adapt to multiple service class scenarios.
The above technical solutions only represent the technical solutions of the present invention, and are not the most perfect and accurate solutions. As technology innovations and the era move, more reasonable and efficient changes may be made to the solution. The exemplary embodiments were chosen and described in order to explain the principles and the application of the invention and to facilitate reference by researchers and technicians, as well as to understand and practice the details of the invention. It is intended that all such modifications and variations be included within the scope of the invention, which is determined by the following claims and their equivalents, be included within the scope of the invention.
Claims (7)
1. A real-time service migration method based on lightweight learning is characterized by comprising the following steps:
1) Constructing a dynamic edge network model; dividing regions according to the communication capacity of an infrastructure, wherein one region comprises a service provider and a service requester, setting that service migration is executed in discrete time slots, a user terminal can be used as the service provider while being used as the service requester, the service generated by the service requester can be partially migrated to other equipment for execution, the migration execution process of the service is divided into three steps of inputting, executing and outputting, and the service requester divides the migrated part into a local execution part and a migration execution part for parallel execution so as to disperse workload to improve the working efficiency and reduce the cost;
2) Resolving a service migration problem; respectively taking service delay and migration payment cost as indexes of service cooperative migration performance and cost to construct a dual-target optimization problem;
3) The infrastructure makes an optimal matching strategy based on the observed global state;
4) Transmitting the expert data set to the agent for the agent to train an agent strategy based on the imitation learning;
5) The intelligent agent trains the intelligent agent strategy based on the expert data set, and accelerates the model updating process based on the meta-learning strategy, so that the learning cost of the traditional neural network is eliminated, the traditional learning load is reduced, d time slots are set as an updating period, the expert track data set is updated in each updating period and provided for the distributed intelligent agent to learn, and each device needs to independently learn and independently update the strategy according to observable information to ensure the accuracy of the strategy.
2. The real-time service migration method based on lightweight learning according to claim 1, wherein: step 1) specifically comprises the steps of constructing service delay and transferring payment cost;
1.1 the service latency is as follows,whereinIn order to perform the delay locally,performing a time delay for the migration;
the migration execution time delay is as follows,
whereinIs the communication delay of the two devices,in order to calculate the time delay for the device,obtaining time delay of data packets required for service;
1.2 the migration payment calculation process is as follows:
wherein the parameter k represents a price coefficient for adjusting the available computing powerAnd surplus energyImpact on unit rent;
wherein R is comp (t) infrastructure available computing resources, 1 means that the remaining available energy is always sufficient, and κ is a price factor, so the corresponding energy consumptionIs calculated as
Wherein gamma is i (t) is a service S i (t) a mobility of the light-emitting element,for local calculation of time delay, e comp In order to calculate the percentage of energy consumed per unit,for local download delay, e down In order to download the percentage of energy consumed per unit,for communication delay, e comm Is the percentage of energy consumed by the communication unit.
3. The real-time service migration method based on lightweight learning according to claim 1, wherein: step 2) the optimization problem P1 is
WhereinIndicating the length of the execution slot, alpha ij (t) represents a service migration device decision variable, β ijh (t) represents a service data packet acquisition decision variable, γ i (T) is a service mobility decision variable, T i (t) time slot execution for device, P i (t) resource lease cost, S total number of service requests to be performed, P1 constraint on
C6:γ i (t)∈[0,1],
Constraint C1 ensures that the execution latency of a service cannot exceed its tolerable latency to guarantee the quality of experience for the user, where T i (t) is the service execution latency,is K i Tolerable delay for class services; constraint C2 ensures that the migration portion of each service needs to be completed within a communicable time, whereA delay is performed for the migration of the service,the time delay of communication between the two devices is obtained; constraint C3 ensures that each service provider should not exhaust its remaining energy toPreventing service interruption due to energy exhaustion, whereinIn order to provide the remaining energy for the equipment,to perform energy consumption, D i (t) andrespectively representing a device and a set of devices; c4 defines the upper limit of the communication capacity of the device with the infrastructure, alpha ij (t) as a device migration decision variable, R ch (t) is the upper limit of the number of channels; constraint C5 constrains the binary decision variable value, α ij (t) and beta ijh (t) decision variables, n, for the device migration and service data packet acquisition modes, respectively t The total number of the equipment is; c6 illustrates the service mobility γ i (t) value range, constraint C7 indicates when mobility γ is present i (t) =0, when no service provider provides cooperation, that is, when no service provider provides cooperation
5. The real-time service migration method based on lightweight learning according to claim 4, wherein: the step 4) specifically comprises the following steps:
step 4.1):
at the beginning of each time slot in an updating round, the matching times D of the equipment are initialized firstly j (t), visit, and number of service matches S i (t) visit is 0, whereinThen, the preference value of each device is initialized to 0, i.e.And initializing the tuning parametersIs ∞;
step 4.2):
for each service request, firstly, the optimal mobility executed on each migration device is obtained, and the matching decision alpha is obtained according to the obtained optimal mobility ij (t) and beta ijh (t), lower limit of mobilityComprises the following steps:
whereinIs K i The tolerable delay of the service is such that,is a homeThe time delay of the data packet is obtained,for calculating the time delay locally whenUpper limit of mobilityComprises the following steps:
whereinIn order to delay the communication between the two devices,in order to wait for the communication to be delayed,the time delay is obtained for the data packet,in order to delay the time of communication,to calculate the time delay. When the temperature is higher than the set temperature Upper limit of mobilityComprises the following steps:
whereinIn order to be able to tolerate the delay of the service,in order to wait for the communication to be delayed,the time delay is obtained for the data packet,in order to delay the time of communication,to calculate the time delay. Since the optimal time delay is that the local time delay and the migration time delay are equal, the optimal migration rateCan be expressed as:
whereinIn order to obtain the packet delay locally,in order to calculate the time delay locally,in order to migrate the execution latency,the time delay is obtained for the data packet,in order to delay the time of communication,in order to calculate the time delay,representing the actual execution delay of the task;
step 4.3):
for each attempted migration device, if the constraints C1-C7 are satisfied, the benefit U will be ij (t) adding to service S in descending order i (t) preference list, otherwise γ i Benefit U when (t) =0 ij (t) adding to a preference list; obtaining a priority value for each service request based on all preference valuesIs a stand forMaximum preference value for served;
step 4.4):
for service S in service request set i (t) to the device setAnd executing matching operation, wherein the specific execution process is as follows: from the setIs in S i (t) finding a suitable implementation for the matching process, defining an expected value U ij (t) isAndand if so, the sumThen S i (t) migration to device D j (t) and returning the matching result, otherwise, matching the adjusting parameter delta j (t) then needs to be updated toWhereinFor service S i (t) a preference value for (t),as a device D j Preference value of (t), U ij (t) is a desired value.
Step 4.5):
if no matching result is returned in step 4.4), an update operation is performed to update the list of tuning variables for the device that has not been previously matchedThe adjustment factor is updated to min { delta, delta j (t) }, where δ is the adjustment factor initialized to ∞, Δ j (t) adjusting all accessed service preference values to adjustment variablesAdjusting all vehicle preferences toAnd all the adjustment variables Delta j (t) update to Δ j (t)-δ。
6. The method according to claim 1, wherein the method comprises the following steps: step 5) the updating steps of the intelligent agent strategy are as follows:
step 5.1):
obtaining an initial expert demonstration data set epsilon 0 And expert strategyThereafter, each agent obtains an initial agent model by training a neural network, and the agent network estimates actions based on observed statesAnd fitting the observed states and the estimated motion profile according to a loss functionAnd expert strategy pi e (a, s) to train its strategy, loss functionThe following:
whereinRepresenting a smart agent strategy,. Pi e (as) denotes an expert policy, a denotes an actual action, s denotes an observed state,a predicted action is represented by the predicted movement,representing a freezing parameter, theta 0 Which is indicative of the initial parameters of the device,expressing the expectation; therefore, the updating process of the parameters is as follows:
wherein iota b Represents the learning rate of the underlying learner,representing loss functionA gradient of (a);
step 5.2):
in the update periodIn (1),representing a set of update periods, the agent obtains a partially updated expert trajectory epsilon l By meta-learningRecord the scaling and transformation of model migration, with the meta-learning parameter in period l denoted as ω l The Yuan learning process willIs converted intoBy passingTo obtain omega l The goal of meta-learning is to makeIs similar to
Step 5.3):
7. The method according to claim 6, wherein the method comprises: the element updating of the agent comprises two sub-stages, namely basic learner training and element learner training; in the first period, an expert trajectory epsilon is randomly extracted from the data set e,l Then samplingTraining basic learning model by using bar data, samplingTo train meta model learning, anTemporary parameter θ' l From the parameter theta of the l-1 period l-1 The initialization is derived and used for fine tuning, updated as:
wherein iota b Based on the learning rate of the base learner,to gradient the loss function of the base learner,to freeze the parameter, θ l-1 Parameter of period l-1, ω l-1 Is a meta learner parameter; thus the parameter omega of the meta-learner l The updating is as follows:
wherein iota m Based on the learning rate of the base learner,to solve the gradient of the loss function of the meta-learner,is a freezing parameter, θ' l As a temporary parameter, ω l-1 A meta learner parameter for a period of l-1; thus agent parameter θ l Can be updated as:
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210921760.0A CN115484304B (en) | 2022-08-02 | 2022-08-02 | Lightweight learning-based live service migration method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210921760.0A CN115484304B (en) | 2022-08-02 | 2022-08-02 | Lightweight learning-based live service migration method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115484304A true CN115484304A (en) | 2022-12-16 |
CN115484304B CN115484304B (en) | 2024-03-19 |
Family
ID=84422715
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210921760.0A Active CN115484304B (en) | 2022-08-02 | 2022-08-02 | Lightweight learning-based live service migration method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115484304B (en) |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2012149788A1 (en) * | 2011-09-30 | 2012-11-08 | 华为技术有限公司 | Service establishment method and system, radio network controller and user terminal |
CN110275758A (en) * | 2019-05-09 | 2019-09-24 | 重庆邮电大学 | A kind of virtual network function intelligence moving method |
CN111858009A (en) * | 2020-07-30 | 2020-10-30 | 航天欧华信息技术有限公司 | Task scheduling method of mobile edge computing system based on migration and reinforcement learning |
CN111885155A (en) * | 2020-07-22 | 2020-11-03 | 大连理工大学 | Vehicle-mounted task collaborative migration method for vehicle networking resource fusion |
CN113114722A (en) * | 2021-03-17 | 2021-07-13 | 重庆邮电大学 | Virtual network function migration method based on edge network |
US20210250838A1 (en) * | 2018-10-22 | 2021-08-12 | Huawei Technologies Co., Ltd. | Mobile handover method and related device |
CN113434212A (en) * | 2021-06-24 | 2021-09-24 | 北京邮电大学 | Cache auxiliary task cooperative unloading and resource allocation method based on meta reinforcement learning |
CN113543074A (en) * | 2021-06-15 | 2021-10-22 | 南京航空航天大学 | Joint computing migration and resource allocation method based on vehicle-road cloud cooperation |
CN114362810A (en) * | 2022-01-11 | 2022-04-15 | 重庆邮电大学 | Low-orbit satellite beam hopping optimization method based on migration depth reinforcement learning |
-
2022
- 2022-08-02 CN CN202210921760.0A patent/CN115484304B/en active Active
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2012149788A1 (en) * | 2011-09-30 | 2012-11-08 | 华为技术有限公司 | Service establishment method and system, radio network controller and user terminal |
US20210250838A1 (en) * | 2018-10-22 | 2021-08-12 | Huawei Technologies Co., Ltd. | Mobile handover method and related device |
CN110275758A (en) * | 2019-05-09 | 2019-09-24 | 重庆邮电大学 | A kind of virtual network function intelligence moving method |
CN111885155A (en) * | 2020-07-22 | 2020-11-03 | 大连理工大学 | Vehicle-mounted task collaborative migration method for vehicle networking resource fusion |
CN111858009A (en) * | 2020-07-30 | 2020-10-30 | 航天欧华信息技术有限公司 | Task scheduling method of mobile edge computing system based on migration and reinforcement learning |
CN113114722A (en) * | 2021-03-17 | 2021-07-13 | 重庆邮电大学 | Virtual network function migration method based on edge network |
CN113543074A (en) * | 2021-06-15 | 2021-10-22 | 南京航空航天大学 | Joint computing migration and resource allocation method based on vehicle-road cloud cooperation |
CN113434212A (en) * | 2021-06-24 | 2021-09-24 | 北京邮电大学 | Cache auxiliary task cooperative unloading and resource allocation method based on meta reinforcement learning |
CN114362810A (en) * | 2022-01-11 | 2022-04-15 | 重庆邮电大学 | Low-orbit satellite beam hopping optimization method based on migration depth reinforcement learning |
Non-Patent Citations (3)
Title |
---|
ANDREAS POLZE;: "Timely Virtual Machine Migration for Pro-active Fault Tolerance", 2011 14TH IEEE INTERNATIONAL SYMPOSIUM ON OBJECT/COMPONENT/SERVICE-ORIENTED REAL-TIME DISTRIBUTED COMPUTING WORKSHOPS, 21 April 2011 (2011-04-21) * |
刘坤: "基于5G的星地融合核心网的设计与仿真实现", 信息科技辑, 15 March 2022 (2022-03-15) * |
唐伦;周钰;谭颀;魏延南;陈前斌;: "基于强化学习的5G网络切片虚拟网络功能迁移算法", 电子与信息学报, no. 03, 15 March 2020 (2020-03-15) * |
Also Published As
Publication number | Publication date |
---|---|
CN115484304B (en) | 2024-03-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Chen et al. | Deep reinforcement learning for computation offloading in mobile edge computing environment | |
Wang et al. | Dynamic UAV deployment for differentiated services: A multi-agent imitation learning based approach | |
Zhou et al. | Incentive-driven deep reinforcement learning for content caching and D2D offloading | |
CN113434212B (en) | Cache auxiliary task cooperative unloading and resource allocation method based on meta reinforcement learning | |
Xie et al. | Adaptive online decision method for initial congestion window in 5G mobile edge computing using deep reinforcement learning | |
CN110290011A (en) | Dynamic Service laying method based on Lyapunov control optimization in edge calculations | |
Wu et al. | Multi-agent DRL for joint completion delay and energy consumption with queuing theory in MEC-based IIoT | |
CN111262940A (en) | Vehicle-mounted edge computing application caching method, device and system | |
CN114143891A (en) | FDQL-based multi-dimensional resource collaborative optimization method in mobile edge network | |
Wang et al. | Distributed reinforcement learning for age of information minimization in real-time IoT systems | |
CN113822456A (en) | Service combination optimization deployment method based on deep reinforcement learning in cloud and mist mixed environment | |
CN116489712B (en) | Mobile edge computing task unloading method based on deep reinforcement learning | |
CN115278708A (en) | Mobile edge computing resource management method for federal learning | |
CN113573320A (en) | SFC deployment method based on improved actor-critic algorithm in edge network | |
CN116489226A (en) | Online resource scheduling method for guaranteeing service quality | |
CN116185523A (en) | Task unloading and deployment method | |
Li et al. | DQN-enabled content caching and quantum ant colony-based computation offloading in MEC | |
Nguyen et al. | Intelligent blockchain-based edge computing via deep reinforcement learning: solutions and challenges | |
Huang et al. | Reinforcement learning for cost-effective IoT service caching at the edge | |
Qadeer et al. | Hrl-edge-cloud: Multi-resource allocation in edge-cloud based smart-streetscape system using heuristic reinforcement learning | |
CN114051252A (en) | Multi-user intelligent transmitting power control method in wireless access network | |
CN117459112A (en) | Mobile edge caching method and equipment in LEO satellite network based on graph rolling network | |
Chen et al. | Distributed task offloading game in multiserver mobile edge computing networks | |
CN111901833A (en) | Unreliable channel transmission-oriented joint service scheduling and content caching method | |
CN115484304A (en) | Real-time service migration method based on lightweight learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |