CN112312299A

CN112312299A - Service unloading method, device and system

Info

Publication number: CN112312299A
Application number: CN202011161252.4A
Authority: CN
Inventors: 邹彪; 杜伟; 王和平; 孟小前; 武艺; 郑思嘉; 张嘉琳
Original assignee: Sgcc General Aviation Co ltd; State Grid Corp of China SGCC
Current assignee: Sgcc General Aviation Co ltd; State Grid Corp of China SGCC
Priority date: 2020-10-27
Filing date: 2020-10-27
Publication date: 2021-02-02

Abstract

The invention provides a service unloading method, a device and a system, which relate to the technical field of communication, and the method comprises the following steps: acquiring service data to be unloaded, resource data, an initialized policy network and an initialized evaluation network; calculating a time delay parameter of the service data to be unloaded according to the resource data; generating a first service unloading result according to the service data to be unloaded and the initialized policy network; generating a first evaluation result of a first service unloading result according to the time delay parameter and the initialized evaluation network; updating a first gradient parameter of the policy network and a second gradient parameter of the evaluation network by using a depth deterministic policy gradient algorithm according to the first service unloading result and the first evaluation result; and generating a target service unloading result of the service data to be unloaded in the resource data according to the updated strategy network and the updated evaluation network. The invention fully utilizes the resource of fog calculation and provides low-delay response service for the service of the Internet of things.

Description

Service unloading method, device and system

Technical Field

The present invention relates to the field of communications technologies, and in particular, to a method, an apparatus, and a system for offloading services.

Background

Under the background of rapid development of a 5G network, the application range of the Internet of things is wider and wider, and more new services are provided for work and life of people. However, part of the internet of things services are sensitive to the calculation delay, and higher requirements are provided for the performance of the internet of things. The fog computing has the advantage of low time delay, and is gradually used as a supplement of a cloud computing technology, so that the fog computing is more and more widely applied to the Internet of things. Under the background, how to fully utilize the resource of the fog calculation and provide a lower delay response service for the service of the internet of things becomes an important research content.

Disclosure of Invention

The invention provides a service unloading method, a device and a system, which can solve the problem of large delay of network service unloading in a fog computing environment.

In a first aspect, an embodiment of the present invention provides a service offloading method, where the method includes: acquiring service data to be unloaded, resource data, an initialized policy network and an initialized evaluation network; calculating a time delay parameter of the service data to be unloaded according to the resource data; generating a first service unloading result according to the service data to be unloaded and the initialized policy network; generating a first evaluation result of the first service unloading result according to the time delay parameter and the initialized evaluation network; updating a first gradient parameter of the policy network and a second gradient parameter of the evaluation network by using a depth deterministic policy gradient algorithm according to the first service unloading result and the first evaluation result; and generating a target service unloading result of the service data to be unloaded according to the updated strategy network and the updated evaluation network.

In a second aspect, an embodiment of the present invention further provides a service offloading device, where the device includes: the acquisition module is used for acquiring service data to be unloaded, resource data, an initialized policy network and an initialized evaluation network; the time delay module is used for calculating a time delay parameter of the service data to be unloaded according to the resource data; the first generation module is used for generating a first service unloading result according to the service data to be unloaded and the initialized policy network; a second generation module, configured to generate a first evaluation result of the first service offload result according to the delay parameter and the initialized evaluation network; the updating module is used for updating a first gradient parameter of the strategy network and a second gradient parameter of the evaluation network by using a depth certainty strategy gradient algorithm according to the first service unloading result and the first evaluation result; and the result module is used for generating a target service unloading result of the service data to be unloaded according to the updated strategy network and the updated evaluation network.

In a third aspect, an embodiment of the present invention further provides a service offloading system, where the system includes: the system comprises a cloud service main control module, a regional SDN control module, an edge node module, a data module, an equipment module and any one of the service unloading devices; the service unloading device is in communication connection with the cloud service main control module, the regional SDN control module, the node module, the data module and the equipment module respectively.

In a fourth aspect, an embodiment of the present invention further provides a computer device, including a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor implements the service uninstalling method when executing the computer program.

In a fifth aspect, an embodiment of the present invention further provides a computer-readable storage medium, where a computer program for executing the service uninstalling method is stored in the computer-readable storage medium.

The embodiment of the invention has the following beneficial effects: the embodiment of the invention provides a service unloading scheme, which comprises the steps of obtaining service data to be unloaded, resource data, an initialized policy network and an initialized evaluation network; calculating a time delay parameter of the service data to be unloaded according to the resource data; generating a first service unloading result according to the service data to be unloaded and the initialized policy network; generating a first evaluation result of the first service unloading result according to the time delay parameter and the initialized evaluation network; updating a first gradient parameter of the policy network and a second gradient parameter of the evaluation network by using a depth deterministic policy gradient algorithm according to the first service unloading result and the first evaluation result; and generating a target service unloading result of the service data to be unloaded according to the updated strategy network and the updated evaluation network. According to the method and the device, a time delay parameter is obtained through calculation according to resource data, the time delay parameter is used for limiting time delay of different types of resource data, the time delay parameter is used for obtaining a first service unloading result and a first evaluation result, then, a strategy network and an evaluation network are updated through a depth certainty strategy gradient algorithm, a target service unloading result is obtained through the updated network, resources of fog calculation are fully utilized, and low-time-delay response service is provided for the service of the Internet of things.

Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.

In order to make the aforementioned and other objects, features and advantages of the present invention comprehensible, preferred embodiments accompanied with figures are described in detail below.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.

Fig. 1 is a flowchart of a service offloading method according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of a service offload system architecture according to an embodiment of the present invention;

fig. 3 is a schematic diagram illustrating steps of a service offloading method according to an embodiment of the present invention;

FIG. 4 is a diagram illustrating the effect of the maximum distance between adjacent cells on the average data transmission rate according to an embodiment of the present invention;

FIG. 5 is a schematic diagram illustrating the effect of the number of devices connected to each compute node on the average data transmission rate according to an embodiment of the present invention;

fig. 6 is a block diagram of a service offloading device according to an embodiment of the present invention;

FIG. 7 is a block diagram of another service offload device according to an embodiment of the present invention;

fig. 8 is a block diagram of a computer device according to an embodiment of the present invention.

Detailed Description

To make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is apparent that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

At present, in the first existing scheme, the energy consumption of the fog computing nodes is solved, a distributed computing service unloading algorithm is provided by adopting a game theory and an intelligent optimization theory, and the energy consumption of the fog computing nodes is well reduced. In the second conventional scheme, a service unloading algorithm for reducing time delay is provided by aiming at solving the time delay of a plurality of computing tasks and adopting a cooperation mode among the computing tasks, so that the time delay of task execution is well reduced, and the energy consumption of edge nodes is reduced. In the third existing scheme, the energy consumption of the fog computing node is reduced, the task execution delay is delayed, and the task node payment cost is reduced as a comprehensive optimization target, a deep reinforcement learning algorithm is adopted, an optimized service unloading algorithm is provided, and a good effect is achieved in an experiment. In the prior art, the problem is modeled and solved by using an intelligent optimization algorithm with the aim of solving the high-definition video signal processing time delay as an objective and with the aims of minimizing energy consumption and optimizing low time delay as an optimization objective. In the prior art, the network environment density and complexity are taken as research backgrounds, a combined allocation algorithm of service unloading and resource scheduling in the network intensive environment is provided, the resource allocation problem in the network intensive environment is well solved, and the task calculation delay is reduced. In the prior art, the sixth scheme takes reducing the competition duration of user resources as an entry point, provides a service scheduling algorithm, optimizes a user resource allocation mechanism, and reduces the calculation delay of user tasks.

Known from the existing scheme, the existing scheme has achieved better results in the aspects of time delay, energy consumption and the like of service unloading. However, in the existing scheme, in terms of the fog computing environment, factors in aspects of service types, types of edge nodes and the like are less considered, and the service unloading problem in the whole network environment is not considered, particularly, factors such as large network environment range and dynamic change are not considered. Therefore, a new service offloading algorithm is urgently needed, and an algorithm for minimizing service offloading delay under the scenes of comprehensively considering network dynamic change, network scale, service type and the like is available.

Based on the above, the service offloading method, the device and the system provided by the embodiment of the invention provide a layered internet of things architecture in a large-scale fog computing environment, and according to the characteristics of tasks, four service offloading modes are provided, so that a service offloading algorithm based on a depth deterministic policy gradient is designed with the aim of minimizing service offloading delay.

To facilitate understanding of the present embodiment, a detailed description is first given of a service offloading method disclosed in the present embodiment.

An embodiment of the present invention provides a service offloading method, referring to a flow chart of a service offloading method shown in fig. 1, the method including the following steps:

step S102, acquiring service data to be unloaded, resource data, an initialized policy network and an initialized evaluation network.

In the embodiment of the invention, the service data to be unloaded is task data acquired according to the sensors of the equipment layer, and the sensors of the equipment can comprise intelligent factory sensors, intelligent market sensors, intelligent home sensors and the like. The resource data is device information in an internet of things service unloading management architecture, and can be used for determining information such as computing resources, transmission resources and idle resources. The policy network is used for generating unloading results of the service data to be unloaded on each Internet of things service unloading device, and the evaluation network is used for evaluating the unloading results. The initialized policy network and the initialized evaluation network are obtained by initializing the policy network and the evaluation network by using the initial parameters.

And step S104, calculating a time delay parameter of the service data to be unloaded according to the resource data.

In the embodiment of the present invention, the delay parameter is used to determine the execution duration of the service data to be offloaded on the device corresponding to the resource data. According to different types of resource data such as service types, types of edge nodes, network environments and the like, the time delay parameter can be calculated, so that time delay limitation on different types of resource data and different unloading modes is realized by utilizing the time delay parameter.

And step S106, generating a first service unloading result according to the service data to be unloaded and the initialized policy network.

In the embodiment of the invention, the task data to be processed can be determined according to the service data to be unloaded, and the first service unloading result can be obtained by combining the task data to be processed and the initialized policy network, wherein the first service unloading result is a service unloading scheme obtained preliminarily.

Step S108, generating a first evaluation result of the first service unloading result according to the time delay parameter and the initialized evaluation network.

In the embodiment of the present invention, the first evaluation result is used to evaluate the first service offloading result, and the first evaluation result obtained based on the delay parameter is used to evaluate the first service offloading result, so as to screen out a service offloading scheme capable of providing a low-delay response.

Step S110, according to the first service offloading result and the first evaluation result, updating a first Gradient parameter of the Policy network and a second Gradient parameter of the evaluation network by using a Deep Deterministic Policy Gradient algorithm (DDPG).

In the embodiment of the invention, a function required by a calculation process of a depth certainty strategy gradient algorithm is obtained according to a first service unloading result and a first evaluation result, and a first gradient parameter of a strategy network and a second gradient parameter of an evaluation network are updated by using the depth certainty strategy gradient algorithm.

And step S112, generating a target service unloading result of the service data to be unloaded in the resource data according to the updated policy network and the updated evaluation network.

In the embodiment of the invention, after the first gradient parameter of the policy network and the second gradient parameter of the evaluation network are updated, the updated policy network and the updated evaluation network are obtained, the updated policy network and the updated evaluation network are used for obtaining the updated service unloading result and the updated evaluation result, the updated service unloading result is screened according to the updated evaluation result to obtain the target service unloading result, and the target service unloading result is the final unloading result of the service data to be unloaded.

The embodiment of the invention provides a service unloading scheme, which comprises the steps of obtaining service data to be unloaded, resource data, an initialized policy network and an initialized evaluation network; calculating a time delay parameter of the service data to be unloaded according to the resource data; generating a first service unloading result according to the service data to be unloaded and the initialized policy network; generating a first evaluation result of the first service unloading result according to the time delay parameter and the initialized evaluation network; updating a first gradient parameter of the policy network and a second gradient parameter of the evaluation network by using a depth deterministic policy gradient algorithm according to the first service unloading result and the first evaluation result; and generating a target service unloading result of the service data to be unloaded according to the updated strategy network and the updated evaluation network. According to the method and the device, a time delay parameter is obtained through calculation according to resource data, the time delay parameter is used for limiting time delay of different types of resource data, the time delay parameter is used for obtaining a first service unloading result and a first evaluation result, then, a strategy network and an evaluation network are updated through a depth certainty strategy gradient algorithm, a target service unloading result is obtained through the updated network, resources of fog calculation are fully utilized, and low-time-delay response service is provided for the service of the Internet of things.

In order to improve the data processing efficiency, before acquiring the initialized policy network and the initialized evaluation network, the following steps may be further performed:

randomly generating an initial first gradient parameter and an initial second gradient parameter; initializing the strategy network by utilizing the initial first gradient parameter to obtain an initialized strategy network; and initializing the evaluation network by using the initial second gradient parameter to obtain the initialized evaluation network.

In the embodiment of the invention, the first gradient parameter is an initial parameter in the policy network, the second gradient parameter is an initial parameter in the evaluation network, and the policy network and the evaluation network are initialized by respectively using the first gradient parameter and the initial second gradient parameter to obtain the initialized policy network and the initialized evaluation network.

When the DDPG algorithm is used to solve the optimized service offloading policy, an Actor (policy) -Critic (evaluation) architecture is adopted. The Actor refers to a policy network and comprises an Actor _ M network and an Actor _ T network, wherein the main function of the Actor _ T network is to generate a training data set, and the main function of the Actor _ M network is to train parameters of an optimization network. Critic refers to an evaluation network, and comprises a Critic _ M network and a Critic _ T network, wherein the Critic _ T network mainly functions to generate a training data set, and the Critic _ M network mainly functions to train optimization network parameters.

In order to effectively reduce the task delay of service unloading and research the dynamic property of network topology as a variable, the resource data comprises calculation resource data and transmission resource data; calculating the delay parameter of the service data to be unloaded according to the resource data, which can be executed according to the following steps:

acquiring transmission delay data; calculating the execution duration of the service data to be unloaded according to the transmission delay data, the calculation resource data and the transmission resource data; and determining a time delay parameter of the service data to be unloaded according to the execution time length.

In the embodiment of the invention, the task delay consists of two parts, namely calculation delay and transmission delay. And pre-calculating transmission delay data, and then calculating the execution duration of the service data to be unloaded according to the transmission delay data, the calculation resource data and the transmission resource data.

In order to effectively reduce the time delay of service unloading and take the dynamic property of network topology as a variable for research, the embodiment of the invention divides the service unloading into four modes of idle equipment unloading, edge node unloading, cloud unloading and local unloading. The idle device unloading refers to unloading the service to a computing node which meets task requirements in a device layer, the edge node unloading refers to unloading the service to an edge node of the edge node layer, the cloud unloading refers to unloading the service to a corresponding server in a cloud service main control layer, and the local unloading refers to unloading the service to a computing node which meets the task requirements in the local area of the device layer. The idle device unloading and local unloading conditions are more suitable for the environment with small computing task requirements. The unloading of the edge node is more suitable for the environment with large computing task requirement and sensitive time delay. The cloud unloading is suitable for the environment with high computing task requirement and insensitive time delay. For the four service offload modes, only the local service offload mode does not need transmission delay, and the other three service offload modes all need transmission delay to be considered.

In order to obtain more reliable transmission delay data, the following formula may be used to generate the transmission delay data:

wherein, TR_D,XRepresenting transmission delay data, X representing cloud computing service, sensor terminal or edge computing node, TD_D,XIndicating the distance between the node of the task and the node receiving the task, GS indicating the Gaussian white noise power, TH_D,XRepresenting network messagesThe fading factor of the track, B represents the transmission bandwidth of the network, delta is the loss factor representing the network line,

indicating the power of the node accepting the task,

indicating the transmission line loss between the mission node and the node receiving the mission.

In order to ensure the calculation efficiency, the execution duration of the service data to be offloaded may be calculated according to the following formula based on the transmitted delay data, the calculation resource data, and the transmission resource data:

wherein,

indicating the duration of the execution of the task on the idle device,

indicating the computing power of the idle device, R_D2DRepresenting the transmission delay data from node D to node D,

indicating the duration of execution of a task on a local device，

Which represents the computing power of the local device,

indicating the duration of execution of the task on the edge device,

representing the computing power of the edge device, R_D2MRepresenting the transmission delay data from node D to the fog computing node M,

representing the execution duration of the task on the cloud device, f_RRepresenting computing power of cloud-end equipment, R_D2RRepresenting the transmission delay data from node D to cloud node R,

a set of fog computing node resources is represented,

representing a set of sensor nodes, D_jRepresenting sensor nodes, Q_pRepresenting computing resource data, S_pRepresenting transmission resource data.

In order to ensure that the time delay parameter is applicable to different unloading mode degrees, the time delay parameter of the service data to be unloaded is determined according to the execution duration by using the following formula:

wherein,

a latency parameter representing the service data to be offloaded,

a parameter representing a local offload mode is indicated,

indicating a fog computing node offload mode parameter,

a cloud offload mode parameter is represented that,

a parameter indicative of an idle device offload mode,

indicating the duration of the execution of the task on the local device,

indicating the duration of execution of the task on the edge device,

represents the execution time of the task on the cloud device,

indicating the duration of execution of the task on the idle device, K_pRepresenting the subtasks in the service data to be offloaded.

In order to obtain an optimized service unloading result, the method generates a first service unloading result according to the service data to be unloaded and the initialized policy network by using the following formula:

wherein,

a set of actions is represented that is,

representing actions performed at the edge compute node,

indicating the action to be performed at the idle device,

indicating a fog computing node offload mode parameter,

indicating an idle device offload mode parameter, Q_pRepresenting computing resource data.

In order to evaluate the first service offloading result, a first evaluation result of the first service offloading result is generated according to the time delay parameter and the initialized evaluation network by using the following formula:

wherein S is_tThe state space is represented by a representation of,

indicating that the edge compute node is used for offloading,

indicating that the use of the idle device is off-load,

which represents the total delay of the service,

to representThe delay parameter of the service data to be offloaded,

presentation service K_pThe time required for all local executions, r(s)_t,a_t) Representing a reward function, V_π(s) represents a value function, s represents a state, and π represents a strategy.

In order to improve the efficiency and accuracy of updating the neural network, according to the first service offloading result and the first evaluation result, updating a first gradient parameter of the policy network and a second gradient parameter of the evaluation network by using a depth deterministic policy gradient algorithm, which may be performed according to the following steps:

determining an action value function according to the first service unloading result and the first evaluation result; and updating a first gradient parameter of the policy network and a second gradient parameter of the evaluation network by using a depth deterministic policy gradient algorithm according to the action value function.

In order to improve the calculation efficiency, an action value function is determined according to the first service unloading result and the first evaluation result according to the following formula:

wherein Q is_π(s, a) represents an action value function, mu represents a strategy generated by the strategy network, gamma represents an attenuation factor, the value range is 0 to 1, s' represents the next state after s takes a, a represents action, and Q represents the next state after s takes a^μ(s_t,a_t) Representing the action value function obtained according to the bellman equation.

In order to improve the calculation accuracy, a first gradient parameter of the strategy network and a second gradient parameter of the evaluation network are updated by a depth certainty strategy gradient algorithm according to an action value function according to the following formula:

wherein m represents the serial number of the sensor node,

is the first gradient parameter that is to be measured,

is the second gradient parameter that is to be measured,

is the updated first gradient parameter and,

is the updated second gradient parameter,

the status is represented by a number of time slots,

represents an action, Q_iRepresents a motion value function, Q ', obtained from Bellman equation'_iRepresents the updated action value function obtained according to the bellman equation,

indicating the updated state, gamma indicates the decay factor,the function J is used to measure the behavior of the strategy mu, p^βRepresenting the distribution function of the state s.

Referring to the schematic diagram of the implementation steps of the service offloading method shown in fig. 3, the method mainly includes the following steps: designing a layered Internet of things service unloading management architecture in a fog computing environment; constructing a service unloading decision model based on DDPG in a fog computing environment; solving the service unloading algorithm scheme design for minimizing the service unloading delay; a service offload policy is executed. The procedure for carrying out this embodiment is described below in a specific example.

Referring to fig. 2, in the hierarchical internet of things service offloading management architecture in the step-designed fog computing environment, the hierarchical internet of things service offloading management architecture includes, from top to bottom, a cloud service master control layer, a regional SDN control layer, an edge node layer, a data layer, and a device layer.

The function and construction of each layer is described in detail below. (1) The cloud service master control layer mainly comprises two aspects, one is to provide services for various tasks in the environment of the Internet of things, and the other is to perform unified management on SDN controllers in various areas. In order to complete the two functions, the system mainly comprises a web server, an application server, a database server and an SDN main controller. The SDN main controller completes the unified management function of each SDN area control. (2) And the SDN control layer of each area mainly completes resource scheduling and service unloading of computing resources and task requirements in each area. Each zone includes an SDN control that obtains computing resource conditions from edge nodes and computing resource requirements from task requesters. (3) And the edge node layer mainly completes the calculation and storage functions. The computing nodes complete computing functions and are usually located near the task demand nodes; the storage node completes the storage function of accessing data for a high frequency. By reasonably arranging the computing nodes and the storage nodes, the computing and storage access time delay of tasks can be effectively reduced, and the requirement of network bandwidth is remarkably reduced. (4) And the data layer mainly completes the functions of acquiring the task request data of the sensor node of the Internet of things and transmitting the task request data to the edge node. The data layer mainly comprises a domain controller, a wireless access point and a high-reliability industrial switch. The domain controller mainly completes the functions of acquiring the requirements and requesting the resources. (5) The equipment layer mainly refers to the sensor network environment of each application scene, and mainly comprises sensor networks in intelligent environments such as factories, markets, homes and the like.

In the aspect of network environment, the network environment of the Internet of things is divided into N unit areas, and { Cell is used₁,Cell₂,…Cell_NRepresents it. The cloud computing service is represented by R. Cell in each Cell area_iIncluding m sensor terminals (using)

Representation), n fog computing nodes (usage)

Representation).

In terms of computing tasks, K is used to represent a service request, each task request comprising p subtasks K_pUsing K_p＝(Q_p,S_p,T_m) And (4) showing. Wherein Q is_pRepresenting the computational resources required by the task, S_pIndicating the transmission resources, T, required by the task_mIndicating the task's constraints on latency. The task delay consists of two parts, namely calculation delay and transmission delay. For the four service offload modes, only the local service offload mode does not need transmission delay, and the other three service offload modes all need transmission delay to be considered.

Transmission delay usage formula

Performing a calculation wherein TR_D,XRepresenting transmission delay data, X representing any one of cloud computing service, sensor terminal or edge computing node, TD_D,XIndicating the distance between the node of the task and the node receiving the task, GS indicating the Gaussian white noise power, TH_D,XTo representThe fading factor of the network channel, B represents the transmission bandwidth of the network, delta is the loss factor representing the network line,

indicating the power of the node accepting the task,

In the step of constructing a DDPG-based service unloading decision model in a fog computing environment, an SDN controller needs to unload services, firstly, a task demand condition and a fog computing resource condition need to be acquired, and then, an intelligent optimization algorithm is adopted to unload the services. In order to effectively reduce the time delay of service unloading and take the dynamic property of network topology as a variable for research, the invention divides the service unloading into four modes of idle equipment unloading, edge node unloading, cloud unloading and local unloading.

Representation of four service offloads: for the computation subtask K_pThe four service offloads are respectively expressed as: (1) idle device offload mode usage

And (4) showing. (2) Local offload mode usage

Represents; (3) fog computing node offload mode usage

And (4) showing. (4) Cloud offload mode usage

And (4) showing.

Task latency for four service offloads: for the computation subtask K_pThe task delays for the four service offloading are described below. Task delay usage formula for idle device offload mode

It is shown that, among others,

representing the execution time of the task on the idle device,

indicating the computing power of the idle device. Local offload mode usage formula

It is shown that, among others,

representing the execution time of the task on the local device,

representing the computing power of the local device. Fog calculation node unloading mode use formula

Is shown in which

Representing the execution time of the task on the edge device,

representing the computing power of the edge device. Cloud offload mode usage formula

It is shown that, among others,

representing the execution time of a task on a cloud device, f_RRepresenting the computing power of the cloud device. In summary, task K_pThe total calculated delay can be formulated

And (6) performing calculation.

It should be noted that, in the following description,

and

is 0 or 1, when the value is 0, it indicates that the unloading mode corresponding to the coefficient is not adopted, and when the value is 1, it indicates that the unloading mode corresponding to the coefficient is adopted.

In the step of solving the service unloading algorithm scheme for minimizing the service unloading delay, the solution is carried out based on a depth certainty strategy gradient algorithm in the deep reinforcement learning, so that each intelligent sensor of the Internet of things can obtain the optimized service unloading scheme. Considering that the local offload mode can perform service offload through internal calculation of the sensor, and the embodiment of the present invention mainly solves delay-sensitive service offload, therefore, the embodiment of the present invention mainly studies two offload modes, namely idle device offload and edge node offload.

1) Service unloading decision model based on DDPG

When the DDPG algorithm is used for solving the optimized service unloading strategy, an Actor-criticc framework is adopted. The Actor refers to a policy network and comprises an Actor _ M network and an Actor _ T network, wherein the main function of the Actor _ T network is to generate a training data set, and the main function of the Actor _ M network is to train parameters of an optimization network. Critic refers to an evaluation network, and comprises a Critic _ M network and a Critic _ T network, wherein the Critic _ T network mainly functions to generate a training data set, and the Critic _ M network mainly functions to train optimization network parameters. .

Deep reinforcement learning refers to learning an action strategy by using a deep neural network on the basis of traditional reinforcement learning. In general, the reinforcement learning model includes four main elements, namely environment, state s, strategy pi and action a. And (3) optimizing the four main elements by using the Agent of the Agent in the reinforcement learning model to obtain the reward r as an evaluation standard. For convenience of description, at time t, s is used_tIndicating the state at time t, using s_t+1Indicating the state at time t + 1. Wherein s is_t+1Is that Agent selects action a according to policy π_tThe next state entered after execution in the environment. To evaluate the effectiveness of a policy, the return of an Agent using the policy's actions when executed in the environment is defined as the reward r_t. The following is a detailed description of each of the key elements.

2) Deep reinforcement learning factor modeling

In the aspect of state space, the states of the fog computing nodes and the idle equipment terminals around the sensor equipment form the state space. For fog computing nodes, use

Representing the remaining resource sets of each edge node within the region, wherein,

for idle device terminals, use

Indicating the set of remaining resources for each free device within the area, wherein,

based on this, the formula is used

A state space is represented in which, among other things,

the total time delay of the service is represented by the following calculation method

Defining service offload decisions as a formula in terms of action space

Wherein,

in terms of reward functions, the reward functions are defined as formulas

Wherein,

presentation service K_pAll the time required for local execution,

indicating that the service task is in state s_tTaking action of_tThe latter execution duration.

In terms of value functions, formulas are used

Calculated value V_π(s) a function representing the expectation of the reward r of the strategy pi at the initial state s. Wherein,

representing the desired operation.

Based on this, a formula can be used

Calculating a motion value function Q_π(s, a), gamma represents the attenuation factor, the value range is 0 to 1, and s' represents the next state after s takes a. To simplify the computation of the action value function, known from Bellman Equation, according to the public

The formula Q can be obtained^μ(s_t,a_t)＝E[r(s_t,a_t)+γQ^μ(s_t+1,μ(s_t+1))]. Where μ denotes a policy generated by the Actor network.

3) Depth-deterministic policy gradients

The strategy gradient is an optimized search algorithm based on gradient, and in the invention, gradient update parameters of a Critic _ M network and an Actor _ M network are as follows

And

the gradient update parameters of the Critic _ T network and the Actor _ T network are

And

the calculation process of the Loss function is defined as a formula

Wherein,

using the formula

And (4) calculating.

Based on this, the gradient is calculated as formula

Wherein the function J uses the formula

Calculations to measure the performance of the strategy mu. Rho^βRepresenting the distribution function of the state s. Therefore, the formula

Is defined as the value of p in terms of s in the state space^βDistributed computation Q^μ(s, μ (s)) as expected.

When the Actor network interacts with the environment, in order to prevent the problem that the neural network cannot be converged due to the correlation of the conversion data sequence in time, the embodiment of the invention stores the conversion data sequence in the experience pool and then performs random sampling based on the mini-batch, thereby realizing the time irrelevance of the data.

4) Service unloading algorithm based on DDPG

A Service Offloading algorithm (Service Offloading algorithm) based on DDPG proposed by the embodiment of the present invention is shown in table 1. The method comprises two processes of model initialization and optimal service unloading strategy solving for each area of the network environment. The model initialization steps include steps 1 to 7 in table 1, and mainly initialize four network models and key parameters in the Actor-critical architecture. The step of solving the optimized service unloading strategy in each area of the network environment comprises the steps from 8 th to 25 th in the table 1, and the optimal service unloading strategy is mainly solved within the iteration number.

TABLE 1

In order to verify the performance of the algorithm, a network environment of fog computing is developed by adopting Python language. Based on the zone-dividing idea of the present invention, the network environment is divided into 25 cells. The minimum distance between the individual units is set to 10 meters. In each cell, 2 idle devices and 2 fog computing nodes are set. Where the idle device calculation frequency is 200 and 150. The calculation frequency of the fog calculation node is 650 and 600. The transmission power of the devices is subject to a uniform distribution between (5, 38). In terms of services that need to be offloaded, 100 services that need to be offloaded are randomly generated, with an even distribution of computing resource services (100, 800) required for each service.

In the aspect of parameter setting, the minimum batch in the experience pool is set to be 500, the Actor _ M network learning rate is set to be 0.0001, the criticic _ M network learning rate is set to be 0.001, the gaussian noise power during transmission delay calculation is set to be-114 dbm, the capacity of the memory pool is set to be 50000, the maximum iteration number is set to be 5000, and the exploration number is set to be 10. Considering that the existing Distributed Cooperative Task Offloading (DCTO) based on a plurality of edge nodes better realizes the calculation service Offloading of the intelligent terminal, the sooddppg algorithm of the present invention is compared with the existing DCTO algorithm in the experimental part.

The two algorithms are compared in terms of average data transmission rate, service offloading success rate, and task execution time.

In terms of average data transmission rate, a comparison is made in two dimensions from the maximum distance between adjacent units, the number of devices connected per compute node.

The effect of the maximum distance between adjacent cells on the average data transmission rate is shown in fig. 4. Wherein the X-axis indicates that the maximum distance between adjacent cells varies between 100m and 1000 m. The Y-axis represents the average data transmission rate (bps). It can be seen from the figure that as the maximum distance between adjacent cells increases, the average data transmission rate under both algorithms increases. This is because the network size increases, mutual interference in the network environment decreases, and the data transmission rate increases. Under various circumstances, the average data transfer rate of the inventive algorithm SOoDDPG is greater than the average data transfer rate of the algorithm DCTO. The algorithm can realize the service unloading in a relatively optimized mode, reduces the interference among network equipment and further improves the data transmission rate.

The effect of the number of devices connected per compute node on the average data transfer rate is shown in fig. 5. Where the X-axis represents the number of devices connected per compute node varying between 1 and 8. The Y-axis represents the average data transmission rate (bps). As can be seen from the figure, as the number of devices connected to each node increases, the average data transmission rate under both algorithms decreases rapidly, which means that the number of devices increases, which easily causes the load of the computing node to increase, thereby decreasing the average data transmission rate. However, in the context of different device numbers, the algorithm SOoDDPG of the present invention achieves better results than the algorithm DCTO.

The embodiment of the invention provides a service unloading method, a device and a system, wherein the method is used for solving based on a depth certainty strategy gradient algorithm in deep reinforcement learning, so that each intelligent sensor of the Internet of things can obtain an optimized service unloading scheme, the method has good application effect and performance, and the problem of large service unloading delay in a large-scale network is solved well.

The embodiment of the invention also provides a service unloading device, which is described in the following embodiment. Because the principle of the device for solving the problems is similar to the service unloading method, the implementation of the device can refer to the implementation of the service unloading method, and repeated details are not repeated. Referring to fig. 6, a block diagram of a service offloading device is shown, which includes:

an obtaining module 61, configured to obtain service data to be offloaded, resource data, an initialized policy network, and an initialized evaluation network; the time delay module 62 is configured to calculate a time delay parameter of the service data to be unloaded according to the resource data; a first generating module 63, configured to generate a first service offloading result according to the service data to be offloaded and the initialized policy network; a second generating module 64, configured to generate a first evaluation result of the first service offloading result according to the delay parameter and the initialized evaluation network; an updating module 65, configured to update a first gradient parameter of the policy network and a second gradient parameter of the evaluation network by using a depth deterministic policy gradient algorithm according to the first service offload result and the first evaluation result; and the result module 66 is configured to generate a target service offloading result of the service data to be offloaded according to the updated policy network and the updated evaluation network.

In one embodiment, referring to another structural block diagram of the service offloading device shown in fig. 7, the device further includes an initialization module 67 configured to: randomly generating an initial first gradient parameter and an initial second gradient parameter; initializing the strategy network by utilizing the initial first gradient parameter to obtain an initialized strategy network; initializing the evaluation network by using the initial second gradient parameter to obtain the initialized evaluation network

In one embodiment, the resource data includes computational resource data and transmission resource data; a delay module, specifically configured to: acquiring transmission delay data; calculating the execution duration of the service data to be unloaded according to the transmission delay data, the calculation resource data and the transmission resource data; and determining a time delay parameter of the service data to be unloaded according to the execution time length.

In an embodiment, before the obtaining the transmission delay data, the delay module is further configured to generate the transmission delay data according to the following formula:

wherein, TR_D,XRepresenting transmission delay data, X representing cloud computing service, sensor terminal or edge computing node, TD_D,XIndicating the distance between the node of the task and the node receiving the task, GS indicating the Gaussian white noise power, TH_D,XRepresenting the fading factor of the network channel, B representing the transmission bandwidth of the network, delta is the loss factor representing the network line,

indicating the power of the node accepting the task,

In an embodiment, the delay module is further configured to calculate an execution duration of the service data to be offloaded according to the transmission delay data, the calculation resource data, and the transmission resource data according to the following formula:

wherein,

indicating the duration of the execution of the task on the idle device,

indicating the duration of the execution of the task on the local device,

which represents the computing power of the local device,

indicating the duration of execution of the task on the edge device,

a set of fog computing node resources is represented,

In an embodiment, the delay module is further configured to determine a delay parameter of the service data to be offloaded according to the execution duration by using the following formula:

wherein,

a latency parameter representing the service data to be offloaded,

a parameter representing a local offload mode is indicated,

indicating a fog computing node offload mode parameter,

a cloud offload mode parameter is represented that,

a parameter indicative of an idle device offload mode,

indicating the duration of the execution of the task on the local device,

indicating the duration of execution of the task on the edge device,

represents the execution time of the task on the cloud device,

In an embodiment, the first generating module is specifically configured to: according to the service to be offloaded, using the following formulaThe data and the initialized policy network generate a first service offload result:

wherein,

a set of actions is represented that is,

representing actions performed at the edge compute node,

indicating the action to be performed at the idle device,

indicating a fog computing node offload mode parameter,

In an embodiment, the second generating module is specifically configured to: generating a first evaluation result of the first service offloading result according to the time delay parameter and the initialized evaluation network by using the following formula:

wherein S is_tThe state space is represented by a representation of,

indicating that the edge compute node is used for offloading,

indicating that the use of the idle device is off-load,

which represents the total delay of the service,

a latency parameter representing the service data to be offloaded,

In one embodiment, the update module is specifically configured to: according to the first service unloading result and the first evaluation result, updating a first gradient parameter of the policy network and a second gradient parameter of the evaluation network by using a deep deterministic policy gradient algorithm, wherein the method comprises the following steps: determining an action value function according to the first service unloading result and the first evaluation result; and updating a first gradient parameter of the policy network and a second gradient parameter of the evaluation network by using a depth deterministic policy gradient algorithm according to the action value function.

In one embodiment, the update module is specifically configured to: determining an action value function according to the first service uninstalling result and the first evaluation result according to the following formula:

Q^μ(s_t,a_t)＝E[r(s_t,a_t)+γQ^μ(s_t+1,μ(s_t+1))]

In one embodiment, the update module is specifically configured to: updating a first gradient parameter of the policy network and a second gradient parameter of the evaluation network by using a depth deterministic policy gradient algorithm according to the action value function according to the following formula:

wherein m represents the serial number of the sensor node,

is the first gradient parameter that is to be measured,

is the second gradient parameter that is to be measured,

is the updated first gradient parameter and,

is the updated second gradient parameter,

the status is represented by a number of time slots,

represents an action, Q_iTo representMotion value function Q 'obtained according to Bellman equation'_iRepresents the updated action value function obtained according to the bellman equation,

representing the updated state, gamma represents the decay factor, and the function J is used to measure the behavior of the strategy mu, ρ^βRepresenting the distribution function of the state s.

The embodiment of the invention also provides a service unloading system, which comprises a cloud service main control module, a service unloading module and a service unloading module, wherein the cloud service main control module comprises an SDN main controller, a data server, a Web server and an application server; the regional SDN control module comprises a plurality of SDN controllers; the node module comprises a plurality of computing nodes and cache nodes; the data module comprises a switch, a wireless access unit and a domain controller; the equipment module includes a plurality of sensors.

In the embodiment of the present invention, referring to the schematic diagram of the service offloading system architecture shown in fig. 2, a cloud service master control module may be set as a cloud service master control layer, and the main functions include two aspects, one is to provide services for various tasks in the environment of the internet of things, and the other is to perform unified management on SDN controllers in various areas. In order to complete the two functions, the system mainly comprises a Web server, an application server, a database server and an SDN main controller. The SDN main controller completes the unified management function of controlling each SDN area.

The regional SDN control module can be set as a regional SDN control layer and is mainly used for completing resource scheduling and service unloading of computing resources and task requirements in each region. Each zone includes an SDN control that obtains computing resource conditions from edge nodes and computing resource requirements from task requesters.

The node modules may be configured to perform mainly computing and storage functions for the edge node layer. The computing nodes complete computing functions and are usually located near the task demand nodes; the storage node completes the storage function of accessing data for a high frequency. By reasonably arranging the computing nodes and the storage nodes, the computing and storage access time delay of tasks can be effectively reduced, and the requirement of network bandwidth is remarkably reduced.

The data module can be set as a data layer and mainly achieves the functions of acquiring task request data of the sensor node of the Internet of things and transmitting the task request data to the edge node. The data layer mainly comprises a domain controller, a wireless access point and a high-reliability industrial switch. The domain controller mainly completes the functions of acquiring the requirements and requesting the resources.

The device module can be set as a device layer, mainly refers to the sensor network environment of each application scene, and mainly comprises sensor networks in intelligent environments such as factories, markets, homes and the like.

An embodiment of the present invention further provides a computer device, referring to the schematic block diagram of the structure of the computer device shown in fig. 8, where the computer device includes a memory 81, a processor 82, and a computer program stored in the memory and executable on the processor, and the processor implements any of the steps of the service uninstalling method when executing the computer program.

It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working process of the computer device described above may refer to the corresponding process in the foregoing method embodiment, and is not described herein again.

An embodiment of the present invention further provides a computer-readable storage medium, where a computer program for executing any one of the above-mentioned service offloading methods is stored in the computer-readable storage medium.

As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

Finally, it should be noted that: the above-mentioned embodiments are only specific embodiments of the present invention, which are used for illustrating the technical solutions of the present invention and not for limiting the same, and the protection scope of the present invention is not limited thereto, although the present invention is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: any person skilled in the art can modify or easily conceive the technical solutions described in the foregoing embodiments or equivalent substitutes for some technical features within the technical scope of the present disclosure; such modifications, changes or substitutions do not depart from the spirit and scope of the embodiments of the present invention, and they should be construed as being included therein. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. A method of service offloading, comprising:

acquiring service data to be unloaded, resource data, an initialized policy network and an initialized evaluation network;

calculating a time delay parameter of the service data to be unloaded according to the resource data;

generating a first service unloading result according to the service data to be unloaded and the initialized policy network;

generating a first evaluation result of the first service unloading result according to the time delay parameter and the initialized evaluation network;

updating a first gradient parameter of the policy network and a second gradient parameter of the evaluation network by using a deep deterministic policy gradient algorithm according to the first service offload result and the first evaluation result;

and generating a target service unloading result of the service data to be unloaded according to the updated strategy network and the updated evaluation network.

2. The method of claim 1, wherein obtaining the initialized policy network and the initialized evaluation network is preceded by:

randomly generating an initial first gradient parameter and an initial second gradient parameter;

initializing a policy network by using the initial first gradient parameter to obtain an initialized policy network;

and initializing the evaluation network by using the initial second gradient parameter to obtain an initialized evaluation network.

3. The method of claim 1, wherein the resource data comprises computational resource data and transmission resource data;

calculating a time delay parameter of the service data to be unloaded according to the resource data, wherein the time delay parameter comprises the following steps:

acquiring transmission delay data;

calculating the execution duration of the service data to be unloaded according to the transmission delay data, the calculation resource data and the transmission resource data;

and determining a time delay parameter of the service data to be unloaded according to the execution time length.

4. The method of claim 3, wherein prior to obtaining the propagation delay data, further comprising generating the propagation delay data according to the following equation:

indicating the power of the node accepting the task,

5. The method of claim 4, comprising calculating an execution duration of service data to be offloaded from the transmission delay data, the computational resource data, and the transmission resource data according to the following formula:

wherein,

indicating the duration of the execution of the task on the idle device,

indicating the duration of the execution of the task on the local device,

which represents the computing power of the local device,

indicating the duration of execution of the task on the edge device,

a set of fog computing node resources is represented,

6. The method of claim 5, comprising determining a delay parameter for the service data to be offloaded according to the execution duration using the following formula:

wherein,

a latency parameter representing the service data to be offloaded,

a parameter representing a local offload mode is indicated,

indicating a fog computing node offload mode parameter,

a cloud offload mode parameter is represented that,

a parameter indicative of an idle device offload mode,

indicating the duration of the execution of the task on the local device,

indicating the duration of execution of the task on the edge device,

represents the execution time of the task on the cloud device,

7. The method of claim 6, comprising: generating a first service unloading result according to the service data to be unloaded and the initialized policy network by using the following formula:

wherein,

a set of actions is represented that is,

representing actions performed at the edge compute node,

indicating the action to be performed at the idle device,

indicating a fog computing node offload mode parameter,

8. The method of claim 7, comprising: generating a first evaluation result of the first service offloading result according to the delay parameter and the initialized evaluation network by using the following formula:

wherein S is_tThe state space is represented by a representation of,

indicating that the edge compute node is used for offloading,

indicating that the use of the idle device is off-load,

which represents the total delay of the service,

a latency parameter representing the service data to be offloaded,

9. The method of claim 1, wherein updating a first gradient parameter of the policy network and a second gradient parameter of the evaluation network using a deep deterministic policy gradient algorithm based on the first service offload result and the first evaluation result comprises:

determining an action value function according to the first service unloading result and the first evaluation result;

updating a first gradient parameter of the policy network and a second gradient parameter of the evaluation network using a depth-deterministic policy gradient algorithm according to the action value function.

10. The method of claim 9, comprising determining an action value function based on the first service offload result and the first evaluation result according to the following formula:

11. The method of claim 9, wherein the first gradient parameter of the policy network and the second gradient parameter of the evaluation network are updated using a depth-deterministic policy gradient algorithm according to the action value function according to the following formula:

wherein m represents the serial number of the sensor node,

is the first gradient parameter that is to be measured,

is the second gradient parameter that is to be measured,

is the updated first gradient parameter and,

is the updated second gradient parameter,

the status is represented by a number of time slots,

12. A service offloading device, comprising:

the acquisition module is used for acquiring service data to be unloaded, resource data, an initialized policy network and an initialized evaluation network;

the time delay module is used for calculating a time delay parameter of the service data to be unloaded according to the resource data;

a first generating module, configured to generate a first service offload result according to the to-be-offloaded service data and the initialized policy network;

a second generation module, configured to generate a first evaluation result of the first service offloading result according to the delay parameter and the initialized evaluation network;

an updating module, configured to update a first gradient parameter of the policy network and a second gradient parameter of the evaluation network by using a depth-deterministic policy gradient algorithm according to the first service offload result and the first evaluation result;

and the result module is used for generating a target service unloading result of the service data to be unloaded according to the updated strategy network and the updated evaluation network.

13. The apparatus of claim 12, further comprising an initialization module to:

14. The apparatus of claim 12, wherein the resource data comprises computational resource data and transmission resource data; the delay module is specifically configured to:

acquiring transmission delay data;

15. The apparatus of claim 12, wherein the update module is specifically configured to:

16. A service offloading system comprising a cloud service master module, a regional SDN control module, an edge node module, a data module, a device module, and the service offloading device of any of claims 12-15;

the service unloading device is in communication connection with the cloud service main control module, the regional SDN control module, the node module, the data module and the equipment module respectively.

17. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the service offloading method of any of claims 1 to 11 when executing the computer program.

18. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program for executing the service offloading method of any of claims 1 through 11.