CN117891532B

CN117891532B - Terminal energy efficiency optimization unloading method based on attention multi-index sorting

Info

Publication number: CN117891532B
Application number: CN202410298713.4A
Authority: CN
Inventors: 毕远国; 陈威; 郑彤; 樊彦伯; 张天旭; 张星
Original assignee: 东北大学
Filing date: 2024-03-15
Publication date: 2024-07-05
Anticipated expiration: 2044-03-15

Abstract

The invention belongs to the technical field of mobile edge computing networks, and discloses a terminal energy efficiency optimization unloading method based on attention multi-index sequencing. The task offloading efficiency in a mobile edge computing network is aimed at optimizing, with particular attention paid to the optimization of latency and energy consumption. This approach takes into account a number of key factors of the task, such as the exit task frequency, computational workload, time urgency, and per-bit application completion rate, in order to determine the priority order of the tasks. The method adopts a reinforcement learning algorithm and a graph rolling network to effectively extract graph structural features among the dependency tasks, introduces a attention mechanism to concentrate on weight training of important features, and simultaneously uses an entropy rewarding concept to improve the convergence speed and stability of training. The method provides a brand-new optimization unloading strategy for the field of mobile edge computing, and is helpful for meeting the requirements of the 5G era on high-performance mobile services.

Description

Terminal energy efficiency optimization unloading method based on attention multi-index sorting

Technical Field

The invention relates to the technical field of mobile edge computing networks, in particular to a terminal energy efficiency optimization unloading method based on attention multi-index sequencing.

Background

With the rapid development of the mobile internet, intelligent devices have been incorporated into people's daily lives, and applications have become more and more complex, including mobile payment, smart medicine, mobile games, and Virtual Reality (VR). These demanding applications present challenges to the resource capacity of the smart device. Since google in 2008 proposed a cloud computing concept, cloud computing has gradually introduced a mobile environment, broken through the resource limitation of intelligent devices, and provided diversified and efficient application programs and services. Cloud computing is not only cost-effective, but also simplifies information technology management and can respond to user demands more quickly. However, with the rise of the internet of things, higher requirements are put on transmission bandwidth, delay, energy consumption, application performance and reliability. The traditional cloud computing has the disadvantages of limited bandwidth, higher delay and obvious energy consumption problem, and is difficult to meet the requirement of users on high performance. Edge computing is therefore considered a key technology and architectural concept for transitioning to 5G as a new computing paradigm brand new angle of outcrop. Edge computing is a technology that supports computing at the edge of a network, moves services and functions originally located in the cloud to the vicinity of users, enables the tight integration of cloud computing platforms and networks, and provides powerful computing, storage, networking, and communication capabilities. Edge computing is more focused on handling the computing needs at the transaction level than cloud computing, enabling users to achieve better quality of experience (Quality of Experience, qoE) and quality of service (Quality of Service, qoS).

With the rise of 5G communications, the demand for high quality wireless services has shown an exponentially increasing trend. The 5G era not only expands the application of mobile phones and tablet computers, but also introduces new business scenes such as automatic driving, VR, AR and the like, and is more close to the living field such as smart grid, smart agriculture, smart city and environment monitoring. These new service scenarios put higher demands on 5G key technical indicators such as latency, energy efficiency and reliability. To address challenges in mobile communications, the mobile edge computing (Mobile Edge Computing, MEC) concept has evolved. The MEC brings computing and storage resources to the edge of the mobile network enabling it to run high performance applications on the user equipment while meeting stringent performance requirements. The MEC server provides a large amount of computing resources and interacts with the user devices to enhance the user experience. However, offloading tasks to the MEC server may increase communication latency, resulting in reduced performance. Therefore, a revolutionary end-edge cloud collaboration offload policy needs to be formulated, comprehensively considering latency and energy consumption, to meet the performance requirements of computationally intensive applications in the 5G era.

In the field of mobile edge computing, there are a number of key problems and challenges. First, task priority computation problems are a core challenge because mobile applications are typically composed of interdependent tasks, with complex dependencies between them. The existing mathematical formula calculation method is difficult to fully consider the complete dependency relationship among tasks, so that some tasks are given lower priority, the acquisition of computing resources is limited, and finally higher delay is caused. Secondly, task state information processing problems relate to effectively capturing the dependency relationship between tasks, and the current method may not fully achieve the target, so that the unloading decision efficiency of the subsequent tasks is reduced. In addition, the unloading method based on heuristic search requires longer system execution time, is difficult to adapt to dynamically-changing scenes, and influences the efficiency of unloading decisions. Finally, the existing deep reinforcement learning-based method has two main problems, including neglecting the difference of the dependency task characteristics, leading to longer convergence time of the training process and unstable performance in different environmental states and training phases, and underconsidering the training optimization of network parameters in the intelligent agent, thereby leading to the reduction of the overall system performance. Solving these problems requires extensive research and innovation to improve the performance and efficiency of mobile edge computing systems to meet the ever-increasing mobile application demands.

Disclosure of Invention

In the field of mobile edge computing dependency task offloading, the main challenges faced include efficiently handling task dependencies, improving offloading decision efficiency, shortening system execution time, and improving deep reinforcement learning methods to accommodate different operating environments and optimize training processes. In order to cope with the challenges, the invention provides a terminal energy efficiency optimization unloading method based on attention-based multi-index sequencing, which is characterized in that a attention-based multi-index sequencing codec is utilized to carry out dependency task unloading and energy consumption and time delay optimization in mobile edge calculation so as to improve the judgment of task priority and the accuracy of unloading decision.

The technical scheme of the invention is as follows: a terminal energy efficiency optimization unloading method based on attention multi-index sorting comprises the following steps:

Step 1, analyzing interaction among terminal equipment, an edge server and a cloud server from the multi-layer architecture of a cloud layer, an edge layer and a user layer to establish a system model; the system model comprises a network model, an application program model, a communication model and an energy consumption model;

Considering a problem of dependency task offloading in a multi-overlay edge network scenario based on a system model, the goal is to minimize average application completion time and minimize energy consumption; delay and energy consumption in offloading decisions relative to delay in performing all tasks completely locally And energy consumptionConsidering the reduction degree of the (1) as unloading utility, simultaneously considering setting different delay weights and energy consumption weights for terminal equipment according to diversified delay and energy consumption requirements, defining average unloading utility of all the terminal equipment according to the different delay weights and energy consumption weights, standardizing tasks in all application programs, and setting unique virtual starting node tasks and virtual ending node tasks;

the dependency task offloading problem optimization objective aims to maximize the average offloading utility of all terminal devices, defined as follows:

Representing a weighted sum of delay and energy consumption at each task processing, Representing the sum of all tasks that need to be processed,Indicating that the task offload is associated with the edge server,Which means that a certain terminal device is present,Representing the subtasks of a certain application program,Representing a certain edge server that is to be used,Indicating the latest deadline of the application,A set of terminal devices is represented and,Which represents a certain application program,Representing a collection of all servers,Is a balance factor that minimizes average application completion time and energy consumption; is the maximum completion time for the application to execute locally; is the maximum energy consumption of the application program executing locally; for offloading decision sets; Representing the action of unloading a certain task, Representing the total number of tasks; constraint C1 indicates that tasks can only be offloaded to the server associated with the affiliated terminal device; constraint C2 indicates that the task can only be performed indivisibly at one computing node; constraint C3 represents an application end time constraint; constraint C4 represents a non-negative constraint that unloads the binary constraint of the decision variable; constraint C5 indicates that the unique virtual start node task and virtual node end task among the application tasks must be executed locally at the terminal device;

Step 2, problem analysis and transformation; aiming at the problem of unloading the dependency task, splitting the dependency task into two key steps for solving; firstly, establishing a dependency task mixed priority index based on a system model, and solving the problem of sequencing inside the dependency task; constructing an agent decision model based on attention, performing improved PPO training on the agent decision model, and solving the problem of task unloading decision after sequencing; and continuously optimizing unloading decisions by training the constructed models repeatedly and interactively with the environment.

The network model comprises three key layers, namely a cloud layer, an edge layer and a user layer;

The cloud layer is a first-layer key layer and comprises a cloud center data center; the cloud center data center stores all services and has computing capability to support terminal equipment; the cloud center data center is used for training and deploying each model;

the edge layer is a second-layer key layer and comprises a plurality of base stations and an edge server; each edge server configures computing resources and storage resources; the terminal equipment uninstalls the application task to the edge server;

the user layer is a third key layer and comprises terminal equipment; when the terminal equipment cannot meet the calculation and delay requirements of the application program, unloading the application task to an edge server;

the terminal equipment is communicated with the edge servers through wireless connection, the edge servers are connected with each other through wires to transmit data, and the edge servers are connected with the cloud center data center through high-speed optical fiber links; the terminal equipment is not directly connected with the cloud center data center, and communication and task unloading are carried out only through the edge server;

Treating the application as a set of dependent tasks; in each time slot, the terminal equipment generates a dependency task; according to the relevant information of the dependency task, equipment resources, electric quantity conditions and the environment conditions of the edge server, a corresponding unloading decision is made;

Presence in network model Personal terminal deviceAn edge server; the terminal equipment set is expressed asThe location of the terminal device is randomly allocated at the beginning of each slot; the position of the terminal equipment moves according to the Gaussian Markov random model; in the same time slot, the position of the terminal equipment is not changed, and the terminal equipmentIs fixed as the coordinate position ofUpdated only at the beginning of each slot, with CPU frequencyIndicating terminal equipmentBy local computing power of (1)Indicating terminal equipmentIs used for the transmission power of the (a); the edge server set is represented asWherein, the method comprises the steps of, wherein,Represent the firstEdge serverIs fixed as the coordinates of; When a group of terminal devices are connected to the same edge server, the edge server has an unloading task and has a wired transmission function; each terminal device has an application program for deciding on which edge server to execute the task; the terminal equipment can only directly communicate with one edge server during task execution; the offloading decision is performed taking into account the state changes of the network model at different times.

The application program model specifically comprises the following steps:

Consider Each base station has a certain coverage area at random positions of terminal equipment in a certain area, and when the terminal equipment is in the coverage area, the corresponding edge server can be accessed through a wireless network; each terminal device has an application program which is executed within the limit of the ending time of the application program; each terminal device application is structured, and subtasks in each application can be executed on any edge server or cloud server; each application is modeled as a directed acyclic graphWhereinIs a collection of task nodes that are assigned to a task,Is a set of directed edges between task nodes; representing a set of task nodes on each application as；，Representing application program first The number of tasks to be performed in the process,Is a taskThe size of the input data is such that,Is to complete the taskThe required CPU cycle number, set taskIs set of precursor tasks of (1)，Representing tasksAnd precursor subtasksAn intermediate data size therebetween; each task offloads to a communicable edge server for execution; in the directed acyclic graph, use is made ofRepresenting only tasksCan only be executed after completion; The set of directed edges between tasks is represented as，Is a dependency relationship between tasks such that tasksAt the taskFinish before beginning;

unloading decision is used for determining whether a dependency task generated by the terminal equipment is respectively delivered to the terminal equipment or an edge server in a communication range for processing, and task unloading variables are adopted To represent tasksWherein the load-shedding decision of (c) is determined,Representing execution of tasks locallyOtherwiseRepresenting tasksExecuting at the edge server, for each user; Thus, define the offload decision set as。

The communication model specifically comprises the following steps:

defining a transmission rate between the terminal equipment and the edge server when the task is unloaded to the edge server;

offloading tasks to edge servers during offloading of tasks to edge servers Terminal equipment set of (a) isThe uplink transmission rate is equal to the downlink transmission rate; the transmission rate refers to the speed at which data is transferred between the terminal device and the edge server; setting wireless communication between the terminal equipment and the edge server based on orthogonal frequency division multiple access; will beRepresented as terminal equipmentAnd edge serverThe data transmission rate between them,

Wherein,For the bandwidth of the terminal device to the edge server,For terminal equipmentIs used for the transmission power of the (c),Indicating terminal equipmentTo edge serversIs used for the channel gain of (a),WhereinRepresenting the reference distanceThe channel power gain at the time of this,Is a terminal deviceServer and method for controlling the sameThe distance between the two plates is set to be equal,Representing the same transmission of tasks to an edge serverProcessing terminal equipment signal to current terminal equipmentIs used for the interference of (a) and (b),Representing gaussian noise power;

The local communication processing delay refers to the time required for locally executing a task on the terminal device, and the edge server processing delay refers to the time required for transmitting the task from the terminal device to the edge server for processing, and the local communication processing delay and the edge server processing delay are optimized according to the following definition;

Definition 1: Wherein, the method comprises the steps of, wherein, Representing tasksThe ready time for execution in-situ,Representing tasksA ready time on the edge server;

Definition 2: Wherein, the method comprises the steps of, wherein, Representing tasksAt the completion time of the execution in-place,Representing tasksCompletion time of execution at the edge server;

first, a task is introduced Ready time for local processingIndicating that all the precursor task results reach the terminal equipmentIs the time of the current driving taskThere is no intermediate data transfer delay at the time of local processing, and at the time of edge server processing, there is an intermediate data transfer delay expressed as，The intermediate data size is indicated as such,Representing a transmission rate between the terminal device and the server; CPU core configured by terminal equipment isEach CPU core can only execute one task at the same time, and the minimum completion time of each task executed locally on the user task is；The CPU of the user is in an idle state; thus, the first and second substrates are bonded together,The definition is as follows:

Wherein, Representing tasksPrecursor task set, taskOnly after the precursor task is completed can the precursor task be started; tasksThe processing time of the local execution is that，Indicating the size of the terminal device's capability to handle the computing task,Representing tasksReady time of local processing, thus taskThe completion time of (2) is:

when the task of the terminal equipment is unloaded onto the edge server, the processing waiting delay of the edge server exists, and when the task arrives and the idle CPU core processes the task, the task The time from the terminal to the server is expressed asTasksReady time on edge serverThe definition is as follows:

Wherein, the CPU processing capacity of the edge server is expressed as ，Representing edge server processing latency, edge server performing tasksIs that the actual execution time of (a)Thus, the taskThe completion time on the edge server is:

The total delay of the application program is the local completion time of the virtual ending node task; will be Represented as an application programAnd willRepresented as an application programIs the first virtual task of the start of an application programIs delayed by completion of (a)The following are given:

。

the energy consumption model comprises local processing calculation energy consumption, terminal equipment-to-edge server transmission energy consumption and edge server-to-target server transmission energy consumption and processing waiting energy consumption;

the local processing calculates the energy consumption The following are provided:

representing the terminal equipment to calculate the energy consumption coefficient;

The energy consumption of the transmission from the terminal equipment to the edge server The following are provided:

Representing the transmission delay of the terminal equipment for transmitting the task to the edge server;

The edge server transmits energy consumption to the target server Processing standby energy consumptionThe following are provided:

Wherein, The representation is the waiting power of the terminal equipment when executing the task on the MEC edge server; thus, the total energy consumption consumed by the end terminal device is as follows:

。

the number of times the application exits the task is as follows: representing tasks Number of exits of the task ifRather than exit the subtask, thenNormalizing;

Representing a set of overall subtasks within an application;

The computational workload is: representing tasks Is then definedRepresenting the computational workload and normalizing;

Representing a task computing workload size within the application;

The time urgency is: for arrival time slots Task of (2)UsingTo represent tasksIn time slotIs normalized;

the application completion rate per bit is: a scale for measuring successful completion of an application during transmission or processing of each bit of data;

representing the size of each bit of application completion rate of a certain subtask in the application task in the overall application;

based on the task priority index, obtaining the optimal task execution sequence with the maximum number of completed applications in each time slot;

Wherein, Representation ofIs directly subsequent to the collection of (a)；A successor task set representing all subtasks among the application; representing a set of comprehensive ranking values.

The attention-based agent decision model is used for extracting the dependency relationship between the ordered tasks and optimizing task unloading decisions by using deep learning;

Modeling a system model as a time discrete system, wherein the terminal equipment is regarded as an unloading agent; modeling the dependency task offloading problem as a Markov decision process, applying deep reinforcement learning to the task offloading problem; the state space, action space and reward function are defined as follows;

state space: at the time of scheduling tasks At this time, the state of the MEC system depends on the scheduled taskThe scheduling result of the previous task also depends on the environment information to interact; defining a state space as a combination of directed acyclic graph information, MEC system environment information, and a partial offload plan; a directed acyclic graph representing the code, Representing slaveTo the point ofIs scheduled for offloading a task sequence; the state space is expressed as:

Wherein, The directed acyclic graph information is represented,AndRespectively a set of task nodes and a set of directed edges between tasks; representing a set of task nodes on each application as；Representing system environment information,Representing directed acyclic graph tasks,Representing an unloading action space;

consider an application program Directed acyclic graph of (1), whereinIs a collection of tasks that are to be performed,Representing dependencies in an application; for application programsEach task in (1) usesTo represent tasksIs characterized by (2); application programIs characterized by all tasks:

For each task The number of feature dimensions is 2; using adjacency matricesRepresenting an applicationIs a contiguous relationship of (2); tasksThe state of (2) is expressed asIncluding tasksFeature and adjacency matrix of (2)Information of (2);

The object of the state space is to output a task level of embedding WhereinIs the embedded dimension of each task,Is the number of tasks generated by the terminal device, inputAnd；Comprising the userAll information of the requested application program,Each row of (2) represents a taskIs a vector of embedding;

action space: a task is offloaded to an edge server or executed locally on a terminal device; the offloading is indicated to the edge server, Representing locally executed tasks; The action space is defined as；

Bonus function: the optimization objective is to maximize the offload utility, defining the reward function at each step as an estimated increment of offload utility:

Wherein, Is the average delay of a task in the directed acyclic graph,The energy consumption of a task in a directed acyclic graph, defined as；Representing the delay weight coefficient,Representing energy consumption weight coefficients;

According to the definition of the Markov decision process, the unloading problem is converted into a codec problem based on an attention mechanism; inputting the ordered task sequence and outputting a task scheduling plan; strategy Is shown in the stateDown selection actionFor tasksProbability of (2); Represented in a given dependency graph MEC environmental information of (C) Obtaining an offload plan in the case of individual tasksProbability of (2);

The definition is as follows:

designing an intelligent body decision model based on attention, and approaching A policy defined in (a); combining a codec and an attention mechanism, approximating a strategy function and a cost function, wherein the attention-based agent decision model comprises a strategy neural network and a value neural network, and representing the ordered task sequence asAnd represents the function of the encoder in the attention-based agent decision model asThe hidden state of the encoder is obtained by:

Wherein, Is a coding stepIs used to determine the hidden state of the (c),A task sequence representing an input task,Is a parameter of the encoder; Is a function of the decoder in the attention-based agent decision model;

Decoding step Decoder hidden state of (c)The calculation is as follows:

Wherein, Is a parameter of the decoder network and,Representing the attention profile of the previous time step; is the context vector of the attention mechanism, defined by the weighted sum of the hidden states of the encoder as follows:

weights for each hidden state of an encoder Calculated by:

Wherein the scoring function For measuring encoder positionInput at the location and decoder at the locationThe matching degree of the output is treated, and the rest network layers except the top layer of the decoder of the strategy neural network and the value neural network share the rest parameters; for a strategic neural network, at the decoder outputAdding a full connection layer on it and converting the output into a distribution about actions using a softmax function; For value neural networks, inAdding a full connection layer and using the output to represent the state value; The shared parameters in the neural network of the attention mechanism coder-decoder are used for extracting common features in the directed acyclic graph, and the training strategy neural network accelerates the training of the value neural network;

the offloading decision of each task is decided by training the neural network of the attention mechanism codec; the steps of the unloading process are summarized as follows:

Step one: then, embedding the ordered tasks into a vector sequence as input state information of an encoder; next, the output sequence of the encoder is used to calculate a context vector; in the first place In the step of decoding, byTo generate an offloading decision;

Step two: the terminal equipment and the edge server cooperatively complete executing all tasks according to the unloading decision;

the first training objective is to find an optimal strategy to maximize the long-term jackpot, specifically defined as:

Wherein, Is a parameter of the neural network of the attention mechanism codec,Is the number of tasks of the directed acyclic graph,Is a function of the reward,AndRespectively the firstObservation states and offloading decisions for individual tasks; the aim of training is to adjust the parametersMaximizing the long-term jackpot.

The improved PPO-based training is specifically as follows;

When training the attention-based agent decision model, the whole training track is divided into sequences and then input into the improved PPO network; the training trajectory includes a dispatch plan And state value sequenceFirstly, sampling from the environment, wherein a state value sequence is obtained by forward propagation of an ordered task sequence; obtaining a bonus sequence by applying a scheduling plan to an environment；

Time stepError term of (2)The definition is as follows:

instant rewards representing time steps, Representing discount factors,Indicating the total payback expected to be obtainable under the state policy;

obtaining time steps using generalized dominance estimator The definition of the advantages of (a) is as follows:

Wherein, For controlling the trade-off between bias and variance;

Training a first stage to generate two different attention-based agent decision models, initializing by using the same random parameters, wherein one is used for sampling and the other is used for updating a sampling neural network; in each cycle training, an attention-based agent decision model for sampling samples a set of training trajectories and stores them in an experience buffer In (3) calculating and storing a dominant function sequenceAnd estimating a sequence of state values；Including task embedding sequencesScheduling of samplesReward sequenceSampled state value sequencesSequence of dominance functionsSequence of estimated state values；

Training the second stage byProceeding withObjective function of individual periodsSmall batches of random gradient descent improvement update for updated attention-based agent decision models; Is a shearing objective function; A square error loss function; Expressed in the current policy In the state ofEntropy of the lower strategy; And Is a coefficient;

the clipping objective function is defined as:

Wherein, Is a super parameter for controlling the shearing range; the policy probability ratio is:

Shear function Aims at limitingTaking the sheared target minimum value and the uncut target minimum value as final targets;

The square error loss function Is a predicted state valueAnd a target state valueSquare error loss between:

Wherein, 。

The invention has the beneficial effects that: the invention establishes more accurate priority for each task by comprehensively considering a plurality of key factors including the task exiting frequency, the calculation workload, the time urgency, the application completion rate per bit and the like. The method can more effectively identify and process the dependency relationship between the tasks, and remarkably improve the efficiency and accuracy of task scheduling, thereby reducing the time delay and the energy consumption of the terminal equipment.

Aiming at the defects of the prior art in task feature extraction and unloading decision-making efficiency, the invention adopts a graph convolution network and an attention mechanism. Key features are extracted from the directed acyclic graph of the task, and the attention mechanism enables the model to concentrate on more important information in the training process, so that interference of irrelevant information is reduced. In addition, by introducing the entropy rewarding concept, the invention ensures more effective exploration and decision making, and accelerates the convergence speed and stability of model training. These innovations accelerate the offloading decision process, improving overall system performance.

The method not only improves the efficiency of task priority determination and unloading decision, but also remarkably improves the overall system performance through accurate task feature extraction and effective training mechanism, and better meets the requirements of the 5G era on high efficiency and low time delay of mobile service.

Drawings

FIG. 1 shows a schematic diagram of a terminal energy efficiency optimization offloading method based on attention multi-index ranking.

Figure 2 shows a model for offloading decision-making and a training process therefor.

Fig. 3 shows an attention-based agent decision model.

Fig. 4 shows the variation of average time delay under different application subtasks, and the DTO algorithm in the comparison chart shows the effect of the method of the invention.

Fig. 5 shows the variation of average energy consumption under different application subtasks, and the DTO algorithm in the comparison graph shows the effect of the method of the present invention.

Fig. 6 shows the quality of service (QoS) variation at different application subtasks, and the DTO algorithm in the comparison shows the effect of the method of the present invention.

Fig. 7 shows the quality of service (QoS) variation of the offloadability level at different terminal devices, and the DTO algorithm in the comparison shows the effect of the method of the present invention.

Fig. 8 shows a schematic diagram of a multi-layer architecture from cloud layer, edge layer to user layer designed by the method of the present invention.

Figure 9 shows a flow chart of the overall unloading of the method designed by the method of the invention.

Detailed Description

In order to achieve the above purpose, the technical scheme adopted by the invention is as follows: a terminal energy efficiency optimization unloading method based on attention multi-index sorting comprises the following specific steps:

step 1: and establishing a system model comprising a network model, an application program model, a communication model and an energy consumption model.

Under the high-density terminal equipment scene, the traditional base station deployment mode faces three major problems: firstly, because the base station deployment is sparse, many user terminal devices cannot be effectively covered, so that communication services are not available; secondly, no coverage area exists between the base stations, so that a user frequently encounters signal interruption when moving, and the communication quality and the user experience are reduced; finally, emerging applications such as intelligent transportation require higher density base stations to provide strong signal coverage and low latency, requiring dense base station deployment. To address these issues, a dense base station deployment approach becomes critical. In the invention, the base station and the edge server are regarded as a whole, so that the complexity of the problem is simplified, and the reliability and the capacity of the communication network are improved.

The method focuses on the situation of densely deployed base stations, namely, terminal equipment can exist in the coverage range of a plurality of edge servers. The multi-overlay edge network architecture includes three key layers: cloud layer, edge layer and user layer.

Different computing nodes have different computing and communication capabilities, and more efficient task processing is achieved through cooperation. The terminal equipment is communicated with the edge servers through wireless connection, the edge servers are connected through wired connection to transmit data, and the edge servers are connected with the cloud center through high-speed optical fiber links. The terminal equipment is not directly connected with the cloud center, and communication and task unloading can only be carried out through the edge server. This architecture leverages the capabilities of the different nodes to increase task processing efficiency. The present invention considers the positive impact of application partitioning and parallel processing on transfer time and offloading efficiency as a set of dependent tasks. At each time slot, the end devices will generate a set of these tasks and send their information, their own resources and power conditions to the MEC edge server. The edge server will collect this information and then need to make an offloading decision. The present invention is directed to providing a better user experience from the point of view of the terminal device, with the aim of reducing average application completion time and energy consumption and ensuring that data dependency constraints are met. At the same time, it is also pursued to improve fairness in task scheduling, i.e. to reduce the extent of delay and energy consumption by offloading tasks relative to all executing locally. The core goal of this invention is to optimize task allocation and processing to provide more efficient application execution and resource utilization.

In order to improve the offloading utility of the terminal device, an optimal offloading decision needs to be made. The offloading decision is used to determine whether the dependency tasks generated by the terminal device should be handled by the terminal device itself or by some edge server within communication range, respectively. Each of the dependent tasks considered by the present invention is already at a minimum granularity and is not repartitionable, so each of the dependent tasks is considered to be locally processed or offloaded as a whole.

When offloading tasks to an edge server, it is necessary to precisely define the transmission rate between the terminal device and the edge server. This involves considering the network type of the terminal device (e.g., 4G, 5G, or Wi-Fi), device capabilities, and network environment to select an appropriate transmission rate. This is the key to ensuring efficient offloading and distributed processing of tasks, and requires optimization of the transmission rate between the terminal device and the edge server for overall system performance based on comprehensive consideration of network performance, latency and task priority.

In offloading tasks to MEC edge servers, the face is the need to offload tasksOffloading to edge serversTerminal equipment set of (a) isInter-cell interference is also considered and it is assumed that the uplink and downlink are symmetrical, i.e. the uplink transmission rate is equal to the downlink transmission rate. In this scenario, the transmission rate refers to the speed at which data is transferred between the terminal device and the edge server, and its stability and reliability are critical to the task processing efficiency. Therefore, transmission strategies must be optimized to improve transmission efficiency and reduce delay, and measures must be taken to mitigate or avoid inter-cell interference to ensure reliability in the data transmission process. Meanwhile, as the uplink and the downlink are symmetrical, transmission management is simplified, the performance of the whole edge computing system is improved, and better user experience is provided.

Local communication processing delay refers to the time required to execute tasks locally on the terminal device, while MEC server processing delay refers to the time required to transfer tasks from the terminal device to the server for processing, optimizing both delays helps to improve mobile computing efficiency and user experience. In order to better represent the delay model, the following definitions are presented.

The energy consumption generated by the terminal device in the task processing process is also one of key indexes of the unloading effect, and the energy consumption of local processing and the energy consumption of unloading processing will be described in detail in this section. Because the MEC edge server runs with continuous wired power supply, the invention does not consider the energy consumption of the edge server for executing the unloading task and the energy consumption for transmitting the intermediate data. Thus, the portion of the energy consumption includes local processing computing energy consumption, terminal device to edge server transmission energy consumption, edge server to target server transmission energy consumption, and processing standby energy consumption.

The invention considers the problem of dependency task offloading in a multi-overlay edge network scenario, with the goal of minimizing average application completion time and energy consumption. In particular, latency and energy consumption in offloading decisions are compared to latency in performing all tasks entirely locallyAnd energy consumptionIs regarded as the offloading utility, while different delay and energy consumption weights are set for the terminal devices according to the diversified delay and energy consumption requirements, and the average offloading utility of all the terminal devices is defined accordingly.

Step 2: problem analysis and transformation;

the problem studied in order to demonstrate the present invention is that of NP hard. The problem is reduced to a multi-machine scheduling problem. First, a special case of the research problem is given: has the following components Individual user applicationsAnd calculating nodes by the homogeneous edges. Each application may execute on any edge computing node and the application is indivisible (i.e., each application consists of only one task). Is provided withThe execution time required by the application program is. There is a need to provide a resource allocation strategy that enablesThe edge computing nodes can process in the shortest timeApplication. This is a typical multi-machine scheduling problem, which is an NP hard problem. In summary, the problem of the problem can be simplified to a multi-machine scheduling problem. Thus, it can be inferred that the problem studied by the present invention is an NP hard problem, which is difficult to solve in polynomial time. The invention divides the problem into two sub-problems to solve, the first step is to solve the ordering problem in the dependency task, and the second step is to solve the problem of unloading decision after the sequence is arranged.

Step 2.1: designing a task priority ordering algorithm based on the mixed index dependency;

For each dependent application task, several task priority indicators are defined, including as the number of exit tasks, computational workload, time urgency, and per-bit application completion rate, which are ultimately combined into one normalized indicator priority indicator. Based on the task priority index, the necessity of the priority indexes is deduced and proved so as to obtain the optimal task execution sequence with the maximum number of completed applications in each time slot, and a standardized task priority index is designed. Task scheduling is not only affected by priority, but also limited by task dependency relationship; before executing one subtask, ensuring that all the front subtasks are completed; the terminal equipment maintains three queues: the first is used for storing the latest completed subtasks, the second is used for storing executable subtasks, the third is used for maintaining the subtasks which cannot be executed due to lack of data, and a multi-queue priority ordering method is designed; in this way, an update structure with a time space is realized, and maintenance and computation of data can be reduced to minimize application completion time.

Step 2.2: attention-based coding and decoding dependent task offloading scheme design;

The present invention uses a Markov Decision Process (MDP) to model the network environment as a time discrete system, wherein the terminal device is regarded as an offload agent. In general, MDP is described as a 4-tuple comprising a state space, a state transition probability, an action space, and a reward function. In this scenario, since the state of the agent is continuous in a high dimension and we have difficulty in accurately obtaining the state transition probability, the MDP is simplified, and it is converted into a model-free process. To apply Deep Reinforcement Learning (DRL) to task offloading problems, the present invention models the problem as an MDP.

Representing environmental status information includes two parts: 1) MEC server state and 2) user state. For edge servers with CPU computing power, bandwidth and task waiting queues are necessary to make the decision whether or not the current user is directly connected to the server. Also, if the tasks are offloaded to the MEC server, the channel information should be considered. Thus, the current total channel gain of the corresponding edge server also belongs to the MEC environment status information. In addition, user information has a significant impact on offloading decisions. The completion time of the task that the current user has decided, the computing power of the user device, and the estimated completion time of the task that the current user is performing constitute user information. The information of the MEC server and the user information constitute necessary information of the MEC environment. Features extracted from MEC servers and end device dependent tasks are fed into the MLP to learn the embedding of the entire MEC environment. Dependency task embedding utilizes a GCN network to consider dependency relationships between tasks to efficiently capture the overall structure of an application and better extract the tasks that aid in decision makingInformation offloaded. The method is based on GCN (graph rolling network) and is used for task embedding. In short, embedding learned by GNN can be considered as a feature of a task without requiring cumbersome feature engineering by hand. This process helps to better understand the relationships between tasks, improving the efficiency of decision task offloading. As previously mentioned, consider an applicationDirected acyclic graph DAG of (1), whereinIs a collection of tasks that are to be performed,Representing dependencies in an application. For application programsEach task in (1) usesTo represent tasksIs characterized by (3).

The simulation implementation of the invention is based on Pycharm platform, the simulated terminal equipment is located in the small cellular network, and the transmission rate depends on the distance between the terminal equipment and the MEC host. The set of available transmission rates includes {2Mbps, 8Mbps, 14Mbps, 20Mbps, 26Mbps }. The model is implemented using Tensorflow using an attention-based agent decision model. The specific architecture is as follows: the encoder adopts two layers of bidirectional Long Short Term Memory (LSTM), and each layer has 256 hidden units; the decoder uses two layers of dynamic Long Short Term Memory (LSTM), with 256 hidden units per layer. Furthermore, both encoder and decoder use layer normalization. During training, the learning rate was set to 0.0001, the coefficient was set to c1=0.5, c2=0.01, and the batch size was 500. The choice of hyper-parameters has an important impact on training results and convergence speed. Meanwhile, a comparison algorithm is set during simulation verification of the invention. In order to evaluate three performance indexes of average response time delay, average response energy consumption and quality of service (QoS), comparison experiments are performed on an all local (local offload algorithm), all remote (remote offload algorithm), random (random offload algorithm), COFE (heuristic algorithm) and DDQNTO (task offload based on dual deep q network) platform respectively.

Claims

1. A terminal energy efficiency optimization unloading method based on attention multi-index sorting is characterized by comprising the following steps:

Considering a problem of dependency task offloading in a multi-overlay edge network scenario based on a system model, the goal is to minimize average application completion time and minimize energy consumption; regarding the reduction degree of delay and energy consumption under the unloading decision relative to delay t _m and energy consumption e _n for completely executing all tasks locally as unloading utility, simultaneously considering that different delay weights and energy consumption weights are set for terminal equipment according to diversified delay and energy consumption requirements, defining average unloading utility of all the terminal equipment according to the different delay weights and energy consumption weights, simultaneously standardizing tasks in all application programs, and setting unique virtual starting node tasks and virtual ending node tasks;

OU_n＝(1-ε)×t_m+ε×e_m

C3：t_m≤ETC_m

OU _n represents a weighted sum of delay and energy consumption at each task processing, M represents a sum of all tasks to be processed, y _ij represents a size of task offload associated with an edge server, n _i represents a certain terminal device, v _n,i represents a subtask of a certain application, s _j represents a certain edge server, ETC _m represents an application latest deadline, A set of terminal devices is represented and,Which represents a certain application program,Representing all server sets, ε is a balance factor that minimizes average application completion time and energy consumption; t ^max is the maximum completion time for the application to execute locally; e ^max is the maximum power consumption of the application program to execute locally; For offloading decision sets; alpha _n,i represents the unloading action of a certain task, and I represents the total number of tasks; constraint C1 indicates that tasks can only be offloaded to the server associated with the affiliated terminal device; constraint C2 indicates that the task can only be performed indivisibly at one computing node; constraint C3 represents an application end time constraint; constraint C4 represents a non-negative constraint that unloads the binary constraint of the decision variable; constraint C5 indicates that the unique virtual start node task and virtual node end task among the application tasks must be executed locally at the terminal device;

Step 2, problem analysis and transformation; aiming at the problem of unloading the dependency task, splitting the dependency task into two key steps for solving; firstly, establishing a dependency task mixed priority index based on a system model, and solving the problem of sequencing inside the dependency task; constructing an agent decision model based on attention, performing improved PPO training on the agent decision model, and solving the problem of task unloading decision after sequencing; through training the built models repeatedly and interactively with the environment, unloading decisions are continuously optimized;

the problem of ordering inside the dependent tasks is solved by means of the mixed priority index of the dependent tasks;

for each dependency task, defining the following task priority indexes including the number of times of application exiting the task, calculation workload, time urgency and application completion rate of each bit, and finally merging into a standardized task mixed priority index;

the number of times the application exits the task is as follows: e _n represents a task Number of exits of the task ifIf not, E _n = 0 and normalizing;

Representing a set of overall subtasks within an application;

the computational workload is: l _n represents a task Is then definedRepresenting the computational workload and normalizing;

Psi _n′ denotes the task computing workload size within the application;

The time urgency is: for tasks arriving at time slot t _x′ UsingTo represent tasksTime urgency at time slot t _x, and normalization;

Wherein succ (T _i) represents the directly successor set of T _i, an A successor task set representing all subtasks among the application; k represents a set of comprehensive ranking values;

state space: at the time of scheduling tasks At this time, the state of the MEC system depends on the scheduled taskThe scheduling result of the previous task also depends on the environment information to interact; defining a state space as a combination of directed acyclic graph information, MEC system environment information, and a partial offload plan; Representing a coded directed acyclic graph, A _1：n＝[a₁,a₂,...,a_i,...,a_n represents a slave To the point ofIs scheduled for offloading a task sequence; the state space is expressed as:

Wherein, The directed acyclic graph information is represented,And ε _n is the set of task nodes and the set of directed edges between tasks, respectively; representing a set of task nodes on each application asS represents system environment information, G represents directed acyclic graph tasks, A _1：i represents unloading action space;

consider an application program Directed acyclic graph of (1), whereinIs a collection of tasks, ε _n represents the dependencies in an application; for each task in application G _n, v _n,i. F is used to characterize task v _n,i; all tasks in application G _n are characterized as:

F_n＝[v_n,1.f,v_n,2.f,...,v_n,l.f]

For each task v _n,i, the number of feature dimensions is 2; using adjacency matrix a _n to represent adjacencies of application G _n; the state of task v _n,i is represented as Including the characteristics of task v _n,i and the information of adjacency matrix a _n;

The object of the state space is to output a task level of embedding Wherein D is the embedding dimension of each task, i is the number of tasks generated by the terminal device, inputs F _n and a _n;O_n contain all information of the application requested by user n, each row of O _n represents the embedding vector of task v _n,i;

Action space: a task is offloaded to an edge server or executed locally on a terminal device; a _i =1 denotes offloading to an edge server, a _i =0 denotes executing task T ₁ locally; the action space is defined as a _1：i = {0,1,2, 3..i };

λ_t+λ_e＝1

Wherein, Is the average delay of a task in the directed acyclic graph,The energy consumption of a task in a directed acyclic graph, defined asAndLambda _t represents the delay weight coefficient, lambda _e represents the energy consumption weight coefficient;

According to the definition of the Markov decision process, the unloading problem is converted into a codec problem based on an attention mechanism; inputting the ordered task sequence and outputting a task scheduling plan; policy pi (a _i|(S,G),A_1：i-1) represents the probability of selecting action a _i for task T _i in state (S, G); pi (a _1：n | (S, G)) represents the probability of obtaining an offload plan a _1：n given the MEC context information and n tasks of the dependency graph G, S;

pi (A _1：n | (S, G)) is defined as:

designing an intelligent body decision model based on attention, and approaching A policy defined in (a); combining a codec and an attention mechanism, approximating a strategy function and a cost function, wherein the attention-based agent decision model comprises a strategy neural network and a value neural network, and representing the ordered task sequence asAnd representing the function of the encoder in the attention-based agent decision model as f _enc, the hidden state of the encoder is obtained by:

e_i＝f_enc(e_i-1,t_i;θ_enc)

Wherein e _i is the hidden state of the encoding step i, t _i is the task sequence of the input task, θ _enc is the parameter of the encoder; f _enc is a function of the decoder in the attention-based agent decision model;

The decoder hidden state d _k of decoding step k is calculated as follows: d _k＝f_dec(d_i-1,a_k-1,c_k;θ_dec)

Where θ _enc is a parameter of the decoder network, a _k-1 represents the attention profile of the previous time step; c _k is the context vector of the attention mechanism, defined by the weighted sum of the hidden states of the encoder as follows:

The weight α _ki for each hidden state of the encoder is calculated by:

Wherein the scoring function f _score(d_k-1,e_i) is used to measure the matching degree of the input of the encoder at the position i and the output of the decoder at the position k, and the rest of network layers except the top layer of the decoder share the rest of parameters of the strategy neural network and the value neural network; for the strategic neural network, add a fully connected layer on the decoder's output d _k and convert the output to a distribution of actions pi using the softmax function (a _k|s_k); for the value neural network, a full connection layer is added on d _k, and the output is used for representing the state value v (s _k); the shared parameters in the neural network of the attention mechanism coder-decoder are used for extracting common features in the directed acyclic graph, and the training strategy neural network accelerates the training of the value neural network;

step one: then, embedding the ordered tasks into a vector sequence as input state information of an encoder; next, the output sequence of the encoder is used to calculate a context vector; in the kth decoding, by To generate an offloading decision;

Where θ is the parameter of the neural network of the attention mechanism codec, n is the number of tasks of the directed acyclic graph, R (s _t,a_t) is the reward function, s _t and a _t are the observation state and offloading decision of the t-th task, respectively; the goal of the training is to maximize the long-term jackpot by adjusting the parameter θ;

the improved PPO-based training is specifically as follows;

When training the attention-based agent decision model, the whole training track is divided into sequences and then input into the improved PPO network; the training track comprises a scheduling plan A _1：n and a state value sequence [ v _π(s₁),v_π(s₂),...,v_π(s_n ], wherein the state value sequence is obtained by forward propagation of the ordered task sequence; obtaining a bonus sequence [ r ₁,r₂,...,r_n ] by applying a scheduling plan to the environment;

the error term delta for time step t is defined as:

δ_t＝r_t+γv_π(s_t+1)-v_π(s_t)

r _t denotes the immediate rewards for time steps, γ denotes the discount factor, v _π(s_t) denotes the total payoff expected to be available under the status policy;

The advantage of time step t is obtained using a generalized advantage estimator, defined as follows:

Wherein λ,0 < λ <1 is used to control the trade-off between bias and variance;

Training a first stage to generate two different attention-based agent decision models, initializing by using the same random parameters, wherein one is used for sampling and the other is used for updating a sampling neural network; in each cycle training, the attention-based agent decision model for sampling samples a set of training trajectories and stores them in the experience buffer D _i, calculates and stores the dominant function sequence And estimating a sequence of state valuesD _i includes a task embedding sequence [ T ₁,T₂,...,T_n ], a sampled dispatch plan A _1：n, a rewards sequence [ r ₁,r₂,...,r_n ], a sampled status value sequence [ v _π(s₁),v_π(s₂),...,v_π(s_n ], and a dominance function sequenceSequence of estimated state values

Training the second stage by performing an objective function on D _i for m periodsSmall batches of random gradient descent improvement update for updated attention-based agent decision models; l ^C (θ) is a shear objective function; l ^VF (θ) is a squaring error loss function; s [ pi _θ](s_t) represents the entropy of the policy under state S _t under current policy pi _θ; c ₁ and c ₂ are coefficients;

the clipping objective function is defined as:

Where e is the hyper-parameter used to control the shear range; pr _t (θ) is the policy probability ratio:

The clipping function clip (pr _t (theta), 1-epsilon, 1+ epsilon) aims at limiting the value of pr _t (theta), and takes the minimum value of the clipped target and the minimum value of the non-clipped target as the final target;

The square error loss function L ^VF is a predicted state value And target state value v _π(s_t):

Wherein,

2. The method for optimizing and unloading terminal energy efficiency based on attention multi-index sequencing of claim 1, wherein the network model comprises three key layers of cloud layer, edge layer and user layer;

k terminal devices and j edge servers exist in the network model; the terminal equipment set is expressed as The location of the terminal device is randomly allocated at the beginning of each slot; the position of the terminal equipment moves according to the Gaussian Markov random model; the position of the terminal device is not changed in the same time slot, and the coordinate position of the terminal device n _i is fixed asOnly at the beginning of each time slot, the local computing capacity of the terminal device n _i is represented by the CPU frequency f _i, and the transmission power of the terminal device n _i is represented by p _i; the edge server set is represented asWhere s _j denotes the jth edge server, the coordinates of the edge server s _j are fixed asWhen a group of terminal devices are connected to the same edge server, the edge server has an unloading task and has a wired transmission function; each terminal device has an application program for deciding on which edge server to execute the task; the terminal equipment can only directly communicate with one edge server during task execution; the offloading decision is performed taking into account the state changes of the network model at different times.

3. The method for optimizing and offloading terminal energy efficiency based on attention multi-index ranking according to claim 2, wherein the application model is specifically:

Considering the random position of the nth _i terminal equipment in a certain area, each base station has a certain coverage area, and when the terminal equipment is in the coverage area, the corresponding edge server can be accessed through a wireless network; each terminal device has an application program which is executed within the limit of the ending time of the application program; each terminal device application is structured, and subtasks in each application can be executed on any edge server or cloud server; each application is modeled as a directed acyclic graph Wherein the method comprises the steps ofIs a set of task nodes, epsilon _n is a set of directed edges between task nodes; representing a set of task nodes on each application asV _n,i＝(d_n,i,b_n,i,db_n,ii′),i∈1,2,...,I,v_n,i denotes the ith _i task of the application, d _n,i is the size of the input data of task v _n,i, b _n,i is the number of CPU cycles required to complete task v _n,i, set the precursor task set Pre of task v _n,i (v _n,i),db_n,ii′ denotes the size of intermediate data between task v _n,i and precursor subtask v _n,i′; each task is offloaded to a communicable edge server for execution; in a directed acyclic graph, use (v _n,m,v_n,n) denotes that v _n,n can only be executed after task v _n,m is completed; the set of directed edges between tasks is denoted asLet ε _n |=e be the dependency between tasks so that task v _n,i completes before task v _n,j begins;

unloading decision is used for determining whether a dependency task generated by the terminal equipment is respectively delivered to the terminal equipment or an edge server in a communication range for processing, and task unloading variables are adopted To represent the offloading decision of task v _n,i, where α _n,i =0 represents executing task v _n,i locally, otherwiseMeaning task v _n,i is performed at the edge server, for each user α _n,0＝α_n,j+1 =0; thus, define the offload decision set as

4. The method for optimizing and offloading terminal energy efficiency based on attention multi-index ranking of claim 3, wherein the communication model is specifically:

In the process of offloading tasks to the edge server, the terminal equipment for offloading tasks to the edge server s _j is gathered into The uplink transmission rate is equal to the downlink transmission rate; the transmission rate refers to the speed at which data is transferred between the terminal device and the edge server; setting wireless communication between the terminal equipment and the edge server based on orthogonal frequency division multiple access; r _i,j is denoted as the data transfer rate between the terminal device n _i and the edge server s _j,

Where W _i,j is the bandwidth from the terminal device to the edge server, p _i is the transmission power of terminal device n _i, g _ij denotes the channel gain of terminal device n _i to the edge server s _j,Where ρ _ij represents the channel power gain at reference distance d ₀ =1m,Is a terminal deviceServer and method for controlling the sameThe distance between the two plates is set to be equal,Representing the interference of the terminal device signal processed by the same task transmitted to the edge server s _j to the current terminal device N _i, N ₀ representing gaussian noise power;

Definition 1: Wherein, Representing the ready time for task v _n,i to execute locally,Representing the ready time of task v _n,i on the edge server;

Definition 2: Wherein, Indicating the completion time of the local execution of task v _n,i,Representing the completion time of task v _n,i executing at the edge server;

first, the ready time for the task v _n,i to process locally is introduced Indicating the time when all the precursor task results reach the terminal device n _i, when the precursor task v _n,j is processed locally, no intermediate data transmission delay exists, and when the precursor task v _n,j is processed by the edge server, the intermediate data transmission delay exists and is indicated asDb _ii' denotes an intermediate data size, and R _s,s' denotes a transmission rate between the terminal device and the server; CPU core configured by terminal equipment isEach CPU core can only execute one task at the same time, and the minimum completion time of each task executed locally on the user task isThe CPU of the user is in an idle state; thus, the first and second substrates are bonded together,The definition is as follows:

Where Pre (v _n,i) represents the precursor task set of task v _n,i, task v _n,i can only start after the precursor task is completed; the processing time of the task v _n,i in local execution is Indicating the size of the terminal device's capability to handle the computing task,The ready time for task v _n,i to process locally, therefore the completion time for task v _n,i is:

when the task of the terminal equipment is unloaded onto the edge server, the processing waiting delay of the edge server exists, and when the task arrives and the idle CPU kernel processes the task, the time from the terminal to the server of the task v _n,i is expressed as Ready time of task v _n,i on edge serverThe definition is as follows:

Wherein, the CPU processing capacity of the edge server is expressed as Ρ _s represents edge server processing latency, the actual execution time of edge server execution task v _n,i isThe completion time of task v _n,i on the edge server is therefore:

The total delay of the application program is the local completion time of the virtual ending node task; will be Represented as an application programAnd willRepresented as an application programIs the first virtual task of the start of an application programThe completion delay t _m of (1) is given as:

5. The method for optimizing and offloading terminal energy efficiency based on attention multi-index ranking of claim 4, wherein the energy consumption model comprises local processing calculation energy consumption, terminal device to edge server transmission energy consumption, edge server to target server transmission energy consumption, and processing waiting energy consumption;

k _n represents the energy consumption coefficient calculated by the terminal equipment;

Wherein, The representation is the waiting power of the terminal equipment when executing the task on the MEC edge server;

Thus, the total energy consumption consumed by the end terminal device is as follows: