CN111400001A

CN111400001A - Online computing task unloading scheduling method facing edge computing environment

Info

Publication number: CN111400001A
Application number: CN202010157564.1A
Authority: CN
Inventors: 郑四发; 王桢
Original assignee: Tsinghua University
Current assignee: Tsinghua University
Priority date: 2020-03-09
Filing date: 2020-03-09
Publication date: 2020-07-10
Anticipated expiration: 2040-03-09
Also published as: CN111400001B

Abstract

The invention discloses an online computing task unloading scheduling method facing to a marginal computing environment, wherein the algorithm is based on a dynamic Guno game model and comprises the following steps: the server periodically issues a lease price for the computing resource; each application of each user equipment generates a calculation task according to a certain frequency; when a calculation task is generated, the user equipment calculates the number of rented calculation resources by using the obtained price information and historical information and adopting a designed iterative algorithm, determines a payment function value obtained by calculation on different servers, selects an optimal scheduling mode in real time and sends a calculation task request; the server receives the request of the user equipment, allocates the computing resources in real time and updates the rental price; and after the calculation is finished, the user equipment receives result data and updates historical information, so that the dynamic circulation is realized, the Nash equilibrium is achieved, and the global optimum is realized. The invention has low calculation amount, low communication overhead and strong real-time performance, and is suitable for the quality requirements of the differential user experience of dynamic network environment and different applications.

Description

Online computing task unloading scheduling method facing edge computing environment

Technical Field

The invention belongs to the technical field of edge computing optimization, and particularly relates to an online computing task unloading scheduling method.

Background

With the development of mobile intelligent devices and the increasing demand of users for application service quality, in addition to the difficulty in developing algorithms, new fundamental problems are also exposed: the richer the computer applications used in mobile devices, the more computing resources the mobile devices require and there are sensitive delay constraints, which means that higher demands are placed on the computing power of the mobile devices.

In recent years, research and development of communication technology and edge computing technology provide a new idea for solving the problem of sharing the computing task pressure of a mobile terminal by using edge computing, for example, in the fields of automatic driving, such as S L AM (Simultaneous localization and Mapping), video detection, target recognition and tracking, voice recognition, collaborative optimization control, AR (Augmented Reality) navigation in the field of travel services, panorama synthesis and the like, are delay-sensitive and computation-intensive, and are very suitable for the application of an edge computing system.

Although the development of edge computing is well-established, there are still a number of key technical issues to be researched and solved. The core of edge computing is to distribute computing tasks, and the decision and scheduling problem of computing offloading is one of its key technologies, that is, the problem of how to place and allocate resources between different servers for an application module, so as to avoid congestion of computing resources and achieve the problem of balanced optimization of network performance and computing cost.

In the existing research, a centralized optimization method is mostly adopted, for example, a heuristic algorithm adopted in a service request distribution method (chinese patent application publication No. CN108874525A) facing an edge computing environment disclosed by Zhejiang university, and for example, a branch limit method adopted in a computation offload method (chinese patent application publication No. CN110535700A) under a multi-user edge server scene disclosed by Harbin engineering university all belong to a centralized algorithm, and need to adopt offline setting: before solving the problem, the global input is required to be known, including the task information unloaded by all the mobile terminals. In practical applications, this would incur significant communication overhead and would not be able to dynamically adapt to changes in the mobile-end application requirements. Therefore, it is necessary to develop an online computation task offloading scheduling method capable of adapting to different scenarios. Aiming at the problem, the current online method adopts a method for dividing time slices, for example, a Markov chain is adopted in a paper A mobile virtual scheduling scheme in content centralized networks to Model discrete time slices so as to adopt a centralized optimization method, and the problem of poor real-time property mainly exists; there are also methods of monitoring network conditions in real time, such as a method of monitoring network environment and looking up a table, which is adopted in an automatic driving service unloading method based on edge calculation (chinese patent application publication No. CN110633138A) disclosed by the university of zhongshan, but the method has poor adaptability to highly time-varying network environments and different application requirements; a storage gaming machine-the theoretical approach adopts a potential gaming model to establish an online unloading scheduling method, and has large communication overhead and large calculation amount.

To meet the practical requirements of edge computing applications, high real-time, adaptive, and low-communication-overhead online computing task offloading scheduling methods need to be developed to provide good network service quality.

Disclosure of Invention

In view of the above, the present invention provides an online computing task offloading scheduling method for an edge computing environment, which has the advantages of strong real-time performance and low communication overhead, and can adaptively allocate computing resources according to different application requirements and scenarios, thereby achieving combination optimization between network performance and cost.

In order to achieve the purpose, the invention adopts the following technical scheme:

the invention provides an online computing task unloading scheduling method facing to an edge computing environment, which is applied to a three-layer framework: the edge computing environment of the cloud server, the edge server and the user equipment is characterized in that the online computing task unloading and scheduling method is based on a dynamic Guno game model and comprises the following steps:

1) each server periodically broadcasts and releases the information of the calculation resource renting price, and the calculation resource renting price is made according to the current self calculation resource occupation condition of the server and a calculation resource pricing strategy; each user device within communication range of the edge server receives the computing resource lease price information and runs a plurality of applications, and each application serves as an independent individual for leasing the computing resource, corresponding to different players in the game; each application generates a calculation task request according to a certain frequency;

2) when user equipment generates a calculation task request, the user equipment calculates the number of the requested calculation resources by taking an end-to-end delay and a cost comprehensive index as an optimization target by using a gradient descent iterative algorithm of a self-adaptive learning rate according to received calculation resource lease price information and self calculation task execution historical information, calculates the end-to-end delay of the calculation task by using a queuing theory model, and calculates a corresponding payment function, thereby selecting a calculation unloading mode for maximizing self benefits, determining an optimal unloading scheduling strategy, and if the calculation unloading is needed, the user equipment sends a calculation task request data packet which comprises calculation data and control information containing the calculation resource lease number, and executes the step 3); if the estimation calculation unloading is not needed, the user equipment executes the calculation task locally, updates the calculation result and the calculation task historical information after the calculation task is completed, returns to the step 1), and continues to perform the next calculation task unloading scheduling;

3) the method comprises the steps that a server receives a computing task request of user equipment, a virtual machine serving a computing task is constructed by using an online dynamic boxing algorithm, and the computing task is executed after corresponding physical host computing resources are distributed; the application corresponding to each computing task only has one virtual machine at the same time; the server updates the self-computing resource occupation situation in real time, updates the computing resource price and broadcasts according to the computing resource pricing strategy used in the step 1);

4) and after the calculation task is completed, the server returns a calculation result data packet, the user equipment updates the calculation result and the calculation task historical information, and returns to the step 1), and the next calculation task unloading scheduling is continuously carried out, so that the dynamic circulation is carried out, and the Nash equilibrium is achieved.

The invention has the following characteristics and beneficial effects:

the invention is an online algorithm, under the condition of n user equipment, l application and e servers, the time complexity is O (nle), the required calculation is iteration of the resource renting quantity, the space complexity is O (nle), the required storage data is historical task information, and the calculation complexity is low; the control instruction is less, the scheduling decision algorithm is executed at the user equipment end, the required external input is only received periodic price broadcast, and the required communication overhead is small; the iterative function can be quickly converged to stable states under different scenes, and good instantaneity is achieved; through an iterative algorithm, the unloading decision can reach Nash equilibrium, so that all users can obtain equilibrium optimization between performance and cost, and global optimization is achieved; different requirements of different applications on performance and cost can be reflected through the definition of the payment function, so that the most reasonable computing resources are distributed to the different applications, the purpose of optimizing the system performance is achieved, and the applicability is wide.

Drawings

FIG. 1 is an overall flow diagram of an embodiment of the present invention;

FIG. 2 is a schematic diagram of an application scenario of an embodiment of the present invention;

fig. 3 is a schematic diagram of practical effects of the embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the detailed description and specific examples, while indicating the scope of the invention, are intended for purposes of illustration only and are not intended to limit the scope of the invention.

For better understanding of the present invention, an application example of the online computing task offload scheduling method for an edge computing environment according to the present invention is described in detail below by taking an edge computing environment for vehicle application as an example.

Referring to fig. 1, the present invention provides an online computing task offloading scheduling method for an edge computing environment, which is applied to a three-layer architecture: a cloud server, a plurality of edge servers, and an edge computing environment for a plurality of user devices. The cloud server is connected with each edge server through a wired network, and the edge servers are connected with user equipment (mobile equipment) in the communication range of the edge servers through a wireless network. The user equipment has weaker computing power, the edge server has stronger computing power, can be used by the user equipment connected with the edge server, and has lower network delay, and the cloud server has the most computing power but has larger network delay with the user equipment. The user equipment can process the computing task by using the local computing resource, and also can process the computing task by using the computing resource of the edge server or the cloud server connected with the user equipment at a certain cost, wherein the use cost is the corresponding rental fee paid to the server. The server selects a computing resource leasing strategy for maximizing the benefits according to the established leasing price and the user equipment according to the price, and continuously adjusts the strategy in a dynamic network environment to form a dynamic Guno game model. Based on a dynamic Guno game model, each server formulates a calculation resource price and periodically broadcasts according to the current self calculation resource occupation condition and a calculation resource pricing strategy, and user equipment in the communication range of each server receives calculation resource price information and generates a calculation task request; each user equipment uses each application as an independent individual to perform renting service of computing resources, when a computing task is generated, according to received computing resource price information and historical computing condition information of the computing task per se, including the amount of the rented computing resources and the obtained actual end-to-end delay and payment function value, the number of the computing resources required by the requested computing task is calculated by using a gradient descent iterative algorithm of a self-adaptive learning rate, so that a computing unloading mode for maximizing the benefit per se is selected, and if the computing unloading is performed, a computing task request is sent to a corresponding server; the server receives a corresponding calculation task request, uses an online dynamic boxing algorithm to construct a virtual machine serving a calculation task, allocates corresponding physical host computer calculation resources, updates the calculation resource occupation condition of the server, updates the calculation resource price according to a pricing strategy and broadcasts; and the user equipment receives the calculation result, updates the historical information, and carries out the next iteration, so that the dynamic repetition is realized, the Nash balance is finally achieved, and the overall optimization target of the system performance and the cost is realized.

The embodiment of the invention is applied to the networked vehicle edge computing environment shown in fig. 2, wherein each networked vehicle (shown as reference numerals 1-3 in the figure) is used as user equipment to drive into the coverage range of the edge server, is connected with the edge server 6 or the edge server 7 and the cloud server 8 through the communication node 4 and the communication node 5 in a wired or wireless mode, and receives the server computing resource price information. Internet vehicle { v₁，v₂，v₃Available computing resources include local on-board computing resources, edge server computing resources, and cloud server computing resources { s }₁，s₂，s₃}. All servers and vehicles support the online computation unloading scheduling method provided by the invention through a write protocol. Vehicle operation with different applications a₁，a₂，…，a_lWith different computational and delay requirements, l is the total number of applications loaded by the vehicle. The method for unloading and scheduling the online computing task specifically comprises the following steps:

1) the cloud server 8, the edge server 6 and the edge server 7 make a calculation resource renting price according to the current self calculation resource occupation condition and a calculation resource pricing strategy, and periodically broadcast to the

vehicles

1, 2 and 3 through the

communication nodes

4 and 5; the computing resource occupation situation may include, but is not limited to, the occupation situation of a CPU, a memory, and the like of the server, the computing resource pricing policy is positively correlated with the computing resource occupation situation of the server to prevent the congestion of the computing task, and the computing pricing policy may be represented by a lower model:

in the formula, p_uviFor servers s_vFor user equipment v_uApplication of (a)_iRental price of provided computing resourcesWhere v is 1, 2, …, e, e is the total number of servers, u is 1, 2, …, n, n is the total number of user equipments in the communication range of the servers, i is 1, 2, …, l, l is the user equipment v_uTotal number of applications loaded; q. q.s_uviFor user equipment v_uApplication of (a)_iTo a server s_vThe amount of computing resource applied for is a basic computing resource f_b，∑_u，iq_uviFor servers s_vCalculating the total occupation condition of resources; x is the number of_uviFor servers s_vTo user equipment v_uApplication of (a)_iMinimum resource price charged, y_uviAnd z > 0, which represents the rate and rate at which the price of a computing resource rises with the occupancy of the computing resource. x is the number of_uvi，y_uviZ is set by the service operator, x_uviThe larger the base price is; y is_uviThe larger z, the greater the rate of change of computing resources with lease rate. The pricing strategy is established so that the user equipment does not occupy a large amount of resources due to the benefit of the user equipment, thereby impairing the service quality of other user equipment.

The determined rental price for the computing resource is transmitted to the user equipment using a form of broadcasting, typically at a frequency of 10 Hz.

Each user equipment v_uRunning different applications a respectively_iThe renting of computing resources is serviced by each application of each user device as an independent entity, the computing tasks of each application having an attribute < lambda corresponding to different players in the gaming model_ui，d_ui，c_uiIs where λ_uiRepresenting user equipment v_uApplications in (A)_iThe generation frequency of the calculation tasks of d_uiRepresenting user equipment v_uApplications in (A)_iThe size of the data volume to be transferred for the calculation task of c_uiRepresenting user equipment v_uApplications in (A)_iThe amount of computation that needs to be performed by the computing task of, user equipment v_uEach application of (2) to_uiA computational task is generated for the mean poisson distribution time interval.

2) When user equipment generates a calculation task request, the user equipment executes historical information including historical calculation task resource renting quantity and payment function value according to received calculation resource renting price information and previous calculation tasks stored by the user equipment; the method comprises the steps of calculating the number of requested computing resources by using a gradient descent iterative algorithm of a self-adaptive learning rate and taking a comprehensive index of end-to-end delay and cost as an optimization target, calculating end-to-end delay (including calculation delay and communication delay) by using a queuing theory model, and calculating a corresponding payment function, so that a computing unloading mode for maximizing self benefits is selected, and an optimal unloading scheduling strategy is determined. According to the optimal unloading scheduling strategy, if a larger payment function value can be obtained by executing on the edge server or the cloud server, calculation unloading is needed, the user equipment sends a calculation task request data packet including calculation data and control information containing calculation resource renting number, and step 3) is executed. And if the estimation calculation unloading is not needed, the user equipment executes the calculation task locally, updates the calculation result and the calculation task historical information after the calculation task is completed, returns to the step 1) and continues to perform the next calculation task unloading scheduling.

For user equipment v_uFor each application, i.e. game player, on all servers corresponding to the payout function W_uiComprises the following steps:

wherein, tau_uviRepresenting user equipment v_uApplication of (a)_iWhether or not at server s_vIndicator of execution, if executed, τ_uvi1, otherwise, τ _uvi0 and satisfy

Only one server can be selected for executing one calculation task;

for user equipment v_uApplication of (a)_iAt server s_vEnd-to-end delay of the executing computational task; p is a radical of_uviq_uviFor user equipment v_uApplication of (a)_iTo a server s_vRental cost paid α_ui，β_ui，γ_uiAs a weighting factor, α_uiAnd β_uiThe larger the delay requirement, the higher gamma_uiThe larger the number of applications, the more sensitive it is to the lease price, the extreme value of the payment function, representing the achievement of the best balance between network performance and cost of use, the weighting factor α for different applications_ui，β_ui，γ_uiDifferent, different performance requirements are reflected, and according to different application requirements, such as automatic driving vehicle control application, sensitive to delay, a larger α is set_ui，β_uiAnd a smaller gamma_ui(ii) a Or entertainment applications, more price sensitive, then set a larger gamma_ui。

Through a mechanism of renting prices, payment functions, namely earnings, among the user equipment can be influenced mutually, and the state that the pricing strategy of each user equipment can obtain the maximum payment function value is the Nash equilibrium state of the dynamic Guno game model. In order to enable each user equipment to reach a Nash equilibrium state under the condition of limited information interaction, an iterative method of calculating the number of requested calculation resources by using a gradient descent iterative algorithm of a self-adaptive learning rate according to received price information and self historical information is adopted, and the Nash equilibrium is converged through continuous iteration, wherein the expression of the iterative algorithm is as follows:

where k is the number of iterations, s_uiFor user equipment v_uApplication of (a)_iS learning rate factor of_uiq_uvi(k) The convergence may be accelerated for the adaptive learning rate calculated for the kth iteration. In view of the actual communication overhead, in the case of limited information interaction,

it is difficult to calculate, so calculating the partial derivatives using differential approximation can result in an iterative formula that utilizes historical information:

wherein Q is_-1(k-1) is other application (other application means other than the user equipment v_uApplications in (A)_iAll applications other than the user device and other user devices), Q) of the computing resource leasing policy_-1(k)∪{q_uvi(k) Is the set of player policies used. When a calculation task is generated, calculating the number of calculation resources required to be rented by using an iterative algorithm and combining historical information of user equipment;

when making offloading decisions, it is necessary to delay end-to-end

The estimation is carried out in such a way that,

divided into communication delays

And calculating the delay

The calculation formula is as follows:

in the formula (I), the compound is shown in the specification,

calculating a delay

According to server s_vFor user equipment v_uApplication a of request_iThe processing logic of the computing task of (1) may be estimated using an M/G/1 queuing theory model. The calculation expression is:

wherein, mu_uiComputing a resource f for a unit_bFor user equipment v_uApplication of (a)_iOf the computing task of (1), the value being c_ui/f_bAnd rho is the service intensity in the M/G/1 queuing theory model and is calculated according to a formula.

Communication delay

The method is divided into two parts of fixed connection delay and delay positively correlated with the transmission quantity, and the calculation formula is as follows:

wherein r is_uvFor user equipment v_uAnd server s_vThe equivalent transmission rate of (a) of (b),

for user equipment v_uAnd server s_vFixed connection delay of r_uvAnd

and can be dynamically updated according to network conditions.

The number q of computing resources obtained by the iterative method_uviCalculating the payment functions performed on the different servers, selecting an offloading scheme that maximizes the payment function, i.e. τ_uviAnd determining the calculation unloading scheduling strategy. Wherein, the calculation unit of the user equipment also belongs to a server, if the user equipment selects to carry out unloading according to the calculation unloading scheduling strategy result, the communication node 4 and the communication node 5 are used for sending data packets to carry out data transmission, and the communication network sends the calculation task to the target serviceAnd the server executes the step 3), and the data packet comprises data information required by the calculation task and control information of the renting amount of the calculation resources. If the local calculation is selected, the calculation unit of the user equipment is used for calculation, after the calculation task is completed, the user equipment updates the calculation result and the calculation task historical information, the step 1 is returned, and the next calculation task unloading scheduling is continued.

3) When receiving a computing task request of corresponding user equipment, the edge server 6, the edge server 7 and the cloud server 8 first determine whether the corresponding computing task has a virtual machine being executed, and if so, add the computing task into a waiting queue. If no virtual machine corresponding to the computing task or the waiting virtual machine completes computing, the server allocates computing resources for the computing task, and the allocation quantity of the computing resources is determined according to the requested quantity f_bq_uviAnd (6) distributing. The allocation method adopts an online First Fit dynamic boxing algorithm to allocate, and allocates to a First physical host meeting the computing resource requirement; if no physical host satisfying the computing resource requirement exists, a first-in first-out queue is added for waiting. Then updating the occupation condition of the computing resources of the server, and updating p according to the pricing strategy adopted in the step 1)_uviThe latest price is periodically released for the next iteration of each user equipment iteration algorithm.

4) After the server completes the calculation task request and returns the calculation result, the user equipment can obtain the real end-to-end delay

The data is stored in historical information and used for the next iteration algorithm and the updating of network state information r_uvAnd

thereby improving the accuracy of the delay estimation. Returning to the step 1), continuing the next cycle, dynamically repeating to form a dynamic repeated game, and finally enabling the resource request strategy of each user equipment to reach Nash equilibrium, so that each application obtains the optimal payment function value to reach Nash equilibriumTo balance cost and performance, a global optimization is formed, as shown in fig. 3, all applications can find the optimal policy of renting and unloading computing resources according to their different requirements and can keep stable. The method can also be rapidly re-converged when the user equipment moves, the topology changes and the network environment changes.

The present invention and its embodiments have been described above schematically, without limitation, and what is shown in the drawings is only one of the embodiments of the present invention and is not actually limited thereto. Therefore, if the person skilled in the art receives the teaching, it is within the scope of the present invention to design the similar manner and embodiments without departing from the spirit of the invention.

Claims

1. An online computing task unloading scheduling method facing to an edge computing environment is applied to a three-layer framework: the edge computing environment of the cloud server, the edge server and the user equipment is characterized in that the online computing task unloading and scheduling method is based on a dynamic Guno game model and comprises the following steps:

2. The on-line computing task offload scheduling method of claim 1, wherein: in the step 1), the occupation condition of the computing resources comprises the occupation condition of a CPU and a memory of the server; the computing resource pricing strategy is positively correlated with the computing resource occupation condition of the server, and is expressed by a lower model:

in the formula, p_uviFor servers s_vFor user equipment v_uApplication of (a)_iThe lease price, v-1,2, …, e, e is the total number of servers, u is 1, 2, …, n, n is the total number of user equipments in the communication range of the servers, i is 1, 2, …, l, l is the user equipment v_uTotal number of applications loaded; q. q.s_uviFor user equipment v_uApplication of (a)_iTo a server s_vThe amount of computing resource applied for is a basic computing resource f_b，∑_u，iq_uviFor servers s_vCalculating the total occupation condition of resources; x is the number of_uviFor servers s_vTo user equipment v_uApplication of (a)_iMinimum resource price charged, y_uviZ > 0, representing the proportion and rate at which the price of a computing resource rises with the occupancy of the computing resource;

the server sends the computing resource lease price to the user equipment in a broadcast mode;

user equipment v_uEach application a of_iHas an attribute of < lambda_ui，d_ui，c_uiIs where λ_uiRepresenting user equipment v_uApplications in (A)_iThe generation frequency of the calculation tasks of d_uiRepresenting user equipment v_uApplications in (A)_iThe size of the data volume to be transferred for the calculation task of c_uiRepresenting user equipment v_uApplications in (A)_iThe amount of computation that needs to be performed by the computing task of, user equipment v_uEach application of (2) to_uiA computational task is generated for the mean poisson distribution time interval.

3. The on-line computing task offload scheduling method of claim 2, wherein: in step 2), for the user equipment v_uEach application a of_iNamely the corresponding payment function W of the game player arranged on all the servers_uiComprises the following steps:

wherein, tau_uviRepresenting user equipment v_uShould be prepared fromBy a_iWhether or not at server s_vIndicator of execution, if executed, τ_uvi1, otherwise, τ_uvi0 and satisfy

Only one server can be selected for executing one calculation task;

for user equipment v_uApplication of (a)_iAt server s_vEnd-to-end delay of the executing computational task; p is a radical of_uviq_uviFor user equipment v_uApplication of (a)_iTo a server s_vRental cost paid α_ui，β_ui，γ_uiAs a weighting factor, α_uiAnd β_uiThe larger the delay requirement, the higher gamma_uiThe larger, the more sensitive to rental price;

calculating user equipment v by using gradient descent iterative algorithm of self-adaptive learning rate according to calculation resource renting price information received by the user equipment and calculation task execution historical information stored by the user equipment_uApplication of (a)_iAt server s_vThe number of the rented computing resources enables each user equipment to reach a Nash equilibrium state under the condition of limited information interaction, so that a computing resource pricing strategy of each user can obtain a maximum payment function value; the expression of the iterative algorithm is as follows:

where k is the number of iterations, s_uiFor user equipment v_uApplication of (a)_iS learning rate factor of_uiq_uvi(k) An adaptive learning rate calculated for the kth iteration;

calculating bias using differential approximationDerivatives utilize an iterative formula of historical information:

wherein Q is_-1(k-1) computing resource lease policy for other applications, Q_-1(k)∪{q_uvi(k) The set of all player policies;

when a calculation task is generated, according to the iterative algorithm and by using the calculation task execution historical information of the user equipment, the number of calculation resources required to be rented is calculated in an iterative manner;

in making an offload decision, the end-to-end delay is calculated according to the following equation

In the formula (I), the compound is shown in the specification,

to calculate the delay, according to the server s_vFor user equipment v_uApplication a of request_iThe processing logic of the computing task is estimated by using an M/G/1 queuing theory model, and the computing expression is as follows:

wherein, mu_uiComputing a resource f for a unit_bFor user equipment v_uApplication of (a)_iC of the computing task_ui/f_b(ii) a Rho is the service intensity in the M/G/1 queuing theory;

for communication delay, the method is divided into two parts of fixed connection delay and delay positively correlated with transmission quantity, and the calculation formula is as follows:

for user equipment v_uAnd server s_vFixed connection delay of r_uvAnd

dynamically updating according to the network condition;

the number q of computing resources obtained by the iterative method_uviAt the user equipment v_uApplications in (A)_iThe calculation tasks of (4) are performed on different servers, an off-loading scheme is selected that maximizes the payment function, i.e., τ_uviDetermining a calculation unloading scheduling strategy; and selecting local computation or unloading computation according to the computation unloading scheduling strategy result.

4. The on-line computing task offload scheduling method of claim 1 or 2, wherein: in step 3), when receiving a computing task request of user equipment, a server firstly judges whether a corresponding computing task has a virtual machine which is executing, if so, the computing task is added into a waiting queue, if no virtual machine corresponding to the application or the virtual machine which is waiting finishes computing, the server allocates computing resources for the computing task, and the allocation quantity of the computing resources is allocated according to the requested quantity; the distribution method adopts an online First Fit dynamic boxing algorithm and distributes the First Fit dynamic boxing algorithm to the second Fit dynamic boxing algorithmIf the physical host which meets the computing resource requirement does not exist, adding a first-in first-out queue for waiting; then updating the occupation condition of the computing resources of the server, and updating the lease price p of the computing resources according to the pricing strategy of the computing resources adopted in the step 1)_uviAnd periodically releasing the latest price for the next iteration of each user iteration algorithm.

5. The on-line computing task offload scheduling method of claim 1, wherein: in step 4), after the server finishes the calculation task request and returns the calculation result, the user equipment obtains the real end-to-end delay, and the data is stored in the historical information and used for iterating the algorithm and updating the network state information.

6. The on-line computing task offload scheduling method of claim 1, wherein: the broadcast frequency is typically 10 Hz.