CN113254188A

CN113254188A - Scheduling optimization method and device, electronic equipment and storage medium

Info

Publication number: CN113254188A
Application number: CN202110765005.3A
Authority: CN
Inventors: 任涛; 胡哲源; 谷宁波; 牛建伟; 胡舒程; 李青锋
Original assignee: Hangzhou Innovation Research Institute of Beihang University
Current assignee: Hangzhou Innovation Research Institute of Beihang University
Priority date: 2021-07-07
Filing date: 2021-07-07
Publication date: 2021-08-13
Anticipated expiration: 2041-07-07
Also published as: CN113254188B

Abstract

The embodiment of the application provides a scheduling optimization method and device, an electronic device and a storage medium, and relates to the technical field of scheduling optimization. The scheduling optimization method is applied to electronic equipment, the electronic equipment is in communication connection with a mobile edge computing network system, the mobile edge computing network system comprises at least one base station, an unmanned aerial vehicle and mobile equipment, and the scheduling optimization method comprises the following steps: firstly, acquiring a task to be processed and current position information of at least one mobile device; secondly, inputting the task to be processed and the current position information into a preset scheduling optimization model to obtain a scheduling strategy; the scheduling policy is then transmitted to the at least one mobile device. By the method, the first task can be dispatched to the unmanned aerial vehicle for processing, and the second task can be dispatched to the base station for processing, so that the problem of low scheduling optimization efficiency caused by the fact that in the prior art, tasks are all executed locally on the mobile equipment or are all dispatched to the unmanned aerial vehicle or the base station for remote execution is solved.

Description

Scheduling optimization method and device, electronic equipment and storage medium

Technical Field

The present application relates to the field of scheduling optimization technologies, and in particular, to a scheduling optimization method and apparatus, an electronic device, and a storage medium.

Background

In the field of drone-assisted mobile edge computing, appropriate decisions need to be made regarding how a drone schedules computational tasks in a mobile edge computing network (whether the computational tasks are executed locally at the mobile device or scheduled to be executed at the drone or base station) to achieve the desired performance.

However, the inventor researches and discovers that in the prior art, tasks are all executed locally on the mobile device or are all scheduled to the unmanned aerial vehicle or the base station for remote execution, so that the scheduling optimization has the problem of low efficiency.

Disclosure of Invention

In view of the above, an object of the present application is to provide a scheduling optimization method and apparatus, an electronic device, and a storage medium, so as to solve the problems in the prior art.

In order to achieve the above purpose, the embodiment of the present application adopts the following technical solutions:

in a first aspect, the present invention provides a scheduling optimization method, applied to an electronic device, where the electronic device is in communication connection with a mobile edge computing network system, where the mobile edge computing network system includes at least one base station, an unmanned aerial vehicle, and a mobile device, and the scheduling optimization method includes:

acquiring a task to be processed and current position information of the at least one mobile device, wherein the task to be processed comprises a first task and a second task;

inputting the task to be processed and the current position information into a preset scheduling optimization model to obtain a scheduling strategy, wherein the scheduling optimization model is obtained by training based on the established initial model;

and sending the scheduling policy to the at least one mobile device, so that the at least one mobile device sends the first task to the at least one unmanned aerial vehicle for processing based on the scheduling policy, and forwards the second task to the at least one base station for processing through the at least one unmanned aerial vehicle.

In an optional embodiment, the scheduling optimization method further includes a step of obtaining a scheduling optimization model, where the step includes:

establishing an initial model and an optimization objective function according to the initial parameters of the mobile edge computing network system;

and training the initial model according to the optimization objective function to obtain a scheduling optimization model.

In an optional embodiment, the step of establishing an initial model and optimizing an objective function according to initial parameters of the mobile edge computing network system includes:

establishing an initial model according to initial parameters of the at least one base station, the unmanned aerial vehicle and the mobile equipment;

and establishing an optimization objective function according to the initial model.

In an optional embodiment, the scheduling optimization model includes an unmanned aerial vehicle trajectory planning model, a calculation task joint scheduling model, and a resource allocation model, and the step of training the initial model according to the optimization objective function to obtain the scheduling optimization model includes:

splitting the optimization objective function to obtain a first optimization objective function, a second optimization objective function and a third optimization objective function;

training the initial model according to the first optimization objective function to obtain the unmanned aerial vehicle trajectory planning model, training the initial model according to the second optimization objective function to obtain the calculation task joint scheduling model, and training the initial model according to the third optimization objective function to obtain the resource allocation model.

In an optional embodiment, the step of inputting the to-be-processed task and the current location information into a preset scheduling optimization model to obtain a scheduling policy includes:

inputting the current position information into the unmanned aerial vehicle trajectory planning model, and calculating to obtain the predicted position information of the at least one mobile device;

inputting the task to be processed and the predicted position information into the task joint scheduling model, and calculating to obtain a task scheduling decision variable of the at least one mobile device;

and inputting the task to be processed and the task scheduling decision variable into the resource allocation model, and calculating to obtain a scheduling strategy.

In an optional embodiment, the step of inputting the current location information into the unmanned aerial vehicle trajectory planning model and calculating the predicted location information of the at least one mobile device includes:

performing motion prediction processing according to the current position information to obtain next position information of the at least one mobile device;

and clustering the next position information of the at least one mobile device to obtain the predicted position information.

In an optional implementation manner, the step of inputting the to-be-processed task and the predicted location information into the task joint scheduling model and calculating a task scheduling decision variable of the at least one mobile device includes:

performing task joint scheduling training processing according to the task to be processed and the predicted position information to obtain a decision action of the at least one mobile device;

and performing integrated processing on the decision action to obtain a task scheduling decision variable.

In a second aspect, the present invention provides a scheduling optimization apparatus, applied to an electronic device, where the electronic device is in communication connection with a mobile edge computing network system, where the mobile edge computing network system includes at least one base station, an unmanned aerial vehicle, and a mobile device, and the scheduling optimization apparatus includes:

the task obtaining module is used for obtaining a task to be processed and current position information of the at least one mobile device, wherein the task to be processed comprises a first task and a second task;

the strategy acquisition module is used for inputting the to-be-processed task and the current position information into a preset scheduling optimization model to obtain a scheduling strategy, wherein the scheduling optimization model is obtained by training based on the established initial model;

a policy sending module, configured to send the scheduling policy to the at least one mobile device, so that the at least one mobile device sends the first task to the at least one drone for processing based on the scheduling policy, and forwards the second task to the at least one base station through the at least one drone for processing.

In a third aspect, the present invention provides an electronic device comprising: a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the schedule optimization method of any of the preceding embodiments when executing the program.

In a fourth aspect, the present invention provides a storage medium, where the storage medium includes a computer program, and the computer program controls, when running, an electronic device where the storage medium is located to execute the scheduling optimization method according to any of the foregoing embodiments.

The scheduling optimization method and device, the electronic device and the storage medium provided by the embodiment of the application obtain the scheduling policy by inputting the task to be processed and the current position information into the preset scheduling optimization model, and send the scheduling policy to the at least one mobile device, so that the at least one mobile device sends the first task to the at least one unmanned aerial vehicle for processing based on the scheduling policy, and forwards the second task to the at least one base station for processing through the at least one unmanned aerial vehicle, thereby realizing the purpose of scheduling the first task on the unmanned aerial vehicle for processing, and scheduling the second task on the base station for processing, and avoiding the problems that in the prior art, the tasks are all executed locally on the mobile device or are all executed remotely on the unmanned aerial vehicle or the base station, and the scheduling optimization efficiency is low.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained from the drawings without inventive effort.

Fig. 1 shows a block diagram of a data processing system according to an embodiment of the present application.

Fig. 2 shows a block diagram of a scheduling optimization system according to an embodiment of the present application.

Fig. 3 shows a block diagram of an electronic device according to an embodiment of the present application.

Fig. 4 is a flowchart illustrating a scheduling optimization method according to an embodiment of the present application.

Fig. 5 is another schematic flow chart of the scheduling optimization method according to the embodiment of the present application.

Fig. 6 is a schematic structural diagram of a scheduling optimization model according to an embodiment of the present application.

Fig. 7 is a schematic structural diagram of an LSTM network according to an embodiment of the present application.

Fig. 8 is a schematic structural diagram of a mobile device location prediction model based on an LSTM network according to an embodiment of the present application.

Fig. 9 is a schematic flowchart of a clustering algorithm for a mobile device based on FCM according to an embodiment of the present application.

Fig. 10 is a schematic structural diagram of an actor neural network and an evaluation family neural network provided in an embodiment of the present application.

Fig. 11 is a flowchart illustrating a DDPG-based calculation task scheduling algorithm according to an embodiment of the present application.

Fig. 12 is a schematic flowchart of a scheduling variable modeling integration algorithm provided in an embodiment of the present application.

Fig. 13 is a block diagram of a scheduling optimization apparatus according to an embodiment of the present application.

Icon: 10-a data processing system; 100-an electronic device; 110 — a first memory; 120-a first processor; 130-a communication module; 200-a scheduling optimization system; 1300-scheduling optimization means; 1310-a task obtaining module; 1320-policy acquisition module; 1330-policy sending module.

Detailed Description

Due to the high mobility and flexibility of Unmanned Aerial Vehicles (UAVs), researchers have recently proposed a technology for assisting Mobile Edge Computing (MEC) in various application scenarios using an Unmanned Aerial vehicle by deploying relevant wireless communication nodes in the Unmanned Aerial vehicle to establish a communication relationship with Mobile Devices (MDs) of users. When network infrastructure is unavailable (such as rescue scene where natural disaster occurs), network equipment is sparsely distributed (such as field operation environment) or mobile equipment facing temporary proliferation is far beyond network service capability (such as large competition or meeting), the unmanned aerial vehicle can be used as a communication relay station or an edge computing platform. After the unmanned aerial vehicle deploys the computing resources, the unmanned aerial vehicle-assisted mobile edge computing network will bring many advantages, such as reducing network overhead, reducing computing task execution delay, better quality of experience (QoE), prolonging the battery life of the mobile device, and the like.

In the field of drone-assisted mobile edge computing, appropriate decisions need to be made on the motion trajectory of a drone and the offloading of computing tasks in a mobile edge computing network (whether the computing tasks are executed locally on a mobile device or are offloaded to an edge server side) to achieve ideal performance. Specifically, the existing research and invention realizes the minimization of the computation delay or energy consumption of all mobile devices by optimizing the track, the task unloading proportion and the task scheduling condition of the unmanned aerial vehicle, and ensures the reliability of the whole edge computing network.

Existing drone-assisted edge computing systems often use only one or more drones as edge computing devices to ensure low latency and reliability of network system computing task transmissions. Due to the limitations of current unmanned aerial vehicle technology development and the relatively weak computing power of deploying computing devices in unmanned aerial vehicles, simply using an unmanned aerial vehicle-assisted edge computing network is not sufficient to provide satisfactory services for multiple mobile devices. Therefore, a more promising model is to implement the building of a mobile edge computing network among mobile devices, drones, and cellular Base Stations (BSs). The existing edge computing networks composed of mobile devices, unmanned aerial vehicles and base stations only comprise one unmanned aerial vehicle, and the unmanned aerial vehicle is used as a computing device of an edge service and a relay task forwarding device, so that the computing task requirements of a plurality of mobile devices cannot be met at the same time, and the task computing time delay of a network system is increased.

In order to improve at least one of the above technical problems proposed by the present application, embodiments of the present application provide a scheduling optimization method and apparatus, an electronic device, and a storage medium, and the following describes technical solutions of the present application through possible implementation manners.

The defects existing in the above solutions are the results obtained after the inventor has practiced and studied carefully, so the discovery process of the above problems and the solutions proposed by the embodiments of the present application in the following description to the above problems should be the contributions made by the inventor in the invention process.

In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. The components of the embodiments of the present application, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations.

Thus, the following detailed description of the embodiments of the present application, presented in the accompanying drawings, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

It is to be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.

It should be noted that the features of the embodiments of the present application may be combined with each other without conflict.

Fig. 1 is a block diagram of a data processing system 10 according to an embodiment of the present application, which provides a possible implementation manner of the data processing system 10, and referring to fig. 1, the data processing system 10 may include one or more of an electronic device 100 and a scheduling optimization system 200.

The electronic device 100 is in communication connection with the scheduling optimization system 200, and the electronic device 100 obtains the task to be processed and the position of the scheduling optimization system 200 and obtains a scheduling policy according to the task to be processed and the position, so that the scheduling optimization system 200 performs scheduling optimization according to the scheduling policy.

Optionally, the specific composition of the scheduling optimization system 200 is not limited, and may be set according to the actual application requirement. For example, in one alternative example, the schedule optimization system 200 can include at least one base station, a drone, and a mobile device.

It should be noted that, in an alternative example, the electronic device 100 and the mobile device may be the same device; in another alternative example, the electronic device 100 and the drone may be the same device; in another alternative example, the electronic device 100 and the base station may be the same device.

Optionally, the number of the base stations is not limited, and may be set according to the actual application requirements. For example, in an alternative example, the number of base stations may be one.

That is to say, in order to solve the problem that the task computation delay of the edge computing network composed of the mobile device, the unmanned aerial vehicle and the base station is high and cannot simultaneously meet the requirements of a plurality of mobile devices with computing tasks, with reference to fig. 2, the present application establishes a mobile edge computing network composed of a single base station, a plurality of unmanned aerial vehicles and a large number of mobile devices. The computing tasks generated by the mobile devices in the network can be performed by the mobile devices themselves, can be offloaded to one of the drones for simple computation, or can be further transmitted to the base station for more intensive computation.

Referring to fig. 3, a block diagram of an electronic device 100 according to an embodiment of the present disclosure is shown, where the electronic device 100 in this embodiment may be a server, a processing device, a processing platform, and the like, which are capable of performing data interaction and processing. The electronic device 100 includes a first memory 110, a first processor 120, and a communication module 130. The elements of the first memory 110, the first processor 120 and the communication module 130 are electrically connected to each other directly or indirectly to realize data transmission or interaction. For example, the components may be electrically connected to each other via one or more communication buses or signal lines.

The first memory 110 is used for storing programs or data. The first Memory 110 may be, but is not limited to, a Random Access Memory (RAM), a Read Only Memory (ROM), a Programmable Read-Only Memory (PROM), an Erasable Read-Only Memory (EPROM), an electrically Erasable Read-Only Memory (EEPROM), and the like.

The first processor 120 is used to read/write data or programs stored in the first memory 110 and perform corresponding functions. The communication module 130 is used for establishing a communication connection between the electronic device 100 and another communication terminal through a network, and for transceiving data through the network.

It should be understood that the configuration shown in fig. 3 is merely a schematic diagram of the configuration of the electronic device 100, and that the electronic device 100 may also include more or fewer components than shown in fig. 3, or have a different configuration than shown in fig. 3. The components shown in fig. 3 may be implemented in hardware, software, or a combination thereof.

Fig. 4 shows one of flowcharts of a scheduling optimization method provided in an embodiment of the present application, where the method is applicable to the electronic device 100 shown in fig. 3 and is executed by the electronic device 100 in fig. 3. It should be understood that, in other embodiments, the order of some steps in the scheduling optimization method of this embodiment may be interchanged according to actual needs, or some steps may be omitted or deleted. The flow of the scheduling optimization method shown in fig. 4 is described in detail below.

Step S410, a task to be processed and current location information of at least one mobile device are obtained.

The tasks to be processed comprise a first task and a second task.

Step S420, inputting the task to be processed and the current position information into a preset scheduling optimization model to obtain a scheduling strategy.

And the scheduling optimization model is obtained by training based on the established initial model.

Step S430, sending the scheduling policy to at least one mobile device, so that the at least one mobile device sends the first task to the at least one drone for processing based on the scheduling policy, and forwards the second task to the at least one base station for processing through the at least one drone.

According to the method, the to-be-processed tasks and the current position information are input into a preset scheduling optimization model to obtain a scheduling strategy, the scheduling strategy is sent to at least one mobile device, so that the at least one mobile device sends the first tasks to the at least one unmanned aerial vehicle for processing based on the scheduling strategy, the second tasks are forwarded to the at least one base station through the at least one unmanned aerial vehicle for processing, the first tasks are scheduled to the unmanned aerial vehicle for processing, the second tasks are scheduled to the base station for processing, and the problem that in the prior art, the scheduling optimization efficiency is low due to the fact that the tasks are all executed locally on the mobile device or are all executed remotely on the unmanned aerial vehicle or the base station is solved.

It should be noted that, before step S410, the scheduling optimization method provided in the embodiment of the present application may further include a step of obtaining a scheduling optimization model, and with reference to fig. 5, the step may include the following sub-steps:

step S440, establishing an initial model and an optimization objective function according to the initial parameters of the mobile edge computing network system.

And S450, training the initial model according to the optimization objective function to obtain a scheduling optimization model.

For step S440, it should be noted that the specific way of establishing the initial model and optimizing the objective function is not limited, and may be set according to the actual application requirement. For example, in an alternative example, step S440 may include the following sub-steps:

establishing an initial model according to initial parameters of at least one base station, the unmanned aerial vehicle and the mobile equipment; and establishing an optimization objective function according to the initial model.

The initial model may include a system model, a computation model and a communication model of the mobile edge computing network system, and the step of establishing the initial model may include the following sub-steps:

1. establishing a system model:

the network architecture of the system model established by the application is mainly divided into three layers, namely a ground mobile device, an aerial unmanned aerial vehicle and a base station at a far end, and the positions of the three can be expressed by using a three-dimensional Cartesian coordinate system. The total execution time of the task to be processed is recorded asTWhich is divided intoNA slice, a set of slices may be represented as:

；

wherein the length of each time sliceτSatisfy the following requirements

And assuming that each time slice is small enough that the position of each drone within the time slice is unchanged, under the condition that the computing task is considered to be possibly blocked, the network system assumes that the mobile device cannot directly communicate with the base station and can only offload the task to the base station with the help of the drone.

In a network system, the set of mobile devices can be represented as:

；

wherein the content of the first and second substances,Mindicating the number of mobile devices in a time slice

Mobile device

The position of (d) can be expressed as:

；

wherein the content of the first and second substances,

and

representing mobile devices

The coordinates of the horizontal plane in which it lies,

，

。

in time slice

Each mobile device

A computationally intensive task results, which can be expressed as:

；

wherein the content of the first and second substances,

representing a current task

The size of the data (unit: bit),

representing the number of cycles the CPU spends processing each bit,

representing a current task

Maximum time allowed for execution. Without loss of generality, the maximum allowed execution time of all tasks is the same. In addition to this, the present invention is,

value less thanτTo ensure that each task can be executed in one time slice.

Each mobile device

In which an on-board CPU is embedded, and its maximum computing frequencyRate of use can be

And (4) showing. By dynamically adjusting the voltage and frequency of the CPU in time slices

Moving device

Actual CPU frequency

Adaptive control can be realized to improve the utilization efficiency of energy, and therefore,

it should satisfy:

；

wherein it is assumed that all mobile devices have the same maximum computing power

。

In this system, the set of drones can be represented as:

；

wherein the content of the first and second substances,Uindicating number of drones, time slice

Unmanned plane

The position of (d) can be expressed as:

；

wherein the content of the first and second substances,

and

indicating unmanned aerial vehicle

The coordinates of the horizontal plane in which it lies,

，

，Hindicating the altitude at which the drone is located.

Assuming that the maximum flight speed of each drone does not exceed

I.e. can be expressed as:

；

wherein the content of the first and second substances,

is shown in time slice

Unmanned plane

The speed of (2). Furthermore, in order to ensure the flight safety of the drones, the distance between any two drones should be greater than the minimum allowable distance

Namely:

；

in time slice

Unmanned plane

The energy consumption generated can be expressed as:

；

wherein the content of the first and second substances,

，

to represent

The weight of (c).

Each drone may be deployed as an edge server with maximum computing power noted

. In time slice

For computing tasks that are decided to be uploaded to and executed by the drone, the drone

The allocated CPU computing resources may be represented as

And satisfies:

；

assuming that all drones have the same maximum computing power

。

The location of the base station can be expressed as:

；

wherein the content of the first and second substances,

and

representing the coordinates of the horizontal plane in which the base station is located. Because the height that basic station and unmanned aerial vehicle were located is high, the basic station is connected through sight distance wireless transmission link with unmanned aerial vehicle and not directly connected with the mobile device. In this case, the drone acts as a relay forwarding device, forwarding the tasks offloaded by the mobile device to the base station for further computation. Since the base station has a powerful calculation server and energy supply, the execution time of the calculation tasks at the base station is negligible and the energy consumption of all tasks executed at the base station is not considered.

The unloading mode of all the computing tasks of the system follows the complete unloading mode, namely, each computing task is completely executed locally or completely unloaded to the unmanned aerial vehicle

Up or further completely offloaded to base station execution. Using task scheduling decision variables

Representing computational tasks

The unloading situation of (1):

；

wherein the content of the first and second substances,

，

representing computational tasks

To be offloaded to a computing platformk。

It is worth noting that when the task is executed

In a mobile device

Or only one when executed on the unmanned aerial vehicle

Is 1, and the other values are all 0, i.e.

Or

，

. When task

When implemented at the base station, except

In addition, the corresponding drone to which it is offloaded also needs to be 1, i.e. 1

Since one of the drones should act as a relay from the mobile device to the base station. In summary, the variables

The following constraints should be satisfied:

；

in addition, it is assumed that each drone can offload at most one task to the BS for continuous execution in each time slice, and therefore,

it should satisfy:

；

wherein the content of the first and second substances,

。

it is added that the variable is introduced

The constraints of the computing resources allocated by the mobile device and the drone become:

；

；

2. establishing a calculation model:

the computational tasks may be performed in the mobile device, the drone and the base station and may therefore be referred to as local computation, drone-side computation and BS-side computation, respectively. If task

Choosing to calculate locally, i.e.

. Then, the computation time of the task is:

；

the energy consumed was:

；

wherein the content of the first and second substances,

and

is dependent on the mobile device

The positive coefficient of the CPU.

If computing task

Selecting for offloading to a drone

Is executed upwards, i.e. is

The calculation time of the task is as follows:

；

wherein the content of the first and second substances,

；

the corresponding consumed energy is:

；

wherein the content of the first and second substances,

；

wherein the content of the first and second substances,

and

is dependent on the unmanned plane

Positive coefficient of CPU, it is noted that each calculation task

Can only be unloaded into one of the drones.

If task

Performed at the base station, i.e.

According to the assumption of strong computing power and energy supply capability of the base station, the execution time of the task is approximately zero, and the energy consumption generated by the task is not considered.

3. Establishing a communication model:

the communication links of the whole network system are divided into two types: the communication link of the mobile device with the drone and the communication link of the drone with the base station. In order to avoid communication interference possibly existing between the unmanned aerial vehicles, each unmanned aerial vehicle is allocated with orthogonal communication frequency, and due to the fact that the height of each unmanned aerial vehicle is high, a wireless communication channel between each unmanned aerial vehicle and a mobile device or a base station mainly takes sight distance wireless transmission as main points.

In time slice

Moving device

And unmanned aerial vehicle

The distance between them is:

；

in time slice

Unmanned plane

The distance from the base station is:

；

thus, the mobile device

And unmanned aerial vehicle

The wireless channel gain in between is:

；

unmanned plane

And the wireless channel gain between the base station is:

；

wherein the content of the first and second substances,

is the received power gain at a reference distance of 1 meter.

If computing task

Selecting slave mobile devices

Off-load to unmanned aerial vehicle

The transmission rate of the task data is as follows:

；

if computing task

Selecting slave drones

Unloading to a base station, wherein the transmission rate of task data is as follows:

；

wherein the content of the first and second substances,Bwhich represents the bandwidth of the network system,

and

respectively expressed in time slices

Mobile device

With unmanned aerial vehicle

The power of the wireless transmission of (a),

which is indicative of the frequency of the communication noise,

and

the following conditions are respectively satisfied:

；

wherein the content of the first and second substances,

and

respectively representing mobile devices

With unmanned aerial vehicle

The maximum available transmission power.

Mobile device

Offloading computing tasks to unmanned aerial vehicles

The time required and the energy consumed are respectively:

；

；

unmanned plane

The time and energy consumed to offload the computation task to the base station are respectively:

；

；

order to

And

respectively representing mobile devices

With unmanned aerial vehicle

Energy budget of, and for

The following constraints are satisfied:

；

；

the optimization goal of the network system is to minimize the total energy consumption of the mobile device and the drone under the task delay constraint and the system constraint (such as maximum speed of the drone, minimum distance between the drones, and maximum computing power). Computing tasks

When the mobile device, the unmanned aerial vehicle or the base station executes, the corresponding task time delays are respectively expressed as follows:

；

when task scheduling decision variables are introduced

Then, computing task

Can be uniformly expressed as:

；

therefore, the corresponding task execution latency constraint is:

；

in time slice

To execute a task

The energy consumption generated can be divided into two categories:

1) if task

In a mobile device

Execute locally, i.e.

Then moving the device

Is consumed as

；

2) If task

Is unloaded to the unmanned aerial vehicle

Or performed by the base station, i.e.

Then moving the device

Is consumed as

；

Thus, moveDevice

In executing a computing task

The energy consumed can be expressed collectively as:

；

the energy consumption of all mobile devices during task execution can be expressed as:

；

in summary, in order to minimize the total energy consumption of all tasks of the mobile device during the operation of the mobile edge computing network system, an optimization problem (optimization objective function) is defined as follows:

wherein the content of the first and second substances,

，

，

，

is the variable to be optimized.

In question P, the constraint C1 indicates that the maximum speed of the drones and the minimum distance between drones should not violate the corresponding constraint. Constraint C2 ensures that the computation task generated by each time slice at a mobile device can only be executed at any one of the mobile device local, the drone or the base station, and that each drone can only send one task at most to the base station at each time slice. Constraint C3 ensures that the computational resources allocated to local computation and drone computation per time slice, respectively, should not exceed the maximum computational capacity of the mobile device and drone. The constraint C4 indicates that the mobile device and drone should not exceed their respective energy budgets during execution. The limit condition C5 indicates that the mobile device and the drone cannot allocate transmit power beyond the maximum allowed value. The constraint C6 ensures that each task execution should meet the latency requirements.

For step S450, it should be noted that the specific way of training the model is not limited, and the training mode may be set according to the actual application requirement. For example, in an alternative example in which the scheduling optimization model includes an unmanned aerial vehicle trajectory planning model, a computational task joint scheduling model, and a resource allocation model, step S450 may include the following sub-steps:

splitting the optimization objective function to obtain a first optimization objective function, a second optimization objective function and a third optimization objective function; training the initial model according to the first optimization objective function to obtain an unmanned aerial vehicle trajectory planning model, training the initial model according to the second optimization objective function to obtain a calculation task joint scheduling model, and training the initial model according to the third optimization objective function to obtain a resource allocation model.

In detail, the problem P is a problem that is difficult to solve, and the main reasons are the following: 1) because A is a discrete binary variable and L, P and F are continuous variables, the problem is a mixed nonlinear integer programming problem and belongs to an NP difficult problem; 2) due to the fast response requirement of a network system, each time slice scheduling optimization algorithm needs to perform real-time fast scheduling decision; 3) since the positions of the mobile device and the unmanned aerial vehicle both change, P needs to be solved in a dynamically changing environment. Based on the above reasons, the optimization objective function P is decomposed into three sub-problems, including unmanned aerial vehicle trajectory planning (P1, i.e., the first optimization objective function), calculation task joint scheduling (P2, i.e., the second optimization objective function), and calculation and transmission resource allocation (P3, i.e., the third optimization objective function), so that a high-efficiency scheduling strategy of the mobile edge computing network can be obtained, and the complexity of solving the optimization problem is greatly reduced.

In order to reduce the computational complexity of the original optimization problem, P is split into the following three sub-problems:

1. planning the unmanned aerial vehicle track:

in the optimized scheduling variables L, a, P, F in the problem P, the trajectory position L of the drone is less dependent on the other three variables, the optimization of the variables is mainly based on the position observation of the mobile device, the optimization target is as close as possible to the mobile device and the base station, therefore, the trajectory optimization of the drone can be expressed as:

；

wherein the content of the first and second substances,

is shown in unmanned plane

Providing a cluster of mobile devices within a service range and satisfying a condition

。

2. And (3) computing task joint scheduling:

once in the time slice

After the position L of the unmanned aerial vehicle is determined, the task unloading decision variable A needs to be optimized before the variables P and F are optimized. Based on the current mobile device cluster (

) To minimize the maximum computation delay of all tasks

Optimizing a for the purpose of target, so that the constraint C6 in the original problem P is more easily satisfied, and therefore, the computation task joint scheduling sub-problem can be expressed as:

3. calculating and transmitting resource allocation:

after solving problems P1 and P2, under the constraints of C3, C4, and C5, the remaining variables P, F are optimized as follows with the goal of minimizing the energy consumption in the system:

based on the decomposition of the above problems, as shown in fig. 6, the optimization framework provided by the present application is composed of three models, namely, an unmanned aerial vehicle Trajectory Planning model (UAV Trajectory Planning, UTP), a Task Association Scheduling (TAS) and a Resource Allocation and transmission (RA) model, and corresponds to optimization sub-problems P1, P2 and P3, respectively. At the beginning of each time slice, the network system environment generates two state variables (

And

）。

is the input to the UTP model and,

are inputs to the TAS and RA models.

1) UTP model pair

And processing, wherein the UTP model predicts the motion of the mobile device and guides the unmanned aerial vehicle to move to a proper position because the position of the mobile device is different in different time slices. Because the motion mode of the mobile equipment is not in accordance with Gaussian distribution or linear distribution, the motion distribution of the mobile equipment can be simulated by adopting the long-term and short-term memory network. After the prediction is completed, the drones need to be appropriately divided into U clusters according to the number of drones, so that each drone can serve the mobile devices in the cluster. In order to perform soft clustering, i.e., each mobile device can be served by different drones in different time slices (but not more than one drone in the same time slice), a fuzzy C-means clustering method is adopted in the UTP model, and clustering is performed according to similarity of channel power gains. After clustering, the center point of each cluster is used as the output of the motion position of the UAV in the UTP module, i.e.

。

2) The TAS model receives from the UTP model and the network environment, respectively

And

. The TAS model generates task scheduling decision variables according to time-varying channel conditions and computing task requirements

The value of (c). The present application may use an advanced Deep Reinforcement Learning (DRL) method: a Deep Deterministic Policy Gradient algorithm (DDPG) for obtaining experience according to interaction between an algorithm model and an environment and outputting an optimized decision action

. At itIn alternative examples, other reinforcement learning algorithms (e.g., TD3 algorithm, PPO algorithm, etc.) suitable for continuous motion may be used. Action of output for each time slice

Is a one-dimensional vector consisting of

Term composition, where each term is set to a continuous variable relaxed between 0 and 1.

Can be regarded as

The probability of execution on computing device k (which is why each term is set to a continuous value between 0 and 1). Since the task scheduling decision variable should be a two-dimensional, binary value,

is modeled and integrated as 1 or 0 according to the task-related constraints of the optimization problem and is taken as the output of the TAS model, i.e.

。

3) Will be provided with

And

the final processing is performed as input to the RA model. According to the sub-problem P3, the optimization variables P and F can be solved directly through a CVXPY convex optimization toolkit, and P and F output by the RA model interact with the environment.

The environment receives the actions of the 3 model outputs, and the environment receives the actions and generates a reward

(as inputs to the DDPG) and a new state (the corresponding component of the state is sent to the corresponding component of the algorithm framework). Thereafter, the algorithm proceeds to the next time slice and repeats the three steps described above.

It should be noted that the optimal position plan of the unmanned aerial vehicle can be calculated and obtained through a long-short term memory network and a fuzzy C-means clustering method, and the trajectory plan of the unmanned aerial vehicle can be divided into two parts, namely mobile device motion prediction and mobile device clustering.

In a network system, the distance between the drone and the mobile device is a main factor affecting other scheduling variables, so the ideal motion trajectory of the drone is to move gradually towards the mobile device and to be as close as possible to the mobile device. To this end, the algorithm proposed in the present application predicts the position of the mobile device

To assist in the movement of the drone. Due to the fact that

Is based primarily on the location of the previous time slice of the mobile device, the present invention utilizes a recurrent neural network LSTM to model

The timing distribution of (2).

As shown in FIG. 7, the Long-Short Term Memory network (LSTM) is a recurrent neural network that accepts external inputs

And a feedback input (

And

). The output of LSTM includes two terms (

And

) The two terms are input to the LSTM itself for processing in the next time slice. Among the two output items, the output items,

obtained by the following operations:

；

；

；

；

wherein the content of the first and second substances,

、

and

the output value of the neural network is represented,

and

respectively representing sigmoid and hyperbolic tangent activation functions,

、

and

representing the network weights of the corresponding neural network layer,

、

and

representing the offset vector of the corresponding neural network, and the two parts are parameters which need to be learned by the neural network.

Based on

，

Calculated from the following formula:

；

；

wherein the content of the first and second substances,

and

parameters that need to be learned for the neural network.

Based on the above formula, the present application proposes that the mobile device location is predicted based on the mobile device location observation model of LSTM, and the time series expansion is shown in fig. 8. At each time slice, the current mobile device location is input to the LSTM network and the LSTM output

. To predict the mobile device location for the next time slice, a full connectivity layer is added to the output

Fine tuning is performed, specifically as follows:

；

wherein relu is a relu activation function,

and

learning variables need to be trained for neural networks.

Location prediction based on next time slice of mobile device

The mobile devices need to be clustered into U groups, so that the unmanned aerial vehicle can provide services for the mobile devices in a load balancing manner. In order to complete the clustering of the mobile devices, the FCM method can be adopted to start from the fuzzy theory and carry out the clustering on each cluster

Moving device

In time slice

Assigning a metric value

The calculation method is as follows:

；

wherein the content of the first and second substances,

indicating the location of the nth slot drone,

representing the centre point of the kth cluster, i.e.

；

Iterative solution by minimizing the objective function O to be optimized

And

until the difference between two successively calculated metric values is less than a specified threshold value

：

；

Before performing the iteration, all

Should be initialized, each

Use of

Because the mobile devices can only move within a small range, their new center points may be close to the previous center points (which are planned as the positions of the drone's movements)

）。

After the iteration is over, each mobile device

Is given a metric value

On behalf of its membership in the u-th cluster, it may be further explored through an exploration strategy

Tuning to binary clustering decisions can reduce the likelihood of trapping the optimization objective O into local minima. Use of

Indicating a search threshold, mobile device

To be provided with

Is clustered to the cluster having the largest metric value and

cluster to other clusters. The algorithm of FIG. 9 details the clustering process of the FCM-based mobile device in the nth time slice, algorithm 1Output of

Guiding the unmanned aerial vehicle to move to

。

It should be noted that the deep deterministic strategy gradient algorithm based on reinforcement learning can be used to find the task scheduling decision variable of each mobile device, and the joint scheduling of the computing task includes two parts, namely task scheduling decision variable optimization based on DDPG and integration of the scheduling variables. After the motion track of the unmanned aerial vehicle is known, the algorithm framework learns the scheduling strategy of the calculation task by using a reinforcement learning algorithm of the DDPG, namely:

；

policy

Is a mapping function from the environment state to the decision-making action, and the state of the network environment is:

；

policy

The output decision actions are:

；

is a continuous variable from 0 to 1, of size:

。

by reinforcement learning, a near-optimal solution for policy π can be obtained by maximizing the total utility value (also called Q-value):

；

wherein the content of the first and second substances,

is that

Taking decision actions in state

The new state of the back-end environment,

is a time slice

γ is the discount coefficient for future rewards. Since the state and the action space of the environment are high-dimensional, two neural networks are employed: actor neural network (Actor) pi (parameter is

) And evaluate the home neural network (Critic) Q (parameter θ), as shown in fig. 10. In order to make the learning process more stable, a target network (a target strategy network body and a target evaluation network respectively) can be adopted

And

is a parameter) is periodically updated.

In time slice

Actions of the environment that accept the output of the algorithmic model

Then, from state

Transition to a State

And a prize is generated

Packing the four items into a tuple

And stored in an experience playback pool. During algorithm training, a batch of samples is randomly selected from an empirical replay pool, and the evaluation neural network (i.e., parameter θ) is trained according to the following loss function.

；

The actor network is trained to minimize the following gradient function:

；

wherein the content of the first and second substances,

is the state sampled from the state distribution under the current strategy,

is the number of a batch of samples in the network back propagation training,the task joint scheduling training algorithm based on the DDPG is shown in detail in figure 11.

Decision-making actions due to output of actor networks

Is a one-dimensional vector, and

is a continuous value in the range of 0 to 1, so it is desirable that

Reshaping (reshape) is performed in a two-dimensional manner and is integrated to 0 or 1 for further task scheduling. As shown in FIG. 12, is

The modeling and integration algorithm has the time complexity of

. After the modeling and integration of the task scheduling variables, the output a [ m ] of the algorithm 3][k]And transmitting the data to the RA module for resource optimization allocation.

It should be noted that, the convex optimization-based method can be used to determine the allocation of the computation and transmission resources of the network system,

and

the final processing is performed as input to the RA module. According to the sub-problem P3, the optimization variables P and F can be solved directly by a CVXPY tool using a convex optimization method.

For step S420, it should be noted that the specific manner of obtaining the scheduling policy is not limited, and may be set according to the actual application requirement. For example, in an alternative example, step S420 may include the following sub-steps:

inputting the current position information into an unmanned aerial vehicle trajectory planning model, and calculating to obtain the predicted position information of at least one mobile device; inputting the task to be processed and the predicted position information into a task joint scheduling model, and calculating to obtain a task scheduling decision variable of at least one mobile device; and inputting the tasks to be processed and the task scheduling decision variables into a resource allocation model, and calculating to obtain a scheduling strategy.

The current position information is input into the unmanned aerial vehicle trajectory planning model, the specific mode of calculating the predicted position information of the at least one mobile device is not limited, and the predicted position information can be set according to actual application requirements. For example, in one alternative example, the step may include the following sub-steps:

performing motion prediction processing according to the current position information to obtain next position information of at least one mobile device; and clustering the next position information of at least one mobile device to obtain the predicted position information.

It should be noted that, the steps of performing the prediction processing and the clustering processing may refer to the process of training to obtain the trajectory planning model of the unmanned aerial vehicle.

The specific mode of inputting the task to be processed and the predicted position information into the task joint scheduling model and calculating to obtain the task scheduling decision variable of at least one mobile device is not limited and can be set according to the actual application requirements. For example, in one alternative example, the step may include the following sub-steps:

performing task joint scheduling training processing according to the task to be processed and the predicted position information to obtain a decision action of at least one mobile device; and performing integrated processing on the decision action to obtain a task scheduling decision variable.

It should be noted that, the steps of performing the training process and the integrating process may refer to the process of training to obtain the computation task joint scheduling model.

By the method, a mobile edge computing network consisting of a single base station, a plurality of unmanned aerial vehicles and a large number of mobile devices is deployed, each computing task can be executed on the mobile devices, can be unloaded to the unmanned aerial vehicles for computing, or can be further unloaded to the base station by using the unmanned aerial vehicles as relays for more intensive computing. Under the goal of minimizing the energy consumption of a network system, the joint optimization problem of unmanned aerial vehicle track, task association, calculation and transmission resource allocation is determined. In view of the high complexity of the problem, the invention decomposes the optimization problem into three sub-problems, greatly reduces the energy consumption of the whole network system, prolongs the service life of the network, simultaneously reduces the computation delay of all mobile devices in the communication network, and improves the service quality of the computation-intensive application.

With reference to fig. 13, an embodiment of the present application further provides a scheduling optimization apparatus 1300, where functions implemented by the scheduling optimization apparatus 1300 correspond to steps executed by the foregoing method. The schedule optimization apparatus 1300 may be understood as a processor of the electronic device 100, or may be understood as a component that is independent from the electronic device 100 or the processor and implements the functions of the present application under the control of the electronic device 100. The scheduling optimization apparatus 1300 may include a task obtaining module 1310, a policy obtaining module 1320, and a policy sending module 1330.

The task obtaining module 1310 is configured to obtain a to-be-processed task and current location information of at least one mobile device, where the to-be-processed task includes a first task and a second task. In this embodiment of the application, the task obtaining module 1310 may be configured to perform step S410 shown in fig. 4, and reference may be made to the foregoing description of step S410 for relevant contents of the task obtaining module 1310.

A policy obtaining module 1320, configured to input the to-be-processed task and the current position information into a preset scheduling optimization model to obtain a scheduling policy, where the scheduling optimization model is obtained by training based on the established initial model. In this embodiment of the application, the policy obtaining module 1320 may be configured to perform step S420 shown in fig. 4, and reference may be made to the foregoing description of step S420 for relevant content of the policy obtaining module 1320.

The policy sending module 1330 is configured to send the scheduling policy to the at least one mobile device, so that the at least one mobile device sends the first task to the at least one drone for processing based on the scheduling policy, and forwards the second task to the at least one base station for processing through the at least one drone. In the embodiment of the present application, the policy sending module 1330 may be configured to perform step S430 shown in fig. 4, and reference may be made to the foregoing description of step S430 regarding the relevant content of the policy sending module 1330.

In addition, an embodiment of the present application further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the computer program performs the steps of the scheduling optimization method.

The computer program product of the scheduling optimization method provided in the embodiment of the present application includes a computer-readable storage medium storing a program code, where instructions included in the program code may be used to execute steps of the scheduling optimization method in the foregoing method embodiment, which may be referred to specifically in the foregoing method embodiment, and details are not described here again.

To sum up, according to the scheduling optimization method and apparatus, the electronic device, and the storage medium provided in the embodiments of the present application, the scheduling policy is obtained by inputting the task to be processed and the current location information into the preset scheduling optimization model, and the scheduling policy is sent to the at least one mobile device, so that the at least one mobile device sends the first task to the at least one unmanned aerial vehicle for processing based on the scheduling policy, and forwards the second task to the at least one base station for processing through the at least one unmanned aerial vehicle, thereby achieving the purpose of scheduling the first task to the unmanned aerial vehicle for processing, and scheduling the second task to the base station for processing, and avoiding the problem in the prior art that the tasks are either all executed locally on the mobile device or all executed remotely on the unmanned aerial vehicle or the base station, which results in low efficiency of scheduling optimization.

The above description is only a preferred embodiment of the present application and is not intended to limit the present application, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims

1. A scheduling optimization method is applied to an electronic device which is in communication connection with a mobile edge computing network system, wherein the mobile edge computing network system comprises at least one base station, a unmanned aerial vehicle and a mobile device, and the scheduling optimization method comprises the following steps:

2. The method of scheduling optimization of claim 1 further comprising the step of obtaining a scheduling optimization model comprising:

3. The method for scheduling optimization of claim 2 wherein the step of building an initial model and optimizing an objective function based on initial parameters of the mobile edge computing network system comprises:

4. The scheduling optimization method of claim 2, wherein the scheduling optimization model comprises an unmanned aerial vehicle trajectory planning model, a computational task joint scheduling model and a resource allocation model, and the step of training the initial model according to the optimization objective function to obtain the scheduling optimization model comprises:

5. The scheduling optimization method of claim 4, wherein the step of inputting the to-be-processed task and the current location information into a preset scheduling optimization model to obtain the scheduling policy comprises:

6. The method of claim 5, wherein the step of inputting the current location information into the unmanned aerial vehicle trajectory planning model to calculate the predicted location information for the at least one mobile device comprises:

7. The method of claim 5, wherein the step of inputting the to-be-processed task and the predicted location information into the task joint scheduling model and calculating the task scheduling decision variable of the at least one mobile device comprises:

8. A scheduling optimization device applied to an electronic device, wherein the electronic device is in communication connection with a mobile edge computing network system, the mobile edge computing network system comprises at least one base station, a drone and a mobile device, and the scheduling optimization device comprises:

9. An electronic device, comprising: memory, processor and computer program stored on the memory and executable on the processor, the processor implementing the method of schedule optimization of any of claims 1 to 7 when executing the program.

10. A storage medium, comprising a computer program, which when executed controls an electronic device in which the storage medium is located to perform the schedule optimization method of any of claims 1 to 7.