CN115514769B - Satellite elastic Internet resource scheduling method, system, computer equipment and medium - Google Patents

Satellite elastic Internet resource scheduling method, system, computer equipment and medium Download PDF

Info

Publication number
CN115514769B
CN115514769B CN202211125448.7A CN202211125448A CN115514769B CN 115514769 B CN115514769 B CN 115514769B CN 202211125448 A CN202211125448 A CN 202211125448A CN 115514769 B CN115514769 B CN 115514769B
Authority
CN
China
Prior art keywords
task
model
satellite
resource scheduling
delay
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211125448.7A
Other languages
Chinese (zh)
Other versions
CN115514769A (en
Inventor
罗志勇
林天豪
黄澳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sun Yat Sen University
Original Assignee
Sun Yat Sen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sun Yat Sen University filed Critical Sun Yat Sen University
Priority to CN202211125448.7A priority Critical patent/CN115514769B/en
Publication of CN115514769A publication Critical patent/CN115514769A/en
Priority to ZA2023/05873A priority patent/ZA202305873B/en
Application granted granted Critical
Publication of CN115514769B publication Critical patent/CN115514769B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/104Peer-to-peer [P2P] networks
    • H04L67/1074Peer-to-peer [P2P] networks for supporting data block transmission mechanisms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W28/00Network traffic management; Network resource management
    • H04W28/02Traffic management, e.g. flow control or congestion control
    • H04W28/08Load balancing or load distribution
    • H04W28/09Management thereof
    • H04W28/0958Management thereof based on metrics or performance parameters
    • H04W28/0967Quality of Service [QoS] parameters
    • H04W28/0975Quality of Service [QoS] parameters for reducing delays
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Quality & Reliability (AREA)
  • Radio Relay Systems (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention provides a satellite elastic Internet resource scheduling method, a system, computer equipment and a medium, wherein a delay sensitive satellite elastic Internet architecture is established based on multiple pairs and modes of LEO satellites and a user side, a satellite elastic Internet resource scheduling model is established according to the delay sensitive satellite elastic Internet architecture, a minimum delay optimizing model is established according to the satellite elastic Internet resource scheduling model, the minimum delay optimizing model is converted into a corresponding Markov decision model, the Markov decision model is solved, a resource scheduling strategy is obtained, calculation and storage resources of satellites can be fully utilized, queuing delay performance influence is avoided, diversified requirements of the user side are met, service quality is improved, calculation efficiency of resource scheduling is improved based on a deep reinforcement learning algorithm, and further resource scheduling with priority service, lower delay and balanced load is achieved, and the method has strong generalization capability and high practical value.

Description

Satellite elastic Internet resource scheduling method, system, computer equipment and medium
Technical Field
The invention relates to the technical field of satellite resource scheduling, in particular to a satellite elastic internet resource scheduling method, system, computer equipment and storage medium based on deep reinforcement learning.
Background
With the continuous expansion of internet users, user equipment accessing to the network is also proliferated, and accordingly, larger time delay and energy consumption balanced performance requirements are brought. However, due to the limitation of cost and technical conditions, the regional network coverage cannot be realized in a mode of large-scale deployment of ground base stations in many complex natural geographic environments such as deserts, deep sea, forests and the like, and the satellite internet becomes a reliable mode for guaranteeing high-efficiency communication in such regions.
The traditional satellite network systems are independent, the heterogeneous characteristics of networking mechanisms and related protocols are obvious, a serious chimney stand phenomenon is caused, and the utilization efficiency of network space resources is limited to a great extent. How to build an efficient, flexible and agile satellite internet architecture, and how to efficiently offload tasks and reasonably schedule limited resources to different task demands becomes an important research direction for improving the utilization efficiency of the world convergence network resources.
The existing satellite internet resource scheduling method mainly aims at achieving better time delay and energy consumption balance by sinking computing power to LEO satellites and arranging an edge computing technology of an MEC server on the LEO satellites to shorten the physical distance between the MEC server and a user; however, existing methods for solving this optimization problem are mainly divided into two types: 1) The problem of task scheduling is solved by using a Hungary method; 2) The optimization problem is proved to be a convex optimization problem and solved with the KKT condition. Although the two methods can solve the problem of internet dangerous resource scheduling to a certain extent, the two methods have corresponding application defects: if the problem of task scheduling is solved by using the Hungary method, the time delay is reduced compared with the traditional algorithm, but only one server is considered to be responsible for calculating the application scene of the task, and the problem of unbalanced load is easily caused because calculation and storage resources are scarce in the satellite communication scene is not considered; the optimization problem is proved to be a convex optimization problem, and the KKT condition is used for solving, so that only the high efficiency of the solving process is considered, and the influence caused by the queuing delay of the task due to limited resources and large task quantity is not considered; that is, the existing solution does not consider the application distinction between the satellite communication scene and the terrestrial internet scene, and cannot truly and effectively schedule resources based on the actual scenes such as very scarce resources, limited computing and storage resources, long queuing delay of large amount of tasks, and the like, so that the practicability is low.
Therefore, it is needed to provide a reasonable resource scheduling method capable of fully utilizing limited resources on satellites, providing services with low delay, high quality and high security for users around the world, and solving the problem of high delay sensitivity.
Disclosure of Invention
The invention aims to provide a satellite elastic Internet resource scheduling method, which is characterized in that under multiple modes, the problem of resource scheduling among edge servers under satellite network edge calculation is considered, the construction of a time delay sensitive satellite elastic Internet architecture is completed based on SDN/NFV technology and TSN security protocol, a corresponding time delay optimization model is obtained, and a time delay optimization algorithm constructed based on a deep reinforcement learning algorithm architecture is adopted to solve and obtain a resource scheduling strategy, so that the application defect of the existing satellite resource scheduling scheme is effectively overcome, the calculation and storage resources of satellites can be fully utilized, prioritized service is provided, average time delay performance is optimized, and real effective load balancing is realized.
In order to achieve the above objective, it is necessary to provide a satellite elastic internet resource scheduling method, system, computer device and storage medium for the above technical problems.
In a first aspect, an embodiment of the present invention provides a satellite elastic internet resource scheduling method, where the method includes the following steps:
Based on LEO satellite and user end many-to-many mode, establishing time delay sensitive satellite elastic Internet architecture; the LEO satellite corresponds to one MEC server;
establishing a satellite elastic Internet resource scheduling model according to the time delay sensitive satellite elastic Internet architecture;
establishing a minimum time delay optimization model according to the satellite elastic Internet resource scheduling model;
converting the minimized time delay optimization model into a corresponding Markov decision model;
and solving the Markov decision model to obtain a resource scheduling strategy.
In a second aspect, an embodiment of the present invention provides a satellite elastic internet resource scheduling system, where the system includes:
the architecture construction module is used for establishing a time delay sensitive satellite elastic Internet architecture based on the LEO satellite and the many-to-many mode of the user side; the LEO satellite corresponds to one MEC server;
the first modeling module is used for establishing a satellite elastic Internet resource scheduling model according to the time delay sensitive satellite elastic Internet architecture;
the second modeling module is used for establishing a minimum time delay optimization model according to the satellite elastic Internet resource scheduling model;
the model conversion module is used for converting the minimized time delay optimization model into a corresponding Markov decision model;
And the strategy solving module is used for solving the Markov decision model to obtain a resource scheduling strategy.
In a third aspect, embodiments of the present invention further provide a computer device, including a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the above method when executing the computer program.
In a fourth aspect, embodiments of the present invention also provide a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the above method.
The method realizes that a time delay sensitive satellite elastic Internet architecture is established based on multiple pairs of multimode of LEO satellites and user terminals, and after a satellite elastic Internet resource scheduling model is established according to the time delay sensitive satellite elastic Internet architecture, a minimum time delay optimizing model is established according to the satellite elastic Internet resource scheduling model, and the minimum time delay optimizing model is converted into a corresponding Markov decision model, and the Markov decision model is solved, so that the technical scheme of a resource scheduling strategy is obtained. Compared with the prior art, the satellite elastic Internet resource scheduling method can perform comprehensive and effective optimization problem modeling based on the real application scene with limited resources and task queuing delay influence, can fully utilize the calculation resources and storage resources of the satellite, avoid performance influence caused by queuing delay, can meet diversified requirements of a user side based on task priority division, improve service quality, can be based on a delay optimization algorithm constructed in a targeted manner by a deep reinforcement learning algorithm, improve the calculation efficiency of massive service resource scheduling distribution, further realize intelligent efficient satellite elastic Internet resource scheduling with priority service, improve delay performance and ensure load balance, and has strong generalization capability and high practical value.
Drawings
Fig. 1 is a schematic diagram of an application scenario of a satellite elastic internet resource scheduling method in an embodiment of the invention;
FIG. 2 is a schematic flow chart of a satellite elastic Internet resource scheduling method in an embodiment of the invention;
FIG. 3 is a schematic diagram of a time delay sensitive satellite elastic Internet architecture in accordance with an embodiment of the present invention;
FIG. 4 is a pseudo code schematic diagram of a DDRA algorithm designed based on a TD3 architecture of a deep reinforcement learning algorithm in an embodiment of the present invention;
FIG. 5 is a schematic diagram of relevant parameters of a satellite elastic Internet resource scheduling model according to an embodiment of the invention;
FIG. 6 is a graph of the effect of discount rate on DDRA algorithm convergence in an embodiment of the present invention;
FIG. 7 is a graph comparing the convergence of the DDRA algorithm and the SAC algorithm in an embodiment of the present invention;
FIG. 8 is a graph showing the effect of task computation amounts under different algorithms on average time delay performance under computing resource allocation when task data amounts are subjected to uniform distribution in an embodiment of the present invention;
FIG. 9 is a graph comparing the effect of task computation amounts under different algorithms on the average time delay performance under computing resource allocation when task data amounts are subjected to normal distribution in the embodiment of the invention;
FIG. 10 is a graph showing the effect of task computation amounts under different algorithms on the average time delay performance under computing resource allocation when the task computation amounts are subject to uniform distribution in the embodiment of the present invention;
FIG. 11 is a graph showing the effect of task data on the average delay performance under the distribution of computing resources under different algorithms when the task computation amount is subject to normal distribution in the embodiment of the invention;
FIG. 12 is a graph showing the effect of the number of users under different algorithms on the average delay performance under the allocation of computing resources when the task data amount and the task computation amount follow normal distribution in the embodiment of the present invention;
FIG. 13 is a schematic diagram of a satellite elastic Internet resource scheduling system according to an embodiment of the present invention;
fig. 14 is an internal structural view of a computer device in the embodiment of the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantageous effects of the present application more apparent, the present invention will be further described in detail with reference to the accompanying drawings and examples, and it should be understood that the examples described below are only illustrative of the present invention and are not intended to limit the scope of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The satellite elastic Internet resource scheduling method provided by the invention can be applied to the satellite elastic Internet resource scheduling scene which is shown in the figure 1 and is satisfied by the user side and the MEC server in a multi-pair multi-mode, and the intelligent high-efficiency satellite elastic Internet resource scheduling with priority service, time delay improving performance and load balancing guaranteeing is realized; the following embodiments will describe the satellite elastic internet resource scheduling method of the present invention in detail.
In one embodiment, as shown in fig. 2, a satellite elastic internet resource scheduling method is provided, which includes the following steps:
s11, establishing a time delay sensitive satellite elastic Internet architecture based on a LEO satellite and a many-to-many mode of a user; the LEO satellite corresponds to one MEC server, and communication interaction among the MEC services is realized, so that load balancing is realized; the time delay sensitive satellite elastic internet architecture can be understood as a satellite internet architecture shown in fig. 3 based on SDN/NFV technology and combined with the basic idea of IEEE 802.1 Qcc in TSN, and the MEC server of each LEO satellite provides task unloading service for a ground data node (user terminal), and the functions of satellite resource configuration, reason configuration, network configuration and the like are managed in real time based on a converged network, so that the satellite internet architecture meets the diversified task requirements of the user terminal as much as possible; specifically, the step of establishing the delay-sensitive satellite elastic internet architecture based on the many-to-many mode of the LEO satellite and the user terminal includes:
based on SDN/NFV technology, the computing resources and storage resources of MEC servers corresponding to each LEO satellite are virtualized, and various delay related protocols are combined by combining TSN security protocols, so that the principle of minimum delay optimization target management satellite resource configuration, route forwarding and network configuration is established, and the delay sensitive satellite elastic Internet architecture is established.
S12, establishing a satellite elastic Internet resource scheduling model according to the time delay sensitive satellite elastic Internet architecture; the satellite elastic internet resource scheduling model can be understood as a scheduling model for determining a resource allocation strategy when a user side is used for unloading tasks by considering a scene that a plurality of LEO satellites cover a multi-ground data node (user side) on a data plane based on a time delay sensitive satellite elastic internet architecture in the step S11, and mainly comprises a communication model, a task model and a calculation model; the task model is established based on consideration of task priority, and the calculation model can be established based on a preset task division principle and mainly comprises a local task unloading calculation model and an MEC task unloading calculation model; specifically, the step of establishing a satellite elastic internet resource scheduling model according to the time delay sensitive satellite elastic internet architecture includes:
acquiring data transmission rates of the MEC servers uploaded by the user terminals, and constructing the communication model according to the data transmission rates based on the many-to-many mode; the communication model is expressed as:
Figure BDA0003845522590000061
in the formula ,
Figure BDA0003845522590000071
wherein ,
Figure BDA0003845522590000072
and />
Figure BDA0003845522590000073
Respectively representing a user side set and an MEC server set; / >
Figure BDA0003845522590000074
r i,j (t)、I i,j 、h i,j(t) and si,j Respectively represent the user end u in the time interval t i Offloading tasks to MEC server b j Transmission delay, transmission energy consumption, transmission rate, inter-cell interference power, channel gain, and linear distance; w represents a channel bandwidth; sigma (sigma) 2 Representing the noise power of the user equipment; z i (t) represents the user terminal u in the time slot t i Task Q generated i The data size of (t); c represents the speed of light; p is p i (t) represents a time slot t and a user terminal u i Transmission power of the transmission signal;
it should be noted that, because the data volume of the processed task is very small, the communication model of the present invention always does not consider the energy consumption and time delay during downloading; in addition, based on a multi-mode scene of a multi-user terminal and a multi-LEO node, under the support of the existing condition, the switching time delay of satellite communication is negligible, and as the user terminal is in a region which is far away from the LEO satellite, the distance from the user terminal to the MEC server can be approximately considered to be the same as the distance from the LEO satellite, and the following task model is built under the condition that only an end-edge model is focused and the influence caused by cloud is not considered;
based on a load balancing principle, constructing a task model according to the task calculation amount, the task data amount and the task priority; the task model is expressed as:
Q i (t)={ω i (t),z i (t),pri i (t)}
wherein ,Qi (t) represents the user terminal u in the time slot t i A generated task; omega i (t) represents task Q i (t) the amount of computation required, i.e., the CPU frequency required to complete the task; z i (t) represents task Q i The data size of (t); pri (pri) i (t) represents the task Q i (t) priority, and pri i (t)∈[1,2,…,PN]PN represents the number of priorities;
dividing each user end task into a local offloading task and an MEC offloading task, and respectively constructing a corresponding local offloading task calculation model and an MEC offloading task calculation model; the local task offloading calculation model may be understood as a processing delay and energy consumption calculation model when the user side device locally processes the local task offloading, and may be expressed as:
Figure BDA0003845522590000081
wherein ,
Figure BDA0003845522590000082
and />
Figure BDA0003845522590000083
Respectively represent user end u i For local task Q i L Processing time delay and corresponding energy consumption; f (f) i L Representing user end u i Is a local CPU frequency of (b); ρ i A power coefficient representing the energy consumed by each CPU cycle;
the MEC offloading task calculation model may be understood as a MEC server processing delay model established by considering not only the task processing delay of offloading tasks to the MEC server by a user side, but also the queuing delay of tasks, based on the consideration that there may be a plurality of user sides offloading tasks to the same MEC server and different priorities of tasks offloaded by different user sides, which is expressed as:
Figure BDA0003845522590000084
wherein ,
Figure BDA0003845522590000085
indicating the user end u in the time slot t i Offloading tasks to MEC server b j Is a processing delay of (1); />
Figure BDA0003845522590000086
Representing MEC server b j Is a CPU frequency of (2); />
Figure BDA0003845522590000087
Representing MEC server b j Assigning MEC offload tasks Q in time slots t i E Is calculated according to the resource proportion; />
Figure BDA0003845522590000088
Indicating priority as pri i MEC offload task Q i E Average queuing delay of (a); it should be noted that, the energy consumption of the processing task of the MEC server is not considered here;
above-mentioned
Figure BDA0003845522590000089
The calculation model of (1) is related to an actually adopted queuing model, such as a task queuing model is assumed to be a non-preemptive limit queuing model (M/M/N queue), and tasks with the same priority are processed according to a first-come first-serve principle; and assuming that the arrival rate of any priority task arriving at the queue for any time slot obeys a parameter lambda i The Poisson distribution of (t), the processing time of the MEC server obeys the exponential distribution of the parameter mu (t), and the storage space is large enough when a plurality of servers exist; according to the setting of the task model, the priority has PN numbers, and then the corresponding priority is pri i Task Q of (2) i Average queuing delay of (c):
Figure BDA00038455225900000810
in the formula ,
Figure BDA0003845522590000091
Figure BDA0003845522590000092
the total arrival rate is:
Figure BDA0003845522590000093
and the constraints include:
Figure BDA0003845522590000094
/>
Figure BDA0003845522590000095
Figure BDA0003845522590000096
wherein ,Oj Representation offloading to MEC server b j A collection of computing tasks.
Specifically, tasks Q of different priorities i The calculation of the average queuing delay at MEC server j can be understood as:
when the priority pri=1 of the task,
Figure BDA0003845522590000097
when the priority pri=2 of the task,
Figure BDA0003845522590000098
different levels of task priority can be analogized to, when the priority of the task pri=pn,
Figure BDA0003845522590000099
and (5) performing corresponding task queuing delay calculation.
S13, establishing a minimum time delay optimization model according to the satellite elastic Internet resource scheduling model; the minimum time delay optimization model is a time delay optimization model which is established based on the dynamic property generated by the node tasks and takes the average processing time delay of all tasks generated in a preset time range corresponding to the minimum time gap set as an optimization target; specifically, the step of establishing a minimum time delay optimization model according to the satellite elastic internet resource scheduling model includes:
calculating the task processing average time delay of each time interval in a preset time range according to the satellite elastic Internet resource scheduling model; the average time delay of task processing in each time interval can be calculated according to the transmission energy consumption and the MEC unloading task calculation model in the communication model, for example, the total time delay generated by all the tasks in each time interval t is calculated as follows:
Figure BDA0003845522590000101
Wherein, l represents the total number of clients for offloading tasks to the MEC server in the time slot t;
based on the obtained total time delay, the average time delay of task processing of each user terminal can be calculated as
Figure BDA0003845522590000102
Averaging the task processing average time delays of all the time slots to obtain the task processing average time delay of a preset time range, wherein the task processing average time delay of the preset time range corresponding to the time slot set T is as follows:
Figure BDA0003845522590000103
taking the task processing average time delay of the minimum preset time range as an optimization target, and constructing a minimum time delay optimization model; the objective function of the minimum delay optimization model is expressed as:
Figure BDA0003845522590000104
in the formula ,
Figure BDA0003845522590000105
wherein d (t) represents the total time delay generated by all tasks in the time interval t;
Figure BDA0003845522590000106
and />
Figure BDA0003845522590000107
Respectively represent the user end u in the time interval t i Offloading tasks to MEC server b j Transmission delay and processing delay of (a); l represents the total number of clients; kappa represents a computing resource proportion matrix allocated to different user terminals by each MEC server;
the constraint condition of the minimum time delay optimization model is expressed as follows:
Figure BDA0003845522590000108
Figure BDA0003845522590000111
wherein ,Oj Representation offloading to MEC server b j A set of computing tasks on;
Figure BDA0003845522590000112
indicating the user end u in the time slot t i Offloading tasks to MEC server b j Is used for the transmission energy consumption of the (a); t represents the total number of time slots; e (E) i Representing a userTerminal u i An upper transmission energy consumption limit of (2); />
Figure BDA0003845522590000113
Representing MEC server b j Assigned to the user end u in time slots t i Task Q of (2) i Is calculated according to the resource proportion;
s14, converting the minimized time delay optimization model into a corresponding Markov decision model; wherein the Markov decision process (Markov Decision Processes, MDP) model is understood to be a 4-element tuple
Figure BDA0003845522590000114
And S represents a state space, A represents an action space, R represents a reward function, χ ε [0,1 ]]Representing discount coefficients; specifically, the step of converting the minimized time delay optimization model into a corresponding markov decision model includes:
constructing a state space of the Markov decision model according to the environmental states of each time slot; the environmental state of each time gap is expressed as:
Figure BDA0003845522590000115
wherein s (t) represents the environmental state of the time gap t; ω (t), z (t) and pri (t) represent the calculated amount, data amount and priority of all tasks within the time slot t, respectively;
Figure BDA0003845522590000116
representing an unloading strategy of a user side;
according to the Agent actions of the agents in each time interval, constructing an action space of the Markov decision model; the Agent actions of each time interval are expressed as follows:
a(t)=κ(t)
Wherein a (t) represents an Agent action of the time slot t; kappa (t) represents the proportion of computing resources allocated to different clients by each MEC server in a time slot t;
constructing a reward function of the Markov decision model according to Agent rewards of the agents in each time interval; the Agent action rewards are expressed as:
Figure BDA0003845522590000117
wherein r (t) represents an Agent action reward for the time slot t;
it should be noted that, for the conventional MDP problem, the cumulative reward function is maximized, and the objective of the minimum delay optimization model according to the present invention is to minimize the average delay, so that, on the basis of the objective function, the opposite number of delays is selected as the reward function, and the reward function is set to a minimum value when the constraint is not satisfied.
S15, solving a Markov decision model to obtain a resource scheduling strategy; the method for solving the Markov decision model can be various existing methods, in order to ensure the high efficiency and the accuracy of solving, in the embodiment, a DDRA algorithm is preferably established on the basis of a reinforcement learning algorithm TD3 framework, and a resource scheduling strategy is output through a trained neural network model; specifically, the step of solving the markov decision model to obtain the resource scheduling policy includes:
And constructing a time delay optimization DDRA algorithm based on a reinforcement learning algorithm TD3 framework, and solving the Markov decision model through the time delay optimization DDRA algorithm to obtain the resource scheduling strategy.
When solving the Markov decision model based on the reinforcement learning algorithm TD3, the following 6 networks need to be trained: the training process comprises the following steps of:
the parameter of the Actor network is phi, the input is the current training environment state s (t), and the output is the current Agent strategy pi (a (t) s (t); phi, namely the action probability distribution of the current time slot; the corresponding Target Actor network parameter is phi ', the input is the current implementation environment state s ' (t), the output is the current Agent strategy pi (a ' (t) to s ' (t); phi '), namely the action probability distribution of the current time interval;
the parameter of the first Critic network is θ 1 Inputs are the current Agent policy pi (a (t) |s (t); phi) and the current training environment state s (t), and outputs are the Q function of action a (t) taken in the current state
Figure BDA0003845522590000121
I.e., the accumulated expected value of taking a particular resource allocation action in the current satellite communication environment state; the parameter of the corresponding first Target Critic network is θ' 1 Inputs are the current Agent policy pi (a ' (t) |s ' (t); phi ') and the implementation environment state s ' (t), and outputs are the Q function of the action taken in the current state a ' (t)/(t)>
Figure BDA0003845522590000122
I.e., the accumulated expected value of taking a particular resource allocation action in the current satellite communication environment state;
the parameter of the second Critic network is θ 2 Inputs are the current Agent policy pi (a (t) |s (t); phi) and the current training environment state s (t), and outputs are the Q function of action a (t) taken in the current state
Figure BDA0003845522590000131
I.e., the accumulated expected value of taking a particular resource allocation action in the current satellite communication environment state; the parameter of the corresponding second Target Critic network is θ' 2 Inputs are the current Agent policy pi (a ' (t) |s ' (t); phi ') and the implementation environment state s ' (t), and outputs are the Q function of the action taken in the current state a ' (t)/(t)>
Figure BDA0003845522590000132
I.e., the accumulated expected value of taking a particular resource allocation action in the current satellite communication environment state;
the pseudo code of the process of training the neural network by adopting the time delay optimization DDRA algorithm constructed based on the reinforcement learning algorithm TD3 algorithm frame is shown in figure 4, and the method comprises the following steps:
1) Using an Actor network to interact with the environment, and storing result tuples { s (t), a (t), r (t), s (t+1) } obtained by each step of interaction into a cache;
2) Randomly taking batches of tuples { s (t), a (t), r (t), s (t+1) } from the cache, and calculating a (t+1) and Q functions; wherein action a (t+1) is taken in the current state:
Figure BDA0003845522590000133
Figure BDA0003845522590000134
correspondingly, the Q function Q in the current state:
Figure BDA0003845522590000135
where σ' is the variance, c is the upper limit, γ is the Q function update rate;
3) Calculating and updating parameter theta of first Critic network 1 And parameter θ of the second Critic network 2 The method comprises the steps of carrying out a first treatment on the surface of the The corresponding updating method is as follows:
parameters of the first Critic network
Figure BDA0003845522590000136
Parameters of the second Critic network
Figure BDA0003845522590000137
4) Updating the parameter phi of the Actor network, the parameter phi ' of the Target Actor network and the parameter theta ' of the first Target Critic network at a preset step length interval d ' 1 And parameter θ 'of the second Target Critic network' 2 The method comprises the steps of carrying out a first treatment on the surface of the The corresponding parameter updating method is as follows:
parameters of an Actor network
Figure BDA0003845522590000138
The parameter phi ' ≡τ+ (1- τ) phi ' of the Target Actor network '
Parameter θ 'of Target Critic1 network' 1 ←τθ 1 +(1-τ)θ′ 1
Parameter θ 'of Target Critic2 network' 2 ←τθ 2 +(1-τ)θ′ 2
wherein ,
Figure BDA0003845522590000141
η is the Actor network update rate and τ is the Target network update rate;
5) Repeating the steps until the neural network converges.
In order to make a task offloading decision, the Actor network needs to train for many times according to the steps and update and train the first Critic network, the second Critic network, the Target Actor network, the first Target Critic network and the second Target Critic network by combining experiences in the cache; and finally, determining a resource scheduling strategy by adopting the Target Actor network, the first Target Critic network and the second Target Critic network which are obtained through training.
According to the method, the problem of resource scheduling among edge servers under satellite network edge computing is considered, the construction of a time delay sensitive satellite elastic Internet architecture is completed based on SDN/NFV technology and TSN security protocol, a corresponding time delay optimization model is obtained, a method for obtaining a resource scheduling strategy is solved by adopting the time delay optimization algorithm constructed based on the deep reinforcement learning algorithm architecture, comprehensive and effective optimization problem modeling is conducted based on the actual application scene of limited resources and task queuing time delay influence, computing resources and storage resources of satellites can be fully utilized, performance influence caused by queuing time delay is avoided, division of task priority can be achieved, diversified requirements of a user side can be met, service quality can be improved, calculation efficiency of massive service resource scheduling distribution can be improved based on the time delay optimization algorithm constructed in a targeted mode, further intelligent efficient satellite elastic Internet resource scheduling with priority service, time delay performance is improved, load balancing is guaranteed, application defects of the existing satellite resource scheduling scheme are effectively overcome, and the method has strong generalization capability and high practical value.
In order to verify the application effect and performance of the DDRA scheme, the application also implements a relevant simulation comparison experiment, and the specific simulation process is as follows:
the simulation platform adopts Python 3.9, is provided with 3 LEO satellites with the height of 784km to fly over a square area of 1200m multiplied by 1200m, 24 (modifiable) user terminals are distributed randomly on the ground by default, and each user terminal can only offload tasks to an MEC server of one LEO satellite or locally. Because the altitude is far greater than the ground area, the distance between each user side and the MEC server is approximately considered to be the altitude of the LEO satellite, and because the considered LEO satellite forms a star system, the loss of channel switching and the influence caused by the communication window are negligible, namely, the user time can be considered to be communicated with the LEO satellite, and the channel gain can be acquired in advance through a sensing technology. In addition, the transmission power of the user terminal is set to be 23dBm, the channel bandwidth is 20MHz, and the channel model is selected as a free space fading channel model. For task parameters, mainly considering the calculated amount, data amount, priority and possible energy consumption; task offloading decisions of the last time slot need to be considered during calculation, and the amount of calculation resources which can be provided by each LEO satellite in each time slot is assumed to be a random value in a range; in the DDTO algorithm, through careful adjustment, all the neural networks are divided into 4 layers, namely 1 input layer, 2 hidden layers and 1 output layer, wherein hidden layer neurons of an actor network are respectively 2048 and 1024, and hidden layer neurons of a critic network are respectively 1024 and 512; the training parameters of the model were learned at a rate of 0.001 and discounted by a factor of 0.75. In addition, the rest of the parameter settings are detailed in FIG. 5;
Simulation a: based on the continuity of the action space, selecting a SAC algorithm with similar performance to the DDRA algorithm, and analyzing and comparing convergence rates of the DDRA algorithm and other reinforcement learning algorithms; the SAC algorithm is also improved based on the DQN algorithm, the method also has the characteristic of over-estimation prevention on the calculation of the Critic network, and the difference is that SAC introduces strategy entropy into a reward function, namely encourages Agent to increase exploration while maximizing rewards, the design intention is to enable Agent to search global optimal solution rather than local optimal solution as much as possible, further more time is occupied for training, and the training process is not efficient;
simulation B: analyzing and comparing the optimized time delay results of the DDRA algorithm and other resource scheduling algorithms (SAC algorithm and local optimization algorithm are selected); the model based on the local optimization algorithm (marked as LOA) does not consider the influence of priority and queuing delay, can prove that the optimization problem is a convex optimization problem, and solves the Lagrangian equation of the problem through the KKT condition to obtain an optimal solution; meanwhile, in the LOA, the optimal solution is substituted into the objective function in the minimum time delay optimization model to obtain the average time delay;
Simulation C: based on the average value of the data quantity of 3.5Mb, the calculated quantity is respectively subjected to scenes of uniform distribution and normal distribution, and the influence of the calculated quantity of the task on the average time delay performance under the distribution of the calculation resources is researched;
simulation D: based on the mean value of the calculated amount being 5.5Gcycle, the data amount is respectively subjected to scenes of uniform distribution and normal distribution, and the influence of the task data amount on the average time delay performance under the allocation of the calculation resources is researched;
simulation E: based on the situation that the average value of the task calculated amount and the task data amount is 5Gcycle and 1Mb respectively and both are subjected to normal distribution, the influence of the user quantity on the average time delay performance under the computing resource distribution is researched.
Based on the simulation experiment, the following results are obtained:
the analysis results shown in FIGS. 6-7 were obtained from simulation A experiments: FIG. 6 compares the effect of discount rate on DDRA algorithm convergence; in deep reinforcement learning, the discount factor is an important hyper-parameter in the Markov decision process. By setting the discount factors to 0.05, 0.75 and 0.95 in sequence, the discount factors are found to have bad influence when being too large or too small, and when the discount factors are 0.05 and 0.95, the average time delay is relatively large fluctuated, which indicates that the system is unstable under 2 conditions, and particularly when the discount factor is 0.05, a relatively large peak value appears at the epicode=50 or so; when the discount factor is 0.75, the convergence of the system is stable and the convergence speed is moderate, so that the system is suitable for being used as a default super-parameter of a model when the task calculation amount and the task data amount are analyzed next. Based on this, it can be seen that the discount factor must be reasonably selected, otherwise it will cause instability or convergence of the system to an unsuitable value; fig. 7 compares the convergence of different reinforcement learning algorithms under SMTOM, and finds that DDRA has converged to about 1.8s when epicode=100, while SAC algorithm converges to about 2s after epicode=5000; as can be seen, the convergence speed of the DDRA algorithm is significantly faster than that of the SAC algorithm, and the final average delay obtained by convergence is also lower than that of the SAC algorithm, i.e. the DDRA has higher calculation efficiency and better optimization result in the second scenario under the SMTOM, which is very important in the satellite communication scenario with scarce resources.
Based on simulation B and simulation C, the analysis results shown in fig. 8 to 9 were obtained: fig. 8-9 can summarize that as the task calculation increases, the average system delay tends to increase, and the SAC algorithm tends to increase less stably. Meanwhile, the DDRA and the local optimization algorithm LOA provided by the application have better performance, and the average time delay is lower than that of the SAC algorithm. In addition, when the calculated amount is smaller than 4.5Gcycle, the performance of the DDRA algorithm is poorer than that of LOA, but when the calculated amount is larger than 4.5Gcycle, the performance of the DDRA algorithm is better than that of LOA, and as the LOA does not consider the queuing delay problem, when the calculated amount of tasks is larger, the processing time of each task is increased, the queuing delay is increased, and if the processing time is still only partially optimized, the effect of the reduced time delay performance is inevitably brought; based on the method, the advantages of the DDRA algorithm in the application scene with large calculation amount are reflected, and the advantages of SMTMO in the satellite communication scene are reflected.
When the data amount obeys the uniform distribution in fig. 8, the average delay of SAC algorithm is 37.8% higher and the average delay of DDRA algorithm is 2.7% lower than that of LOA algorithm. In fig. 9, it can be seen that the average time delay performance of each algorithm is not greatly different when the data volume is subjected to normal distribution and the average time delay of each algorithm is relatively stable when the calculated volume is relatively low, and the time delay performance of the method is superior to DDRA and SAC; after the calculated amount is increased, a result with larger variance appears in the SAC algorithm, which possibly fails to converge to the optimal solution under the limit of training times, and the DDRA algorithm exceeds LOA when the delay performance is represented, so that the superiority of the DDRA algorithm under the high task calculated amount under the SMTMO is reflected.
Based on simulation B and simulation D, the analysis results shown in FIGS. 10-11 were obtained: from fig. 10 to fig. 11, it can be summarized that the task data volume has little influence on the task unloading performance, and as the data volume increases, the average time delay of various algorithms generally shows a smaller increase trend, because the data volume only affects the transmission time delay, and the transmission time delay is usually smaller in the whole time delay, and under the LOA, the optimal value obtained through gradient solution is not related to the task data volume. Of the three algorithms, the SAC algorithm has larger jitter, and the LOA and DDRA show a smooth rising trend compared with each other. In addition, when the calculated amount obeys the uniform distribution in fig. 10, compared with the LOA, the average time delay of the SAC algorithm is 1.9% lower, and the average time delay of the random algorithm is 11.5% lower; it can be seen from fig. 11 that when the data volume obeys the normal distribution, the average delay of the SAC algorithm is 2.1% higher and the average delay of the random algorithm is 11.2% lower than the LOA; therefore, under higher task calculation amount, the average time delay performance of the SAC algorithm is similar to that of the LOA, and the DDRA provided by the invention is about 11% better than that of the LOA.
Based on simulation B and simulation E, the analysis results shown in fig. 12 were obtained: it can be seen from fig. 12 that, in general, the average time delay under each algorithm gradually increases with the number of users, and in particular, the local optimization algorithm, and the increasing trend of the average time delay increases somewhat like an exponential increase. Comparing the three algorithms, on average, the average time delay performance of the DDRA algorithm is best, and the average time delay under different user numbers in the simulation is 6.5s; the SAC algorithm performs a little worse than the DDRA algorithm, and the average time delay under different user numbers in the simulation is 6.8s; the average time delay of the local optimization algorithm under different user numbers in the simulation is 13.1s, because the influence caused by queuing time delay is ignored, the influence is not great when the user number is small, but if the user number is increased, the average time delay is also increased sharply, so that the time delay performance under the multi-user task number is greatly reduced, and the user experience is reduced. That is, overall, the DDRA algorithm has an average latency of 4.4% lower than the SAC algorithm and 50.4% lower than the local optimization algorithm for different numbers of clients.
Based on the results of the simulation experiments, the invention establishes a satellite elastic Internet architecture and a satellite elastic Internet resource scheduling model aiming at the resource scheduling problem in the satellite elastic Internet scene, and on the basis, proposes an average time delay target under the energy limiting condition, and further proposes a technical scheme for solving the non-convex optimization problem by utilizing a reinforcement learning algorithm TD3 framework, so that the average time delay of user end task unloading can be obviously reduced, and real effective satellite resource scheduling is realized.
Although the steps in the flowcharts described above are shown in order as indicated by arrows, these steps are not necessarily executed in order as indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders.
In one embodiment, as shown in fig. 13, there is provided a satellite-resilient internet resource scheduling system, the system comprising:
the architecture construction module 1 is used for establishing a time delay sensitive satellite elastic Internet architecture based on the LEO satellite and the many-to-many mode of the user side; the LEO satellite corresponds to one MEC server;
The first modeling module 2 is used for establishing a satellite elastic Internet resource scheduling model according to the time delay sensitive satellite elastic Internet architecture;
the second modeling module 3 is used for establishing a minimum time delay optimization model according to the satellite elastic internet resource scheduling model;
the model conversion module 4 is used for converting the minimized time delay optimization model into a corresponding Markov decision model;
and the strategy solving module 5 is used for solving the Markov decision model to obtain a resource scheduling strategy.
For a specific limitation of a satellite-resilient internet resource scheduling system, reference may be made to the limitation of a satellite-resilient internet resource scheduling method hereinabove, and no further description is given here. The modules in the satellite elastic internet resource scheduling system can be all or partially realized by software, hardware and a combination thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.
Fig. 14 shows an internal structural diagram of a computer device, which may be a terminal or a server in particular, in one embodiment. As shown in fig. 14, the computer device includes a processor, a memory, a network interface, a display, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program when executed by a processor implements a satellite-resilient internet resource scheduling method. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, can also be keys, a track ball or a touch pad arranged on the shell of the computer equipment, and can also be an external keyboard, a touch pad or a mouse and the like.
It will be appreciated by those of ordinary skill in the art that the architecture shown in fig. 14 is merely a block diagram of some of the architecture relevant to the present application and is not intended to limit the computer device on which the present application may be implemented, and that a particular computing device may include more or fewer components than shown, or may combine certain components, or have the same arrangement of components.
In one embodiment, a computer device is provided comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the above method when the computer program is executed.
In one embodiment, a computer readable storage medium is provided having a computer program stored thereon, which when executed by a processor, implements the steps of the above method.
In summary, the satellite elastic internet resource scheduling method and system provided by the embodiment of the invention realize the establishment of a time delay sensitive satellite elastic internet architecture based on multiple pairs of multimode of LEO satellites and user terminals, and after the establishment of a satellite elastic internet resource scheduling model according to the time delay sensitive satellite elastic internet architecture, a minimum time delay optimizing model is established according to the satellite elastic internet resource scheduling model, and the minimum time delay optimizing model is converted into a corresponding Markov decision model, and the Markov decision model is solved, so that the technical scheme of a resource scheduling strategy is obtained.
In this specification, each embodiment is described in a progressive manner, and all the embodiments are directly the same or similar parts referring to each other, and each embodiment mainly describes differences from other embodiments. In particular, for system embodiments, since they are substantially similar to method embodiments, the description is relatively simple, as relevant to see a section of the description of method embodiments. It should be noted that, any combination of the technical features of the foregoing embodiments may be used, and for brevity, all of the possible combinations of the technical features of the foregoing embodiments are not described, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.
The foregoing examples represent only a few preferred embodiments of the present application, which are described in more detail and are not thereby to be construed as limiting the scope of the invention. It should be noted that modifications and substitutions can be made by those skilled in the art without departing from the technical principles of the present invention, and such modifications and substitutions should also be considered to be within the scope of the present application. Therefore, the protection scope of the patent application is subject to the protection scope of the claims.

Claims (8)

1. The satellite elastic internet resource scheduling method is characterized by comprising the following steps of:
based on LEO satellite and user end many-to-many mode, establishing time delay sensitive satellite elastic Internet architecture; the LEO satellite corresponds to one MEC server;
establishing a satellite elastic Internet resource scheduling model according to the time delay sensitive satellite elastic Internet architecture; the satellite elastic Internet resource scheduling model comprises a communication model, a task model and a calculation model; the computing model comprises a local off-load task computing model and an MEC off-load task computing model;
establishing a minimum time delay optimization model according to the satellite elastic Internet resource scheduling model;
converting the minimized time delay optimization model into a corresponding Markov decision model;
solving a Markov decision model to obtain a resource scheduling strategy;
the step of establishing a satellite elastic internet resource scheduling model according to the time delay sensitive satellite elastic internet architecture comprises the following steps:
acquiring data transmission rates of the MEC servers uploaded by the user terminals, and constructing the communication model according to the data transmission rates based on the many-to-many mode; the communication model is expressed as:
Figure QLYQS_1
in the formula ,
Figure QLYQS_2
wherein ,
Figure QLYQS_3
and />
Figure QLYQS_4
Respectively representing a user side set and an MEC server set; />
Figure QLYQS_5
r i,j (t)、I i,j 、h i,j(t) and si,j Respectively represent the user end u in the time interval t i Offloading tasks to MEC server b j Transmission delay, transmission energy consumption, transmission rate, inter-cell interference power, channel gain, and linear distance; w represents a channel bandwidth; sigma (sigma) 2 Representing the noise power of the user equipment; z i (t) represents the user terminal u in the time slot t i Task Q generated i The data size of (t); c represents the speed of light; p is p i (t) represents a time slot t and a user terminal u i Transmission power of the transmission signal;
based on a load balancing principle, constructing a task model according to the task calculation amount, the task data amount and the task priority; the task model is expressed as:
Q i (t)={ω i (t),z i (t),pri i (t)}
wherein ,Qi (t)Indicating the user end u in the time slot t i A generated task; omega i (t) represents task Q i (t) the amount of calculation required; z i (t) represents task Q i The data size of (t); pri (pri) i (t) represents the task Q i (t) priority, and pri i (t)∈[1,2,…,PN]PN represents the number of priorities;
dividing each user end task into a local offloading task and an MEC offloading task, and respectively constructing a corresponding local offloading task calculation model and an MEC offloading task calculation model; the local offload task computation model is expressed as:
Figure QLYQS_6
wherein ,
Figure QLYQS_7
and />
Figure QLYQS_8
Respectively represent user end u i For local task Q i L Processing time delay and corresponding energy consumption; f (f) i L Representing user end u i Is a local CPU frequency of (b); ρ i A power coefficient representing the energy consumed by each CPU cycle;
the MEC off-load task calculation model is expressed as:
Figure QLYQS_9
wherein ,
Figure QLYQS_10
indicating the user end u in the time slot t i Offloading tasks to MEC server b j Is a processing delay of (1); />
Figure QLYQS_11
Representing MEC server b j Is a CPU frequency of (2); />
Figure QLYQS_12
Representing MEC server b j Assigning MEC offload tasks Q in time slots t i E Is calculated according to the resource proportion; />
Figure QLYQS_13
Indicating priority as pri i MEC offload task Q i E Is used for the average queuing delay.
2. The method for scheduling satellite elastic internet resources according to claim 1, wherein the step of establishing a delay-sensitive satellite elastic internet architecture based on the LEO satellite and the many-to-many mode of the user terminal comprises:
based on SDN/NFV technology, the computing resources and storage resources of MEC servers corresponding to each LEO satellite are virtualized, and various delay related protocols are combined by combining TSN security protocols, so that the principle of minimum delay optimization target management satellite resource configuration, route forwarding and network configuration is established, and the delay sensitive satellite elastic Internet architecture is established.
3. The method for scheduling satellite elastic internet resources according to claim 1, wherein the step of establishing a minimum delay optimization model according to the satellite elastic internet resource scheduling model comprises:
calculating the task processing average time delay of each time interval in a preset time range according to the satellite elastic Internet resource scheduling model;
averaging the task processing average time delays of all the time slots to obtain the task processing average time delay of a preset time range;
taking the task processing average time delay of the minimum preset time range as an optimization target, and constructing a minimum time delay optimization model; the objective function of the minimum delay optimization model is expressed as:
Figure QLYQS_14
in the formula ,
Figure QLYQS_15
wherein d (t) represents the total time delay generated by all tasks in the time interval t;
Figure QLYQS_16
and />
Figure QLYQS_17
Respectively represent the user end u in the time interval t i Offloading tasks to MEC server b j Transmission delay and processing delay of (a); l represents the total number of clients; kappa represents a computing resource proportion matrix allocated to different user terminals by each MEC server;
the constraint condition of the minimum time delay optimization model is expressed as follows:
Figure QLYQS_18
/>
Figure QLYQS_19
wherein ,Oj Representation offloading to MEC server b j A set of computing tasks on;
Figure QLYQS_20
Indicating the user end u in the time slot t i Offloading tasks to MEC server b j Is used for the transmission energy consumption of the (a); t represents the total number of time slots; e (E) i Representing user end u i An upper transmission energy consumption limit of (2); />
Figure QLYQS_21
Representing MEC server b j At the time ofThe gap t is allocated to the user terminal u i Task Q of (2) i Is a ratio of the calculated resources.
4. The satellite elastic internet resource scheduling method of claim 3, wherein the step of converting the minimized delay optimization model into a corresponding markov decision model comprises:
constructing a state space of the Markov decision model according to the environmental states of each time slot; the environmental state of each time gap is expressed as:
Figure QLYQS_22
wherein s (t) represents the environmental state of the time gap t; ω (t), z (t) and pri (t) represent the calculated amount, data amount and priority of all tasks within the time slot t, respectively;
Figure QLYQS_23
representing an unloading strategy of a user side;
according to the Agent actions of the agents in each time interval, constructing an action space of the Markov decision model; the Agent actions of each time interval are expressed as follows:
a(t)=κ(t)
wherein a (t) represents an Agent action of the time slot t; kappa (t) represents the proportion of computing resources allocated to different clients by each MEC server in a time slot t;
Constructing a reward function of the Markov decision model according to Agent rewards of the agents in each time interval; the Agent action rewards are expressed as:
Figure QLYQS_24
where r (t) represents the Agent action rewards for time slot t.
5. The method for scheduling satellite elastic internet resources according to claim 1, wherein the step of solving a markov decision model to obtain a resource scheduling policy comprises:
and constructing a time delay optimization DDRA algorithm based on a reinforcement learning algorithm TD3 framework, and solving the Markov decision model through the time delay optimization DDRA algorithm to obtain the resource scheduling strategy.
6. A satellite-resilient internet resource scheduling system, the system comprising:
the architecture construction module is used for establishing a time delay sensitive satellite elastic Internet architecture based on the LEO satellite and the many-to-many mode of the user side; the LEO satellite corresponds to one MEC server;
the first modeling module is used for establishing a satellite elastic Internet resource scheduling model according to the time delay sensitive satellite elastic Internet architecture; the satellite elastic Internet resource scheduling model comprises a communication model, a task model and a calculation model; the computing model comprises a local off-load task computing model and an MEC off-load task computing model;
The second modeling module is used for establishing a minimum time delay optimization model according to the satellite elastic Internet resource scheduling model;
the model conversion module is used for converting the minimized time delay optimization model into a corresponding Markov decision model;
the strategy solving module is used for solving the Markov decision model to obtain a resource scheduling strategy;
the establishing a satellite elastic internet resource scheduling model according to the time delay sensitive satellite elastic internet architecture comprises the following steps:
acquiring data transmission rates of the MEC servers uploaded by the user terminals, and constructing the communication model according to the data transmission rates based on the many-to-many mode; the communication model is expressed as:
Figure QLYQS_25
in the formula ,
Figure QLYQS_26
wherein ,
Figure QLYQS_27
and />
Figure QLYQS_28
Respectively representing a user side set and an MEC server set; />
Figure QLYQS_29
r i,j (t)、I i,j 、h i,j(t) and si,j Respectively represent the user end u in the time interval t i Offloading tasks to MEC server b j Transmission delay, transmission energy consumption, transmission rate, inter-cell interference power, channel gain, and linear distance; w represents a channel bandwidth; sigma (sigma) 2 Representing the noise power of the user equipment; z i (t) represents the user terminal u in the time slot t i Task Q generated i The data size of (t); c represents the speed of light; p is p i (t) represents a time slot t and a user terminal u i Transmission power of the transmission signal;
based on a load balancing principle, constructing a task model according to the task calculation amount, the task data amount and the task priority; the task model is expressed as:
Q i (t)={ω i (t),z i (t),pri i (t)}
wherein ,Qi (t) represents the user terminal u in the time slot t i A generated task; omega i (t) represents task Q i (t) the amount of calculation required; z i (t) represents task Q i The data size of (t); pri (pri) i (t) represents the task Q i (t) priority, and pri i (t)∈[1,2,…,PN]PN represents the number of priorities;
dividing each user end task into a local offloading task and an MEC offloading task, and respectively constructing a corresponding local offloading task calculation model and an MEC offloading task calculation model; the local offload task computation model is expressed as:
Figure QLYQS_30
wherein ,
Figure QLYQS_31
and />
Figure QLYQS_32
Respectively represent user end u i For local task Q i L Processing time delay and corresponding energy consumption; f (f) i L Representing user end u i Is a local CPU frequency of (b); ρ i A power coefficient representing the energy consumed by each CPU cycle;
the MEC off-load task calculation model is expressed as:
Figure QLYQS_33
wherein ,
Figure QLYQS_34
indicating the user end u in the time slot t i Offloading tasks to MEC server b j Is a processing delay of (1); />
Figure QLYQS_35
Representing MEC server b j Is a CPU frequency of (2); />
Figure QLYQS_36
Representing MEC server b j Assigning MEC offload tasks Q in time slots t i E Is calculated according to the resource proportion; />
Figure QLYQS_37
Indicating priority as pri i MEC offload task Q i E Is used for the average queuing delay.
7. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the method of any of claims 1 to 5 when the computer program is executed.
8. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 5.
CN202211125448.7A 2022-09-14 2022-09-14 Satellite elastic Internet resource scheduling method, system, computer equipment and medium Active CN115514769B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202211125448.7A CN115514769B (en) 2022-09-14 2022-09-14 Satellite elastic Internet resource scheduling method, system, computer equipment and medium
ZA2023/05873A ZA202305873B (en) 2022-09-14 2023-06-01 Resource scheduling method and system for satellite elastic internet, computer device, and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211125448.7A CN115514769B (en) 2022-09-14 2022-09-14 Satellite elastic Internet resource scheduling method, system, computer equipment and medium

Publications (2)

Publication Number Publication Date
CN115514769A CN115514769A (en) 2022-12-23
CN115514769B true CN115514769B (en) 2023-06-06

Family

ID=84504538

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211125448.7A Active CN115514769B (en) 2022-09-14 2022-09-14 Satellite elastic Internet resource scheduling method, system, computer equipment and medium

Country Status (2)

Country Link
CN (1) CN115514769B (en)
ZA (1) ZA202305873B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116346206A (en) * 2023-03-27 2023-06-27 广州爱浦路网络技术有限公司 AI/ML model distributed transmission method, device and system based on low orbit satellite and 5GS

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111211830A (en) * 2020-01-13 2020-05-29 电子科技大学 Satellite uplink bandwidth resource allocation method based on Markov prediction
CN113346944A (en) * 2021-06-28 2021-09-03 上海交通大学 Time delay minimization calculation task unloading method and system in air-space-ground integrated network
CN114362810A (en) * 2022-01-11 2022-04-15 重庆邮电大学 Low-orbit satellite beam hopping optimization method based on migration depth reinforcement learning
CN114884949A (en) * 2022-05-07 2022-08-09 重庆邮电大学 Low-orbit satellite Internet of things task unloading method based on MADDPG algorithm
CN114900225A (en) * 2022-04-24 2022-08-12 南京大学 Low-orbit giant constellation-based civil aviation Internet service management and access resource allocation method
CN114928394A (en) * 2022-04-06 2022-08-19 中国科学院上海微系统与信息技术研究所 Low-orbit satellite edge computing resource allocation method with optimized energy consumption
CN114980039A (en) * 2022-05-24 2022-08-30 中山大学 Random task scheduling and resource allocation method in MEC system of D2D cooperative computing

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11187548B2 (en) * 2019-02-05 2021-11-30 International Business Machines Corporation Planning vehicle computational unit migration based on mobility prediction

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111211830A (en) * 2020-01-13 2020-05-29 电子科技大学 Satellite uplink bandwidth resource allocation method based on Markov prediction
CN113346944A (en) * 2021-06-28 2021-09-03 上海交通大学 Time delay minimization calculation task unloading method and system in air-space-ground integrated network
CN114362810A (en) * 2022-01-11 2022-04-15 重庆邮电大学 Low-orbit satellite beam hopping optimization method based on migration depth reinforcement learning
CN114928394A (en) * 2022-04-06 2022-08-19 中国科学院上海微系统与信息技术研究所 Low-orbit satellite edge computing resource allocation method with optimized energy consumption
CN114900225A (en) * 2022-04-24 2022-08-12 南京大学 Low-orbit giant constellation-based civil aviation Internet service management and access resource allocation method
CN114884949A (en) * 2022-05-07 2022-08-09 重庆邮电大学 Low-orbit satellite Internet of things task unloading method based on MADDPG algorithm
CN114980039A (en) * 2022-05-24 2022-08-30 中山大学 Random task scheduling and resource allocation method in MEC system of D2D cooperative computing

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
天基激光网络在线分布式接入调度算法;王世超;吴斌;汪勃;;激光与光电子学进展(第03期);第1-3页 *

Also Published As

Publication number Publication date
ZA202305873B (en) 2023-12-20
CN115514769A (en) 2022-12-23

Similar Documents

Publication Publication Date Title
CN109684075B (en) Method for unloading computing tasks based on edge computing and cloud computing cooperation
CN109818786B (en) Method for optimally selecting distributed multi-resource combined path capable of sensing application of cloud data center
CN109151864B (en) Migration decision and resource optimal allocation method for mobile edge computing ultra-dense network
CN112911648A (en) Air-ground combined mobile edge calculation unloading optimization method
Shu et al. Dependency-aware and latency-optimal computation offloading for multi-user edge computing networks
CN113778648A (en) Task scheduling method based on deep reinforcement learning in hierarchical edge computing environment
Yuan et al. Online dispatching and fair scheduling of edge computing tasks: A learning-based approach
EP4024212A1 (en) Method for scheduling interference workloads on edge network resources
Huang et al. Toward decentralized and collaborative deep learning inference for intelligent IoT devices
CN113377533A (en) Dynamic computation unloading and server deployment method in unmanned aerial vehicle assisted mobile edge computation
CN115190033B (en) Cloud edge fusion network task unloading method based on reinforcement learning
CN115514769B (en) Satellite elastic Internet resource scheduling method, system, computer equipment and medium
Qi et al. Vehicular edge computing via deep reinforcement learning
CN113573363A (en) MEC calculation unloading and resource allocation method based on deep reinforcement learning
Chiang et al. Deep Q-learning-based dynamic network slicing and task offloading in edge network
CN113821346B (en) Edge computing unloading and resource management method based on deep reinforcement learning
Hu et al. Dynamic task offloading in MEC-enabled IoT networks: A hybrid DDPG-D3QN approach
CN115499875B (en) Satellite internet task unloading method, system and readable storage medium
Henna et al. Distributed and collaborative high-speed inference deep learning for mobile edge with topological dependencies
CN116781532A (en) Optimization mapping method of service function chains in converged network architecture and related equipment
Ge et al. Mobile edge computing against smart attacks with deep reinforcement learning in cognitive MIMO IoT systems
CN113157344B (en) DRL-based energy consumption perception task unloading method in mobile edge computing environment
CN114698125A (en) Method, device and system for optimizing computation offload of mobile edge computing network
CN114785692A (en) Virtual power plant aggregation regulation and control communication network flow balancing method and device
CN116418808A (en) Combined computing unloading and resource allocation method and device for MEC

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant