CN115412156B - Urban monitoring-oriented satellite energy-carrying Internet of things resource optimal allocation method - Google Patents

Urban monitoring-oriented satellite energy-carrying Internet of things resource optimal allocation method Download PDF

Info

Publication number
CN115412156B
CN115412156B CN202211006067.7A CN202211006067A CN115412156B CN 115412156 B CN115412156 B CN 115412156B CN 202211006067 A CN202211006067 A CN 202211006067A CN 115412156 B CN115412156 B CN 115412156B
Authority
CN
China
Prior art keywords
leo
node
monitoring
energy
time slot
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211006067.7A
Other languages
Chinese (zh)
Other versions
CN115412156A (en
Inventor
李源
许海涛
徐佳康
杨仁金
张海旺
吕挺
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Penghu Wuyu Technology Development Co ltd
Original Assignee
Beijing Penghu Wuyu Technology Development Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Penghu Wuyu Technology Development Co ltd filed Critical Beijing Penghu Wuyu Technology Development Co ltd
Priority to CN202211006067.7A priority Critical patent/CN115412156B/en
Publication of CN115412156A publication Critical patent/CN115412156A/en
Application granted granted Critical
Publication of CN115412156B publication Critical patent/CN115412156B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04BTRANSMISSION
    • H04B7/00Radio transmission systems, i.e. using radiation field
    • H04B7/14Relay systems
    • H04B7/15Active relay systems
    • H04B7/185Space-based or airborne stations; Stations for satellite systems
    • H04B7/1851Systems using a satellite or space-based relay
    • H04B7/18513Transmission in a satellite or space-based system
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04BTRANSMISSION
    • H04B17/00Monitoring; Testing
    • H04B17/30Monitoring; Testing of propagation channels
    • H04B17/391Modelling the propagation channel
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04BTRANSMISSION
    • H04B7/00Radio transmission systems, i.e. using radiation field
    • H04B7/14Relay systems
    • H04B7/15Active relay systems
    • H04B7/185Space-based or airborne stations; Stations for satellite systems
    • H04B7/1851Systems using a satellite or space-based relay
    • H04B7/18519Operations control, administration or maintenance
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W16/00Network planning, e.g. coverage or traffic planning tools; Network deployment, e.g. resource partitioning or cells structures
    • H04W16/02Resource partitioning among network components, e.g. reuse partitioning
    • H04W16/10Dynamic resource partitioning
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W16/00Network planning, e.g. coverage or traffic planning tools; Network deployment, e.g. resource partitioning or cells structures
    • H04W16/22Traffic simulation tools or models
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W28/00Network traffic management; Network resource management
    • H04W28/16Central resource management; Negotiation of resources or communication parameters, e.g. negotiating bandwidth or QoS [Quality of Service]
    • H04W28/18Negotiating wireless communication parameters
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W28/00Network traffic management; Network resource management
    • H04W28/16Central resource management; Negotiation of resources or communication parameters, e.g. negotiating bandwidth or QoS [Quality of Service]
    • H04W28/24Negotiating SLA [Service Level Agreement]; Negotiating QoS [Quality of Service]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • Astronomy & Astrophysics (AREA)
  • Aviation & Aerospace Engineering (AREA)
  • General Physics & Mathematics (AREA)
  • Electromagnetism (AREA)
  • Mobile Radio Communication Systems (AREA)
  • Radio Relay Systems (AREA)

Abstract

The invention provides an urban monitoring-oriented satellite energy-carrying Internet of things resource optimal allocation method, which comprises the following steps: s1, constructing a low orbit satellite LEO auxiliary city monitoring network model, and providing energy transmission and data acquisition services for multiple monitoring nodes; s2, adopting a monitoring node cluster head election algorithm based on K-Means to perform cluster head election and resource allocation; s3, network resource allocation optimization is carried out by adopting comprehensive data acquisition, energy transmission and LEO energy consumption; s4, based on a Markov decision process problem model and a DDPG algorithm, adopting a low orbit satellite LEO to assist in multi-objective joint optimization of urban monitoring network data acquisition and energy transmission. The optimization of the invention aims at maximizing the uplink data collection amount and the downlink energy transmission amount by jointly optimizing the low-orbit satellite LEO flight decision and resource allocation, and reducing the low-orbit satellite LEO energy consumption as much as possible.

Description

Urban monitoring-oriented satellite energy-carrying Internet of things resource optimal allocation method
Technical Field
The invention relates to the field of wireless communication of the low-orbit satellite Internet of things, in particular to an urban monitoring-oriented satellite energy-carrying Internet of things resource optimal allocation method.
Background
In a low orbit satellite LEO assisted urban monitoring network, the low orbit satellite LEO provides energy transmission and data acquisition services for multiple detection nodes through mobile deployment. However, since the tasks of the monitoring nodes are different and the position distribution is not uniform, differences are generated in the aspects of data generation rate, distribution density, energy consumption rate and the like. In addition, because the monitoring nodes which are closer to the information collecting node need to bear more communication load, the nodes are easy to consume own energy prematurely, and if the monitoring nodes cannot obtain timely data acquisition and energy transmission service, the monitoring nodes can cause serious energy holes and data loss.
Disclosure of Invention
The optimization objective of the invention is to maximize the uplink data collection amount and the downlink energy transmission amount by jointly optimizing the low-orbit satellite LEO flight decision and the resource allocation, and reduce the low-orbit satellite LEO energy consumption as much as possible, and in the process of actually establishing the multi-objective optimization problem, three optimization objectives have conflicts to a certain extent. How to find the best coverage service position to make a flight decision and optimize a resource allocation decision, the process is very complex, and considerable calculation cost is brought. Furthermore, conventional model-based methods such as dynamic planning methods are not effective in solving this problem, since the environment is partially observable.
Therefore, the invention divides the problem into two parts of cluster head election and resource allocation. Firstly, a low-orbit satellite LEO auxiliary city monitoring network clustering model is provided, a corresponding cluster head election algorithm is provided, after the nodes are clustered, a proper monitoring node is selected from each cluster to serve as a cluster head, and the cluster head nodes collect data of the monitoring nodes in the clusters and forward the data to the low-orbit satellite LEO.
Then, a low orbit satellite LEO assisted city monitoring network resource allocation strategy is provided, and a corresponding algorithm is provided. The problem can be described as a markov decision process, thus building a relevant problem model. Considering that the monitoring nodes in the scene are densely distributed, the DQN algorithm is not suitable for continuous action space, the DDPG as a classical DRL algorithm is proved to learn an effective strategy in the continuous action space through low-dimensional observation, the algorithm is suitable for low-orbit satellite LEO flight decision problem, and the rewards in the original DDPG algorithm are considered as scalar values.
The technical scheme of the invention mainly comprises the following steps:
a city monitoring-oriented satellite energy-carrying Internet of things resource optimization allocation method comprises the following steps:
s1, constructing a low orbit satellite LEO auxiliary city monitoring network model, and providing energy transmission and data acquisition services for multiple monitoring nodes;
s2, adopting a monitoring node cluster head election algorithm based on K-Means to perform cluster head election and resource allocation;
s3, network resource allocation optimization is carried out by adopting comprehensive data acquisition, energy transmission and LEO energy consumption;
s4, based on a Markov decision process problem model and a DDPG algorithm, adopting a low orbit satellite LEO to assist in multi-objective joint optimization of urban monitoring network data acquisition and energy transmission.
The method specifically comprises the following steps:
s1, constructing a low orbit satellite LEO auxiliary city monitoring network model, wherein the specific scene is that a single low orbit satellite LEO provides energy transmission and data acquisition service for a plurality of monitoring nodes through mobile deployment. The LEO is provided with a single antenna, the node is provided with a plurality of antennas, and information decoding and energy collection are respectively carried out based on an antenna switching structure.
And constructing a transmission queue model of the system.
And establishing a probability channel model by comprehensively considering the occurrence probability of the sight link and the non-sight link channels, and taking the probability channel model as a channel model of LEO and a ground monitoring node.
And determining the next action of the LEO in real time in the moving process, updating the position of the LEO, and jointly considering the flight energy consumption, the coverage service energy consumption and the communication energy consumption to construct an energy consumption model of the system.
LEO sends radio frequency signals to monitoring nodes at specific transmit powers in sub-slots by serving the flight speed and yaw angle decisions to move to the target node location. In the sub-time slot, all monitoring nodes in the satellite energy transmission coverage range are charged, and accordingly an energy transmission model of the system is built.
S2, dividing all M monitoring nodes into K clusters by using a K-Means algorithm, selecting a proper monitoring node from each cluster as a cluster head, collecting data of monitoring nodes in the clusters by the cluster head nodes, and forwarding the data to the LEO. The LEO performs energy transfer to all nodes in the cluster during the overlay service phase. At each time slot, the LEO selects one cluster head node as the target node for the next service. The service priority of the node is considered for the selection of the target node;
s3, comprehensively considering three aspects of data acquisition requirements, energy transmission requirements and low orbit satellite LEO energy consumption, realizing maximization of uplink data collection quantity and downlink energy transmission quantity through combined optimization of LEO flight decision and resource allocation, defining a multi-objective optimization problem, and optimizing.
S4, constructing a problem model according to the Markov decision process. The state space of the system is first described. Then based on the system state and environment in the low orbit satellite LEO auxiliary city monitoring network research scene, the action selected by the LEO at the specific time slot comprises the flying speed, flying angle and time slot allocation of the LEO and the transmitting power allocation, the action space description is constructed. Meanwhile, the method is used as quantitative evaluation after the intelligent agent takes action in reinforcement learning.
Further:
s1 specifically comprises the following steps:
s101, constructing a low orbit satellite LEO auxiliary city monitoring network model
The scenario provides energy transmission and data acquisition services for multiple monitoring nodes through mobile deployment for a single LEO. The LEO is provided with a single antenna, the node is provided with a plurality of antennas, and information decoding and energy collection are respectively carried out based on an antenna switching structure;
each time the duration of a flight task is T > 0, the total time is divided into equal-length time slots, i.e., t=1, 2. And the monitoring node receives the radio frequency signal based on the antenna switching structure, decodes the information and simultaneously collects energy, and the cluster head node uploads the monitoring data to the LEO in an uplink sub-time slot.
S102, constructing a transmission queue model
In this scenario, the monitoring node is used to
Figure BDA0003809164500000031
The node position is represented as [ x ] m ,y m ]. For monitoring node->
Figure BDA0003809164500000032
Setting lambda m (t) represents the data generation rate of node m during the execution of the monitoring task at time slot t; let lambda of different nodes m (t) obeys a poisson distribution and the parameter is constant during the monitoring task, i.e. lambda m (t)=λ m . Set->
Figure BDA0003809164500000033
Representing the length of data waiting to be uploaded in the data transmission queue of the monitoring node m at time slot t +.>
Figure BDA0003809164500000034
Expressed as:
Figure BDA0003809164500000035
wherein the method comprises the steps of
Figure BDA0003809164500000036
Figure BDA0003809164500000037
Is the maximum capacity of the data transmission queue storage, assuming all monitoring nodes
Figure BDA0003809164500000038
Same, when->
Figure BDA0003809164500000039
Exceed->
Figure BDA00038091645000000310
When this means that newly collected data cannot be placed in the node data buffer and will be discarded, causing data overflow.
For energy transmission requirements, set
Figure BDA00038091645000000311
Representing the residual energy of the monitoring node m at the time slot t, and setting mu m (t) represents the energy consumption rate of the node in time slot t, mu of different time slots m (t) is the same, i.e. μ m (t)=μ m Also because of different hardware factors and deployment locations, the mu of the monitoring node m Different. +.>
Figure BDA00038091645000000312
Expressed as:
Figure BDA00038091645000000313
wherein the method comprises the steps of
Figure BDA00038091645000000314
Figure BDA00038091645000000315
Is the maximum capacity of the energy transfer queue storage, assuming all monitoring nodes
Figure BDA00038091645000000316
The same applies. When- >
Figure BDA00038091645000000317
When the monitoring node is in energy exhaustion, normal service cannot be provided, and an energy cavity condition occurs.
S103, constructing a system channel model
The probability channel model is established by comprehensively considering the occurrence probability of the line-of-sight link LOS and the non-line-of-sight link NLOS channels, and is used as a communication model of LEO and a ground monitoring node, and the corresponding LOSs under the model is expressed as:
Figure BDA0003809164500000041
in gamma 0 =(4πf c /c) -2 Representing the reference distance d 0 Channel power gain at=1m, f c Representing the carrier frequency, c representing the speed of light; d, d m (t) is the distance between LEO and the target node m,
Figure BDA0003809164500000042
representing a path loss index; mu (mu) NLOS Is the attenuation coefficient of the NLOS link.
For monitoring node m, the LOS probability at time t is:
Figure BDA0003809164500000043
where a and b are constants, θ depending on the carrier frequency and the type of environment m (t) is the elevation angle between the LEO and the target monitoring node, expressed as:
θ m (t)=(180/π)sin -1 (H/d m (t)) (5)
non-line-of-sight link probability passing P t NLOSm (t))=1-P t LOSm (t)) to represent. The downlink channel power gain and the uplink channel power gain of the communication link between LEO and target monitoring node m are denoted as h, respectively m (t) and g m (t). I.e. the channel power gain between LEO and target node is expressed as:
Figure BDA0003809164500000044
s104, constructing a system energy consumption model
Assuming LEO flies at a fixed height H > 0, the horizontal position at time slot t is denoted as [ x ] u (t),y u (t)]The LEO determines its next action in real time during the movement and updates its location. The flight control of LEO in this scenario is described by a flight speed v (t) limited by the maximum flight speed and a yaw angle θ (t) limited by θ (t) ∈ [ -pi, pi []Is limited by the number of (a). Here, the energy consumption model study on LEO will jointly consider flight energy consumption, coverage service energy consumption and communication energy consumption, wherein the propulsion power consumption of LEO at speed V during flight is calculated by the following formula:
Figure BDA0003809164500000045
p in the formula 0 Is blade profile power at overlay service, U tip Is the tip speed of the rotor blade. P (P) i And v 0 Representing induction power and average rotor induction speed under overlay service conditions. For parasitic power, d 0 ρ, s, A represent the fuselage resistance ratio, air density, rotor solidity and rotor disk area, respectively. The propulsion power consumption of the LEO includes the blade profile, the inductive power and the parasitic power, corresponding to the three parts of equations (4) - (7). The power consumption is obtained by setting v=0 for the overlay service:
P hov =P(V=0)=P 0 +P i (8)
the flight expended energy of the LEO in time slot t is expressed as:
Figure BDA0003809164500000051
and (5) carrying out energy transmission and data acquisition on the monitoring nodes in the LEO coverage range in the coverage service stage.
S105, constructing an energy transmission model
LEO transmits power P in sub-slots τ (t) by serving the flight speed and yaw angle decisions to move to target node locations d (t) transmitting a radio frequency signal to the monitoring node, wherein P d (t) receive
Figure BDA0003809164500000052
Is limited by the number of (a). Within τ (t), all monitoring nodes within the LEO energy transmission coverage will get charged, and the received power at monitoring node m is expressed as:
Figure BDA0003809164500000053
a nonlinear energy transfer model is applied as the air-to-ground energy transfer model. By the RF-EH model, the actual power of the receiving end is expressed as:
Figure BDA0003809164500000054
wherein P is limit The maximum output DC power, c and d are circuit characteristic correlation constants.
S2 specifically comprises the following steps:
s201, distance formula design of K-Means algorithm
For the distance formula design of the K-Means algorithm, joint monitoring node characteristics and Euclidean distance are used as joint distances:
Figure BDA0003809164500000055
wherein a and b are the length and width of the city monitoring network model, respectively.
Dividing M monitoring nodes into K clusters according to a clustering algorithm, wherein node subsets corresponding to each cluster are expressed as
Figure BDA0003809164500000056
Wherein the intra-cluster node transmits self monitoring data to the cluster head node, LEO transmits energy to all nodes in the cluster in the coverage service stage, the monitoring node serving as the cluster head uploads the monitoring data through an uplink in the time of 1-tau (t), and the uplink transmitting power of the cluster head node is +. >
Figure BDA0003809164500000057
Depending on the total energy collected during τ (t), i.e. +.>
Figure BDA0003809164500000058
Is positively related to->
Figure BDA0003809164500000059
Expressed as:
Figure BDA00038091645000000510
wherein ζ represents the energy conversion efficiency, which is a constant value.
The upload data rate for cluster head node k is expressed as:
Figure BDA0003809164500000061
in each time slot, the LEO selects a cluster head node as a target node of the next service, if the target node is still the current node, the LEO of the next time slot continuously maintains the coverage service state, if the target node is changed, the LEO is in a flight state, and the LEO moves to the position of the target node through decision of the flight speed and the deflection angle. For the selection of the target node, the service priority of the node needs to be considered, wherein the service priority comprises a data acquisition priority, an energy supply priority and a node distance, the data acquisition priority is set based on the data queue length and the data generation rate of the monitoring node, and the energy supply priority is the same. The service priority of cluster head node k at time slot t is defined as Q k (t):
Figure BDA0003809164500000062
Wherein the data acquisition priority and the energy supply priority weights are α=1 and β=5, respectively.
S3 specifically comprises the following steps:
network performance is required to be optimized by comprehensively considering three aspects of data acquisition requirements, energy transmission requirements and LEO energy consumption:
(1) Data acquisition amount
The data acquisition during the coverage service of the LEO at the monitoring node k in the time slot t is realized based on the cluster head node, and the corresponding data acquisition amount is expressed as:
D k (t)=R k (t)(1-τ(t)) (16)
the total data acquisition amount of the LEO on the monitoring node in the task period T is expressed as:
Figure BDA0003809164500000063
(2) Energy transmission quantity
In the time slot t, LEO sends data acquisition information to cluster head node k and the rest monitoring nodes in the LEO coverage range and transmits energy, and during the service of the cluster head node k position coverage, the energy transmission quantity of LEO in the time slot t is expressed as:
Figure BDA0003809164500000064
the total amount of energy transfer of the LEO during the mission period T is expressed as:
Figure BDA0003809164500000065
(3) Low orbit satellite LEO energy consumption
According to the state of LEO in time slot t, LEO energy consumption is divided into flight energy consumption and coverage service energy consumption, wherein the energy consumption during coverage service comprises LEO coverage service energy consumption and downlink transmission total energy, and the energy consumption is expressed as:
Figure BDA0003809164500000071
if in flight, LEO energy consumption is denoted as E uav (t) =p (v) t, so the energy consumption of LEO in a task cycle is expressed as:
Figure BDA0003809164500000072
setting up
Figure BDA0003809164500000073
I.e. the data volume in the monitoring node is larger than the threshold +.>
Figure BDA0003809164500000074
Data overflow is considered to occur. Set->
Figure BDA0003809164500000075
Monitoring node energy less than->
Figure BDA0003809164500000076
The energy void situation is considered to occur, where both α and β are constant values.
The optimization aims at maximizing the uplink data collection amount and the downlink energy transmission amount by jointly optimizing LEO flight decisions and resource allocation.
S301, based on the above, defining a multi-objective optimization problem is as follows:
P1:
Figure BDA0003809164500000077
Figure BDA0003809164500000078
Figure BDA0003809164500000079
Figure BDA00038091645000000710
Figure BDA00038091645000000711
Figure BDA00038091645000000712
in constraint conditions of the optimization problem, C1 and C2 are flight speed and deflection angle constraints of LEO, C3 is LEO transmitting power constraint, and C4 and C5 are constraints of a monitoring node data queue and an energy queue respectively.
The optimization target preference is described by introducing weight parameters by adopting a multi-target joint optimization MJDPG algorithm for low orbit satellite LEO assisted city monitoring network data acquisition and energy transmission.
In MJDDPG algorithm, the corresponding prize value is expressed as r=rw T The bonus vector is converted into scalar form. According to the importance preference of each optimization objective, in the interval [0.0,1.0]All weight parameters are selected, and the rest network structures are the same as the DDPG algorithm. For the target network, the target value y t Is calculated as follows:
y t =rw T +γQ'(s t+1 ,μ'(s t+1μ' )|θ Q' ) (28)。
s4 specifically comprises the following steps:
s401, defining a state space
In a low orbit satellite LEO assisted urban monitoring network, the state space is jointly determined by the monitoring nodes, LEO and environmental information, at time slot t,
Figure BDA00038091645000000713
is the relative distance between the target monitoring node and LEO in a Cartesian coordinate system, N f (t) recording the cumulative number of times the LEO task period exceeded the limit region by time t. The absolute position of LEO is [ x ] u (t),y u (t)]Number of data loss nodes N d (t) and number of Power-off nodes N e (t) will prompt LEO to timely serve the high demand monitoring node, the amount of data to be uploaded in the current target node +.>
Figure BDA0003809164500000081
The LEO is directed to make efficient decisions on the allocation of resources for time slots and power. The definition of the state space is as follows:
Figure BDA0003809164500000082
wherein at time t=0, N f (t)、N d (t) and N e (t) are all 0.
S402, defining an action space
Mapping the state space to a continuous action space, and realizing multi-objective optimization by jointly optimizing LEO flight decision, time slot allocation and power allocation in LEO-assisted city monitoring network scene. Based onThe actions selected by LEO at time slot t include LEO flight speed v (t), flight angle θ (t), and time slot allocation τ (t) and transmit power allocation p d (t) all the action variables are continuous variables. Thus, the actions that LEO can take as an agent in time slot t can be expressed as:
Figure BDA0003809164500000083
wherein the flying speed v (t) and the yaw angle θ (t) are respectively in the intervals [0, v ] max ]And [ -pi, pi]Within the range, pass [ cos (θ (t))]Representing yaw, p d (t) is in the range of P d ∈[0,P max ]τ (t) represents the proportion of time allocated to downlink energy transmission in a single time slot.
S403, establishing a reward function
The optimization aims at optimizing the network performance by maximizing the data acquisition amount, the energy transmission amount and minimizing the LEO energy consumption on the premise of guaranteeing the network quality. The rewards are thus defined as multidimensional vectors:
Figure BDA0003809164500000084
wherein r is dc (t)、r eh (t) and r ec (t) is an optimization objective, r aux And (t) is a penalty term.
According to the service condition of LEO to the node in time slot t, the rewarding value is expressed as:
Figure BDA0003809164500000085
wherein D is k (t) represents the reward value corresponding to the total data volume collected by LEO in the coverage service stage at the monitoring node k, and the larger the total data volume is, the larger the reward value is obtained; e (E) k (t) represents the rewards value brought by the LEO covering the total energy transmission amount in the service stage at the monitoring node k, and the larger the total energy transmission amount is, the larger the rewards value is obtained;
Figure BDA0003809164500000086
indicating the energy consumption of LEO in time slot t, if LEO is in flight +.>
Figure BDA0003809164500000087
If in the overlay service state, the overlay service energy consumption and communication energy consumption are included, namely +.>
Figure BDA0003809164500000088
Once the target monitoring node falls within the coverage radius of the LEO, the LEO will cover the service and perform data acquisition and energy transfer. Otherwise, LEO is in flight phase r dc (t) and r eh (t) are all 0. Furthermore for auxiliary penalty function r aux (t):
Figure BDA0003809164500000091
r aux The first two terms in (t) are the distance between the LEO and the target monitoring node. LEO is motivated to preserve network quality during learning by punishing the wrong flight decisions of LEO.
Drawings
FIG. 1 is a model building diagram of S1;
FIG. 2 is a flowchart of a cluster head election algorithm of S2;
FIG. 3 is a flowchart of the low orbit satellite LEO assisted urban monitoring network resource allocation strategy formulation of S3;
FIG. 4 is a Markov decision process problem model of S4;
fig. 5 is a DDPG algorithm architecture diagram.
Detailed Description
The following detailed description of embodiments of the invention is, therefore, to be taken in conjunction with the accompanying drawings, and it is to be understood that the scope of the invention is not limited to the specific embodiments.
S1 specifically comprises as shown in FIG. 1:
s101, constructing a low orbit satellite LEO auxiliary city monitoring network model
Specifically, the scene is that a single LEO provides energy transmission and data acquisition service for multiple monitoring nodes through mobile deployment. The LEO is provided with a single antenna, the nodes are provided with multiple antennas, information decoding and energy collection are respectively carried out based on an antenna switching structure, and the monitoring nodes can be different in terms of data generation rate, distribution density, energy consumption rate and the like according to different monitoring tasks.
Considering that LEO has limited energy, each time the duration of a flight task is T > 0, for ease of analysis, the total time is divided into equal length time slots, i.e., t=1, 2. And the monitoring node receives the radio frequency signal based on the antenna switching structure, decodes the information and simultaneously collects energy, and the cluster head node uploads the monitoring data to the LEO in an uplink sub-time slot.
S102, constructing a transmission queue model
In this scenario, the monitoring node is used to
Figure BDA0003809164500000092
The node position is represented as [ x ] m ,y m ]. For monitoring node->
Figure BDA0003809164500000093
Setting lambda m (t) represents the data generation rate of node m during the execution of the monitoring task in time slot t, where different monitoring nodes will be at lambda due to deployment location and hardware factors m (t) there is a difference, given lambda of different nodes m (t) obeys a poisson distribution and the parameter is constant during the monitoring task, i.e. lambda m (t)=λ m . Set->
Figure BDA0003809164500000101
Representing the length of data waiting to be uploaded in the data transmission queue of the monitoring node m at time slot t +.>
Figure BDA0003809164500000102
Can be expressed as:
Figure BDA0003809164500000103
wherein the method comprises the steps of
Figure BDA0003809164500000104
Is the maximum capacity of the data transmission queue storage, assuming all monitoring nodes
Figure BDA0003809164500000105
Same, when->
Figure BDA0003809164500000106
Exceed->
Figure BDA0003809164500000107
When this means that newly collected data cannot be placed in the node data buffer and will be discarded, causing data overflow.
For energy transmission requirements, set
Figure BDA0003809164500000108
Representing the residual energy of the monitoring node m at the time slot t, and setting mu m (t) represents the energy consumption rate of the node in time slot t, mu of different time slots m (t) is the same, i.e. μ m (t)=μ m Also because of different hardware factors and deployment locations, the mu of the monitoring node m Different. +.>
Figure BDA0003809164500000109
Can be expressed as:
Figure BDA00038091645000001010
Wherein the method comprises the steps of
Figure BDA00038091645000001011
Figure BDA00038091645000001012
Is the maximum capacity of the energy transfer queue storage, assuming all monitoring nodes
Figure BDA00038091645000001013
The same applies. When->
Figure BDA00038091645000001014
When the monitoring node is in energy exhaustion, normal service cannot be provided, and an energy cavity condition occurs.
S103, constructing a system channel model
Considering that the research scene is an urban area with more building shielding, the free space propagation channel model is no longer applicable. Therefore, probability channel models are built by comprehensively considering the occurrence probability of Line of Sight (LOS) and Non-Line of Sight (NLOS) channels, and are used as communication models of LEOs and ground monitoring nodes, under the models, corresponding losses can be considered, research scenes are urban areas with more building shielding, and the free space propagation channel models are not applicable any more. The probability channel model is thus built by comprehensively considering the probability of occurrence of Line of Sight (LOS) and Non-Line of Sight (NLOS) channels, and serves as a communication model for LEO and ground monitoring nodes, where the corresponding LOSs can be expressed as:
Figure BDA00038091645000001015
in gamma 0 =(4πf c /c) -2 Representing the reference distance d 0 Channel power gain at=1m, f c Representing the carrier frequency, c representing the speed of light; d, d m (t) LEO and target node m The distance between the two electrodes is equal to the distance between the two electrodes,
Figure BDA00038091645000001016
representing a path loss index; mu (mu) NLOS Is the attenuation coefficient of the NLOS link.
For monitoring node m, the LOS probability at time t is:
Figure BDA00038091645000001017
where a and b are constants, θ depending on the carrier frequency and the type of environment m (t) is the elevation angle between the LEO and the target monitoring node, expressed as:
θ m (t)=(180/π)sin -1 (H/d m (t)) (5)
non-line-of-sight link probability may be determined by P t NLOSm (t))=1-P t LOSm (t)) to represent. It is assumed that the uplink and downlink channels are approximately the same. The downlink channel power gain and the uplink channel power gain of the communication link between LEO and target monitoring node m may be denoted as h, respectively m (t) and g m (t). I.e. the channel power gain between LEO and target node can be expressed as:
Figure BDA0003809164500000111
s104, constructing a system energy consumption model
Assuming LEO flies at a fixed height H > 0, the horizontal position at time slot t is denoted as [ x ] u (t),y u (t)]The position change in the LEO vertical direction is not taken into account in this scenario. The LEO determines its next action in real time during the movement and updates its location. The flight control of LEO in this scenario is described by a flight speed v (t) limited by the maximum flight speed and a yaw angle θ (t) limited by θ (t) ∈ [ -pi, pi []Is limited by the number of (a). Here, the energy consumption model study on LEO will jointly consider flight energy consumption, coverage service energy consumption and communication energy consumption, wherein the flight The power consumption of LEO propulsion at speed V during a row can be calculated by:
Figure BDA0003809164500000112
p in the formula 0 Is blade profile power at overlay service, U tip Is the tip speed of the rotor blade. P (P) i And v 0 The induction power and average rotor induction speed under the overlay service conditions are shown. For parasitic power, d 0 ρ, s, A represent the fuselage resistance ratio, air density, rotor solidity and rotor disk area, respectively. The propulsion power consumption of the LEO includes the blade profile, the inductive power and the parasitic power, corresponding to the three parts of equation (4-7). The power consumption for the overlay service can be obtained by setting v=0:
P hov =P(V=0)=P 0 +P i (8)
the flight expended energy of LEO in time slot t can therefore be expressed as:
Figure BDA0003809164500000113
the energy transmission loss in the process of charging the monitoring node by LEO is mainly considered for communication energy consumption. In an actual scene, the LEO has a limited energy transmission range, and only the monitoring nodes in the LEO coverage range are subjected to energy transmission and data acquisition in the coverage service stage.
S105, constructing an energy transmission model
LEO transmits power P in sub-slots τ (t) by serving the flight speed and yaw angle decisions to move to target node locations d (t) transmitting a radio frequency signal to the monitoring node, wherein P d (t) receive
Figure BDA0003809164500000114
Is limited by the number of (a). Within τ (t), all monitoring nodes within the LEO energy transmission coverage will get charged, and the received power at monitoring node m can be expressed as:
Figure BDA0003809164500000121
To more closely approximate a real scene, a nonlinear energy transfer model is applied here as the air-to-ground energy transfer model. Compared with a linear model, the nonlinear energy transmission model considers the saturation limit of a circuit, and has more generality and practicability. By the RF-EH model, the actual power at the receiving end can be expressed as:
Figure BDA0003809164500000122
wherein P is limit The maximum output DC power, c and d are circuit characteristic correlation constants.
S2 is shown in FIG. 2, and is a flow design diagram of a cluster head election algorithm based on K-Means;
the number of monitoring nodes in the urban monitoring network scene is huge and the monitoring nodes are densely distributed, the LEO sequentially traverses all the monitoring nodes to perform energy transmission and data acquisition, so that serious LEO energy consumption is caused, and in addition, if the monitoring nodes cannot obtain timely data acquisition and energy transmission service, serious energy holes and data loss are caused. Therefore, the problem is divided into two parts of cluster head election and resource allocation, after the nodes are clustered, a proper monitoring node is selected from each cluster to serve as a cluster head, and the cluster head nodes collect data of monitoring nodes in the clusters and forward the data to LEO. The K-Means algorithm is selected to cluster all the monitoring nodes as shown in FIG. 2:
the specific algorithm is shown in the following table:
Figure BDA0003809164500000123
S201, distance formula design of K-Means algorithm
For the distance formula design of the K-Means algorithm, the similarity of adjacent monitoring nodes in the data generation rate and the energy consumption rate is considered, and the joint monitoring node characteristics and Euclidean distance are taken as joint distances:
Figure BDA0003809164500000131
wherein a and b are the length and width of the city monitoring network model, respectively.
Dividing M monitoring nodes into K clusters according to a clustering algorithm, wherein node subsets corresponding to each cluster are expressed as
Figure BDA0003809164500000132
Wherein the intra-cluster node transmits self monitoring data to the cluster head node, LEO transmits energy to all nodes in the cluster in the coverage service stage, the monitoring node serving as the cluster head uploads the monitoring data through an uplink in the time of 1-tau (t), and the uplink transmitting power of the cluster head node is +.>
Figure BDA0003809164500000133
Depending on the total energy collected during τ (t), i.e. +.>
Figure BDA0003809164500000134
Is positively related to->
Figure BDA0003809164500000135
Can be expressed as:
Figure BDA0003809164500000136
wherein ζ represents the energy conversion efficiency, which is a constant value.
Thus, the upload data rate for cluster head node k is expressed as:
Figure BDA0003809164500000137
at each time slot, the LEO selects one cluster head node as the target node for the next service,if the target node is still the current node, the LEO of the next time slot continues to maintain the coverage service state, and if the target node is changed, the LEO is in a flight state and moves to the position of the target node through decision of the flight speed and the deflection angle. For the selection of the target node, the service priority of the node needs to be considered, wherein the service priority comprises a data acquisition priority, an energy supply priority and a node distance, the data acquisition priority is set based on the data queue length and the data generation rate of the monitoring node, and the energy supply priority is the same. The service priority of cluster head node k at time slot t can be defined as Q k (t):
Figure BDA0003809164500000138
Wherein the data acquisition priority and the energy supply priority weights are α=1 and β=5, respectively.
S3, as shown in FIG. 3, a flow chart is formulated for the low orbit satellite LEO assisted city monitoring network resource allocation strategy.
Because the research target is that a dynamic resource allocation strategy based on an antenna switching structure is found in the LEO-assisted urban monitoring network scene, the strategy is required to comprehensively consider three aspects of data acquisition requirements, energy transmission requirements and LEO energy consumption to optimize network performance. The analysis is as follows, wherein the data acquisition requirement is mainly represented by the total amount of data acquisition of the LEO on the monitoring node, the energy transmission requirement is the total amount of energy transmitted by the LEO, and the LEO energy consumption is the total amount of communication energy consumption, flight energy consumption and coverage service energy consumption of the LEO in one task period. Therefore, in the research, the aim of optimizing the total data acquisition amount and the total energy transmission amount in the urban monitoring network and the minimum LEO energy consumption is multiple. In the decision of LEO flight path selection and coverage service position, the state of the monitoring node and LEO energy consumption are considered, and the overflow of the monitoring node data and energy holes are avoided as much as possible. And setting the LEO to sequentially access the monitoring nodes for service according to the real-time service priority of the nodes, namely selecting the monitoring nodes as target nodes of the LEO in a time slot t based on a formula (15).
(1) Data acquisition amount
After the monitoring nodes are clustered, the monitoring nodes in the clusters send the cache data in the data transmission queue to the cluster head nodes in a single-hop or multi-hop mode, so that the LEO in the time slot t is realized based on the cluster head nodes during the coverage service of the monitoring node k, and the corresponding data acquisition quantity can be expressed as:
D k (t)=R k (t)(1-τ(t)) (16)
the total data acquisition of the LEO to the monitoring node in the task period T can be expressed as:
Figure BDA0003809164500000141
(2) Energy transmission quantity
During the time slot t, the LEO sends data acquisition information to the cluster head node k and the rest monitoring nodes in the coverage area of the LEO and transmits energy, and during the coverage service of the cluster head node k, the energy transmission amount of the LEO in the time slot t can be expressed as:
Figure BDA0003809164500000142
the total amount of energy transfer of the LEO during the mission period T can thus be expressed as:
Figure BDA0003809164500000143
(3) Low orbit satellite LEO energy consumption
LEO energy consumption can be classified into flight energy consumption and coverage service energy consumption according to the state of LEO in time slot t, wherein energy consumption during coverage service includes LEO coverage service energy consumption and downlink transmission total energy, and can be expressed as:
Figure BDA0003809164500000144
/>
LEO energy consumption meter if in flight stateShown as E uav (t) =p (v) t, so the energy consumption of LEO in a task cycle can be expressed as:
Figure BDA0003809164500000145
meanwhile, considering the actual scene, in order to minimize the data overflow and energy cavity conditions, setting
Figure BDA0003809164500000146
I.e. the data volume in the monitoring node is larger than the threshold +.>
Figure BDA0003809164500000151
Data overflow is considered to occur. Similarly, set +.>
Figure BDA0003809164500000152
Monitoring node energy less than->
Figure BDA0003809164500000153
The energy void situation is considered to occur, where both α and β are constant values.
The optimization aims to maximize the uplink data collection amount and the downlink energy transmission amount by jointly optimizing LEO flight decision and resource allocation, and reduce LEO energy consumption as much as possible.
S301, based on the above, defining a multi-objective optimization problem is as follows:
P1:
Figure BDA0003809164500000154
Figure BDA0003809164500000155
Figure BDA0003809164500000156
Figure BDA0003809164500000157
Figure BDA0003809164500000158
Figure BDA0003809164500000159
in constraint conditions of the optimization problem, C1 and C2 are flight speed and deflection angle constraints of LEO, C3 is LEO transmitting power constraint, and C4 and C5 are constraints of a monitoring node data queue and an energy queue respectively. In the optimization objective, the maximization of the data collection amount mainly depends on the allocation of the LEO to the time slots and the power during the coverage service of the target monitoring node position, that is, the uploading data amount of the current target monitoring node can be improved by optimizing the time slots and the power. However, there is an increase in LEO power consumption when the resource allocation is excessive. Maximizing the total amount of energy transmission is also dependent on the allocation of slots and power by the LEO during the overlay service, and by allocating more slots and power, the monitoring node can be provided with sufficient energy, but more LEO energy consumption will result.
Based on the above analysis, it is not difficult to find that three optimization objectives have a conflict to some extent. How to find the best coverage service position to make a flight decision and optimize a resource allocation decision, the process is very complex, and considerable calculation cost is brought. Furthermore, conventional model-based methods such as dynamic planning methods are not effective in solving this problem, since the environment is partially observable. Considering the dense distribution of monitoring nodes in this scenario, DQN algorithm is not applicable to continuous motion space, while DDPG has been proven as classical DRL algorithm to learn effective strategies in continuous motion space through low-dimensional observation. The algorithm is suitable for LEO flight decision problems, takes scalar values as rewards in an original DDPG algorithm into consideration, extends to multidimensional rewards according to Multi-objective optimization problems, and provides a Multi-objective Joint DDPG (MJDPG) algorithm for low orbit satellite LEO assisted city monitoring network data acquisition and energy transmission, and the optimization objective preference is described by introducing weight parameters.
Fig. 5 is a diagram of a DDPG algorithm architecture, in which MJDDPG is a single target MDP with scalar prize signals, and the prizes in the experience tuples are vectors, unlike the original DDPG. Since the value of the action depends on the preference between competing goals, a linear weighting method is used here to calculate a weighted sum of the prize value vector elements, where the corresponding prize value is denoted r=rw T The bonus vector is converted to scalar form during a particular calculation.
In the arrangement herein, in the interval [0.0,1.0, according to the importance preference of each optimization objective]All weight parameters are selected, and the rest network structures are the same as the DDPG algorithm. For the target network, the target value y t Is calculated as follows:
y t =rw T +γQ'(s t+1 ,μ'(s t+1μ' )|θ Q' ) (28)
the complete MJDDPG algorithm is shown in algorithm 2.
Figure BDA0003809164500000161
/>
Figure BDA0003809164500000171
S4, a Markov decision process problem model is shown in FIG. 4.
S401, defining a state space
In a low orbit satellite LEO assisted urban monitoring network, the state space is jointly determined by the monitoring nodes, LEO and environmental information, at time slot t,
Figure BDA0003809164500000172
the relative distance between the target monitoring node and LEO in the Cartesian coordinate system is that after LEO completes the service of the current target node, a new monitoring node is selected according to the current state of the system to be usedIs the target node, and thus this section helps direct the LEO to bring the target monitoring node into its coverage. N (N) f (t) recording the cumulative number of times the LEO task period exceeded the limit region by time t. Absolute position of bound LEO [ x u (t),y u (t)]The LEO can be prevented from flying out of the designated area, and unnecessary resource waste is caused. Number of data loss nodes N d (t) and number of Power-off nodes N e (t) will prompt LEO to timely serve the high demand monitoring node, the amount of data to be uploaded in the current target node +.>
Figure BDA0003809164500000173
The LEO is directed to make efficient decisions on the allocation of resources for time slots and power. Thus, the definition of the state space is as follows:
Figure BDA0003809164500000174
wherein at time t=0, N f (t)、N d (t) and N e (t) are all 0.
S402, defining an action space
In a research scenario, the state space needs to be mapped to a continuous action space, and in a LEO-assisted city monitoring network scenario, multi-objective optimization is achieved by jointly optimizing LEO flight decisions, time slot allocation and power allocation. Based on the current system state and environment, the actions selected by the LEO at time slot t include the flight speed v (t), flight angle θ (t), and time slot allocation τ (t) and transmit power allocation p of the LEO d (t) all the action variables are continuous variables. Thus, the actions that LEO can take as an agent in time slot t can be expressed as:
Figure BDA0003809164500000175
wherein the flying speed v (t) and the yaw angle θ (t) are respectively in the intervals [0, v ] max ]And [ -pi, pi]Within the range, pass [ cos (θ (t))]Representing yaw, p d (t) is in the range of P d ∈[0,P max ]τ (t) represents the proportion of time allocated to downlink energy transmission in a single time slot.
S403, establishing a reward function
The reward function is used as quantitative evaluation after the agent takes action in reinforcement learning, the proper reward function is particularly important to the performance of the deep reinforcement learning algorithm, and LEO learns the control strategy by well-designed reward function. The optimization aims at optimizing the network performance by maximizing the data acquisition amount, the energy transmission amount and minimizing the LEO energy consumption on the premise of guaranteeing the network quality. The rewards can thus be defined as multidimensional vectors:
Figure BDA0003809164500000181
Wherein r is dc (t)、r eh (t) and r ec (t) is an optimization objective, r aux And (t) is a penalty term.
Depending on how the LEO is serving the node at time slot t, the prize value may be expressed as:
Figure BDA0003809164500000182
wherein D is k (t) represents the reward value corresponding to the total data volume collected by LEO in the coverage service stage at the monitoring node k, and the larger the total data volume is, the larger the reward value is obtained; e (E) k (t) represents the rewards value brought by the LEO covering the total energy transmission amount in the service stage at the monitoring node k, and the larger the total energy transmission amount is, the larger the rewards value is obtained;
Figure BDA0003809164500000183
indicating the energy consumption of LEO in time slot t, if LEO is in flight +.>
Figure BDA0003809164500000184
If in the overlay service state, the overlay service energy consumption and communication energy consumption are included, namely +.>
Figure BDA0003809164500000185
Once the target monitoring node falls within the coverage radius of the LEO, the LEO will cover the service and perform data acquisition and energy transfer. Otherwise, LEO is in flight phase r dc (t) and r eh (t) are all 0. Furthermore for auxiliary penalty function r aux (t):
Figure BDA0003809164500000186
It can be seen that r aux The first two terms in (t) are the distance between the LEO and the target monitoring node. If the LEO is farther from the target monitoring node, the penalty term will be smaller, helping the LEO to identify the location of the target monitoring node in order to be close to the target node. In addition, if the LEO attempts to fly out of the area or due to untimely service, a negative reward will be obtained if the monitoring node data overflows or the energy is exhausted. By penalizing LEO erroneous flight decisions to promote LEO preserving network quality during learning, r in the simulation environment herein aux (t) corresponding weight w aux Always set to 1.

Claims (1)

1. The city monitoring-oriented satellite energy-carrying Internet of things resource optimization allocation method is characterized by comprising the following steps of:
s1, constructing a low orbit satellite LEO auxiliary city monitoring network model, wherein a specific scene is that single low orbit satellite LEO provides energy transmission and data acquisition service for multiple monitoring nodes through mobile deployment; the LEO is provided with a single antenna, the node is provided with a plurality of antennas, and information decoding and energy collection are respectively carried out based on an antenna switching structure;
constructing a transmission queue model of the system;
establishing a probability channel model by comprehensively considering the occurrence probability of the sight link and the non-sight link channels, and taking the probability channel model as a channel model of LEO and a ground monitoring node;
the LEO determines the next action in real time in the moving process, updates the position of the LEO, and jointly considers the flight energy consumption, the coverage service energy consumption and the communication energy consumption to construct an energy consumption model of the system;
LEO is used for serving the decision of the flying speed and the deflection angle to move to the position of a target node, and a radio frequency signal is sent to a monitoring node at a specific transmitting power in a sub-time slot; in the sub time slot, all monitoring nodes in the satellite energy transmission coverage range are charged, and accordingly an energy transmission model of the system is built;
S1 specifically comprises the following steps:
s101, constructing a low orbit satellite LEO auxiliary city monitoring network model
The scene is that a single LEO provides energy transmission and data acquisition service for a plurality of monitoring nodes through mobile deployment; the LEO is provided with a single antenna, the node is provided with a plurality of antennas, and information decoding and energy collection are respectively carried out based on an antenna switching structure;
each time the duration of a flight task is T > 0, dividing the total time into equal-length time slots, i.e., t=1, 2, & gt, T, adopting a flight-coverage service communication protocol in the LEO work, wherein the LEO does not communicate with the monitoring nodes during flight, only performs energy transmission and data acquisition on the monitoring nodes during the coverage service, the coverage service time slot is divided into two parts which respectively correspond to uplink and downlink communication, and the LEO transmits information to the monitoring nodes in the cluster in the downlink sub time slot and simultaneously performs energy transmission; the monitoring node receives the radio frequency signal based on the antenna switching structure to perform information decoding and simultaneously perform energy collection, and the cluster head node uploads monitoring data to the LEO in an uplink sub-time slot;
s102, constructing a transmission queue model
In this scenario, the monitoring node is used to
Figure QLYQS_1
The node position is represented as [ x ] m ,y m ]The method comprises the steps of carrying out a first treatment on the surface of the For monitoring nodes
Figure QLYQS_2
Setting lambda m (t) represents the data generation rate of node m during the execution of the monitoring task at time slot t; assume thatLambda of different nodes m (t) obeys a poisson distribution and the parameter is constant during the monitoring task, i.e. lambda m (t)=λ m The method comprises the steps of carrying out a first treatment on the surface of the Set->
Figure QLYQS_3
Representing the length of data waiting to be uploaded in the data transmission queue of the monitoring node m at time slot t +.>
Figure QLYQS_4
Expressed as:
Figure QLYQS_5
wherein the method comprises the steps of
Figure QLYQS_6
Figure QLYQS_7
Is the maximum capacity of the data transmission queue storage, assuming +.>
Figure QLYQS_8
Same, when->
Figure QLYQS_9
Exceed->
Figure QLYQS_10
When the data is stored in the node data buffer area, the newly collected data cannot be put into the node data buffer area to be discarded, so that data overflow is caused;
for energy transmission requirements, set
Figure QLYQS_11
Representing the residual energy of the monitoring node m at the time slot t, and setting mu m (t) represents the energy consumption rate of the node in time slot t, mu of different time slots m (t) is the same, i.e. μ m (t)=μ m Also because of hardwareDifferent factors and deployment locations, mu of monitoring nodes m Different; +.>
Figure QLYQS_12
Expressed as:
Figure QLYQS_13
wherein the method comprises the steps of
Figure QLYQS_14
Figure QLYQS_15
Is the maximum capacity of the energy transfer queue storage, assuming +.>
Figure QLYQS_16
The same; when->
Figure QLYQS_17
When the monitoring node is exhausted, normal service cannot be provided, and an energy cavity condition occurs;
s103, constructing a system channel model
The probability channel model is established by comprehensively considering the occurrence probability of the line-of-sight link LOS and the non-line-of-sight link NLOS channels, and is used as a communication model of LEO and a ground monitoring node, and the corresponding LOSs under the model is expressed as:
Figure QLYQS_18
in gamma 0 =(4πf c /c) -2 Representing the reference distance d 0 Channel power gain at=1m, f c Representing the carrier frequency, c representing the speed of light; d, d m (t) is the distance between LEO and the target node m,
Figure QLYQS_19
representing a path loss index; mu (mu) NLOS Is the attenuation coefficient of the NLOS link;
for monitoring node m, the LOS probability at time t is:
Figure QLYQS_20
where a and b are constants, θ depending on the carrier frequency and the type of environment m (t) is the elevation angle between the LEO and the target monitoring node, expressed as:
Figure QLYQS_21
non-line-of-sight link probability passing
Figure QLYQS_22
To represent; the downlink channel power gain and the uplink channel power gain of the communication link between LEO and target monitoring node m are denoted as h, respectively m (t) and g m (t); i.e. the channel power gain between LEO and target node is expressed as:
Figure QLYQS_23
s104, constructing a system energy consumption model
Assuming LEO flies at a fixed height H > 0, the horizontal position at time slot t is denoted as [ x ] u (t),y u (t)]The LEO determines the next action in real time in the moving process and updates the position of the LEO; the flight control of LEO in this scenario is described by a flight speed v (t) limited by the maximum flight speed and a yaw angle θ (t) limited by θ (t) ∈ [ -pi, pi [ ]Is limited by (a); here, the energy consumption model study on LEO will jointly consider flight energy consumption, coverage service energy consumption and communication energy consumption, wherein the propulsion power consumption of LEO at speed V during flight is calculated by the following formula:
Figure QLYQS_24
p in the formula 0 Is blade profile power at overlay service, U tip Is the tip speed of the rotor blade; p (P) i And v 0 Representing induction power and average rotor induction speed under overlay service conditions; for parasitic power, d 0 ρ, s, A represent the fuselage resistance ratio, air density, rotor solidity and rotor disk area, respectively; the propulsion power consumption of LEO includes blade profile, inductive power and parasitic power, corresponding to the three parts of equation (7); the power consumption is obtained by setting v=0 for the overlay service:
P hov =P(V=0)=P 0 +P i (8)
the flight expended energy of the LEO in time slot t is expressed as:
Figure QLYQS_25
in the coverage service stage, carrying out energy transmission and data acquisition on the monitoring nodes in the LEO coverage area;
s105, constructing an energy transmission model
LEO transmits power P in sub-slots τ (t) by serving the flight speed and yaw angle decisions to move to target node locations d (t) transmitting a radio frequency signal to the monitoring node, wherein P d (t) receive
Figure QLYQS_26
Is limited by (a); within τ (t), all monitoring nodes within the LEO energy transmission coverage will get charged, and the received power at monitoring node m is expressed as:
Figure QLYQS_27
Applying a nonlinear energy transmission model as an air-ground energy transmission model; by the RF-EH model, the actual power of the receiving end is expressed as:
Figure QLYQS_28
wherein P is limit The maximum output direct current power, c and d are circuit characteristic correlation constants;
s2, dividing all M monitoring nodes into K clusters by using a K-Means algorithm, selecting a proper monitoring node from each cluster as a cluster head, collecting data of monitoring nodes in the clusters by the cluster head nodes, and forwarding the data to LEO; LEO transmits energy to all nodes in the cluster in the coverage service stage; in each time slot, LEO selects a cluster head node as a target node of the next service; the service priority of the node is considered for the selection of the target node;
s2 specifically comprises the following steps:
s201, distance formula design of K-Means algorithm
For the distance formula design of the K-Means algorithm, joint monitoring node characteristics and Euclidean distance are used as joint distances:
Figure QLYQS_29
a and b are the length and the width of the city monitoring network model respectively;
dividing M monitoring nodes into K clusters according to a clustering algorithm, wherein node subsets corresponding to each cluster are expressed as
Figure QLYQS_30
Wherein the intra-cluster node transmits self monitoring data to the cluster head node, LEO transmits energy to all nodes in the cluster in the coverage service stage, the monitoring node serving as the cluster head uploads the monitoring data through an uplink in the time of 1-tau (t), and the uplink transmitting power of the cluster head node is +. >
Figure QLYQS_31
Depending on the total energy collected during τ (t), i.e. +.>
Figure QLYQS_32
Is positively related to->
Figure QLYQS_33
Expressed as:
Figure QLYQS_34
wherein ζ represents energy conversion efficiency, which is a constant value;
the upload data rate for cluster head node k is expressed as:
Figure QLYQS_35
in each time slot, the LEO selects a cluster head node as a target node of the next service, if the target node is still the current node, the LEO of the next time slot continuously maintains the coverage service state, if the target node is changed, the LEO is in a flight state, and the LEO moves to the position of the target node through decision of the flight speed and the deflection angle; the method comprises the steps that service priorities of nodes are required to be considered for selecting target nodes, the service priorities comprise data acquisition priorities, energy supply priorities and node distances, the data acquisition priorities are set based on the data queue length and the data generation rate of monitoring nodes, and the energy supply priorities are the same; the service priority of cluster head node k in time slot t is defined as Q k (t):
Figure QLYQS_36
Wherein the data acquisition priority and the energy supply priority weights are α=1 and β=5, respectively;
s3, comprehensively considering three aspects of data acquisition requirements, energy transmission requirements and low orbit satellite LEO energy consumption, realizing maximization of uplink data collection quantity and downlink energy transmission quantity through combined optimization of LEO flight decision and resource allocation, defining a multi-objective optimization problem, and optimizing;
S3 specifically comprises the following steps:
network performance is required to be optimized by comprehensively considering three aspects of data acquisition requirements, energy transmission requirements and LEO energy consumption:
(1) Data acquisition amount
The data acquisition during the coverage service of the LEO at the monitoring node k in the time slot t is realized based on the cluster head node, and the corresponding data acquisition amount is expressed as:
D k (t)=R k (t)(1-τ(t)) (16)
the total data acquisition amount of the LEO on the monitoring node in the task period T is expressed as:
Figure QLYQS_37
(2) Energy transmission quantity
In the time slot t, LEO sends data acquisition information to cluster head node k and the rest monitoring nodes in the LEO coverage range and transmits energy, and during the service of the cluster head node k position coverage, the energy transmission quantity of LEO in the time slot t is expressed as:
Figure QLYQS_38
the total amount of energy transfer of the LEO during the mission period T is expressed as:
Figure QLYQS_39
(3) Low orbit satellite LEO energy consumption
According to the state of LEO in time slot t, LEO energy consumption is divided into flight energy consumption and coverage service energy consumption, wherein the energy consumption during coverage service comprises LEO coverage service energy consumption and downlink transmission total energy, and the energy consumption is expressed as:
Figure QLYQS_40
if in flight, LEO energy consumption is denoted as E uav (t) =p (v) t, the energy consumption of LEO in a task cycle is expressed as:
Figure QLYQS_41
setting up
Figure QLYQS_42
I.e. the data volume in the monitoring node is larger than the threshold +.>
Figure QLYQS_43
When the data overflow condition occurs, the data overflow condition is considered to occur; setting up
Figure QLYQS_44
Monitoring node energy less than->
Figure QLYQS_45
The situation of energy cavity is considered to occur, wherein the data acquisition priority and the energy supply priority weight are respectively alpha=1 and beta=5;
the optimization aims at maximizing the uplink data collection amount and the downlink energy transmission amount by jointly optimizing LEO flight decisions and resource allocation;
s301, based on the above, defining a multi-objective optimization problem is as follows:
Figure QLYQS_46
Figure QLYQS_47
Figure QLYQS_48
Figure QLYQS_49
Figure QLYQS_50
Figure QLYQS_51
in constraint conditions of the optimization problem, C1 and C2 are flight speed and deflection angle constraints of LEO, C3 is LEO transmitting power constraints, and C4 and C5 are constraints of a monitoring node data queue and an energy queue respectively;
adopting a multi-target joint optimization MJDPG algorithm for low orbit satellite LEO assisted urban monitoring network data acquisition and energy transmission, and describing optimization target preference by introducing weight parameters;
in MJDDPG algorithm, the corresponding prize value is expressed as r=rw T Converting the bonus vector into a scalar form; according to the importance preference of each optimization objective, in the interval [0.0,1.0]Selecting all weight parameters, and the other network structures are the same as a DDPG algorithm; for the target network, the target value y t Is calculated as follows:
y t =rw T +γQ'(s t+1 ,μ'(s t+1μ' )|θ Q' ) (28);
s4, constructing a problem model according to a Markov decision process; the state space of the system is described first; then constructing an action space description based on the system state and environment in the low orbit satellite LEO auxiliary city monitoring network research scene, and actions selected by the LEO in a specific time slot, including the flying speed, the flying angle, the time slot allocation and the transmitting power allocation of the LEO; meanwhile, the method is used as quantitative evaluation after the intelligent agent takes action in reinforcement learning;
S4 specifically comprises the following steps:
s401, defining a state space
In a low orbit satellite LEO assisted urban monitoring network, the state space is jointly determined by the monitoring nodes, LEO and environmental information, at time slot t,
Figure QLYQS_52
is the relative distance between the target monitoring node and LEO in a Cartesian coordinate system, N f (t) recording the accumulated number of times the LEO task period exceeds the limit area when the time t is recorded; the absolute position of LEO is [ x ] u (t),y u (t)]Number of data loss nodes N d (t) and number of Power-off nodes N e (t) will prompt LEO to timely serve the high demand monitoring node, the amount of data to be uploaded in the current target node +.>
Figure QLYQS_53
Guiding LEO to make effective decision on time slot and power resource allocation; the definition of the state space is as follows:
Figure QLYQS_54
wherein at time t=0, N f (t)、N d (t) and N e (t) are all 0;
s402, defining an action space
Mapping the state space to a continuous action space, and realizing multi-objective optimization by jointly optimizing LEO flight decision, time slot allocation and power allocation in LEO-aided city monitoring network scenes; based on the current system state and environment, the actions selected by the LEO at time slot t include the flight speed v (t), flight angle θ (t), and time slot allocation τ (t) and transmit power allocation p of the LEO d (t) all the action variables are continuous variables; the actions that LEO may take as an agent in time slot t may be expressed as:
Figure QLYQS_55
wherein the flying speed v (t) and the yaw angle θ (t) are respectively in the intervals [0, v ] max ]And [ -pi, pi]Within the range, pass [ cos (θ (t))]Representing yaw, p d (t) is in the range of P d ∈[0,P max ]τ (t) represents the proportion of time allocated to downlink energy transmission in a single time slot;
s403, establishing a reward function
The optimization aim is to optimize the network performance by maximizing the data acquisition amount, the energy transmission amount and minimizing the LEO energy consumption on the premise of guaranteeing the network quality; rewards are defined as multidimensional vectors:
Figure QLYQS_56
wherein r is dc (t)、r eh (t) and r ec (t) is an optimization objective, r aux (t) is a penalty term;
according to the service condition of LEO to the node in time slot t, the rewarding value is expressed as:
Figure QLYQS_57
wherein D is k (t) represents the reward value corresponding to the total data volume collected by LEO in the coverage service stage at the monitoring node k, and the larger the total data volume is, the larger the reward value is obtained; e (E) k (t) represents the rewards value brought by the LEO covering the total energy transmission amount in the service stage at the monitoring node k, and the larger the total energy transmission amount is, the larger the rewards value is obtained;
Figure QLYQS_58
indicating the energy consumption of LEO in time slot t, if LEO is in flight +.>
Figure QLYQS_59
If in the coverage service state, the coverage service energy consumption and the communication energy consumption are included, namely
Figure QLYQS_60
Once the target monitoring node falls within the coverage radius of the LEO, the LEO will cover the service and perform data acquisition and energy transmission; otherwise, LEO is in flight phase r dc (t) and r eh (t) are all 0; furthermore for auxiliary penalty function r aux (t):
Figure QLYQS_61
r aux The first two terms in (t) are the relative distances between the target monitoring node and the LEO in a cartesian coordinate system; LEO is motivated to preserve network quality during learning by punishing the wrong flight decisions of LEO.
CN202211006067.7A 2022-08-22 2022-08-22 Urban monitoring-oriented satellite energy-carrying Internet of things resource optimal allocation method Active CN115412156B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211006067.7A CN115412156B (en) 2022-08-22 2022-08-22 Urban monitoring-oriented satellite energy-carrying Internet of things resource optimal allocation method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211006067.7A CN115412156B (en) 2022-08-22 2022-08-22 Urban monitoring-oriented satellite energy-carrying Internet of things resource optimal allocation method

Publications (2)

Publication Number Publication Date
CN115412156A CN115412156A (en) 2022-11-29
CN115412156B true CN115412156B (en) 2023-07-14

Family

ID=84161215

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211006067.7A Active CN115412156B (en) 2022-08-22 2022-08-22 Urban monitoring-oriented satellite energy-carrying Internet of things resource optimal allocation method

Country Status (1)

Country Link
CN (1) CN115412156B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109803344A (en) * 2018-12-28 2019-05-24 北京邮电大学 A kind of unmanned plane network topology and routing joint mapping method
CN111867104A (en) * 2020-07-15 2020-10-30 中国科学院上海微系统与信息技术研究所 Power distribution method and power distribution device for low earth orbit satellite downlink
CN111865398A (en) * 2020-07-01 2020-10-30 哈尔滨工业大学(深圳) Satellite-ground transmission method under large-scale LEO satellite deployment
CN113055489A (en) * 2021-03-23 2021-06-29 北京计算机技术及应用研究所 Implementation method of satellite-ground converged network resource allocation strategy based on Q learning
US11327806B1 (en) * 2020-12-09 2022-05-10 Qpicloud Technologies Private Limited Profiling and application monitoring for edge devices based on headroom
CN114599117A (en) * 2022-03-07 2022-06-07 中国科学院微小卫星创新研究院 Dynamic configuration method for backspacing resources in random access of low earth orbit satellite network

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109803344A (en) * 2018-12-28 2019-05-24 北京邮电大学 A kind of unmanned plane network topology and routing joint mapping method
CN111865398A (en) * 2020-07-01 2020-10-30 哈尔滨工业大学(深圳) Satellite-ground transmission method under large-scale LEO satellite deployment
CN111867104A (en) * 2020-07-15 2020-10-30 中国科学院上海微系统与信息技术研究所 Power distribution method and power distribution device for low earth orbit satellite downlink
US11327806B1 (en) * 2020-12-09 2022-05-10 Qpicloud Technologies Private Limited Profiling and application monitoring for edge devices based on headroom
CN113055489A (en) * 2021-03-23 2021-06-29 北京计算机技术及应用研究所 Implementation method of satellite-ground converged network resource allocation strategy based on Q learning
CN114599117A (en) * 2022-03-07 2022-06-07 中国科学院微小卫星创新研究院 Dynamic configuration method for backspacing resources in random access of low earth orbit satellite network

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Multi-Objective Optimization for UAV-Assisted Wireless Powered IoT Networks Based on Extended DDPG Algorithm;Yu Yu 等;IEEE TRANSACTIONS ON COMMUNICATIONS;第69卷(第9期);第6361-6374页 *
Trajectory Planning of UAV in Wireless Powered IoT System Based on Deep Reinforcement Learning;Jidong Zhang 等;2020 IEEE/CIC International Conference on Communications in China (ICCC);第645-650页 *

Also Published As

Publication number Publication date
CN115412156A (en) 2022-11-29

Similar Documents

Publication Publication Date Title
Liu et al. Average AoI minimization in UAV-assisted data collection with RF wireless power transfer: A deep reinforcement learning scheme
CN110730031B (en) Unmanned aerial vehicle track and resource allocation joint optimization method for multi-carrier communication
CN111666149A (en) Ultra-dense edge computing network mobility management method based on deep reinforcement learning
Oubbati et al. Multiagent deep reinforcement learning for wireless-powered UAV networks
CN110380776B (en) Internet of things system data collection method based on unmanned aerial vehicle
CN114142908B (en) Multi-unmanned aerial vehicle communication resource allocation method for coverage reconnaissance task
CN114980169A (en) Unmanned aerial vehicle auxiliary ground communication method based on combined optimization of track and phase
CN111526592A (en) Non-cooperative multi-agent power control method used in wireless interference channel
CN115499921A (en) Three-dimensional trajectory design and resource scheduling optimization method for complex unmanned aerial vehicle network
CN115086915A (en) Information transmission method for wireless sensing system of high-speed railway passenger car
CN114205769A (en) Joint trajectory optimization and bandwidth allocation method based on unmanned aerial vehicle data acquisition system
CN113382060B (en) Unmanned aerial vehicle track optimization method and system in Internet of things data collection
Zhang et al. Trajectory planning of UAV in wireless powered IoT system based on deep reinforcement learning
CN113255218B (en) Unmanned aerial vehicle autonomous navigation and resource scheduling method of wireless self-powered communication network
CN109104734B (en) Throughput maximization method for energy-collecting wireless relay network
CN115412156B (en) Urban monitoring-oriented satellite energy-carrying Internet of things resource optimal allocation method
Messaoudi et al. UAV-UGV-Based System for AoI minimization in IoT Networks
CN117119489A (en) Deployment and resource optimization method of wireless energy supply network based on multi-unmanned aerial vehicle assistance
CN116882270A (en) Multi-unmanned aerial vehicle wireless charging and edge computing combined optimization method and system based on deep reinforcement learning
CN115483964B (en) Air-space-ground integrated Internet of things communication resource joint allocation method
CN116321237A (en) Unmanned aerial vehicle auxiliary internet of vehicles data collection method based on deep reinforcement learning
CN116074974A (en) Multi-unmanned aerial vehicle group channel access control method under layered architecture
CN115412157A (en) Emergency rescue oriented satellite energy-carrying Internet of things resource optimal allocation method
CN113055826B (en) Large-scale unmanned aerial vehicle cluster data collection method combining clustering and three-dimensional trajectory planning
Lyu et al. Resource Allocation in UAV-Assisted Wireless Powered Communication Networks for Urban Monitoring

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant