CN114389678B - Multi-beam satellite resource allocation method based on decision performance evaluation - Google Patents

Multi-beam satellite resource allocation method based on decision performance evaluation Download PDF

Info

Publication number
CN114389678B
CN114389678B CN202210033327.3A CN202210033327A CN114389678B CN 114389678 B CN114389678 B CN 114389678B CN 202210033327 A CN202210033327 A CN 202210033327A CN 114389678 B CN114389678 B CN 114389678B
Authority
CN
China
Prior art keywords
user
satellite
communication
total
throughput
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210033327.3A
Other languages
Chinese (zh)
Other versions
CN114389678A (en
Inventor
王朝炜
崔高峰
王力男
胡东伟
刘丽哲
王卫东
庞明亮
邓丹昊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Posts and Telecommunications
CETC 54 Research Institute
Original Assignee
Beijing University of Posts and Telecommunications
CETC 54 Research Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Posts and Telecommunications, CETC 54 Research Institute filed Critical Beijing University of Posts and Telecommunications
Priority to CN202210033327.3A priority Critical patent/CN114389678B/en
Publication of CN114389678A publication Critical patent/CN114389678A/en
Application granted granted Critical
Publication of CN114389678B publication Critical patent/CN114389678B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04BTRANSMISSION
    • H04B7/00Radio transmission systems, i.e. using radiation field
    • H04B7/14Relay systems
    • H04B7/15Active relay systems
    • H04B7/185Space-based or airborne stations; Stations for satellite systems
    • H04B7/1851Systems using a satellite or space-based relay
    • H04B7/18519Operations control, administration or maintenance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04BTRANSMISSION
    • H04B17/00Monitoring; Testing
    • H04B17/30Monitoring; Testing of propagation channels
    • H04B17/309Measuring or estimating channel quality parameters
    • H04B17/336Signal-to-interference ratio [SIR] or carrier-to-interference ratio [CIR]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04BTRANSMISSION
    • H04B17/00Monitoring; Testing
    • H04B17/30Monitoring; Testing of propagation channels
    • H04B17/309Measuring or estimating channel quality parameters
    • H04B17/345Interference values

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • General Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Quality & Reliability (AREA)
  • Electromagnetism (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Astronomy & Astrophysics (AREA)
  • Aviation & Aerospace Engineering (AREA)
  • Radio Relay Systems (AREA)

Abstract

The invention discloses a multi-beam satellite resource allocation method based on decision performance evaluation, belonging to the field of satellite communication; firstly, aiming at M wave beams of a single satellite, respectively corresponding N users to build a communication scene of a same-frequency networking under each wave beam; the user (m, n) at the t-th time slot requests data from the satellite, and calculates the communication link [ m, n, t ]]In the same frequency interference Im,n,tSignal to interference plus noise ratio (SINR)m,n,tAnd rate of communication
Figure DDA0003467364570000011
Then, the communication system throughput sum C of the satellite in the time period T is calculated by using the communication rate of each user in each time slottotal(ii) a Building a multi-objective optimization model considering the throughput of a communication system and the fairness of users at the same time; and finally, solving the multi-objective optimization model by using the DDPG to obtain the joint allocation of bandwidth and power resources meeting the system throughput and the user fairness. The method keeps smaller same frequency interference among users in a full-frequency multiplexing scene, improves the total throughput of a satellite system, and simultaneously considers the fairness of the system.

Description

Multi-beam satellite resource allocation method based on decision performance evaluation
Technical Field
The invention belongs to the field of satellite communication, and particularly relates to a multi-beam satellite resource allocation method based on decision performance evaluation.
Background
As a significant infrastructure of the national information network, the satellite communication system can well make up for the defects of the ground network, and has become an indispensable part of the national information network.
With the rapid development of the internet of things technology, especially the increasing number of terminals, the traffic volume is increased sharply, and the spectrum resources of the satellite communication system are increasingly tense. Future communication satellites must increase their capacity to the throughput range of TB/s, and the limitation of available bandwidth is a major bottleneck to achieve such high throughput values.
The proposal of the multi-beam satellite greatly improves the spectrum utilization rate of a satellite communication system, promotes the development of a high-throughput satellite, and provides favorable conditions for meeting the increasing business requirements, realizing reliable and flexible connection and the conversion of the satellite from broadcasting to broadband tasks.
The characteristic of a multi-beam satellite is utilized to improve the frequency band utilization rate, the same frequency band must be repeatedly used among a large number of spot beams, if the whole bandwidth of the system is repeatedly used in each beam, the highest frequency band reuse rate can be realized, but the highest frequency band reuse rate causes very strong same frequency interference, and the problem can lead the resource allocation of the satellite system to be very complicated.
At present, many researches on multi-beam satellite resource allocation are carried out in a beam domain, each beam uses partial frequency, and therefore co-channel interference of a system is reduced, but the allocation formula cannot fully utilize total frequency resources of the system. Under the background that the same frequency band is used among wave beams, the frequency domain and the power domain are jointly optimized for users, so that the total throughput of the system is improved on the premise of ensuring the fairness of the users, and the method has great significance for the development of the next generation satellite communication technology.
Disclosure of Invention
Aiming at the problems, the invention provides a multi-beam satellite resource allocation method based on decision performance evaluation, which achieves the aim of improving the total throughput of a satellite system and simultaneously considering the fairness of the system by reasonably allocating bandwidth and power resources of users.
The multi-beam resource allocation method specifically comprises the following steps:
step one, building a communication scene comprising a plurality of users and a single multi-beam satellite co-frequency networking;
the multi-beam satellite has M beams with N users under each beam, where the nth user under the mth beam is denoted as (M, N); the user is located at omegam,nA location;
step two, the t time slot user (m, n) requests the data download from the multi-beam satellite, calculates the communication link [ m, n, t ] of the multi-beam satellite and the user]In the same frequency interference Im,n,t
The calculation formula is as follows:
Figure BDA0003467364550000021
[m,n,t]for a communication link of the t-th timeslot user (m, n) with the multi-beam satellite:
Figure BDA0003467364550000022
is the angle of the received signal direction from the axis of the receiving antenna. XifRepresenting the set of all co-frequency channels with frequency f, wherein f is the frequency of the t-th time slot user (m, n);
Figure BDA0003467364550000023
representing a communication link [ m, n, t ]]The channels that cause co-channel interference are,
Figure BDA0003467364550000024
is a link [ m, n, t]The power of the data transmission of (a),
Figure BDA0003467364550000025
is a link [ m, n, t]The channel gain of (c).
Step three, utilizing same frequency interference Im,n,tComputing communication links [ m, n, t ]]SINR ofm,n,tAnd further calculates the communication link [ m, n, t [ ]]Communication rate of
Figure BDA0003467364550000026
The formula for calculating the signal to interference plus noise ratio is as follows:
Figure BDA0003467364550000027
Pm,n,tfor communication links [ m, n, t ]]Transmission power of, N0Is Gaussian white noise power spectral density, Bm,n,tFor communication links [ m, n, t ]]Bandwidth of hm,nIn order to obtain the gain of the satellite-to-ground link,
Figure BDA0003467364550000028
Gt(θ) is link [ m, n, t ]]Theta is the angle of the user (m, n) from the antenna axis of the beam in which it is located; PL is loss and fading of signal power due to a channel environment;
Figure BDA0003467364550000029
is a link [ m, n, t]The receive antenna gain of (1).
Rate of communication
Figure BDA00034673645500000210
The calculation formula is as follows:
Figure BDA00034673645500000211
step four, calculating the throughput sum C of the communication system of the multi-beam satellite in the time period T by using the communication rate of each user in each time slottotal
The calculation formula is as follows:
Figure BDA00034673645500000212
step five, building a multi-objective optimization model considering the throughput of the communication system and the fairness of users at the same time;
firstly, calculating the fairness of the user according to the Jain index; the calculation formula is as follows:
Figure BDA00034673645500000213
Figure BDA00034673645500000214
to initialize business requirements.
Figure BDA00034673645500000215
Representing the service satisfaction index of the subscriber (m, n) instead of the communication rate in the original formula
Figure BDA00034673645500000216
And calculating the fairness according to Jain index formula.
Then, the multi-objective optimization model is as follows:
P1:max Ctotal
P2:max F
Figure BDA0003467364550000031
wherein the optimization objective P1 represents maximizing the overall throughput of the communication system; the optimization objective P2 two represents maximizing fairness of the communication system.
The constraint condition C1 indicates that the throughput of the t-th time slot of the user is not greater than the residual traffic of the time slot;
after the satellite provides service for t time slots to a user (m, n), the remaining traffic of the user in the t +1 th time slot is:
Figure BDA0003467364550000032
the constraint C2 means that the total power of all users is not greater than the maximum transmitting power P of the satellitetotal
The constraint C3 indicates that the total bandwidth of all users in the same beam is not greater than the total bandwidth B of the communication systemtotal
And step six, solving the multi-target optimization model by using the deep reinforcement learning network DDPG to obtain the bandwidth and power resource joint allocation meeting the system throughput and the user fairness.
The set deep reinforcement learning network DDPG comprises: a resource allocation decision network and a decision performance evaluation network;
modeling a process of resource allocation by using DDPG as a Markov process;
defining the network state vector of the t-th time slot in the decision network as st={Ht,It,Ct,DtIn which H istRepresenting the channel gain of each user; I.C. AtThe co-channel interference strength received by the user; ctRepresents the throughput of the user; dtRepresenting the traffic of the user.
Defining actions in the decision network including allocated bandwidth and power resources for each beam, for each time slot t e {1,2,3t={Pt,BtIn which P istIndicating the data transmission power allocated by the system to the user, BtIndicating the data transmission bandwidth allocated by the system to the user.
Performance evaluation network: and calculating the current Q value according to the current resource allocation strategy output by the decision network, and updating the network parameters by calculating a loss function.
The evaluation of the total prize value of the network is designed as: r = ω1R12R23R3
Wherein ω is1~ω3Weights corresponding to different awards; r1The reward for the total throughput of the system is: r is1=Ctotal;R2Reward corresponding to the system fairness coefficient: r2=F;R3For auxiliary reward:
Figure BDA0003467364550000033
wherein: x is the number ofmnThe user resource is allocated with a reasonable or not index,
Figure BDA0003467364550000034
umnis an indicator of whether the user's throughput at the t-th time slot is greater than the traffic volume,
Figure BDA0003467364550000041
ztas an indication of whether all users are more than the total power,
Figure BDA0003467364550000042
unreasonable resource allocation means that the agent has allocated bandwidth resources or power to the users, but has not allocated power or bandwidth to the users.
The invention has the advantages that:
1) The method for allocating the multi-beam satellite resources based on decision performance evaluation simultaneously considers the total throughput of a multi-beam satellite communication system and the fairness of users and simultaneously optimizes two targets; the problem that only optimizing the system throughput can cause part of users not to be served all the time under the condition of limited resources is avoided.
2) The multi-beam satellite resource allocation method based on decision performance evaluation adopts a decision performance evaluation network, adjusts parameters of the decision network according to an evaluation result of the evaluation network so as to optimize a resource allocation scheme, updates parameters of the evaluation network, and realizes accurate prediction of the decision network in an iterative optimization mode.
Drawings
Fig. 1 is a schematic diagram of a multi-beam satellite resource allocation method based on decision performance evaluation according to the present invention.
Fig. 2 is a flowchart of a multi-beam satellite resource allocation method based on decision performance evaluation according to the present invention.
FIG. 3 is a diagram of an actual multi-beam GEO satellite downlink communication scenario constructed in accordance with the present invention;
fig. 4 is a detailed design diagram of a decision-evaluation network according to the present invention.
Fig. 5 is a graph comparing throughput with average resource allocation and different bandwidths of random resources according to the present invention.
Fig. 6 is a comparison diagram of fairness of the system under different bandwidths of the average resource allocation and the random resource in the invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and examples.
The invention provides a multi-beam satellite resource allocation method based on decision performance evaluation, which is used for optimizing bandwidth and power resource allocation of each user, aiming at a scene that users request data downloading from a satellite in a multi-beam satellite same-frequency networking system. As shown in fig. 1, first, a user in each beam in a multi-beam satellite generates a service demand and forwards it to the satellite via a gateway station, requesting data download via the satellite. The satellite collects the position information of each user and the resource information of the system after receiving the request, then transmits the information to the resource allocation decision network, the decision network allocates power and bandwidth to the users according to the state of each user, and transmits the resource allocation strategy to the multi-beam satellite system and the resource allocation evaluation network. And the resource allocation evaluation network evaluates the decision made by the resource allocation decision network according to the actual return of the system and returns the parameters to the resource allocation decision network. The iterative loop continuously improves the accuracy of the decision made by the resource allocation decision network, thereby maximizing the system throughput and the user fairness.
The method is based on the theoretical knowledge of deep reinforcement learning, the total throughput of a satellite system and the fairness optimization modeling of a user are modeled into a multi-objective optimization problem, a continuous resource allocation process with time correlation is modeled into a Markov process, a resource allocation decision network and a decision performance evaluation network are established by using a DDPG neural network, and the multi-objective optimization problem is solved. The bandwidth and power resources of the users are reasonably distributed to the users in each time slot, so that the smaller same frequency interference among the users is kept under the full-frequency multiplexing scene, and the aim of improving the total throughput of the satellite system and simultaneously considering the fairness of the system is achieved.
As shown in fig. 2, the multi-beam satellite resource allocation method specifically includes the following steps:
step one, building a communication scene comprising a plurality of users and a single multi-beam satellite co-frequency networking;
as shown in fig. 3, consider a single multi-beam GEO satellite having M beams with N users per beam, comprising: mobile terminals, multimedia terminals, vehicle terminals, and the like; the users in each beam create service demands that are forwarded to the satellite via the gateway station requesting data download via the satellite. The satellite acquires the position information of each user and the resource information of the system after receiving the request, transmits the position information and the resource information to the gateway station through the user flow generation module and the resource allocation module, and the resource allocation decision network allocates power and bandwidth to the users according to the states of the users; wherein the nth user under the mth beam is denoted as (m, n); the user is located at omegam,nAt a location;
step two, the t time slot user (m, n) requests the data download from the multi-beam satellite, calculates the communication link [ m, n, t ] of the multi-beam satellite and the user]In the same frequency interference Im,n,t
The t time slot satellite provides power P to user (m, n)m,n,tTransmitting data using [ m, n, t ]]Represents the communication link, hm,nIs the satellite-to-ground link gain. Assume a multi-beam satellite channel matrix of Hmatrix
Hmatrix=[h1,1,h1,2,...,hm,n,...,hM,N]
Figure BDA0003467364550000051
Wherein G ist(theta) is link [ m, n, t ]]θ is the angle of the user (m, n) from the antenna axis of the beam in which it is located; PL is loss and fading of signal power due to a channel environment;
Figure BDA0003467364550000052
is a link [ m, n, t]The gain of the receiving antenna of (1),
Figure BDA0003467364550000053
is the angle of the received signal direction from the axis of the receiving antenna.
Considering the co-channel interference problem, the frequency of the t time slot user (m, n) is f, and the received co-channel interference is Im,n,t
Figure BDA0003467364550000054
Xi thereinfRepresenting the set of all co-channel channels at frequency f,
Figure BDA0003467364550000055
representing a communication link [ m, n, t ]]The channels that cause co-channel interference are,
Figure BDA0003467364550000056
is a link [ m, n, t]The power of the data transmission of (a),
Figure BDA0003467364550000057
is a link [ m, n, t]The channel gain of (c).
Step three, utilizing same frequency interference Im,n,tComputing communication links [ m, n, t [ ]]SINR ofm,n,tAnd further calculates the communication link [ m, n, t [ ]]Rate of communication
Figure BDA0003467364550000058
The formula for calculating the signal to interference plus noise ratio is as follows:
Figure BDA0003467364550000059
Pm,n,tfor communication links [ m, n, t ]]Transmission power of, N0Is Gaussian white noise power spectral density, Bm,n,tFor communication links [ m, n, t ]]The bandwidth of (d);
rate of communication
Figure BDA00034673645500000510
The calculation formula is as follows:
Figure BDA00034673645500000511
step four, calculating the total throughput C of the communication system of the multi-beam satellite in the time period T by using the communication rate of each user in each time slottotal
After the satellite provides service for t time slots to the users (m, n), the remaining traffic of the users (m, n) at t +1 time slot is:
Figure BDA0003467364550000061
Figure BDA0003467364550000062
to initialize business requirements.
The total of the throughput of the multi-beam satellite communication system in the T time period is calculated by the formula:
Figure BDA0003467364550000063
step five, building a multi-objective optimization model considering the throughput of the communication system and the fairness of users at the same time;
while considering system throughput, the present invention also takes user fairness as an optimization objective. This is because without fairness constraints, it is obviously unreasonable that in resource-constrained situations there may be users that are never allocated any resources. Firstly, calculating the fairness of the user according to the Jain index; the calculation formula is as follows:
Figure BDA0003467364550000064
since service requests of different users are different, use is made of
Figure BDA0003467364550000065
Representing the service satisfaction index of the user (m, n) instead of the communication rate in the original formula
Figure BDA0003467364550000066
And calculating the fairness according to Jain index formula.
The complete multi-objective optimization model is then as follows:
P1:max Ctotal
P2:max F
Figure BDA0003467364550000067
wherein the optimization objective P1 represents maximizing the overall throughput of the communication system; the optimization objective P2 two represents maximizing fairness of the communication system.
The constraint condition C1 indicates that the throughput of the t-th time slot of the user is not greater than the residual traffic of the time slot;
the constraint C2 means that the total power of all users is not greater than the maximum transmitting power P of the satellitetotal
The constraint C3 indicates that the total bandwidth of all users in the same beam is not greater than the total bandwidth B of the communication systemtotal
Total bandwidth of satellite system is BtotalIs uniformly divided into NBA sub-channel.
And step six, solving the multi-target optimization model by using the deep reinforcement learning network DDPG to obtain the bandwidth and power resource joint allocation meeting the system throughput and the user fairness.
Considering that the user resource allocation decision at the t +1 moment of the system is influenced by the user resource allocation condition at the previous t moment, the method models a continuous resource allocation process with time correlation into a Markov process, and establishes the resource allocation decision by utilizing a deep reinforcement learning network so as to achieve the combined optimization target of the system throughput and the user fairness.
As shown in fig. 4, the performance evaluation scheme based on the decision-evaluation algorithm provided by the present invention is formed by combining a resource allocation decision network based on a policy gradient and a decision performance evaluation network based on a value function; the multi-beam satellite communication system is used as an environment, the output system state interacts with the DDPG network, and the DDPG network selects a resource allocation decision according to the environment state.
The network structure, state, actions and benefits are specifically designed as follows:
resource allocation decision network: and the system is responsible for selecting the current resource allocation action a according to the service requirements of users, and is used for interactively generating the total throughput of the system, the system fairness index observed value and the state with the multi-beam satellite communication system. The input is the environment state of the multi-beam satellite communication system, and the output is the resource allocation action; and the decision network corrects the network parameters according to the resource allocation decision evaluation result returned by the evaluation network.
State design
The state is a description of the external environment, and the agent needs to make subsequent decisions by means of the state parameters, and defines the state in the decision network as s. The state changes along with the change of time, and the network state vector of the t-th time slot is st={Ht,It,Ct,DtH wherein HtRepresenting the channel gain of each user; i istThe co-channel interference strength received by the user; ctRepresents the throughput of the user; dtRepresenting the traffic of the user.
Action design
The action is an output parameter of the agent, is used for adjusting variable information in the system environment, and defines the action in the decision network as a. The network action a is a resource allocation decision for the prediction situation at the next moment, and needs to be implemented into a real system to adjust resource variables.
The decision network action mainly comprises the allocated bandwidth and power resources of each beam, and the feasible solutions of the resource parameters form an action space A of the agent. For each time slot t e {1,2, 3. }, the system will allocate power and bandwidth resources to users with traffic demands under consideration of the influence of co-channel interference.
The set of actions is denoted as At={Pt,Bt}: wherein P istIndicating the data transmission power, P, allocated by the system to the usert={P1,1,t,P1,2,t,...,PM,N,t};BtRepresenting the data transmission bandwidth allocated by the system to the user, Bt={B1,1,t,B1,2,t,...,BM,N,t}。
System performance evaluation network: and calculating the current Q value according to the current resource allocation strategy output by the decision network, and updating the network parameters by calculating a loss function.
Reward design
The evaluation of the network reward value needs to reflect the quality of the system performance of the resource allocation decision performance made by the decision network. For each time slot, the environment designs a system prize value based on the current state, the action in the current state, and the next state. The design of the reward value should be related to the goal of resource allocation decision, and the evaluation of the total reward value of the network comprises the following three rewards:
r1: considering the optimization objective P1, the total system throughput is taken as the first reward:
Figure BDA0003467364550000081
r2: considering the optimization goal P2, the system fairness coefficient is taken as the second reward:
Figure BDA0003467364550000082
wherein N istotalRepresenting the number of all users in a satellite communication system, i.e. Ntotal=M×N。
R3: in order to accelerate the convergence speed of the model, an auxiliary reward R3 is set:
Figure BDA0003467364550000083
wherein: x is the number ofmnThe user resource is allocated with a reasonable or not index,
Figure BDA0003467364550000084
umnis an indication of whether the user's throughput at the t-th time slot is greater than the traffic volume,
Figure BDA0003467364550000085
ztas an indication of whether all users are more than the total power,
Figure BDA0003467364550000086
unreasonable resource allocation means that the agent allocates bandwidth resources (or power) to the users, but does not allocate power (or bandwidth) to the users.
The total reward is designed to be:
the design is as follows: r = ω1R12R23R3
Wherein ω is1~ω3Weights corresponding to different rewards.
Network design
The invention uses DDPG algorithm in deep reinforcement learning to carry out optimization distribution on the resources of the satellite system, and needs to train the parameters of the DDPG network.
The number of the hidden layers of the decision network of the DDPG network is two, the number of neurons in each layer is 128, sigmod is used as an activation function, an Adam optimizer is adopted for optimization, and the learning rate is set to be 1e-4;
and evaluating the number of the hidden layers of the network to be two layers, wherein the number of neurons in each layer is 256, using relu as an activation function, optimizing by adopting an Adam optimizer, and setting the learning rate to be 2e-4. The capacity of the experience pool was set to 10000, the size of Batch sampled from each time was 256, and the variance of the heuristic noise was 0.7.
Performance analysis
Compared with the schemes of average resource allocation and random resource allocation, as shown in fig. 5 and 6, the results show that the multi-beam satellite bandwidth and power resource joint allocation method based on decision performance evaluation and multi-objective optimization provided by the invention achieves the purpose of improving the throughput of the system on the premise of ensuring user fairness under the condition of different system resources.

Claims (5)

1. A multi-beam satellite resource allocation method based on decision performance evaluation is characterized by comprising the following steps: firstly, aiming at a single multi-beam satellite, corresponding to M beams, wherein N users are arranged under each beam, and a communication scene of the same-frequency networking of the users and the satellite is established;
at the t-th time slot, the user (m, n) requests a data download from the satellite, and the communication link m, n, t between the satellite and the user is calculated]In the same frequency interference Im,n,tSignal to interference plus noise ratio (SINR)m,n,tAnd rate of communication
Figure FDA0003830739060000011
Then, the communication rate of each user in each time slot is utilized to calculate the throughput sum C of the communication system of the multi-beam satellite in the time period Ttotal(ii) a Building a multi-objective optimization model considering the throughput of a communication system and the fairness of users at the same time;
the complete multi-objective optimization model is as follows:
Ρ1:max Ctotal
Ρ2:max F
Figure FDA0003830739060000012
wherein the optimization target Ρ 1 represents maximizing an overall throughput of the communication system;
the optimization target Ρ 2 represents a fairness F that maximizes the communication system;
the calculation formula is as follows:
Figure FDA0003830739060000013
Figure FDA0003830739060000014
in order to initiate the service requirements,
Figure FDA0003830739060000015
representing the service satisfaction index of the subscriber (m, n) instead of the communication rate in the original formula
Figure FDA0003830739060000016
And calculating the fairness according to Jain index formula;
the constraint C1 indicates that the throughput of the t-th time slot of the user is not greater than the residual traffic of the time slot
Figure FDA0003830739060000017
The constraint C2 means that the total power of all users is not greater than the maximum transmitting power P of the satellitetotal
The constraint C3 indicates that the total bandwidth of all users in the same beam is not greater than the total bandwidth B of the communication systemtotal
And finally, solving the multi-target optimization model by using a deep reinforcement learning network DDPG (distributed data processing) to obtain the bandwidth and power resource joint distribution meeting the system throughput and the user fairness.
2. The multi-beam satellite resource allocation method based on decision performance evaluation according to claim 1, wherein said co-channel interference Im,n,tThe calculation formula is as follows:
Figure FDA0003830739060000021
Figure FDA0003830739060000022
the angle of the direction of the received signal deviating from the axis of the receiving antenna; xifRepresenting the set of all co-channel channels with frequency f, wherein f is the frequency of the t time slot user (m, n);
Figure FDA0003830739060000023
representing a communication link [ m, n, t ]]The channels that cause co-channel interference are,
Figure FDA0003830739060000024
is a link [ m, n, t]The power of the data transmission of (a),
Figure FDA0003830739060000025
is a link [ m, n, t]The channel gain of (c).
3. The method of claim 1, wherein the SINR is a signal-to-interference-and-noise ratio (SINR)m,n,tThe calculation formula is as follows:
Figure FDA0003830739060000026
Pm,n,tfor communication links [ m, n, t ]]Transmission power of, N0Is Gaussian white noise power spectral density, Bm,n,tFor communication links [ m, n, t ]]Bandwidth of hm,nIn order to obtain the gain of the satellite-to-ground link,
Figure FDA0003830739060000027
Gt(θ) is link [ m, n, t ]]Theta is the angle of the user (m, n) from the antenna axis of the beam in which it is located; PL is loss and fading of signal power due to a channel environment;
Figure FDA0003830739060000028
is a function of the link m and,n,t]the receive antenna gain of (1).
4. The method according to claim 1, wherein the communication rate is based on a multi-beam satellite resource allocation method based on decision performance evaluation
Figure FDA0003830739060000029
The calculation formula is as follows:
Figure FDA00038307390600000210
5. the method according to claim 1, wherein the DDPG is a deep reinforcement learning network that comprises: a resource allocation decision network and a decision performance evaluation network;
modeling a process of resource allocation by using DDPG as a Markov process;
defining the network state vector of the t-th time slot in the decision network as st={Ht,It,Ct,DtH wherein HtRepresenting the channel gain of each user; i istThe co-channel interference strength received by the user; ctRepresents the throughput of the user; dtRepresenting the traffic of the user;
defining actions in the decision network including allocated bandwidth and power resources for each beam, for each time slot t e {1,2,3t={Pt,BtIn which P istIndicating the data transmission power allocated by the system to the user, BtRepresenting the data transmission bandwidth allocated by the system to the user;
performance evaluation network: calculating a current Q value according to a current resource allocation strategy output by the decision network, and updating network parameters by calculating a loss function;
the evaluation of the total network reward value is designed as: r = ω1R12R23R3
Wherein ω is1~ω3Weights corresponding to different rewards; r1The reward for the total throughput of the system is: r is1=Ctotal;R2Reward for system fairness coefficient: r2=F;R3For auxiliary reward:
Figure FDA0003830739060000031
wherein: x is a radical of a fluorine atommnThe user resource is allocated with a reasonable or not index,
Figure FDA0003830739060000032
umnis an indication of whether the user's throughput at the t-th time slot is greater than the traffic volume,
Figure FDA0003830739060000033
ztas an indication of whether all users are more than the total power,
Figure FDA0003830739060000034
an unreasonable resource allocation is when the agent allocates bandwidth resources to the user but does not allocate power to the user, or when the agent allocates power to the user but does not allocate bandwidth to the user.
CN202210033327.3A 2022-01-12 2022-01-12 Multi-beam satellite resource allocation method based on decision performance evaluation Active CN114389678B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210033327.3A CN114389678B (en) 2022-01-12 2022-01-12 Multi-beam satellite resource allocation method based on decision performance evaluation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210033327.3A CN114389678B (en) 2022-01-12 2022-01-12 Multi-beam satellite resource allocation method based on decision performance evaluation

Publications (2)

Publication Number Publication Date
CN114389678A CN114389678A (en) 2022-04-22
CN114389678B true CN114389678B (en) 2022-11-01

Family

ID=81201312

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210033327.3A Active CN114389678B (en) 2022-01-12 2022-01-12 Multi-beam satellite resource allocation method based on decision performance evaluation

Country Status (1)

Country Link
CN (1) CN114389678B (en)

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115189721B (en) * 2022-04-29 2023-12-19 中国人民解放军国防科技大学 Multi-beam satellite bandwidth power meter joint optimization allocation method and application
CN115001611B (en) * 2022-05-18 2023-09-26 西安交通大学 Resource allocation method of beam hopping satellite spectrum sharing system based on reinforcement learning
CN114916051A (en) * 2022-05-24 2022-08-16 桂林电子科技大学 LEO satellite power control method based on BP neural network
CN114793126B (en) * 2022-05-24 2023-06-23 北京航空航天大学 Multi-beam low-orbit satellite user grouping and resource allocation method
CN115441939B (en) * 2022-09-20 2024-03-22 深圳泓越信息科技有限公司 MADDPG algorithm-based multi-beam satellite communication system resource allocation method
CN116156631B (en) * 2023-01-09 2023-08-22 中国人民解放军军事科学院系统工程研究院 Self-adaptive distribution method for satellite communication multi-beam interference power
CN117528784B (en) * 2023-11-09 2024-05-24 中国人民解放军军事科学院系统工程研究院 Multi-domain cross-layer cooperative control method and device for multi-beam satellite communication network
CN117240797B (en) * 2023-11-15 2024-01-30 华北电力大学 Combined resource allocation method, device, equipment and medium for electric power hybrid service
CN117255334B (en) * 2023-11-17 2024-01-26 国网浙江省电力有限公司信息通信分公司 Multistage cooperative scheduling method and system for emergency satellite communication
CN117639903B (en) * 2024-01-23 2024-05-07 南京控维通信科技有限公司 Multi-user satellite communication method and system based on NOMA assistance
CN117833997B (en) * 2024-03-01 2024-05-31 南京控维通信科技有限公司 Multidimensional resource allocation method of NOMA multi-beam satellite communication system based on reinforcement learning

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104394596A (en) * 2014-12-22 2015-03-04 哈尔滨工业大学 Upstream resource distribution method capable of considering both throughput capacity and fairness in TD-LTE-Advanced (Time Division-Long Term Evolution-Advanced) relay system
CN106507366A (en) * 2016-11-25 2017-03-15 中国航空无线电电子研究所 The repeater satellite space-time frequency domain resource dynamic dispatching method of facing multiple users
CN107835528A (en) * 2017-10-25 2018-03-23 哈尔滨工业大学 The resource allocation methods avoided in the ground integrated network of star based on interference
CN113644964A (en) * 2021-08-06 2021-11-12 北京邮电大学 Multi-dimensional resource joint allocation method of multi-beam satellite same-frequency networking system

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20130104337A (en) * 2012-03-13 2013-09-25 한국전자통신연구원 Apparatus and method for allocating resource in multi-beam satellite communication

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104394596A (en) * 2014-12-22 2015-03-04 哈尔滨工业大学 Upstream resource distribution method capable of considering both throughput capacity and fairness in TD-LTE-Advanced (Time Division-Long Term Evolution-Advanced) relay system
CN106507366A (en) * 2016-11-25 2017-03-15 中国航空无线电电子研究所 The repeater satellite space-time frequency domain resource dynamic dispatching method of facing multiple users
CN107835528A (en) * 2017-10-25 2018-03-23 哈尔滨工业大学 The resource allocation methods avoided in the ground integrated network of star based on interference
CN113644964A (en) * 2021-08-06 2021-11-12 北京邮电大学 Multi-dimensional resource joint allocation method of multi-beam satellite same-frequency networking system

Also Published As

Publication number Publication date
CN114389678A (en) 2022-04-22

Similar Documents

Publication Publication Date Title
CN114389678B (en) Multi-beam satellite resource allocation method based on decision performance evaluation
CN111314889B (en) Task unloading and resource allocation method based on mobile edge calculation in Internet of vehicles
CN112616189B (en) Static and dynamic combined millimeter wave beam resource allocation and optimization method
CN111800828B (en) Mobile edge computing resource allocation method for ultra-dense network
CN113572517B (en) Beam hopping resource allocation method, system, storage medium and equipment based on deep reinforcement learning
CN114499629B (en) Dynamic allocation method for jumping beam satellite system resources based on deep reinforcement learning
CN110167176B (en) Wireless network resource allocation method based on distributed machine learning
CN115441939B (en) MADDPG algorithm-based multi-beam satellite communication system resource allocation method
CN107613556B (en) Full-duplex D2D interference management method based on power control
CN112583453A (en) Downlink NOMA power distribution method of multi-beam LEO satellite communication system
CN114189891B (en) Unmanned aerial vehicle heterogeneous network energy efficiency optimization method based on deep reinforcement learning
CN111246485B (en) Internet of vehicles resource allocation method under high-density vehicle-mounted communication environment
CN114786258A (en) Wireless resource allocation optimization method and device based on graph neural network
CN115412936A (en) IRS (intelligent resource management) assisted D2D (device-to-device) system resource allocation method based on multi-agent DQN (differential Quadrature reference network)
CN113259944B (en) RIS auxiliary frequency spectrum sharing method based on deep reinforcement learning
CN114599099A (en) 5G satellite-ground link multi-beam dynamic power distribution method based on reinforcement learning
CN114630299A (en) Information age-perceptible resource allocation method based on deep reinforcement learning
Leng et al. User-level scheduling and resource allocation for multi-beam satellite systems with full frequency reuse
CN117412391A (en) Enhanced dual-depth Q network-based Internet of vehicles wireless resource allocation method
CN116684852B (en) Mountain land metallocene forest environment unmanned aerial vehicle communication resource and hovering position planning method
CN117460034A (en) Intelligent reflection surface relay-assisted Internet of vehicles power distribution and user scheduling method, system, equipment and medium
CN116896407A (en) Multi-proxy A3C algorithm-based multi-beam satellite communication system resource allocation method
CN116566465A (en) Multi-domain resource allocation method for avoiding low-orbit satellite interference based on beam hopping
CN116505998A (en) Multi-beam satellite communication resource distribution system and method based on deep reinforcement learning
Yuan et al. Joint Multi-Ground-User Edge Caching Resource Allocation for Cache-Enabled High-Low-Altitude-Platforms Integrated Network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant