CN113163479A - Cellular Internet of things uplink resource allocation method and electronic equipment - Google Patents

Cellular Internet of things uplink resource allocation method and electronic equipment Download PDF

Info

Publication number
CN113163479A
CN113163479A CN202110164357.3A CN202110164357A CN113163479A CN 113163479 A CN113163479 A CN 113163479A CN 202110164357 A CN202110164357 A CN 202110164357A CN 113163479 A CN113163479 A CN 113163479A
Authority
CN
China
Prior art keywords
node
agent
strategy
representing
channel
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110164357.3A
Other languages
Chinese (zh)
Inventor
孙德栋
欧清海
张宁池
姚贤炯
王艳茹
刘椿枫
李温静
丰雷
刘卉
马文洁
张洁
陈毅龙
郭丹丹
佘蕊
杨志祥
王志强
贺军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Corp of China SGCC
State Grid Information and Telecommunication Co Ltd
Beijing University of Posts and Telecommunications
State Grid Shanghai Electric Power Co Ltd
State Grid Shaanxi Electric Power Co Ltd
Beijing Zhongdian Feihua Communication Co Ltd
Original Assignee
State Grid Corp of China SGCC
State Grid Information and Telecommunication Co Ltd
Beijing University of Posts and Telecommunications
State Grid Shanghai Electric Power Co Ltd
State Grid Shaanxi Electric Power Co Ltd
Beijing Zhongdian Feihua Communication Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Corp of China SGCC, State Grid Information and Telecommunication Co Ltd, Beijing University of Posts and Telecommunications, State Grid Shanghai Electric Power Co Ltd, State Grid Shaanxi Electric Power Co Ltd, Beijing Zhongdian Feihua Communication Co Ltd filed Critical State Grid Corp of China SGCC
Priority to CN202110164357.3A priority Critical patent/CN113163479A/en
Publication of CN113163479A publication Critical patent/CN113163479A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W72/00Local resource management
    • H04W72/04Wireless resource allocation
    • H04W72/044Wireless resource allocation based on the type of the allocated resource
    • H04W72/0473Wireless resource allocation based on the type of the allocated resource the resource being transmission power
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W52/00Power management, e.g. TPC [Transmission Power Control], power saving or power classes
    • H04W52/04TPC
    • H04W52/06TPC algorithms
    • H04W52/14Separate analysis of uplink or downlink
    • H04W52/146Uplink power control
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W72/00Local resource management
    • H04W72/50Allocation or scheduling criteria for wireless resources
    • H04W72/53Allocation or scheduling criteria for wireless resources based on regulatory allocation policies
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W72/00Local resource management
    • H04W72/50Allocation or scheduling criteria for wireless resources
    • H04W72/56Allocation or scheduling criteria for wireless resources based on priority criteria

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

One or more embodiments of the present specification provide a cellular internet of things uplink resource allocation method and an electronic device, where the method includes: each edge node and each direct transmission node of the cellular Internet of things are used as agents, and the agents select an action space A by adopting a search-utilization strategy according to the current system stateiAction a iniAnd executing; according to the executed action aiCalculating the reward value of each agent through a reward function; determining a Q function of the intelligent agent in the current system state according to the Q function of the intelligent agent, and enabling the intelligent agent to enter the next system state from the current system state; determining that the agent performs action a based on the agent's estimation policy, the average estimation policyiA temporal average estimation strategy and an estimation strategy; optimal according to the agent up to a preset number of iterationsAnd estimating a strategy, and performing resource allocation on the uplink resources of the cellular Internet of things. The method provided by the disclosure can realize effective resource allocation of the uplink resources of the cellular Internet of things.

Description

Cellular Internet of things uplink resource allocation method and electronic equipment
Technical Field
One or more embodiments of the present disclosure relate to the field of wireless communication technologies, and in particular, to an uplink resource allocation method for a cellular internet of things and an electronic device.
Background
As one of the three major application scenarios of 5G, mass machine type communication (mtc) is intended to provide connectivity for large-scale internet of things (IoT) devices. The mMTC supports more than 100 million connections of devices with various QoS requirements per square kilometer, brings opportunities for the interconnection of everything, and simultaneously provides new challenges for the aspects of spectrum utilization rate, transmission delay, data throughput and the like. Non-orthogonal multiple access (NOMA) is considered a key technology that can effectively address these challenges. Compared with the traditional orthogonal multiple access technology, the NOMA can improve the spectrum efficiency, reduce the access delay and the signaling overhead and has more advantages when supporting mass connection by utilizing the new power and the coding domain to carry out non-orthogonal resource allocation on the limited resources among the devices. The basic idea of NOMA is to use non-orthogonal transmission at the transmitting end, actively introduce interference information, and demodulate at the receiving end by Successive Interference Cancellation (SIC) technique. SIC can well improve the spectrum efficiency and effectively enhance the network capacity of an uplink and a downlink. In view of the unique advantages of NOMA, NOMA is currently incorporated into the technical part of the 5G mtc standard by 3GPP, and resource management in NOMA is also a hot research issue in the field of wireless communication.
At present, because the performance of internet of things equipment in a large-scale cellular internet of things application scene is generally poor, a Successive Interference Cancellation (SIC) technology in NOMA transmission cannot be completed, and a relay node and a base station which are used for forwarding cannot perform effective communication; meanwhile, complicated interference situation occurs in NOMA frequency spectrum resource sharing, so that effective resource allocation cannot be performed on uplink resources of the cellular Internet of things.
Disclosure of Invention
In view of this, one or more embodiments of the present disclosure provide a cellular internet of things uplink resource allocation method and an electronic device, so as to solve the problem that effective resource allocation cannot be performed on cellular internet of things uplink resources.
In view of the foregoing, one or more embodiments of the present specification provide a method for allocating uplink resources of a cellular internet of things, including:
taking each edge node and each direct transmission node of the cellular Internet of things as an intelligent agent, and executing the following operations on the intelligent agent until the preset iteration times are reached:
the agent selects an action space A by adopting an exploration-utilization strategy according to the current system state of the agentiAction a iniAnd performing the action ai
According to the action a executediCalculating a reward value for each of the agents by a reward function; and
determining a Q function of the intelligent agent in the current system state according to the Q function of the intelligent agent, and enabling the intelligent agent to enter the next system state from the current system state;
determining that the agent performs the action a based on an estimation strategy, an average estimation strategy, of the agentiA temporal average estimation strategy and an estimation strategy; and
in response to the determinationThe agent performs the action aiThe estimated strategy value is larger than the average estimated strategy value, and the learning rate delta is usedwAdjusting the current estimation strategy, otherwise using the learning rate deltalAdjusting the current estimation strategy, where δlw
The above operations executed by the agent reach the preset iteration times to obtain the optimal estimation strategy;
and according to the optimal estimation strategy, performing resource allocation on the uplink resources of the cellular Internet of things.
Further, taking each edge node and each direct transmission node of the cellular internet of things as an agent, executing the following operations on the agent until a preset iteration number is reached, and the method further comprises the following steps:
recording the initial Q function value of the intelligent agent as 0, and determining a counter X for recording the occurrence frequency of the system state Si(S), and an initial estimation strategy of the agent pi (S, a)i) Mean estimation strategy
Figure BDA0002937115550000021
Wherein the initial estimation strategy
Figure BDA0002937115550000022
Initial mean estimation strategy
Figure BDA0002937115550000023
Further, the system state S is determined by the state S of the direct transfer nodewAnd the state s of the edge nodenWherein S ═ { S ═ Sw,sn,w∈W,n∈N};
In particular, the state s of the direct transfer nodewChannel allocation coefficient lambda comprising said direct transfer nodew,cState s of said edge nodenChannel allocation coefficient eta including said edge node nn,r,cAnd a transmission power control coefficient thetanWherein λ isw,c={0,1},sw={λw,c,w∈W,c∈C},ηn,r,c={0,1},θn={0.0,0.2,0.4,0.6,0.8,1.0},sn={ηn,r,cn,n∈N,r∈R,c∈C}。
Further, the reward function is noted as rew (S, a)i) If the agent is an edge node, the reward function rew (S, a)i) The algorithm is as follows:
Figure BDA0002937115550000031
if the agent is a direct transfer node, the reward function rew (S, a)i) The algorithm is as follows:
Figure BDA0002937115550000032
further, the method for determining the Q function in the current system state of the agent comprises:
recording said Q function as Qi(S,ai),
Figure BDA0002937115550000033
Wherein, deltaqRepresents the Q function learning rate, beta represents the jackpot discount coefficient,
Figure BDA0002937115550000034
respectively the next arriving system state and the action performed.
Further, the exploration-utilization strategy is specifically a greedy strategy epsilon-greedy, and the calculation method of the greedy strategy is as follows:
selective action a of agent i given system state SiIs denoted as p (a)i|S),p(ai| S) algorithm is as follows:
Figure BDA0002937115550000035
wherein ε represents the action selection probability, and 0<ε<1,Qi(S,ai) Denotes the Q function, Ai(S) represents the number of actions that agent i can perform in system state S.
Further, the determining that the agent performs action aiThe calculation method of the time average estimation strategy comprises the following steps:
Figure BDA0002937115550000041
the determination that the agent performs action aiThe calculation method of the time estimation strategy comprises the following steps:
Figure BDA0002937115550000042
wherein the content of the first and second substances,
Figure BDA0002937115550000043
the step size of the updating of the estimation strategy is represented, and the calculation method comprises the following steps:
Figure BDA0002937115550000044
where δ is the learning rate, δ takes a different value depending on the following two cases,
Figure BDA0002937115550000045
further, the method also comprises the following steps: determining a signal transmission model for communication among the edge node, the direct transfer node, the relay node and the base station based on a non-orthogonal multiple access (NOMA) technology and an Open Mobile Alliance (OMA) technology, wherein the signal transmission model specifically comprises:
determining N edge nodes, R relay nodes, W direct transmission nodes, and C channels under the base station, where N is {1,2,3, …, N }, R is {1,2,3, …, R }, W is {1,2,3, …, W }, and C is {1,2,3, …, C };
the relay node receives a signal sent by the edge node through NOMA technology to obtain a first signal yrThe first signal yrThe algorithm is as follows:
Figure BDA0002937115550000046
wherein Hn,rRepresenting the channel gain, θ, of the edge node n to the relay node rnRepresenting the transmission power control coefficient, P, of the edge node nnRepresenting the maximum transmit power, S, of the edge node nnRepresenting signals from edge nodes n, etan,r,cDenotes a channel allocation coefficient, ξ denotes an additive white Gaussian noise signal, and
Figure BDA0002937115550000047
σ2representing additive white Gaussian noise power, wherein N belongs to N, and R belongs to R;
further, Hn,rThe algorithm is as follows:
Figure BDA0002937115550000051
wherein the content of the first and second substances,
Figure BDA0002937115550000052
representing small-scale fading of the channel to the relay node r of the edge node n and satisfying a gaussian distribution
Figure BDA0002937115550000053
dn,rDenotes the distance from the edge node n to the relay node r, λ is the path loss exponent;
the base station receives the first signal sent by the relay node through the OMA technology and the signal sent by the direct transfer node through the NOMA technology, and the signal is obtained by decoding SIC through the successive interference cancellation technologySecond signal yBSThe second signal yBSThe algorithm is as follows:
Figure BDA0002937115550000054
wherein Hw,BSRepresenting the channel gain, H, from the direct transfer node w to the base stationr,BSRepresenting the channel gain, P, from the relay node r to the base stationwRepresenting the transmission power, S, of the direct-transfer nodewRepresenting signals from direct-transfer nodes, λw,cDenotes the channel allocation coefficient, murIs a relay gain factor;
Hw,BSthe algorithm is as follows:
Figure BDA0002937115550000055
wherein the content of the first and second substances,
Figure BDA0002937115550000056
representing small-scale fading of the channel from the direct transfer node w to the base station and satisfying a gaussian distribution
Figure BDA0002937115550000057
dw,BSRepresents the distance from the direct transfer node w to the base station;
Hr,BSthe algorithm is as follows:
Figure BDA0002937115550000058
wherein the content of the first and second substances,
Figure BDA0002937115550000059
representing small scale fading of the relay node to base station channel and satisfying a gaussian distribution
Figure BDA00029371155500000510
dr,BSRepresents the distance from the relay node r to the base station;
based on Shannon's theorem, calculating the receiving rate R of the base station for receiving the second signalsumThe receiving rate RsumThe algorithm is as follows:
Figure BDA00029371155500000511
where B denotes the channel bandwidth, τnIndicating that the signal sent by the edge node n is amplified and forwarded by the relay node r on the channel c, and the received signal-to-noise ratio, tau, at the base stationwThe signal transmitted by the direct transmission node w reaches the base station through the channel c, and the receiving signal-to-noise ratio at the base station is represented;
in particular, taunThe calculation method comprises the following steps:
Figure BDA00029371155500000512
wherein Hi,rDenotes the channel gain, θ, from edge node i to relay node riRepresenting the transmission power control coefficient, P, of the edge node iiRepresenting the maximum transmit power, σ, of the edge node i2For additive white Gaussian noise power, i belongs to N, i is not equal to N, and thetaiPinPn
τwThe calculation method comprises the following steps:
Figure BDA0002937115550000061
further, the method also includes, after:
limiting the transmission power of the edge node multiplexing the same channel specifically includes:
when etan,r,cWhen 1, satisfy
Figure BDA0002937115550000062
Wherein, PtotnFor the threshold value of the transmission power, i ≠ n, αiPinPn
Determining that each transmission link meets the QoS requirement of the system, and specifically meeting the following conditions:
τnw≥τo,
Figure BDA0002937115550000063
wherein, tauoA minimum value representing a received signal-to-noise ratio;
limiting each edge node, the direct transfer node and the relay node to only allocate one channel, and specifically satisfying the following conditions:
Figure BDA0002937115550000064
limiting the number of the edge nodes accessed by each channel, and specifically meeting the following conditions:
Figure BDA0002937115550000065
r∈R
wherein q ismaxRepresenting the maximum number of edge nodes allowed to access per channel.
Based on the same inventive concept, one or more embodiments of the present specification further provide an electronic device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor implements the method as described in any one of the above items when executing the program.
As can be seen from the above description, in one or more embodiments of the present disclosure, each edge node and each direct transmission node are regarded as an agent, each agent performs its own action according to the state of the whole system, when the reward obtained by the agent is worse than expected, the learning rate can be quickly adjusted to adapt to the policy change of other agents, when the reward obtained is better than expected, the agent learns cautiously, and the time for the policy change is adapted to other agents, and finally, each agent can converge to the optimal estimation policy, and perform resource allocation on each edge node and each direct transmission node based on the optimal estimation policy.
Drawings
In order to more clearly illustrate one or more embodiments or prior art solutions of the present specification, the drawings that are needed in the description of the embodiments or prior art will be briefly described below, and it is obvious that the drawings in the following description are only one or more embodiments of the present specification, and that other drawings may be obtained by those skilled in the art without inventive effort from these drawings.
Fig. 1 is a flowchart of a method for allocating uplink resources in a cellular internet of things according to one or more embodiments of the present disclosure;
FIG. 2 is a flow diagram of determining a signal transmission model in accordance with one or more embodiments of the present disclosure;
FIG. 3 is a flow diagram of optimizing a signal transmission model in accordance with one or more embodiments of the present disclosure;
fig. 4 is a schematic structural diagram of a cellular internet of things uplink resource allocation device according to one or more embodiments of the present disclosure;
fig. 5 is a schematic structural diagram of an electronic device according to one or more embodiments of the present disclosure.
Detailed Description
For the purpose of promoting a better understanding of the objects, aspects and advantages of the present disclosure, reference is made to the following detailed description taken in conjunction with the accompanying drawings.
It is to be noted that unless otherwise defined, technical or scientific terms used in one or more embodiments of the present specification should have the ordinary meaning as understood by those of ordinary skill in the art to which this disclosure belongs. The use of "first," "second," and similar terms in one or more embodiments of the specification is not intended to indicate any order, quantity, or importance, but rather is used to distinguish one element from another. The word "comprising" or "comprises", and the like, means that the element or item listed before the word covers the element or item listed after the word and its equivalents, but does not exclude other elements or items.
As described in the background section, the existing NOMA-based cellular internet of things application scenario cannot perform efficient allocation of uplink resources. In the process of implementing the present disclosure, the applicant finds that a relay node and a base station for forwarding cannot perform effective communication due to poor performance of the internet of things device; meanwhile, complicated interference situation occurs in NOMA frequency spectrum resource sharing, and finally uplink resources cannot be effectively allocated.
Hereinafter, the technical means of the present disclosure will be described in further detail with reference to specific examples.
WoLF in the agent reinforcement learning algorithm (WoLF-PHC) means that parameters need to be adjusted only slowly when the behavior of an agent is better than an expected value, and the speed of adjusting parameters needs to be increased when the behavior of an agent is worse than the expected value. The PHC is a learning algorithm of a single intelligent agent in a stable environment, and through reinforcement learning, the selection probability of the action which can be maximally accumulated and expected is increased, and finally the optimal strategy can be converged.
Under the same base station of the same cell, an edge node represents edge terminal node equipment, a direct transfer node represents direct transmission terminal node equipment, and a relay node represents relay forwarding node equipment; the relay node and the direct transfer node have good channel conditions and can directly communicate with the base station, while the edge node with poor channel conditions in the cell cannot directly communicate with the base station and needs to communicate with the base station through the relay node in an amplification forwarding mode.
Referring to fig. 1, an uplink resource allocation method for a cellular internet of things according to an embodiment of the present specification includes the following steps:
step S101: taking each edge node and each direct transmission node of the cellular Internet of things as an agent, and executing the following operations of step S102-step S104 for each agent until the preset iteration number is reached.
Before the step, the initial value of the Q function of the intelligent agent is recorded as 0, and the Q function is determinedCounter X for recording the number of occurrences of a system state Si(S), and an initial estimation strategy of the agent pi (S, a)i) Mean estimation strategy
Figure BDA0002937115550000081
Wherein the initial estimation strategy
Figure BDA0002937115550000082
Initial mean estimation strategy
Figure BDA0002937115550000083
The estimation strategy represents the probability of selecting each action under a given system state, and the average estimation strategy is a standard for measuring the estimation strategy, so that the estimation strategy is changed to the optimal estimation strategy.
Wherein, aiRepresenting an action space A performed by an agentiThe system state S is determined by the state S of the direct transfer nodewAnd the state s of the above-mentioned edge nodenIs expressed as S ═ Sw,sn,w∈W,n∈N}。
Further, the state s of the direct transfer nodewChannel allocation coefficient lambda comprising said direct transfer nodew,cState s of the above edge nodenChannel allocation coefficient eta including the edge node nn,r,cAnd a transmission power control coefficient thetanWherein, in the step (A),
λw,c={0,1}
sw={λw,c,w∈W,c∈C}
ηn,r,c={0,1}
θn={0.0,0.2,0.4,0.6,0.8,1.0}
sn={ηn,r,cn,n∈N,r∈R,c∈C}
step S102: the agent selects an action space A by adopting an exploration-utilization strategy according to the current system state of the agentiAction a iniAnd executed.
In this step, an action space AiComprises thatThe following actions: adjusting signal transmission channels, adjusting connected relay nodes, and adjusting transmission power control. For example, there is an agent i, action ai∈AiIf the agent i directly transmits the node, lambda needs to be adjustedw,cIf the agent is an edge node, adjusting the channel allocation coefficient etan,r,cAnd a transmission power control coefficient thetanAnd (4) finishing.
The exploration is carried out, namely a greedy strategy (epsilon-greedy) is selected by utilizing the strategy, and an action space A is selected by utilizing the greedy strategy (epsilon-greedy)iAction a iniThe specific calculation method is as follows:
selective action a of agent i given system state SiIs denoted as p (a)i|S),p(ai| S) algorithm is as follows:
Figure BDA0002937115550000091
wherein ε represents the action selection probability, and 0<ε<1,Qi(S,ai) Denotes the Q function, Ai(S) represents the number of actions that agent i can perform in system state S.
That is, agent i will be at ε (0)<ε<1) Probability of selecting an action space A in the System State SiAny of the actions.
Step S103: according to the executed action aiCalculating the reward value of each agent through a reward function; and determining the Q function of the intelligent agent in the current system state according to the Q function of the intelligent agent, and enabling the intelligent agent to enter the next system state from the current system state.
In this step, after each agent has performed the action, the system calculates the corresponding reward value of the agent, and takes the received snr of the transmitted signal at the base station as the reward of the agent, specifically, the reward function is recorded as rew (S, a)i) If the agent is an edge node, then the reward function rew (S, a)i) The algorithm of (1) is as follows:
Figure BDA0002937115550000101
if the agent is a direct transfer node, then the reward function rew (S, a)i) The algorithm of (1) is as follows:
Figure BDA0002937115550000102
it will be appreciated that the greater the received signal-to-noise ratio value, the greater the reward value received by the agent. Each agent only needs to observe the state at the current moment without observing the action executed by other agents and the acquired reward value, and takes corresponding action to generate corresponding influence on the system, so that the system enters a new system state at the next moment.
Agent updates the Q function Q (S, a) at this timei) The specific algorithm is as follows:
Figure BDA0002937115550000103
wherein, deltaqRepresents the Q function learning rate, beta represents the jackpot discount coefficient,
Figure BDA0002937115550000104
the system state reached and the action performed at the next moment, respectively.
Step S104: determining that the agent performs action a based on the agent's estimation policy, the average estimation policyiA temporal average estimation strategy and an estimation strategy; and performing action a in response to determining that the agent performs action aiThe estimated strategy value is larger than the average estimated strategy value, and the learning rate delta is usedwAdjusting the current estimation strategy, otherwise using the learning rate deltalAdjusting the current estimation strategy, where δlw
In this step, the currently executed action a is updatediTime-averaged estimation strategy
Figure BDA0002937115550000105
The calculation method comprises the following steps:
Figure BDA0002937115550000106
further, updating the currently performed action aiThe calculation method of the time estimation strategy is as follows:
Figure BDA0002937115550000107
wherein the content of the first and second substances,
Figure BDA0002937115550000108
the step size of the updating of the estimation strategy is represented, and the calculation method comprises the following steps:
Figure BDA0002937115550000111
where δ is the learning rate, δ takes a different value depending on the following two cases,
Figure BDA0002937115550000112
in particular, the estimation strategy of agent i is pii(S,ai) And average estimation strategy
Figure BDA0002937115550000113
By comparison, if satisfied
Figure BDA0002937115550000114
Then it is considered as an estimation strategy pii(S,ai) Better, and vice versa, average estimation strategy
Figure BDA0002937115550000115
And more preferably. If the current action aiIf the operation is not the one for maximizing the Q function value
Figure BDA0002937115550000116
Is a negative number, and vice versa
Figure BDA0002937115550000117
Is positive, thereby increasing the probability of selection of the action that maximizes the Q function value.
Step S105: and when the occurrence frequency of the system state reaches a preset iteration frequency, obtaining the optimal estimation strategy of the intelligent agent, and performing resource allocation on the cellular Internet of things uplink resource according to the optimal estimation strategy.
In summary, when the estimation strategy is better, the learning rate of the estimation strategy update becomes slower; when the average estimation strategy is better, the learning efficiency of the estimation strategy update becomes faster. I.e. when the behaviour of the agent is better than expected, the delta is passedwMake slow adjustments to the parameters by delta when the agent's behavior is worse than expectedlAnd rapidly adjusting parameters.
Therefore, the method provided by the embodiment is an uplink resource allocation scheme for online reinforcement learning, and in consideration of a complex interference situation caused by NOMA spectrum resource sharing, in actual complex cellular Internet of things communication, when the number of terminal devices is gradually increased, high computational complexity is caused. However, the multi-agent reinforcement learning algorithm model can enable the system to converge into a stable resource allocation scheme within a specified iteration number. Therefore, the method and the device can realize effective resource allocation of the uplink resources of the cellular Internet of things.
It is to be appreciated that the method can be performed by any apparatus, device, platform, cluster of devices having computing and processing capabilities.
As an optional embodiment, step S101 further includes, before: and determining a signal transmission model for communication between the edge node, the direct transfer node and the relay node and the base station based on the NOMA technology and the OMA technology.
With reference to fig. 2, the signal transmission system model specifically includes:
step S201: n edge nodes, R relay nodes, W direct transfer nodes and C channels under a base station are determined.
In this step, N ═ 1,2,3, …, N }, R ═ 1,2,3, …, R }, W ═ 1,2,3, …, W }, C ═ 1,2,3, …, C }
Step S202: the relay node receives a signal sent by an edge node through the NOMA technology to obtain a first signal yr
In this step, the first signal yrThe algorithm is as follows:
Figure BDA0002937115550000121
wherein Hn,rRepresenting the channel gain, θ, from the edge node n to the relay node rnRepresenting the transmission power control coefficient, P, of the edge node nnRepresenting the maximum transmit power, S, of the edge node nnRepresenting signals from edge nodes n, etan,r,cDenotes a channel allocation coefficient, ξ denotes an additive white Gaussian noise signal, and
Figure BDA0002937115550000122
σ2and the power of the additive white Gaussian noise signal is expressed, N belongs to N, and R belongs to R.
Further, Hn,rThe algorithm is as follows:
Figure BDA0002937115550000123
wherein the content of the first and second substances,
Figure BDA0002937115550000124
representing small-scale fading of the channel to the relay node r of the edge node n and satisfying a gaussian distribution
Figure BDA0002937115550000125
dn,rDenotes the distance from the edge node n to the relay node r, and λ is the path loss exponent.
The edge node and the base station need to be transmitted through two hops when communicating, the edge node sends a signal to the relay node as a first hop, the edge node can multiplex the same subchannel to transmit information to the relay node by using a NOMA mode, the edge node multiplexing the same subchannel can execute NOMA power control in the transmission process, and the signal can be demodulated through a SIC technology when finally reaching the base station through a relay amplification-and-forward (AF) mode.
Step S203: the base station receives a first signal sent by the relay node through the OMA technology and a signal sent by the direct transfer node through the NOMA technology to obtain a second signal yBS
In this step, the second signal yBSThe algorithm is as follows:
Figure BDA0002937115550000131
wherein Hw,BSRepresenting the channel gain, H, from the direct transfer node w to the base stationr,BSRepresenting the channel gain, P, from the relay node r to the base stationwRepresenting the transmission power, S, of the direct-transfer nodewRepresenting signals from direct-transfer nodes, λw,cDenotes the channel allocation coefficient, murIs the relay gain factor.
Hw,BSThe algorithm is as follows:
Figure BDA0002937115550000132
wherein the content of the first and second substances,
Figure BDA0002937115550000133
representing small-scale fading of the channel from the direct transfer node w to the base station and satisfying a gaussian distribution
Figure BDA0002937115550000134
dw,BSRepresenting the distance from the direct transfer node w to the base station.
Hr,BSThe algorithm is as follows:
Figure BDA0002937115550000135
wherein the content of the first and second substances,
Figure BDA0002937115550000136
representing small scale fading of the relay node to base station channel and satisfying a gaussian distribution
Figure BDA0002937115550000137
dr,BSIndicating the distance from the relay node r to the base station.
Further, λ is the channel c assigned to the direct transfer node w for transmitting signals to the base stationw,c1, otherwise λw,c=0。
It can be understood that the second hop refers to sending out a signal from the relay node to the base station, and considering the performance problem of the relay, the second hop directly adopts the AF mode to transmit the signal in the OMA mode, in the AF mode, the relay node only receives the signal from the edge node and amplifies and transmits the signal to the base station, and the SIC decoding operation is performed by the base station without any encoding operation on the signal.
Step S204: based on Shannon's theorem, calculating the receiving rate R of the base station for receiving the second signalsum
In this step, the receiving rate R issumThe algorithm is as follows:
Figure BDA0002937115550000138
where B denotes the channel bandwidth, τnIndicating that the signal sent by the edge node n is amplified and forwarded by the relay node r on the channel c, and the received signal-to-noise ratio, tau, at the base stationwThe signal transmitted by the direct transmission node w reaches the base station through the channel c, and the receiving signal-to-noise ratio at the base station is represented;
in particular, taunThe calculation method comprises the following steps:
Figure BDA0002937115550000139
wherein Hi,rDenotes the channel gain, θ, from edge node i to relay node riRepresenting the transmission power control coefficient, P, of the edge node iiRepresenting the maximum transmit power, σ, of the edge node i2For additive white Gaussian noise power, i belongs to N, i is not equal to N, and thetaiPinPn
Hi,rAnd the above Hn,rThe calculation methods are the same, and are not described herein again.
τwThe calculation method comprises the following steps:
Figure BDA0002937115550000141
as an alternative embodiment, in conjunction with fig. 3, step S204 may further include the following steps:
step S301: the transmission power of edge nodes multiplexing the same channel is limited.
The method specifically comprises the following steps:
when etan,r,cWhen the number is equal to 1, the alloy is put into a container,
Figure BDA0002937115550000142
wherein, PtotnFor the threshold value of the transmission power, i ≠ n, θiPinPn
That is, the difference between the power of the edge node n minus the powers of all edge points smaller than the power of the edge node n must be larger than the threshold value P of the transmission powertotnThe threshold value P of the transmission power can be adjusted according to actual conditionstotnThe setting is performed.
Step S302: it is determined that each transmission link satisfies system quality of service (QoS) requirements.
In this step, the conditions to be satisfied are as follows:
τnw≥τo,
Figure BDA0002937115550000143
wherein, tauoRepresenting the minimum value of the received signal-to-noise ratio.
It can be understood that if each transmission link is required to satisfy the QoS requirement of the system, the above condition is satisfied, and τ can be adjusted according to the actual situationoThe value is set, and is not particularly limited herein.
Step S303: and limiting each edge node, the direct transmission node and the relay node to be allocated with only one channel.
In this step, the conditions to be satisfied are as follows:
Figure BDA0002937115550000144
step S304: limiting the number of access edge nodes per channel.
In this step, the conditions to be satisfied are as follows:
Figure BDA0002937115550000145
r∈R
wherein q ismaxRepresenting the maximum number of edge nodes allowed to access per channel.
The embodiment is system optimization performed for a hybrid transmission system model, and ensures that a base station can successfully decode a received signal by using the SIC technology.
It should be noted that the method of one or more embodiments of the present disclosure may be performed by a single device, such as a computer or server. The method of the embodiment can also be applied to a distributed scene and completed by the mutual cooperation of a plurality of devices. In such a distributed scenario, one of the devices may perform only one or more steps of the method of one or more embodiments of the present disclosure, and the devices may interact with each other to complete the method.
It should be noted that the above description describes certain embodiments of the present disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
Based on the same inventive concept, corresponding to any of the above embodiments, one or more embodiments of the present specification further provide a cellular internet of things uplink resource allocation device.
Referring to fig. 4, the uplink resource allocation apparatus for cellular internet of things includes:
the estimation strategy iteration module 401: the method comprises the following steps that each edge node and each direct transfer node of the cellular Internet of things are used as agents, and the following operations are executed on the agents until the preset iteration number is reached: the agent selects an action space A by adopting an exploration-utilization strategy according to the current system state of the agentiAction a iniAnd performing the action ai(ii) a According to the action a executediCalculating a reward value for each of the agents by a reward function; determining a Q function of the intelligent agent in the current system state according to the Q function of the intelligent agent, and enabling the intelligent agent to enter the next system state from the current system state; determining that the agent performs the action a based on an estimation strategy, an average estimation strategy, of the agentiA temporal average estimation strategy and an estimation strategy; and performing the action a in response to determining that the agent performs the action aiThe estimated strategy value is larger than the average estimated strategy value, and the learning rate delta is usedwAdjusting the current estimation strategy, otherwise using the learning rate deltalAdjusting the current estimation strategy, where δlw(ii) a And the above operations executed by the intelligent agent reach the preset iteration times to obtain the optimal estimation strategy.
The uplink resource allocation module 402: and the method is configured to perform resource allocation on the uplink resources of the cellular Internet of things according to the optimal estimation strategy.
For convenience of description, the above devices are described as being divided into various modules by functions, and are described separately. Of course, the functionality of the modules may be implemented in the same one or more software and/or hardware implementations in implementing one or more embodiments of the present description.
The apparatus of the foregoing embodiment is used to implement the corresponding method in the foregoing embodiment, and has the beneficial effects of the corresponding method embodiment, which are not described herein again.
Fig. 5 is a schematic diagram illustrating a more specific hardware structure of an electronic device according to this embodiment, where the electronic device may include: a processor 1010, a memory 1020, an input/output interface 1030, a communication interface 1040, and a bus 1050. Wherein the processor 1010, memory 1020, input/output interface 1030, and communication interface 1040 are communicatively coupled to each other within the device via bus 1050.
The processor 1010 may be implemented by a general-purpose CPU (Central Processing Unit), a microprocessor, an Application Specific Integrated Circuit (ASIC), or one or more Integrated circuits, and is configured to execute related programs to implement the technical solutions provided in the embodiments of the present disclosure.
The Memory 1020 may be implemented in the form of a ROM (Read Only Memory), a RAM (Random Access Memory), a static storage device, a dynamic storage device, or the like. The memory 1020 may store an operating system and other application programs, and when the technical solution provided by the embodiments of the present specification is implemented by software or firmware, the relevant program codes are stored in the memory 1020 and called to be executed by the processor 1010.
The input/output interface 1030 is used for connecting an input/output module to input and output information. The i/o module may be configured as a component in a device (not shown) or may be external to the device to provide a corresponding function. The input devices may include a keyboard, a mouse, a touch screen, a microphone, various sensors, etc., and the output devices may include a display, a speaker, a vibrator, an indicator light, etc.
The communication interface 1040 is used for connecting a communication module (not shown in the drawings) to implement communication interaction between the present apparatus and other apparatuses. The communication module can realize communication in a wired mode (such as USB, network cable and the like) and also can realize communication in a wireless mode (such as mobile network, WIFI, Bluetooth and the like).
Bus 1050 includes a path that transfers information between various components of the device, such as processor 1010, memory 1020, input/output interface 1030, and communication interface 1040.
It should be noted that although the above-mentioned device only shows the processor 1010, the memory 1020, the input/output interface 1030, the communication interface 1040 and the bus 1050, in a specific implementation, the device may also include other components necessary for normal operation. In addition, those skilled in the art will appreciate that the above-described apparatus may also include only those components necessary to implement the embodiments of the present description, and not necessarily all of the components shown in the figures.
The electronic device of the foregoing embodiment is used to implement the corresponding method in the foregoing embodiment, and has the beneficial effects of the corresponding method embodiment, which are not described herein again.
Computer-readable media of the present embodiments, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device.
Those of ordinary skill in the art will understand that: the discussion of any embodiment above is meant to be exemplary only, and is not intended to intimate that the scope of the disclosure, including the claims, is limited to these examples; within the spirit of the present disclosure, features from the above embodiments or from different embodiments may also be combined, steps may be implemented in any order, and there are many other variations of different aspects of one or more embodiments of the present description as described above, which are not provided in detail for the sake of brevity.
While the present disclosure has been described in conjunction with specific embodiments thereof, many alternatives, modifications, and variations of these embodiments will be apparent to those of ordinary skill in the art in light of the foregoing description. For example, other memory architectures (e.g., dynamic ram (dram)) may use the discussed embodiments.
It is intended that the one or more embodiments of the present specification embrace all such alternatives, modifications and variations as fall within the broad scope of the appended claims. Therefore, any omissions, modifications, substitutions, improvements, and the like that may be made without departing from the spirit and principles of one or more embodiments of the present disclosure are intended to be included within the scope of the present disclosure.

Claims (10)

1. A cellular Internet of things uplink resource allocation method is characterized by comprising the following steps:
taking each edge node and each direct transmission node of the cellular Internet of things as an intelligent agent, and executing the following operations on the intelligent agent until the preset iteration times are reached:
the agent selects an action space A by adopting an exploration-utilization strategy according to the current system state of the agentiAction a iniAnd performing the action ai
According to the action a executediCalculating a reward value for each of the agents by a reward function; and
determining a Q function of the intelligent agent in the current system state according to the Q function of the intelligent agent, and enabling the intelligent agent to enter the next system state from the current system state;
determining that the agent performs the action a based on an estimation strategy, an average estimation strategy, of the agentiA temporal average estimation strategy and an estimation strategy; and
performing the action a in response to determining that the agent performs the actioniThe estimated strategy value is larger than the average estimated strategy value, and the learning rate delta is usedwAdjusting the current estimation strategy, otherwise using the learning rate deltalAdjusting the current estimation strategy, where δlw
The above operations executed by the agent reach the preset iteration times to obtain the optimal estimation strategy;
and according to the optimal estimation strategy, performing resource allocation on the uplink resources of the cellular Internet of things.
2. The method according to claim 1, wherein each edge node and each direct transfer node of the cellular internet of things are used as agents, and the following operations are performed on the agents until a preset number of iterations is reached, and before:
recording the initial Q function value of the intelligent agent as 0, and determining a counter X for recording the occurrence frequency of the system state Si(S), and an initial estimation strategy of the agent pi (S, a)i) Mean estimation strategy
Figure FDA0002937115540000011
Wherein the initial estimation strategy
Figure FDA0002937115540000012
Initial mean estimation strategy
Figure FDA0002937115540000013
3. The method of claim 2, wherein the system state S is defined by a state S of the pass-through nodewAnd the state s of the edge nodenWherein S ═ { S ═ Sw,sn,w∈W,n∈N};
In particular, the state s of the direct transfer nodewChannel allocation coefficient lambda comprising said direct transfer nodew,cState s of said edge nodenChannel allocation coefficient eta including said edge node nn,r,cAnd a transmission power control coefficient thetanWherein λ isw,c={0,1},sw={λw,c,w∈W,c∈C},ηn,r,c={0,1},θn={0.0,0.2,0.4,0.6,0.8,1.0},sn={ηn,r,cn,n∈N,r∈R,c∈C}。
4. A method according to claim 3, characterized by recording said reward function as rew (S, a)i) If the agent is an edge node, the reward function rew (S, a)i) The algorithm is as follows:
Figure FDA0002937115540000021
if the agent is a direct transfer node, the reward function rew (S, a)i) The algorithm is as follows:
Figure FDA0002937115540000022
5. the method of claim 4, wherein the Q function calculation method for determining the current system state of the agent is:
recording said Q function as Qi(S,ai),
Figure FDA0002937115540000023
Wherein, deltaqRepresents the Q function learning rate, beta represents the jackpot discount coefficient,
Figure FDA0002937115540000024
respectively the next arriving system state and the action performed.
6. The method according to claim 5, wherein the exploration-utilization strategy is specifically a greedy strategy epsilon-greedy, and the greedy strategy is calculated by:
selective action a of agent i given system state SiIs denoted as p (a)i|S),p(ai| S) algorithm is as follows:
Figure FDA0002937115540000025
wherein ε represents the action selection probability, and 0<ε<1,Qi(S,ai) Denotes the Q function, Ai(S) represents the number of actions that agent i can perform in system state S.
7. The method of claim 6, wherein the determining that the agent performs action aiThe calculation method of the time average estimation strategy comprises the following steps:
Figure FDA0002937115540000031
the determination that the agent performs action aiThe calculation method of the time estimation strategy comprises the following steps:
Figure FDA0002937115540000032
wherein the content of the first and second substances,
Figure FDA0002937115540000033
the step size of the updating of the estimation strategy is represented, and the calculation method comprises the following steps:
Figure FDA0002937115540000034
where δ is the learning rate, δ takes a different value depending on the following two cases,
Figure FDA0002937115540000035
8. the method of claim 2, further comprising, prior to the method: determining a signal transmission model for communication among the edge node, the direct transfer node, the relay node and the base station based on a non-orthogonal multiple access (NOMA) technology and an Open Mobile Alliance (OMA) technology, wherein the signal transmission model specifically comprises:
determining N edge nodes, R relay nodes, W direct transmission nodes, and C channels under the base station, where N is {1,2,3, …, N }, R is {1,2,3, …, R }, W is {1,2,3, …, W }, and C is {1,2,3, …, C };
the relay node receives a signal sent by the edge node through NOMA technology to obtain a first signal yrThe first signal yrThe algorithm is as follows:
Figure FDA0002937115540000036
wherein Hn,rRepresenting the channel gain, θ, of the edge node n to the relay node rnRepresenting the transmission power control coefficient, P, of the edge node nnRepresenting the maximum transmit power, S, of the edge node nnRepresenting signals from edge nodes n, etan,r,cRepresenting channel segmentsThe coefficient, ξ, represents an additive white Gaussian noise signal, and
Figure FDA0002937115540000041
σ2representing additive white Gaussian noise power, wherein N belongs to N, and R belongs to R;
further, Hn,rThe algorithm is as follows:
Figure FDA0002937115540000042
wherein the content of the first and second substances,
Figure FDA0002937115540000043
representing small-scale fading of the channel to the relay node r of the edge node n and satisfying a gaussian distribution
Figure FDA0002937115540000044
dn,rDenotes the distance from the edge node n to the relay node r, λ is the path loss exponent;
the base station receives the first signal sent by the relay node through the OMA technology and the signal sent by the direct transfer node through the NOMA technology, and a second signal y is obtained by decoding through a Successive Interference Cancellation (SIC) technologyBSThe second signal yBSThe algorithm is as follows:
Figure FDA0002937115540000045
wherein Hw,BSRepresenting the channel gain, H, from the direct transfer node w to the base stationr,BSRepresenting the channel gain, P, from the relay node r to the base stationwRepresenting the transmission power, S, of the direct-transfer nodewRepresenting signals from direct-transfer nodes, λw,cDenotes the channel allocation coefficient, murIs a relay gain factor;
Hw,BSthe algorithm is as follows:
Figure FDA0002937115540000046
wherein the content of the first and second substances,
Figure FDA0002937115540000047
representing small-scale fading of the channel from the direct transfer node w to the base station and satisfying a gaussian distribution
Figure FDA0002937115540000048
dw,BSRepresents the distance from the direct transfer node w to the base station;
Hr,BSthe algorithm is as follows:
Figure FDA0002937115540000049
wherein the content of the first and second substances,
Figure FDA00029371155400000410
representing small scale fading of the relay node to base station channel and satisfying a gaussian distribution
Figure FDA00029371155400000411
dr,BSRepresents the distance from the relay node r to the base station;
based on Shannon's theorem, calculating the receiving rate R of the base station for receiving the second signalsumThe receiving rate RsumThe algorithm is as follows:
Figure FDA00029371155400000412
where B denotes the channel bandwidth, τnIndicating that the signal sent by the edge node n is amplified and forwarded by the relay node r on the channel c, and the received signal-to-noise ratio, tau, at the base stationwThe signal transmitted by the direct transmission node w reaches the base station through the channel c, and the receiving signal-to-noise ratio at the base station is represented;
in particular, taunThe calculation method comprises the following steps:
Figure FDA0002937115540000051
wherein Hi,rDenotes the channel gain, θ, from edge node i to relay node riRepresenting the transmission power control coefficient, P, of the edge node iiRepresenting the maximum transmit power, σ, of the edge node i2For additive white Gaussian noise power, i belongs to N, i is not equal to N, and thetaiPinPn
τwThe calculation method comprises the following steps:
Figure FDA0002937115540000052
9. the method of claim 8, further comprising, after the method:
limiting the transmission power of the edge node multiplexing the same channel specifically includes:
when etan,r,cWhen 1, satisfy
Figure FDA0002937115540000053
Wherein, PtotnFor the threshold value of the transmission power, i ≠ n, αiPinPn
Determining that each transmission link meets the QoS requirement of the system, and specifically meeting the following conditions:
Figure FDA0002937115540000054
wherein, tauoA minimum value representing a received signal-to-noise ratio;
limiting each edge node, the direct transfer node and the relay node to only allocate one channel, and specifically satisfying the following conditions:
Figure FDA0002937115540000055
limiting the number of the edge nodes accessed by each channel, and specifically meeting the following conditions:
Figure FDA0002937115540000056
wherein q ismaxRepresenting the maximum number of edge nodes allowed to access per channel.
10. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the method according to any of claims 1 to 9 when executing the program.
CN202110164357.3A 2021-02-05 2021-02-05 Cellular Internet of things uplink resource allocation method and electronic equipment Pending CN113163479A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110164357.3A CN113163479A (en) 2021-02-05 2021-02-05 Cellular Internet of things uplink resource allocation method and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110164357.3A CN113163479A (en) 2021-02-05 2021-02-05 Cellular Internet of things uplink resource allocation method and electronic equipment

Publications (1)

Publication Number Publication Date
CN113163479A true CN113163479A (en) 2021-07-23

Family

ID=76882780

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110164357.3A Pending CN113163479A (en) 2021-02-05 2021-02-05 Cellular Internet of things uplink resource allocation method and electronic equipment

Country Status (1)

Country Link
CN (1) CN113163479A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114339788A (en) * 2022-01-06 2022-04-12 中山大学 Multi-agent ad hoc network planning method and system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190156197A1 (en) * 2017-11-22 2019-05-23 International Business Machines Corporation Method for adaptive exploration to accelerate deep reinforcement learning
CN110418416A (en) * 2019-07-26 2019-11-05 东南大学 Resource allocation methods based on multiple agent intensified learning in mobile edge calculations system
CN111385894A (en) * 2020-03-17 2020-07-07 全球能源互联网研究院有限公司 Transmission mode selection method and device based on online reinforcement learning
CN111695690A (en) * 2020-07-30 2020-09-22 航天欧华信息技术有限公司 Multi-agent confrontation decision-making method based on cooperative reinforcement learning and transfer learning

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190156197A1 (en) * 2017-11-22 2019-05-23 International Business Machines Corporation Method for adaptive exploration to accelerate deep reinforcement learning
CN110418416A (en) * 2019-07-26 2019-11-05 东南大学 Resource allocation methods based on multiple agent intensified learning in mobile edge calculations system
CN111385894A (en) * 2020-03-17 2020-07-07 全球能源互联网研究院有限公司 Transmission mode selection method and device based on online reinforcement learning
CN111695690A (en) * 2020-07-30 2020-09-22 航天欧华信息技术有限公司 Multi-agent confrontation decision-making method based on cooperative reinforcement learning and transfer learning

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114339788A (en) * 2022-01-06 2022-04-12 中山大学 Multi-agent ad hoc network planning method and system
CN114339788B (en) * 2022-01-06 2023-11-17 中山大学 Multi-agent ad hoc network planning method and system

Similar Documents

Publication Publication Date Title
US11388644B2 (en) Apparatus and method for load balancing in wireless communication system
US9532296B2 (en) Method of multi-hop cooperative communication from terminal and base station and network for multi-hop cooperative communication
WO2015016986A1 (en) Controlling interference
JP6314490B2 (en) Determination of wireless communication precoder
US11601848B2 (en) Method and apparatus for offloading data in wireless communication system
JP2018501689A (en) How to manage communications between multiple mobiles
CN103209427B (en) User-channel-quality-based collaborative user selection method for source users
Patil et al. Stochastic modeling of depth based routing in underwater sensor networks
US20140169262A1 (en) Communication method and apparatus for multi-hop multi-session transmission
CN102711257B (en) A kind of resource allocation methods and equipment
CN111050387B (en) Base station dormancy method and device based on energy efficiency estimation, electronic equipment and medium
CN108777857B (en) Access control method and system under coexistence scene of URLLC and mMTC
CN113163479A (en) Cellular Internet of things uplink resource allocation method and electronic equipment
CN105530203A (en) Access control method and system for D2D communication link
CN113543065B (en) Communication resource allocation method based on reinforcement learning and related equipment thereof
US11864020B2 (en) Uplink bandwidth estimation over broadband cellular networks
CN116801367A (en) Cross link interference suppression method, network node and storage medium
CN108337690B (en) Multi-standard network resource allocation method applied to distributed integrated access system
CN113796127A (en) Cell selection in a multi-frequency communication network
JP6457409B2 (en) Scheduling apparatus and method
CN117318775B (en) Multi-user communication system and transmission method, equipment and medium thereof
US11791959B2 (en) Methods, apparatus and machine-readable mediums for signalling in a base station
Gardazi et al. On achieving throughput optimality with energy prediction–based power allocation in 5G networks
Liu et al. Robust Power Control in TDMA-based Vehicular Communication Network
WO2021109135A1 (en) Method and access network node for beam control

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20210723