CN111132083A - NOMA-based distributed resource allocation method in vehicle formation mode - Google Patents

NOMA-based distributed resource allocation method in vehicle formation mode Download PDF

Info

Publication number
CN111132083A
CN111132083A CN201911214993.1A CN201911214993A CN111132083A CN 111132083 A CN111132083 A CN 111132083A CN 201911214993 A CN201911214993 A CN 201911214993A CN 111132083 A CN111132083 A CN 111132083A
Authority
CN
China
Prior art keywords
noma
v2mv
link
allocation
vehicle
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911214993.1A
Other languages
Chinese (zh)
Other versions
CN111132083B (en
Inventor
郭彩丽
许世琳
冯春燕
王兆丰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Posts and Telecommunications
Original Assignee
Beijing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Posts and Telecommunications filed Critical Beijing University of Posts and Telecommunications
Priority to CN201911214993.1A priority Critical patent/CN111132083B/en
Publication of CN111132083A publication Critical patent/CN111132083A/en
Application granted granted Critical
Publication of CN111132083B publication Critical patent/CN111132083B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/30Services specially adapted for particular environments, situations or purposes
    • H04W4/40Services specially adapted for particular environments, situations or purposes for vehicles, e.g. vehicle-to-pedestrians [V2P]
    • H04W4/44Services specially adapted for particular environments, situations or purposes for vehicles, e.g. vehicle-to-pedestrians [V2P] for communication between vehicles and infrastructures, e.g. vehicle-to-cloud [V2C] or vehicle-to-home [V2H]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/30Services specially adapted for particular environments, situations or purposes
    • H04W4/40Services specially adapted for particular environments, situations or purposes for vehicles, e.g. vehicle-to-pedestrians [V2P]
    • H04W4/46Services specially adapted for particular environments, situations or purposes for vehicles, e.g. vehicle-to-pedestrians [V2P] for vehicle-to-vehicle communication [V2V]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W52/00Power management, e.g. TPC [Transmission Power Control], power saving or power classes
    • H04W52/04TPC
    • H04W52/18TPC being performed according to specific parameters
    • H04W52/24TPC being performed according to specific parameters using SIR [Signal to Interference Ratio] or other wireless path parameters
    • H04W52/243TPC being performed according to specific parameters using SIR [Signal to Interference Ratio] or other wireless path parameters taking into account interferences
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W52/00Power management, e.g. TPC [Transmission Power Control], power saving or power classes
    • H04W52/04TPC
    • H04W52/18TPC being performed according to specific parameters
    • H04W52/26TPC being performed according to specific parameters using transmission rate or quality of service QoS [Quality of Service]
    • H04W52/265TPC being performed according to specific parameters using transmission rate or quality of service QoS [Quality of Service] taking into account the quality of service QoS
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W52/00Power management, e.g. TPC [Transmission Power Control], power saving or power classes
    • H04W52/04TPC
    • H04W52/18TPC being performed according to specific parameters
    • H04W52/26TPC being performed according to specific parameters using transmission rate or quality of service QoS [Quality of Service]
    • H04W52/267TPC being performed according to specific parameters using transmission rate or quality of service QoS [Quality of Service] taking into account the information rate
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W52/00Power management, e.g. TPC [Transmission Power Control], power saving or power classes
    • H04W52/04TPC
    • H04W52/38TPC being performed in particular situations
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W72/00Local resource management
    • H04W72/04Wireless resource allocation
    • H04W72/044Wireless resource allocation based on the type of the allocated resource
    • H04W72/0453Resources in frequency domain, e.g. a carrier in FDMA
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W72/00Local resource management
    • H04W72/04Wireless resource allocation
    • H04W72/044Wireless resource allocation based on the type of the allocated resource
    • H04W72/0473Wireless resource allocation based on the type of the allocated resource the resource being transmission power
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W72/00Local resource management
    • H04W72/50Allocation or scheduling criteria for wireless resources
    • H04W72/54Allocation or scheduling criteria for wireless resources based on quality criteria
    • H04W72/541Allocation or scheduling criteria for wireless resources based on quality criteria using the level of interference
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W72/00Local resource management
    • H04W72/50Allocation or scheduling criteria for wireless resources
    • H04W72/54Allocation or scheduling criteria for wireless resources based on quality criteria
    • H04W72/542Allocation or scheduling criteria for wireless resources based on quality criteria using measured or perceived quality
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W72/00Local resource management
    • H04W72/50Allocation or scheduling criteria for wireless resources
    • H04W72/54Allocation or scheduling criteria for wireless resources based on quality criteria
    • H04W72/543Allocation or scheduling criteria for wireless resources based on quality criteria based on requested quality, e.g. QoS

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Quality & Reliability (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

The invention discloses a distributed resource allocation method based on NOMA (non-orthogonal multiple access) in a vehicle formation mode, belonging to the field of wireless communication. The method provided by the invention firstly decouples the resource allocation problem into two parts of power allocation and sub-channel allocation, and then respectively provides a power allocation scheme based on the driving state of a motorcade and a spectrum allocation scheme based on the Reinforcement Learning (RL) of a distributed multi-agent to solve. In the power distribution part, by comparing with a fixed power distribution scheme, the power distribution scheme considering the safe distance provided by the invention can provide more fair communication performance for vehicle formation on different lanes; in the spectrum allocation part, the scheme provided by the invention can fully utilize the powerful autonomous learning capacity of reinforcement learning, and the fast convergence speed is obtained by considering the neighborhood iteration sequence based on the queue position in the multi-agent Q-learning. On the premise of ensuring V2I communication, the invention realizes the maximization of the total throughput of the V2mV link and improves the communication performance of the system by utilizing the distributed resource allocation based on NOMA.

Description

NOMA-based distributed resource allocation method in vehicle formation mode
Technical Field
The invention belongs to the field of wireless communication, relates to a Non-Orthogonal multiple access (NOMA) communication system, and particularly relates to a distributed resource allocation method in a vehicle formation mode in an internet of vehicles.
Background
With the advent of the automatic driving era, the driving mode of automobiles will change greatly, and in order to reduce driving cost, environmental pollution and traffic accidents, the vehicle formation travel mode will become one of the important driving modes in the automatic driving era[1]. In each formation, the vehicles need to share the surrounding traffic and road condition information, and abundant entertainment application information, etc. Specifically, the vehicles with rich resources in the fleet communicate with other vehicles in an information sharing manner, so that a stable driving mode of the whole fleet and high-quality driving experience of drivers and passengers are maintained. However, the above process is vehicle-to-multi-vehicle communication, which cannot be realized by conventional vehicle-to-vehicle (V2V) communication. In the face of increasingly serious shortage of spectrum resources, in order to meet the requirement of V2mV communication in a vehicle queue, the invention introduces NOMA technology in the Internet of vehicles, which mainly passes through a power domain[2]Or code field[3]Multiplexing allows users to access the same channel non-orthogonally, and the receiving end demodulates the received signal by using Serial Interference Cancellation (SIC) technique. Therefore, the NOMA can greatly improve the system throughput under the condition of reducing the dependence on a large amount of spectrum resources, and meet the large-scale communication connection requirement in a vehicle formation scene.
Currently, resource allocation research based on NOMA in the internet of vehicles just starts to develop, and currently, frequency spectrum resource research of NOMA is mainly a centralized scheme, and a distributed resource allocation scheme is less. Boya Di[4]Based on the matching theoryThe spectral resource allocation algorithm is discussed to support NOMA-based vehicle to everything (V2X) communications. Yiyi Xu[5]By classifying and grouping in the internet of vehicles, a centralized spectrum allocation scheme based on NOMA is proposed. Chen[6]The problem of NOMA-based resource allocation was studied using interference hypergraph and graph coloring theory. Although much research is conducted on a centralized scheme at present, the centralized scheme has the disadvantages of incomplete Channel State Information (CSI), delayed communication request and response, and the like, and thus cannot meet the requirements of high reliability and low delay of vehicle-mounted communication. Therefore, a distributed approach is needed to implement NOMA-based resource allocation.
[1] Huang, D.Chu, C.Wu, and Y.He, IEEE Transactions on Intelligent Transportation Systems, vol.20, No.3, pp.959-974,2018.
[2] Y.saito, y.kishiyama, a.benjebbour, t.nakamura, a.li, and k.higuchi, & wireless access for cellular networks by non-orthogonal multiple access (NOMA) & In 2013IEEE 77th temporal technology conference (VTC Spring), pp.1-5, IEEE,2013.
[3] L.dai, b.wang, y.yuan, s.han, i.chih-Lin, and z.wang, non-orthogonal multiple access of 5G: solutions, challenges, opportunities and future research trends IEEE Communications Magazine, vol.53, No.9, pp.74-81,2015.
[4] B.di, l.song, y.li, and g.y.li, V2X communication for high reliability and low latency in 5G systems using non-orthogonal multiple access IEEE Journal on Selected Areas in Communications, vol.35, No.10, pp.2383-2397,2017.
[5] Xu and X.Gu, NOMA-based V2V System resource Allocation, In 2018International conference on Network Infrastructure and Digital Content (IC-NIDC), pp.239-243, IEEE,2018.
[6] Chen, B.Wang, and R.Zhang, resource allocation for interference maps in NOMA-based V2X networks IEEE Internet of Things Journal, vol.6, No.1, pp.161-170,2018.
Disclosure of Invention
The invention aims to solve the problems, and provides a distributed resource allocation method based on NOMA (non-uniform resource allocation) by utilizing NOMA (non-uniform resource allocation) technology to realize the reuse of the same resource according to a distributed resource allocation principle, which is applied to a vehicle formation mode in a vehicle networking. The invention considers a vehicle networking scene that a vehicle and an infrastructure (V2I) link and a V2mV link coexist, realizes the maximization of the total throughput of the V2mV link, and ensures the normal communication of the V2I link.
In order to achieve the technical effect, the implementation steps of the NOMA distributed resource allocation method based on the vehicle formation mode of the invention comprise:
step one, considering the influence of large-scale fading and small-scale fading of a wireless channel in a system model, and establishing a channel model;
step two, under the condition of protecting the normal communication of the V2I link, the transmission rate of the V2mV link is maximized, and the optimization target is set to be the maximum total throughput of the V2mV link;
thirdly, considering the influence of the frequency reuse of the V2I link and the V2mV link on the normal communication of the V2I link, characterizing the transmission rate of the V2I link considering the interference, and performing constraint characterization on the transmission rate;
step four, with the maximum total throughput of the V2mV link as an optimization target, taking a transmission rate threshold value, power allocation constraint and subchannel allocation constraint of the V2I link as constraint conditions of an optimization problem, constructing a distributed resource allocation model based on NOMA under vehicle formation, and decoupling the optimization problem into two parts of power allocation and subchannel allocation;
step five, adopting a power distribution scheme based on lane conditions;
step 501, analyzing and deducing the channel state of the V2mV link;
502, generating a power distribution scheme among NOMA according to the channel states of links of different lanes V2 mV;
step six, representing sub-channel allocation by using a distributed multi-agent Q-learning algorithm, and obtaining a faster convergence speed by considering a neighborhood iteration sequence based on a formation position;
601, constructing a multi-agent Q-learning framework;
step 602, updating a Q table and a strategy;
step 603, determining the sub-channel allocation scheme.
The invention has the advantages that:
(1) on the premise of not influencing the basic communication quality of V2I, the V2I communication and the V2mV link share the spectrum resource, so that the shortage of the spectrum resource is relieved;
(2) NOMA technology is introduced into the Internet of vehicles, and a user is allowed to be accessed to the same channel in a non-orthogonal manner through power domain and code domain multiplexing technology, so that the system throughput is greatly improved under the condition of reducing the dependence on a large number of spectrum resources;
(3) the resource allocation based on NOMA is realized by adopting a distributed scheme, the maximum total throughput of a V2mV link is kept on the basis of realizing V2I communication, and the requirements of high reliability and low time delay of vehicle-mounted communication are met;
drawings
FIG. 1: the V2mV communication system model schematic diagram based on NOMA in the vehicle formation mode in the vehicle networking is disclosed by the embodiment of the invention;
FIG. 2: the embodiment of the invention provides a flowchart of a distributed resource allocation method based on NOMA in a vehicle formation mode;
FIG. 3: the power distribution scheme in the present invention is compared to the average throughput of the V2mV link on different lanes for the fixed power distribution scheme mentioned in the summary of the invention (graph).
FIG. 4: the invention compares the graph (graph) with the cumulative distribution function of other various resource allocation schemes on the V2I link throughput;
FIG. 5: the present invention is a graph (graph) of the total throughput of V2mV links versus other resource allocation schemes.
FIG. 6: the present invention is a graph (graph) comparing the average run time with other resource allocation schemes.
FIG. 7: the invention is compared with other resource allocation schemes in convergence performance (graph).
Detailed Description
In order that the technical principles of the present invention may be more clearly understood, embodiments of the present invention are described in detail below with reference to the accompanying drawings.
The communication system model of the present invention is shown in fig. 1 and comprises an autonomous driving section of U unidirectional lanes, in which a V2mV link coexists with a V2I link, and different lanes specify different driving speeds ({ V [ ]1,…,vU}) and safety distance ({ d)1,…,dU}). SV in modelkAnd SVk'(K, K '∈ {1,2, …, K }, K ≠ K') denotes K individual traveling vehicles, PVn and PVm (N, m ∈ {1,2, …, N }, N ≠ m) denotes N formation of autonomous vehicles. Each vehicle formation has a speed V ═ V specified in the lane1,…,vNAnd the corresponding vehicle safety distance D ═ D1,…,dNAnd driving in sequence. The vehicle formation n and the m respectively comprise a vehicle set of psinAnd ΨmIn which is defined
Figure BDA0002299264410000041
And
Figure BDA0002299264410000042
respectively the sending vehicles in convoy n and m,
Figure BDA0002299264410000043
Figure BDA0002299264410000044
and
Figure BDA0002299264410000045
Figure BDA0002299264410000046
and
Figure BDA0002299264410000047
for the v and w receiving vehicles in convines n and m, respectively.
The scene mainly comprises two communication modes of V2I and V2mV, wherein in V2I communicationBase station and SVkAnd SVk'The channel gains of the communications are respectively
Figure BDA0002299264410000048
In V2mV communication
Figure BDA0002299264410000049
And
Figure BDA00022992644100000410
and
Figure BDA00022992644100000411
the channel gains of the communications are respectively
Figure BDA00022992644100000412
Figure BDA00022992644100000413
And
Figure BDA00022992644100000414
and
Figure BDA00022992644100000415
the channel gains of the communications are respectively
Figure BDA00022992644100000416
In addition, the base station pair
Figure BDA00022992644100000417
Respectively is
Figure BDA00022992644100000418
To pair
Figure BDA00022992644100000419
And
Figure BDA00022992644100000420
respectively is
Figure BDA00022992644100000421
To pair
Figure BDA00022992644100000422
Interference of
Figure BDA00022992644100000423
For the V2I link, each individual traveling vehicle receives information from the base station through Orthogonal Frequency Division Multiple Access (OFDMA). To alleviate the spectrum resource shortage situation, the present invention assumes that the V2mV link reuses the spectrum resources allocated to the V2I link using an underlay pattern in a Cognitive Radio (CR) network. For convenience, the present invention refers to NOMA-based intra-formation communications collectively as V2 mV.
Referring to fig. 2, a flowchart of a distributed resource allocation method based on NOMA in a vehicle formation mode according to the present invention includes the steps of:
step one, characterizing a channel model S1: in the system model, large-scale fading caused by path loss and small-scale fading caused by doppler effect are mainly considered. Large-scale fading G based on distance d and path loss exponent γLIs defined as:
Figure BDA0002299264410000051
Figure BDA0002299264410000052
wherein G is0Is at a reference distance d0Attenuation of (G) ofrxAnd GtxThe gain of the antenna is represented by,
Figure BDA0002299264410000053
is related to the carrier frequency fcAnd the wavelength of the speed of light c. The presence of fast fading can be demonstrated using a rayleigh channel model due to the doppler effect caused by relative velocity. Based on statistical distribution theory and law of large numbers, the impulse response h (t, tau) of the channel follows a complex Gaussian distribution with amplitude | hi(t)|The rayleigh distribution obeyed is:
Figure BDA0002299264410000054
where σ is a constant and σ > 0.
Step two, optimizing target representation S2: the invention provides a reference transmission rate as a threshold value for judging whether a V2I link is interrupted, and maximizes the transmission rate of a V2mV link on the basis of protecting V2I link communication so as to meet the requirement of information sharing among teams. Therefore, the optimization goal of the present invention is to maximize the overall throughput of the V2mV link. The invention researches the situation that two receiving users exist, and can be popularized to the situation that more than 2 receiving users exist in formation. The internal interference of V2mV, the mutual interference caused by multiplexing the same channel l with V2mV n by other V2mV links, and the interference caused by the base station are respectively defined as
Figure BDA0002299264410000055
And
Figure BDA0002299264410000056
Figure BDA0002299264410000057
Figure BDA0002299264410000058
Figure BDA0002299264410000059
further, the throughputs of user v and user w in formation n are obtained
Figure BDA00022992644100000510
And
Figure BDA00022992644100000511
respectively as follows:
Figure BDA00022992644100000512
Figure BDA00022992644100000513
wherein omega l1,2, …, L represents the set of available spectrum resources, L ∈ ΩlFor the frequency band allocated to V2mV n, muvAnd muwPower allocation factors for users v and w, respectively, based on the NOMA power multiplexing rule, assuming channel gain in the formation
Figure BDA0002299264410000068
Is lower than
Figure BDA0002299264410000069
At this time μv>μw,μvw=1。PnAnd PmTransmission power, P, of V2mV n and m, respectivelyl cAnd BlRespectively base station transmission power and bandwidth at frequency band l, N0Is the power spectral density of Additive White Gaussian Noise (AWGN).
The optimization objective is to maximize the total throughput of the V2mV link, characterized by:
Figure BDA0002299264410000061
wherein
Figure BDA0002299264410000062
Step three, interference rate constraint characterization S3: since V2mV link set omegakThe same frequency band is shared by the V2I link, and the interference on the V2I link k is
Figure BDA0002299264410000063
Corresponding to interference rate Rkc is as follows:
Figure BDA0002299264410000064
wherein
Figure BDA0002299264410000065
Is V2mV n vs. SVkThe interference of (2).
To ensure the communication quality of the V2I link, the throughput R of the V2I link kkc should be p0Is greater than a predetermined threshold
Figure BDA0002299264410000066
Namely:
Figure BDA0002299264410000067
step four, establishing an optimization model S4: taking the throughput of each V2mV link as an optimization variable, maximizing the total throughput of the V2mV link as an optimization target, and taking a constraint condition which needs to be met by spectrum multiplexing and the maximum power limit of the V2mV link and the V2I link as optimization conditions, establishing an optimization model of a resource allocation problem based on NOMA:
Figure BDA0002299264410000071
wherein the first constraint represents the throughput of the V2I link k
Figure BDA0002299264410000072
Should be given as p0Is greater than a predetermined threshold
Figure BDA0002299264410000073
In the second constraint of S n,l1 indicates that the frequency band l has been allocated to V2mV n, S n,l0 indicates that the frequency band l is not allocated to V2mV n; the third and fourth constraints give the maximum number of multiplexes of sub-channels/where LmaxDefining the maximum multiplexing number of the frequency band; in the fifth and sixth constraints
Figure BDA0002299264410000074
For the received power of the V-th vehicle in V2mV n,
Figure BDA0002299264410000075
and
Figure BDA0002299264410000076
limiting the maximum power of V2mV n and the base station, respectively.
The optimization problem is a non-convex MINLP problem due to the discrete domain of the optimization target channel allocation result and the continuous domain limitation of the power allocation result. Due to the extremely high computational complexity of the exhaustive search algorithm, it is not practical to obtain a global solution through it. Therefore, similar to other resource allocation solution schemes, the present invention decouples the entire resource allocation problem into two sub-problems, power allocation and sub-channel.
Step five, power distribution characterization S5: the power allocation can be divided into power allocation between V2mV (inter-V2mV) and power allocation inside V2mV (intra-V2 mV). The principle of power allocation of intra-V2mV is essentially power multiplexing of NOMA, and much research has been done on power multiplexing technology of NOMA so far, therefore, the present invention uses the power multiplexing scheme proposed by Zhiguo Ding for power allocation of intra-V2 mV. Next, the invention focuses on the inter-V2mV power distribution problem with optimization problems of the first, the fifth and the six constraint conditions, and proposes a power distribution scheme based on lane conditions.
Step 501, channel state analysis and derivation S51: because the influence of the path loss on effective signals is generally far greater than that of fast fading in the traditional channel model, the invention provides that the power distribution is reasonably adjusted according to the corresponding safe distances of different lanes, thereby realizing the fair system performance among NOMA on different lanes. Based on reasonable assumptions and theoretical derivation, the invention provides an effective inter-V2mV power allocation scheme. Derived from
Figure BDA0002299264410000077
And
Figure BDA0002299264410000078
throughput R of V2mV nnComprises the following steps:
Figure BDA0002299264410000081
randomly selecting vehicles v and v +1(2 ≦ v < v +1 ≦ Ψ ≦ v ≦ 1 ≦ Ψ ≦ nn) Their channel gains satisfy
Figure BDA0002299264410000082
According to the principle of Serial Interference Cancellation (SIC) demodulation of NOMA, the content of the receiving vehicle v +1 is demodulated at the vehicle v +1 and the vehicle v, and the signal to interference and noise ratios (SINR) thereof are respectively recorded as
Figure BDA0002299264410000083
And
Figure BDA0002299264410000084
satisfies the following conditions:
Figure BDA0002299264410000085
wherein
Figure BDA0002299264410000089
Indicating an equivalent derivation. The content of the receiving vehicle v is demodulated at the vehicle v +1 and the vehicle v respectively, and the signal to interference and noise ratio is recorded as
Figure BDA0002299264410000086
And
Figure BDA0002299264410000087
satisfies the following conditions:
Figure BDA0002299264410000088
the throughput R of V2mV n is obtainednThe approximation is:
Figure BDA0002299264410000091
wherein
Figure BDA0002299264410000092
Represents an equivalent condition to condition Δ:
Figure BDA0002299264410000093
Figure BDA0002299264410000094
the present invention assumes that this approximation equation can be established as long as SIC can be successfully performed in each formation.
Step 502, generating a power allocation scheme between NOMA S52: the invention assumes formation of n and m on different lanes, their throughputs being R respectivelynAnd Rm. Thus, RnAnd RmThe difference of (d) is:
Figure BDA0002299264410000095
suppose that
Figure BDA0002299264410000096
Therefore, there are:
Figure BDA0002299264410000097
in this way, the reference power of the V2mV link supporting NOMA is introduced. And distributing the reference power to the lane with the minimum safety distance, and completing the NOMA-based vehicle formation power distribution on other lanes by the above formula.
Step six, sub-channel allocation characterization S6: due to the strong autonomous learning capacity of Q-learning in a complex strange environment, in order to solve the problem of sub-channel resource allocation with first, second, third and fourth constraint conditions, the Q-learning-based reinforcement learning framework is introduced. Unlike conventional Q-learning, the algorithm proposed by the present invention decomposes the global optimal solution into a plurality of approximately optimal local solutions. In the process, each agent only considers the state and action of the adjacent agent, so that the state space and the action space of each agent can be reduced to a relatively small scale, and the convergence performance of each agent is remarkably improved. At this point, since path loss plays a major role in channel gain, the effect of neighboring agent states is more significant for each agent than for distant agents, so it is a feasible solution to solve for a local solution instead of a global solution. The invention assumes that each agent can receive the state of the adjacent agent, and makes decision according to the state of the adjacent agent without considering the agent with longer distance, so as to reduce the dimension of feasible solution on the premise of ensuring the solution quality.
Step 601, constructing a Q-learning frame S61: the proposed Q-learning framework mainly comprises five basic components: a) the intelligent agent, b) action, c) state, d) reward and e) iteration sequence, the algorithm is characterized in that the state and the iteration sequence of adjacent intelligent agents are considered, and the specific meaning of each part is as follows:
a) agent-each agent corresponds to each V2mV, i.e., {1,2, …, N }, and thus, there are multiple agents in the proposed reinforcement learning framework.
b) The actions are as follows: the action set a ═ (1,2, …, L) is the set of subchannels that the agent chooses in a uniformly distributed manner, each action a ∈ a corresponding to each spectrum L.
c) The state is as follows: the state of each agent is defined as S ═ { V, W, P, Ω }, S ∈ S, where V, W, P and Ω represent the states of the agent' S relative velocity, position, power allocation, and subchannel allocation, respectively, given the limited number of neighboring agents.
d) Rewarding: for agent n, reward function Re will be based on its previous state and selected actionn(s, a) is defined as the throughput of agent n and is passed through
Figure BDA0002299264410000104
Wherein l is implemented as a.
e) And (3) iteration sequence: stackDetermining the Q-learning sequence of the V2mV link according to the distance d from the formation position to the start point of the road section1≥…≥dn≥…≥dNIs sorted in descending order, defined as
Figure BDA0002299264410000101
Step 602, update Q table and policy S62: in order to obtain an agent
Figure BDA0002299264410000102
Based on the optimal subchannel allocation solution of the iteration sequence, the proposed algorithm needs to use a Q-table to store the reward values resulting from different states and actions. According to Bellman's optimal equation, agent
Figure BDA0002299264410000103
The optimal Q value of (a) is defined as:
Figure BDA0002299264410000111
wherein
Figure BDA0002299264410000112
Wherein p isss'For transition probability from state s to s ', r (s, a) is the reward obtained by action a in state s, γ is the discount factor, φ is the number of adjacent agents, Z is the set of integers, and a ' is the action performed in state s '. At each iteration, the Q table will be updated:
Figure BDA0002299264410000113
wherein α is the learning rate, a*For optimal behaviour in the state s, i.e.
Figure BDA0002299264410000114
s' is the next state reached after completing action a at state s.
Agent selects action a to update Q tableStrategy naComprises the following steps:
Figure BDA0002299264410000115
wherein, strategy piaCorresponding to the probability of selecting action a and the probability of exploration epsilon, respectively, | A | being an agent
Figure BDA0002299264410000116
The total number of actions selected.
Step 603, determining a sub-channel allocation scheme S63: and obtaining a converged Q table through S62, selecting the optimal action and state according to the converged Q table, and determining the optimal sub-channel allocation scheme.
Fig. 3 verifies the effectiveness of the proposed power allocation scheme of the present invention by simulating the average throughput of V2mV links on different lanes, where the power allocation scheme represents a safe distance based inter-NOMA power allocation scheme and no power allocation represents the same power allocation to all V2mV links. As can be seen from the figure, the average throughput distribution of the V2mV link on each lane is relatively uniform compared to the case without power distribution, and therefore, the V2mV link on each lane will obtain a fair communication service.
Fig. 4, 5, 6 and 7 show simulation results of performance indexes such as average total throughput, operation time and convergence performance of the V2I link and the V2mV link based on the inter-NOMA power allocation scheme proposed by the present invention. The algorithm is named as a distributed NOMA resource allocation algorithm based on multi-agent Q-learning, and is compared with other distributed and centralized comparison algorithms. In the distributed scheme, in contrast to the distributed V2V resource allocation algorithm based on multi-agent Q-learning, the algorithm divides each V2mV link in the NOMA scheme into | Ψnl-1D 2D-based V2V link, the other parameters of which are the same as for the NOMA scheme. Centralized schemes include a group theory algorithm, a greedy algorithm, and a stochastic algorithm. The software and hardware parameters of the server used for simulation are as follows: window Server 2019, Intel (R) Xeon (R)2.6GHz processor, 16GB RAM.
Fig. 4 compares the cumulative distribution function of the present invention with various other resource allocation schemes with respect to the throughput of the V2I link, and it can be seen that the scheme of the present invention is superior to the V2V scheme, and the performance of the scheme is slightly inferior to greedy and random resource allocation algorithms, but more than 90% of the V2I link can reach the reference rate.
Fig. 5 compares the total throughput of V2mV link according to the present invention with various other resource allocation schemes, and it can be seen that the exhaustion method can achieve better performance with less advantages at the cost of huge computational complexity compared with the proposed scheme. Compared with the V2V scheme, the proposed scheme is generally more advantageous except for the case of a smaller number of queues, because the V2V scheme can utilize more spectrum resources than the NOMA scheme when the spectrum utilization environment is not congested, and the advantage of the proposed algorithm gradually appears as the V2mV link increases. Furthermore, the performance of the proposed algorithm is clearly superior to the centralized algorithm described above.
Fig. 6 and 7 compare the average runtime and convergence performance of the present invention with other resource allocation schemes, respectively. Compared with the traditional Q-learning resource allocation algorithm, the multi-agent of the proposed algorithm can update the Q table at the same time, so that the time for the algorithm to converge is shorter. The superiority of convergence verifies the effectiveness of the algorithm in considering the iteration sequence and the strategy of the adjacent agent states. As shown in fig. 6, the proposed algorithm consumes less runtime than the V2V approach, and can also be verified in fig. 7 by its smaller number of iterations. As can be seen from fig. 6, the proposed algorithm will consume more runtime than a centralized solution, but still within an acceptable time frame.
In summary, by implementing the NOMA-based distributed resource allocation method in the vehicle formation mode according to the embodiment of the present invention, fairer and efficient communication can be realized for the V2mV links between different lanes, and the transmission rate of the V2mV link is greatly increased on the basis of ensuring the communication of the V2I link.
While the foregoing is directed to the preferred embodiment of the present invention, it will be understood by those skilled in the art that various changes and modifications may be made without departing from the spirit and scope of the invention.

Claims (4)

1. A distributed resource allocation method based on NOMA (non-orthogonal multiple access) in a vehicle formation mode is characterized by comprising the following steps:
considering the influence of large-scale fading and small-scale fading of a wireless channel in a system model, and establishing a channel model;
maximizing the transmission rate of the V2mV link under the condition of protecting the V2I link from normal communication, and setting an optimization target to maximize the total throughput of the V2mV link;
considering the influence of frequency reuse of the V2I and V2mV links on normal communication of the V2I link, characterizing the transmission rate of the V2I link considering interference, and performing constraint characterization on the transmission rate;
taking the maximum total throughput of the V2mV link as an optimization target, taking a transmission rate threshold value, power allocation constraint and subchannel allocation constraint of the V2I link as constraint conditions of an optimization problem, constructing a distributed resource allocation model based on NOMA under vehicle formation, and decoupling the optimization problem into two parts, namely power allocation and subchannel allocation;
the power distribution scheme based on lane conditions is adopted, and the power distribution is reasonably adjusted according to the corresponding safe distances of different lanes, so that the fair system performance among NOMA on different lanes is realized;
and characterizing sub-channel allocation by using a distributed multi-agent Q-learning algorithm, and obtaining a faster convergence speed by considering a neighborhood iteration sequence based on a formation position.
2. The method of claim 1, wherein the power allocation scheme based on lane conditions comprises:
the method comprises the following steps: the nth V2mV formation throughput R is obtained through analysisnThe calculation method of (2) deduces the unequal relation of the signal-to-interference-and-noise ratios in the formation to obtain RnAn approximate formula of (d);
step two: for achieving comparison between NOMA on different lanesFair system performance, calculating the difference function R of the throughputs of formation n and m on different lanesn-RmThe difference being related to the vehicle formation transmission power PnAnd PmThereby completing the NOMA-based vehicle formation power allocation.
3. Method according to claim 2, characterized in that the system performance that achieves a fair between NOMA on different lanes is embodied as:
Figure FDA0002299264400000011
therein ΨnAnd ΨmThe vehicle fleet n and m contain a collection of vehicles,
Figure FDA0002299264400000012
formation of a sending vehicle in an nth vehicle formation with a | ΨnThe channel gain of the vehicle communication,
Figure FDA0002299264400000013
formation of the m-th vehicle with the | ΨmChannel gain for vehicle communications;
in order to realize fair system performance, reference power of a V2mV link supporting NOMA is introduced, the reference power is distributed to a lane with the minimum safety distance, and the method comprises the steps of
Figure FDA0002299264400000014
And completing the NOMA-based vehicle formation power distribution on other lanes.
4. The method of claim 1, wherein the distributed multi-agent Q-learning algorithm comprises:
the method comprises the following steps: constructing a Q-learning framework, and defining agents, actions, states, rewards and an iteration sequence in the framework; the intelligent agent is a V2mV formation, the action is a subchannel selected by the intelligent agent in a uniformly distributed mode, the state consists of the relative speed, the position, the power distribution and the subchannel state of the intelligent agent, the reward is the throughput of the intelligent agent, and the iteration sequence determines the Q-learning sequence of the V2mV link;
step two: obtaining an optimal sub-channel distribution solution of the agent based on an iteration sequence, storing reward values obtained from different states and actions by using a Q table, obtaining an optimal Q value according to a Bellman optimal equation, and obtaining an optimal Q value according to a strategy piaUpdating the Q table;
step three: and obtaining a converged Q table, selecting the optimal action and state according to the converged Q table, and determining the optimal sub-channel allocation scheme.
CN201911214993.1A 2019-12-02 2019-12-02 NOMA-based distributed resource allocation method in vehicle formation mode Active CN111132083B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911214993.1A CN111132083B (en) 2019-12-02 2019-12-02 NOMA-based distributed resource allocation method in vehicle formation mode

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911214993.1A CN111132083B (en) 2019-12-02 2019-12-02 NOMA-based distributed resource allocation method in vehicle formation mode

Publications (2)

Publication Number Publication Date
CN111132083A true CN111132083A (en) 2020-05-08
CN111132083B CN111132083B (en) 2021-10-22

Family

ID=70496869

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911214993.1A Active CN111132083B (en) 2019-12-02 2019-12-02 NOMA-based distributed resource allocation method in vehicle formation mode

Country Status (1)

Country Link
CN (1) CN111132083B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112055335A (en) * 2020-09-18 2020-12-08 深圳恩步通信技术有限公司 Uplink vehicle-mounted communication resource allocation method and system based on NOMA
CN112272353A (en) * 2020-10-09 2021-01-26 山西大学 Device-to-device proximity service method based on reinforcement learning
CN113163368A (en) * 2021-05-19 2021-07-23 浙江凡双科技有限公司 Resource allocation method of low-delay high-reliability V2V system
TWI830235B (en) * 2022-05-24 2024-01-21 國立成功大學 Resource allocation method in downlink multi-user superposition transmission based on artificial intelligence

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170134080A1 (en) * 2015-11-05 2017-05-11 Samsung Electronics Co., Ltd Method and apparatus for fd-mimo based multicasting in vehicular communication systems
WO2017116108A1 (en) * 2015-12-28 2017-07-06 Samsung Electronics Co., Ltd. Methods and apparatus for resource collision avoidance in device to device communication
WO2018084524A1 (en) * 2016-11-02 2018-05-11 엘지전자(주) Method for performing sidelink transmission in wireless communication system and apparatus therefor
CN109905918A (en) * 2019-02-25 2019-06-18 重庆邮电大学 A kind of NOMA honeycomb car networking dynamic resource scheduling method based on efficiency
CN110418399A (en) * 2019-07-24 2019-11-05 东南大学 A kind of car networking resource allocation methods based on NOMA

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170134080A1 (en) * 2015-11-05 2017-05-11 Samsung Electronics Co., Ltd Method and apparatus for fd-mimo based multicasting in vehicular communication systems
WO2017116108A1 (en) * 2015-12-28 2017-07-06 Samsung Electronics Co., Ltd. Methods and apparatus for resource collision avoidance in device to device communication
WO2018084524A1 (en) * 2016-11-02 2018-05-11 엘지전자(주) Method for performing sidelink transmission in wireless communication system and apparatus therefor
CN109905918A (en) * 2019-02-25 2019-06-18 重庆邮电大学 A kind of NOMA honeycomb car networking dynamic resource scheduling method based on efficiency
CN110418399A (en) * 2019-07-24 2019-11-05 东南大学 A kind of car networking resource allocation methods based on NOMA

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
HANYU ZHENG 等: "Joint Resource Allocation With Weighted Max-Min Fairness for NOMA-Enabled V2X Communications", 《IEEE ACCESS》 *
张璐: "NOMA技术与车联网V2I通信结合系统性能仿真研究", 《信息通信》 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112055335A (en) * 2020-09-18 2020-12-08 深圳恩步通信技术有限公司 Uplink vehicle-mounted communication resource allocation method and system based on NOMA
CN112055335B (en) * 2020-09-18 2024-02-09 深圳恩步通信技术有限公司 NOMA-based uplink vehicle-mounted communication resource allocation method and system
CN112272353A (en) * 2020-10-09 2021-01-26 山西大学 Device-to-device proximity service method based on reinforcement learning
CN112272353B (en) * 2020-10-09 2021-09-28 山西大学 Device-to-device proximity service method based on reinforcement learning
CN113163368A (en) * 2021-05-19 2021-07-23 浙江凡双科技有限公司 Resource allocation method of low-delay high-reliability V2V system
CN113163368B (en) * 2021-05-19 2022-09-13 浙江凡双科技有限公司 Resource allocation method of low-delay high-reliability V2V system
TWI830235B (en) * 2022-05-24 2024-01-21 國立成功大學 Resource allocation method in downlink multi-user superposition transmission based on artificial intelligence

Also Published As

Publication number Publication date
CN111132083B (en) 2021-10-22

Similar Documents

Publication Publication Date Title
CN111132083B (en) NOMA-based distributed resource allocation method in vehicle formation mode
CN112601197B (en) Resource optimization method in train-connected network based on non-orthogonal multiple access
CN109068391B (en) Internet of vehicles communication optimization algorithm based on edge calculation and Actor-Critic algorithm
Cecchini et al. LTEV2Vsim: An LTE-V2V simulator for the investigation of resource allocation for cooperative awareness
Mei et al. A latency and reliability guaranteed resource allocation scheme for LTE V2V communication systems
Zhang et al. A UAV-enabled data dissemination protocol with proactive caching and file sharing in V2X networks
Hou et al. Joint allocation of wireless resource and computing capability in MEC-enabled vehicular network
CN107094321B (en) Multi-agent Q learning-based vehicle-mounted communication MAC layer channel access method
Wang et al. Self-adaptive clustering and load-bandwidth management for uplink enhancement in heterogeneous vehicular networks
CN113891477A (en) Resource allocation method based on MEC calculation task unloading in Internet of vehicles
CN114650567A (en) Unmanned aerial vehicle-assisted V2I network task unloading method
Hosseini et al. Stackelberg game-based deployment design and radio resource allocation in coordinated UAVs-assisted vehicular communication networks
CN113691956B (en) Internet of vehicles mobility management method based on SDN and MEC
Hazarika et al. Multi-agent DRL-based computation offloading in multiple RIS-aided IoV networks
Sun et al. Control efficient power allocation of uplink NOMA in UAV-aided vehicular platooning
Cho et al. Energy-efficient computation task splitting for edge computing-enabled vehicular networks
Paymard et al. Task scheduling based on priority and resource allocation in multi-user multi-task mobile edge computing system
Xu et al. NOMA enabled resource allocation for vehicle platoon-based vehicular networks
Yang et al. Task-driven semantic-aware green cooperative transmission strategy for vehicular networks
Hosseini et al. Sub-Optimum Radio Resource Allocation in Vehicle-to-Vehicle Communications Based on A Multi-Step Hungarian Algorithm
CN114928611A (en) Internet of vehicles energy-saving calculation unloading optimization method based on IEEE802.11p protocol
Tian et al. Deep reinforcement learning-based dynamic offloading management in UAV-assisted MEC system
CN111132312B (en) Resource allocation method and device
Wang et al. Joint Spectrum Allocation and Power Control in Vehicular Networks Based on Reinforcement Learning
Jagadeesha et al. User satisfaction based resource allocation schemes for multicast in D2D networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant