CN112367638A - Intelligent frequency spectrum selection method for vehicle-vehicle communication of urban rail transit vehicle - Google Patents

Intelligent frequency spectrum selection method for vehicle-vehicle communication of urban rail transit vehicle Download PDF

Info

Publication number
CN112367638A
CN112367638A CN202110032707.0A CN202110032707A CN112367638A CN 112367638 A CN112367638 A CN 112367638A CN 202110032707 A CN202110032707 A CN 202110032707A CN 112367638 A CN112367638 A CN 112367638A
Authority
CN
China
Prior art keywords
vehicle
link
communication
agent
resource block
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110032707.0A
Other languages
Chinese (zh)
Inventor
赵军辉
陈垚
张青苗
廖龙霞
周天清
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
East China Jiaotong University
Original Assignee
East China Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by East China Jiaotong University filed Critical East China Jiaotong University
Priority to CN202110032707.0A priority Critical patent/CN112367638A/en
Publication of CN112367638A publication Critical patent/CN112367638A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/30Services specially adapted for particular environments, situations or purposes
    • H04W4/40Services specially adapted for particular environments, situations or purposes for vehicles, e.g. vehicle-to-pedestrians [V2P]
    • H04W4/42Services specially adapted for particular environments, situations or purposes for vehicles, e.g. vehicle-to-pedestrians [V2P] for mass transport vehicles, e.g. buses, trains or aircraft
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W16/00Network planning, e.g. coverage or traffic planning tools; Network deployment, e.g. resource partitioning or cells structures
    • H04W16/14Spectrum sharing arrangements between different networks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/30Services specially adapted for particular environments, situations or purposes
    • H04W4/40Services specially adapted for particular environments, situations or purposes for vehicles, e.g. vehicle-to-pedestrians [V2P]
    • H04W4/46Services specially adapted for particular environments, situations or purposes for vehicles, e.g. vehicle-to-pedestrians [V2P] for vehicle-to-vehicle communication [V2V]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W72/00Local resource management
    • H04W72/04Wireless resource allocation
    • H04W72/044Wireless resource allocation based on the type of the allocated resource
    • H04W72/0453Resources in frequency domain, e.g. a carrier in FDMA
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W72/00Local resource management
    • H04W72/50Allocation or scheduling criteria for wireless resources
    • H04W72/54Allocation or scheduling criteria for wireless resources based on quality criteria
    • H04W72/541Allocation or scheduling criteria for wireless resources based on quality criteria using the level of interference
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Quality & Reliability (AREA)
  • Aviation & Aerospace Engineering (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

The invention provides an intelligent frequency spectrum selection method for vehicle-to-vehicle communication of urban rail transit, and provides a new frequency spectrum selection scheme on the basis of the existing research, aiming at reducing the influence of interference generated among trains in different communication modes on the whole communication system, and innovatively introducing multi-agent low-dimensional fingerprints on the basis of the traditional multi-agent reinforcement learning algorithm, so that the dimension of a Q function is greatly reduced, and the learning stability is improved. The intelligent frequency spectrum selection scheme determines the communication frequency spectrum and the transmitting power suitable for the intelligent agent by enabling the current intelligent agent to learn the train operation environment and other intelligent agent decision schemes, so that the influence of interference generated by frequency spectrum multiplexing on a system is reduced. The method can be applied to the environment of train-to-train communication of urban rail transit trains, and has strong practicability.

Description

Intelligent frequency spectrum selection method for vehicle-vehicle communication of urban rail transit vehicle
Technical Field
The invention relates to the technical field of wireless communication networks, in particular to an intelligent frequency spectrum selection method for vehicle-to-vehicle communication of urban rail transit vehicles.
Background
With the rapid development of the urban rail transit industry, a two-way train-to-ground (T2G) communication structure in a communication-based train control (CBTC) system cannot meet the requirements of high efficiency and safety of the current rail transit system due to the problems of multiple equipment configurations, complex system structure, large interference of the used communication technology and the like. Therefore, a new type of train communication system is compelled to be needed. Train-to-train (T2T) direct communication (train-to-train, abbreviated as train-to-train communication) can significantly improve the operation efficiency and safety of trains, but it can also reduce trackside equipment to a great extent, and it has become a trend to apply the T2T technology to a new train communication system.
Although the application of T2T communication to urban rail transit systems has great advantages for improving system performance, the T2G technology is still necessary at the present stage, for example, switch state information, train line information, etc. still need to be acquired by train and trackside equipment communication. In the case where T2G communication coexists with T2T communication, the T2T communication link multiplexes T2G communication link uplink spectrum resources in order to make full use of the limited spectrum resources, which also creates co-channel interference problems. Therefore, an effective spectrum sharing scheme is needed to manage interference and reduce the influence of interference as much as possible.
Disclosure of Invention
Therefore, the invention aims to provide an intelligent frequency spectrum selection method for vehicle-to-vehicle communication of urban rail transit, so as to reduce the influence of interference generated by frequency spectrum sharing on the system performance.
An intelligent frequency spectrum selection method for vehicle-to-vehicle communication of urban rail transit vehicles comprises the following steps:
step one, constructing a trackA model of a train-to-train communication system in a traffic radio communication network, wherein the train is located in a cell of radiusLIn a single cell, there isMIndividual vehicle-to-ground communication link, presentNIndividual car-to-car communication links, available bandwidth being divided intoRResource block, definitionR=MEach T2G link uses a single resource block for a coherent time period at the secondmOn one resource blocktThe channel power gain for each T2T link is expressed as:
Figure 967106DEST_PATH_IMAGE001
(1)
wherein the content of the first and second substances,
Figure 312637DEST_PATH_IMAGE002
for large scale fading coefficients, including path loss and shadow fading,
Figure 807203DEST_PATH_IMAGE003
is a small scale fading power component;
step two, for the interference problem existing in the system model, establishing the SINR of the trains with different communication modes, wherein, the SINR is positioned in the resource blockmTo above, totT2T vehicle to
Figure 808657DEST_PATH_IMAGE004
The channel gain for a T2T vehicle is shown as
Figure 273137DEST_PATH_IMAGE005
Of 1 attThe vehicle-to-base station channel gain of T2T is expressed as
Figure 597808DEST_PATH_IMAGE006
Of 1 atmThe vehicle-to-base station channel gain of T2G is expressed as
Figure 653488DEST_PATH_IMAGE007
Of 1 atmT2G vehicle totThe channel gain for a T2T vehicle is shown as
Figure 548763DEST_PATH_IMAGE008
Then, in the resource blockmTo above, tomThe signal-to-interference-and-noise ratio SINR of each T2G link is expressed as:
Figure 285775DEST_PATH_IMAGE009
(2)
first, thetThe SINR of the T2T link is expressed as:
Figure 605898DEST_PATH_IMAGE010
(3)
wherein the content of the first and second substances,
Figure 425955DEST_PATH_IMAGE011
is shown asmThe transmitted power of each T2G vehicle,
Figure 402001DEST_PATH_IMAGE012
is shown astThe transmitted power of each T2T vehicle,
Figure 473863DEST_PATH_IMAGE013
is shown as
Figure 523858DEST_PATH_IMAGE014
The transmitted power of each T2T vehicle,
Figure 859025DEST_PATH_IMAGE015
which is indicative of the power of the noise,
Figure 119105DEST_PATH_IMAGE016
and
Figure 119291DEST_PATH_IMAGE017
all represent spectrum resource allocation indexes when
Figure 351689DEST_PATH_IMAGE016
Or
Figure 654494DEST_PATH_IMAGE018
Has a value of1, when is, denotestA T2T link or the second
Figure 277237DEST_PATH_IMAGE014
T2T link using the firstmA resource block of
Figure 894163DEST_PATH_IMAGE016
Or
Figure 43384DEST_PATH_IMAGE018
When the value of (A) is 0, it means not used;
step three, respectively calculating the channel capacities of the T2T link and the T2G link by using the SINRs calculated in the step two, wherein:
first, themT2G link using the firstmThe channel capacity per resource block is expressed as:
Figure 454774DEST_PATH_IMAGE019
(4)
first, thetT2T link using the firstmThe channel capacity per resource block is expressed as:
Figure 548501DEST_PATH_IMAGE020
(5)
whereinBA bandwidth for each spectrum;
constructing a deep reinforcement learning model introducing multi-agent low-dimensional fingerprints, and specifically comprising a training stage and a testing stage;
s4.1 training phase
S4.1.1 in a coherent time periodkGiven the current environmental stateS k
S4.1.2 consider the T2T communication link as agents, each agenttObtaining observations from the environment
Figure 31435DEST_PATH_IMAGE021
OExpressing the observation function, adding the low-dimensional fingerprint features into the observation value, and changing the observation value into
Figure 644950DEST_PATH_IMAGE022
Wherein, in the step (A),
Figure 492820DEST_PATH_IMAGE023
is a greedy coefficient to be used for the image display,efor training the number of iterations, the strategy is
Figure 214788DEST_PATH_IMAGE024
The strategy adopted by all the agents isA k
S4.1.3 the reward earned by the agent is
Figure 94889DEST_PATH_IMAGE025
S4.1.4 environmental state by probability
Figure 156386DEST_PATH_IMAGE026
Enter the next state
Figure 971895DEST_PATH_IMAGE027
S k+1To representkThe environmental state of the +1 time period,
Figure 790946DEST_PATH_IMAGE028
to representkThe value of the ambient state for the +1 time period,rto representkThe value of the +1 time period prize,Sa value representing the current state of the device,
Figure 553366DEST_PATH_IMAGE029
is shown inkAn action taken by the time period;
s4.1.5 obtaining new observed values for each agent
Figure 266107DEST_PATH_IMAGE030
The agents share the same reward throughout the environment;
s4.2 testing stage
S4.2.1 during a coherence periodkWithin, each agent first estimates environmental observations
Figure 642731DEST_PATH_IMAGE031
S4.2.2 according to the trained Q network, the intelligent agent can select the strategy with the maximum strategy value
Figure 73712DEST_PATH_IMAGE024
S4.2.3 the agent begins transmitting data according to the transmit power and spectral resources determined by the selected policy.
The intelligent frequency spectrum selection method for vehicle-to-vehicle communication of the urban rail transit provided by the invention has the following beneficial effects:
(1) the frequency spectrum sharing and the transmitting power of the T2T communication train are optimized, and the system channel capacity is maximized under the condition of ensuring the communication quality of the T2G communication train;
(2) the multi-agent low-dimensional fingerprint is introduced, and the T2T communication link is set as the agent, so that the stability of an experience multiplexing pool during agent training is improved to a great extent, and the training result is more accurate and effective;
(3) the T2T distributed resource allocation algorithm is designed, can be executed in an off-line mode under different spectrum conditions and network topologies, is convenient to implement, can be deployed in a network quickly, and only when environmental characteristics change significantly, trained networks of all intelligent agents need to be updated, and simulation results show that the method has good effects in the aspects of practical application and system performance.
Drawings
FIG. 1 is a schematic illustration of a vehicle-to-vehicle communication system model;
fig. 2 is a schematic diagram of interference between a T2T link and a T2G link;
FIG. 3 is a schematic diagram of a neural network used by the MARL of the present invention;
FIG. 4 is a schematic diagram of model convergence during the training phase of the present invention;
FIG. 5 is a simulation graph comparing the total channel capacity of the system of the present invention with that of the prior art scheme for different transmission packet sizes;
fig. 6 is a simulation diagram comparing the success probability of the link data transmission of the scheme of the present invention with the prior art scheme T2T under different transmission packet sizes.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The embodiment of the invention provides an intelligent frequency spectrum selection method for vehicle-to-vehicle communication of urban rail transit, which comprises a first step, a second step and a third step.
Step one, constructing a train-vehicle communication system model in a rail transit wireless communication network, wherein the constructed train-vehicle communication system model is shown in figure 1, interference between a T2T link and a T2G link is shown in figure 2, and specifically setting a urban rail transit train communication network based on honeycomb, wherein the radius of a honeycomb cell where a train is located isL. Unlike the conventional cellular network, in the urban rail transit system, base stations are linearly distributed along a track, and the number of trains per cell is limited. According to the particularity of the train operating environment, in a single cell, there isMIndividual vehicle-to-ground communication link, presentNIndividual car-to-car communication links, available bandwidth being divided intoRResource block, without loss of generality, is definedR=MEach train-to-ground (T2G) link uses a single resource block, which means that there is no spectrum sharing between the T2G links. Within a coherence period, in the second placemA resource block (the firstmT2G link taken up)tThe channel power gain of an individual vehicle-to-vehicle (T2T) link is expressed as:
Figure 374243DEST_PATH_IMAGE032
(1)
wherein the content of the first and second substances,
Figure 82436DEST_PATH_IMAGE002
large scale fading coefficients including path loss and shadow fading.
Figure 239748DEST_PATH_IMAGE003
Is a small scale fading power component.
Step two, for the interference problem existing in the system model, establishing the SINR of the trains with different communication modes, wherein, the SINR is positioned in the resource blockmTo above, totT2T vehicle to
Figure 95709DEST_PATH_IMAGE004
The channel gain for a T2T vehicle is shown as
Figure 386882DEST_PATH_IMAGE005
Of 1 attThe vehicle-to-base station channel gain of T2T may be expressed as
Figure 74215DEST_PATH_IMAGE006
Of 1 atmThe vehicle-to-base station channel gain of T2G may be expressed as
Figure 605690DEST_PATH_IMAGE007
Of 1 atmT2G vehicle totThe channel gain for a T2T vehicle may be expressed as
Figure 886630DEST_PATH_IMAGE008
. Then, in the resource blockmTo above, tomThe signal-to-interference-plus-noise ratios (SINRs) of the T2G links can be expressed as:
Figure 325702DEST_PATH_IMAGE009
(2)
first, thetThe SINR of the T2T link may be expressed as:
Figure 70804DEST_PATH_IMAGE010
(3)
wherein the content of the first and second substances,
Figure 960131DEST_PATH_IMAGE011
is shown asmThe transmitted power of each T2G vehicle,
Figure 587422DEST_PATH_IMAGE012
is shown astThe transmitted power of each T2T vehicle,
Figure 33447DEST_PATH_IMAGE013
is shown as
Figure 570738DEST_PATH_IMAGE014
The transmitted power of each T2T vehicle,
Figure 506333DEST_PATH_IMAGE015
which is indicative of the power of the noise,
Figure 558603DEST_PATH_IMAGE016
and
Figure 995269DEST_PATH_IMAGE033
all represent spectrum resource allocation indexes when
Figure 511701DEST_PATH_IMAGE016
Or
Figure 555881DEST_PATH_IMAGE034
When the value of (A) is 1, it means the firsttA T2T link or the second
Figure 33130DEST_PATH_IMAGE035
T2T link using the firstmA resource block of
Figure 820957DEST_PATH_IMAGE016
Or
Figure 191895DEST_PATH_IMAGE034
When the value of (A) is 0, it means that it is not used.
Step three, respectively calculating the channel capacities of the T2T link and the T2G link by using the SINRs calculated in the step two, wherein:
first, themT2G link using the firstmThe channel capacity per resource block can be expressed as:
Figure 593927DEST_PATH_IMAGE019
(4)
first, thetT2T link using the firstmThe channel capacity per resource block can be expressed as:
Figure 886368DEST_PATH_IMAGE020
(5)
whereinBFor the bandwidth of each spectrum.
In order to fully utilize the effective spectrum resources, spectrum reuse is necessary, but the influence of the interference caused by spectrum reuse on the system can be clearly seen according to the formula, so a reasonable and effective spectrum reuse strategy becomes important.
Step four, frequency spectrum multiplexing:
in the urban rail transit train communication network, a plurality of T2T communication links multiplex spectrum resources of the T2G communication links, so that the spectrum multiplexing problem can be modeled as a multi-agent deep learning (MARL) problem, wherein each T2T communication link serves as an intelligent agent, and the intelligent agents jointly explore an unknown communication environment where a train is located so as to obtain experience, improve strategies of spectrum sharing and power control according to observation of the environment by the intelligent agents and then guide the intelligent agents to take proper decisions.
Referring to fig. 3, the MARL-based spectrum multiplexing scheme is mainly divided into two stages, the first stage is a training stage, and the second stage is a testing stage.
S4.1 training phase
S4.1.1 in a coherent time periodkGiven the current environmental stateS k
S4.1.2 communicating T2TThe links being treated as agents, each agenttObtaining an observation from the environment can be expressed as:
Figure 274624DEST_PATH_IMAGE036
(6)
wherein the content of the first and second substances,
Figure 641014DEST_PATH_IMAGE037
Othe representation of the observation function is shown,
Figure 26996DEST_PATH_IMAGE023
is a greedy coefficient to be used for the image display,ein order to train the number of iterations,G m for the T2G link channel state information,H t for the T2T link channel state information,B t in order to transmit the size of the data,T t in order to be able to transmit the duration of the data,
Figure 603471DEST_PATH_IMAGE038
is a greedy coefficient to be used for the image display,eto train the number of iterations.
The strategy adopted according to the observed value is
Figure 733101DEST_PATH_IMAGE024
The transmission power used by the train and the selected spectrum resources are included, and the strategy adopted by all the intelligent agents isA k
S4.1.3 the reward earned by the agent may be expressed as:
Figure 468845DEST_PATH_IMAGE039
(7)
wherein the content of the first and second substances,
Figure 822466DEST_PATH_IMAGE040
the weight occupied for the T2G link,
Figure 823920DEST_PATH_IMAGE041
Figure 694924DEST_PATH_IMAGE042
respectively, the T2G communication link and the T2T communication link are represented at the coherence timekThe channel capacity of the inner.
S4.1.4 environmental state by probability
Figure 894961DEST_PATH_IMAGE043
Enter the next state
Figure 622746DEST_PATH_IMAGE027
S k+1To representkThe environmental state of the +1 time period,
Figure 32867DEST_PATH_IMAGE028
to representkThe value of the ambient state for the +1 time period,rto representkThe value of the +1 time period prize,Sa value representing the current state of the device,
Figure 832196DEST_PATH_IMAGE029
is shown inkAn action taken by the time period;
s4.1.5 obtaining new observed values for each agent
Figure 762106DEST_PATH_IMAGE030
Throughout the environment, the agents share the same reward.
S4.2 testing stage
S4.2.1 during a coherence periodkWithin, each agent first estimates environmental observations
Figure 660792DEST_PATH_IMAGE031
S4.2.2 according to the trained Q network, the intelligent agent can select the strategy with the maximum strategy value
Figure 699155DEST_PATH_IMAGE024
S4.2.3 the agent begins transmitting data according to the transmit power and spectral resources determined by the selected policy.
To verify the system performance of the algorithm, the algorithm was simulated using Python and MATLAB R2018 b. The main parameters of the train communication system are as follows:
radius of cellLNumber of resource blocks (= 3 km)R=7, base station antenna gain is 7 dBi, train antenna gain is 4 dBi, train operation speed is 70-90 km/h, T2G link path loss model 128+37.6log 10: (d) T2T path loss model 148+40log10(d) Fast fading is Rayleigh fading, weight
Figure 895650DEST_PATH_IMAGE044
As shown in fig. 4, the accumulated rewards increase with increasing number of training, which illustrates the effectiveness of the proposed training algorithm. When the training times reach about 1000 times, the overall performance of the system still shows a tendency to converge, although in an urban rail transit environment, the channel fading caused by mobility may cause some fluctuation. Based on such observations, the Q-network of each agent was trained 1500 times in subsequent evaluations of the performance of the T2T and T2G links to ensure that the models all converged.
As shown in fig. 5, as the size of the packet transmitted by the T2T link increases, the performance of all schemes decreases. Obviously, when the size of the transmitted data packet increases, in order to increase the probability of successful data transmission of the T2T link, the T2T link is required to increase the duration of data transmission and increase the transmission power of the T2T link. As the time for the T2T link to transmit data becomes longer, the T2T link may also cause stronger interference to the T2G link due to spectrum sharing. As can be seen from fig. 5, the present invention can achieve better performance in terms of different T2T packet sizes compared to other schemes, but its performance is very close to the upper limit of performance. This further illustrates that, after the multi-agent low-dimensional fingerprints are introduced, the stability of the experience multiplexing pool is greatly improved, so that the training result is more accurate.
As shown in fig. 6, as the size of the data packet to be transmitted increases, the success rate of all schemes except the maximum capacity gradually decreases, and the transmission success probability of the present invention is closer to the transmission probability at the maximum channel capacity than the MARL scheme without adding a low-dimensional fingerprint. In conjunction with the observations of fig. 6, it can be concluded that the deep Q network after training of the present invention, i.e., even in the untrained case, the present spectrum sharing scheme is superior to the other schemes.
In summary, the intelligent spectrum selection method for vehicle-to-vehicle communication of urban rail transit provided by the embodiment has the following beneficial effects:
(1) the frequency spectrum sharing and the transmitting power of the T2T communication train are optimized, and the system channel capacity is maximized under the condition of ensuring the communication quality of the T2G communication train;
(2) the multi-agent low-dimensional fingerprint is introduced, and the T2T communication link is set as the agent, so that the stability of an experience multiplexing pool during agent training is improved to a great extent, and the training result is more accurate and effective;
(3) the T2T distributed resource allocation algorithm is designed, can be executed in an off-line mode under different spectrum conditions and network topologies, is convenient to implement, can be deployed in a network quickly, and only when environmental characteristics change significantly, trained networks of all intelligent agents need to be updated, and simulation results show that the method has good effects in the aspects of practical application and system performance.
The logic and/or steps represented in the flowcharts or otherwise described herein, e.g., an ordered listing of executable instructions that can be considered to implement logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Additionally, the computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.
It should be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.
In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
While embodiments of the invention have been shown and described, it will be understood by those of ordinary skill in the art that: various changes, modifications, substitutions and alterations can be made to the embodiments without departing from the principles and spirit of the invention, the scope of which is defined by the claims and their equivalents.

Claims (1)

1. An intelligent frequency spectrum selection method for vehicle-to-vehicle communication of urban rail transit is characterized by comprising the following steps:
step one, constructing a train-vehicle communication system model in a rail transit wireless communication network, wherein the radius of a cell where a train is located isLIn a single cell, there isMIndividual vehicle-to-ground communication link, presentNIndividual car-to-car communication links, available bandwidth being divided intoRResource block, definitionR=MEach T2G link uses a single resource block for a coherent time period at the secondmOn one resource blocktThe channel power gain for each T2T link is expressed as:
Figure 217276DEST_PATH_IMAGE001
(1)
wherein the content of the first and second substances,
Figure 75510DEST_PATH_IMAGE002
for large scale fading coefficients, including path loss and shadow fading,
Figure 94282DEST_PATH_IMAGE003
is a small scale fading power component;
step two, for the interference problem existing in the system model, establishing the SINR of the trains with different communication modes, wherein, the SINR is positioned in the resource blockmTo above, totT2T vehicle to
Figure 428180DEST_PATH_IMAGE004
The channel gain for a T2T vehicle is shown as
Figure 721758DEST_PATH_IMAGE005
Of 1 attA T2T vehicle-to-baseThe channel gain of a station is expressed as
Figure 637762DEST_PATH_IMAGE006
Of 1 atmThe vehicle-to-base station channel gain of T2G is expressed as
Figure 765118DEST_PATH_IMAGE007
Of 1 atmT2G vehicle totThe channel gain for a T2T vehicle is shown as
Figure 196099DEST_PATH_IMAGE008
Then, in the resource blockmTo above, tomThe signal-to-interference-and-noise ratio SINR of each T2G link is expressed as:
Figure 496630DEST_PATH_IMAGE009
(2)
first, thetThe SINR of the T2T link is expressed as:
Figure 454091DEST_PATH_IMAGE010
(3)
wherein the content of the first and second substances,
Figure 814665DEST_PATH_IMAGE011
is shown asmThe transmitted power of each T2G vehicle,
Figure 998522DEST_PATH_IMAGE012
is shown astThe transmitted power of each T2T vehicle,
Figure 509269DEST_PATH_IMAGE013
is shown as
Figure 399864DEST_PATH_IMAGE014
The transmitted power of each T2T vehicle,
Figure 993657DEST_PATH_IMAGE015
which is indicative of the power of the noise,
Figure 258285DEST_PATH_IMAGE016
and
Figure 166198DEST_PATH_IMAGE017
all represent spectrum resource allocation indexes when
Figure 973617DEST_PATH_IMAGE016
Or
Figure 348097DEST_PATH_IMAGE018
When the value of (A) is 1, it means the firsttA T2T link or the second
Figure 178650DEST_PATH_IMAGE014
T2T link using the firstmA resource block of
Figure 421413DEST_PATH_IMAGE016
Or
Figure 207972DEST_PATH_IMAGE018
When the value of (A) is 0, it means not used;
step three, respectively calculating the channel capacities of the T2T link and the T2G link by using the SINRs calculated in the step two, wherein:
first, themT2G link using the firstmThe channel capacity per resource block is expressed as:
Figure 81250DEST_PATH_IMAGE019
(4)
first, thetT2T link using the firstmThe channel capacity per resource block is expressed as:
Figure 461416DEST_PATH_IMAGE020
(5)
whereinBA bandwidth for each spectrum;
constructing a deep reinforcement learning model introducing multi-agent low-dimensional fingerprints, and specifically comprising a training stage and a testing stage;
s4.1 training phase
S4.1.1 in a coherent time periodkGiven the current environmental stateS k
S4.1.2 consider the T2T communication link as agents, each agenttObtaining observations from the environment
Figure 117656DEST_PATH_IMAGE021
OExpressing the observation function, adding the low-dimensional fingerprint features into the observation value, and changing the observation value into
Figure 837351DEST_PATH_IMAGE022
Wherein, in the step (A),
Figure 943847DEST_PATH_IMAGE023
is a greedy coefficient to be used for the image display,efor training the number of iterations, the strategy is
Figure 748992DEST_PATH_IMAGE024
The strategy adopted by all the agents isA k
S4.1.3 the reward earned by the agent is
Figure 723770DEST_PATH_IMAGE025
S4.1.4 environmental state by probability
Figure 94708DEST_PATH_IMAGE026
Enter the next state
Figure 309789DEST_PATH_IMAGE027
S k+1To representkThe environmental state of the +1 time period,
Figure 274334DEST_PATH_IMAGE028
to representkThe value of the ambient state for the +1 time period,rto representkThe value of the +1 time period prize,Sa value representing the current state of the device,
Figure 865852DEST_PATH_IMAGE029
is shown inkAn action taken by the time period;
s4.1.5 obtaining new observed values for each agent
Figure 91297DEST_PATH_IMAGE030
The agents share the same reward throughout the environment;
s4.2 testing stage
S4.2.1 during a coherence periodkWithin, each agent first estimates environmental observations
Figure 664230DEST_PATH_IMAGE031
S4.2.2 according to the trained Q network, the intelligent agent can select the strategy with the maximum strategy value
Figure 443967DEST_PATH_IMAGE024
S4.2.3 the agent begins transmitting data according to the transmit power and spectral resources determined by the selected policy.
CN202110032707.0A 2021-01-12 2021-01-12 Intelligent frequency spectrum selection method for vehicle-vehicle communication of urban rail transit vehicle Pending CN112367638A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110032707.0A CN112367638A (en) 2021-01-12 2021-01-12 Intelligent frequency spectrum selection method for vehicle-vehicle communication of urban rail transit vehicle

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110032707.0A CN112367638A (en) 2021-01-12 2021-01-12 Intelligent frequency spectrum selection method for vehicle-vehicle communication of urban rail transit vehicle

Publications (1)

Publication Number Publication Date
CN112367638A true CN112367638A (en) 2021-02-12

Family

ID=74534791

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110032707.0A Pending CN112367638A (en) 2021-01-12 2021-01-12 Intelligent frequency spectrum selection method for vehicle-vehicle communication of urban rail transit vehicle

Country Status (1)

Country Link
CN (1) CN112367638A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114199592A (en) * 2021-12-31 2022-03-18 商洛学院 Automobile horsepower machine capable of preventing car washing
CN114235440A (en) * 2021-12-31 2022-03-25 商洛学院 Protective horsepower machine for automobile

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106686737A (en) * 2017-01-12 2017-05-17 北京交通大学 Resource management method based on train position and cargo handling capacity maximization
US20190124667A1 (en) * 2017-10-23 2019-04-25 Commissariat A L'energie Atomique Et Aux Energies Alternatives Method for allocating transmission resources using reinforcement learning
CN112153744A (en) * 2020-09-25 2020-12-29 哈尔滨工业大学 Physical layer security resource allocation method in ICV network

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106686737A (en) * 2017-01-12 2017-05-17 北京交通大学 Resource management method based on train position and cargo handling capacity maximization
US20190124667A1 (en) * 2017-10-23 2019-04-25 Commissariat A L'energie Atomique Et Aux Energies Alternatives Method for allocating transmission resources using reinforcement learning
CN112153744A (en) * 2020-09-25 2020-12-29 哈尔滨工业大学 Physical layer security resource allocation method in ICV network

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
JUNHUI ZHAO,YANG ZHANG,YIWEN NIE,JIN LIU: "Intelligent Resource Allocation for Train-to-Train Communication: A Multi-Agent Deep Reinforcement Learning Approach", 《IEEE》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114199592A (en) * 2021-12-31 2022-03-18 商洛学院 Automobile horsepower machine capable of preventing car washing
CN114235440A (en) * 2021-12-31 2022-03-25 商洛学院 Protective horsepower machine for automobile

Similar Documents

Publication Publication Date Title
CN109729528B (en) D2D resource allocation method based on multi-agent deep reinforcement learning
CN109068391B (en) Internet of vehicles communication optimization algorithm based on edge calculation and Actor-Critic algorithm
Zhang et al. RIS-aided next-generation high-speed train communications: Challenges, solutions, and future directions
CN112367638A (en) Intelligent frequency spectrum selection method for vehicle-vehicle communication of urban rail transit vehicle
CN111917509B (en) Multi-domain intelligent communication system and communication method based on channel-bandwidth joint decision
Zhao et al. Intelligent resource allocation for train-to-train communication: A multi-agent deep reinforcement learning approach
Vu et al. Multi-agent reinforcement learning for channel assignment and power allocation in platoon-based C-V2X systems
CN111586697A (en) Channel resource allocation method based on directed hyper-graph greedy coloring
CN109788566B (en) Network resource allocation method based on deep reinforcement learning
Bi et al. Deep reinforcement learning based power allocation for D2D network
CN112153744B (en) Physical layer security resource allocation method in ICV network
CN115866787A (en) Network resource allocation method integrating terminal direct transmission communication and multi-access edge calculation
Zafar et al. Resource allocation in moving small cell network using deep learning based interference determination
CN116095770A (en) Cross-region cooperation self-adaptive switching judgment method in ultra-dense heterogeneous wireless network
CN116347635A (en) NB-IoT wireless resource allocation method based on NOMA and multi-agent reinforcement learning
CN116582860A (en) Link resource allocation method based on information age constraint
CN115551065A (en) Internet of vehicles resource allocation method based on multi-agent deep reinforcement learning
Liang et al. Multi-agent reinforcement learning for spectrum sharing in vehicular networks
CN117412391A (en) Enhanced dual-depth Q network-based Internet of vehicles wireless resource allocation method
CN109041009B (en) Internet of vehicles uplink power distribution method and device
Xiao et al. Power allocation for device-to-multi-device enabled HetNets: A deep reinforcement learning approach
Zhang et al. DDQN based handover scheme in heterogeneous network
CN111132298A (en) Power distribution method and device
CN112637812B (en) Vehicle-mounted cooperative communication relay selection method based on supervised machine learning
CN115915454A (en) SWIPT-assisted downlink resource allocation method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20210212

RJ01 Rejection of invention patent application after publication