CN109195135A - Base station selecting method based on deeply study in LTE-V - Google Patents
Base station selecting method based on deeply study in LTE-V Download PDFInfo
- Publication number
- CN109195135A CN109195135A CN201810885951.XA CN201810885951A CN109195135A CN 109195135 A CN109195135 A CN 109195135A CN 201810885951 A CN201810885951 A CN 201810885951A CN 109195135 A CN109195135 A CN 109195135A
- Authority
- CN
- China
- Prior art keywords
- base station
- lte
- dqn
- function
- state
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W4/00—Services specially adapted for wireless communication networks; Facilities therefor
- H04W4/30—Services specially adapted for particular environments, situations or purposes
- H04W4/40—Services specially adapted for particular environments, situations or purposes for vehicles, e.g. vehicle-to-pedestrians [V2P]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W28/00—Network traffic management; Network resource management
- H04W28/02—Traffic management, e.g. flow control or congestion control
- H04W28/0289—Congestion control
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W28/00—Network traffic management; Network resource management
- H04W28/02—Traffic management, e.g. flow control or congestion control
- H04W28/08—Load balancing or load distribution
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W48/00—Access restriction; Network selection; Access point selection
- H04W48/20—Selecting an access point
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Computer Security & Cryptography (AREA)
- Mobile Radio Communication Systems (AREA)
Abstract
The present invention relates to the base station selecting methods based on deeply study in a kind of LTE-V, comprising the following steps: 1) according to LTE-V network communication feature and base station selected performance indicator, constructs Q function;2) mobile management unit obtains the status information of vehicle in network, constructs state matrix, and be stored in experience replay pond;3) using experience replay pond as sample, the Q function based on building obtains one for selecting the main DQN of optimal access base station using competition-dual training method training;4) the main DQN obtained with training handles input information, output selection access base station.Compared with prior art, the present invention combines the delay performance and load-balancing performance of communication, allows the vehicle to timely and reliably be communicated, have many advantages, such as it is base station selected it is high-efficient, accuracy is high.
Description
Technical field
The present invention relates to the LTE-V communication technologys and DRL technology, and in particular to a kind of base based on the continuous decision of neural network
It stands selection method, for reducing LTE-V network congestion rate.
Background technique
LTE-V (long term evolution-vehicle, Long Term Evolution-Vehicl) is that China has independent intellectual property right
V2X technology, be the ITS based on timesharing long term evolution (Time Division-Long Term Evolution, TD-LTE)
System scheme belongs to the important application branch of the subsequent evolution technology of LTE.2 months 2015,3GPP working group LTE-V standard
Change research work formally to start, the proposition of Release 14 indicates that LTE-V technical standard is formulated work and counted in 3GPP working group
Formal beginning in drawing, while compatible and performance be substantially improved will be also obtained in 5G.LTE V2V Core part in
The end of the year 2016 finished, and LTE V2X Core part finishes at the beginning of 2017, and V2V is the core of LTE-V, it is contemplated that the end of the year 2018 are complete
Knot, system and equipment based on LTE-V technical standard are estimated will to start commercialization after the year two thousand twenty.
In peak time and congested link, the very big periodic broadcast of the load capacity that road safety and traffic efficiency can generate
Information.If without reasonably congestion control scheme, load caused by these message will lead to serious message delay, and
Acid test can be brought to LTE network capacity.In addition to this, the base that vehicle selects channel conditions best by random competition
It stands, this is easy to cause network congestion in the biggish situation of vehicle flowrate.Therefore, it is necessary to design one kind effectively simultaneously for LTE-V
And eNB (best base station, evolved node B) selection algorithm that robustness is good.
Summary of the invention
The purpose of the present invention is to the delay performance for the cellular communications network for introducing the LTE-V communication technology and network congestions
Deficiency existing for aspect, and the base station selecting method based on deeply study in a kind of LTE-V is provided.
The purpose of the present invention can be achieved through the following technical solutions:
A kind of base station selecting method based on deeply study in LTE-V, comprising the following steps:
1) according to LTE-V network communication feature and base station selected performance indicator, Q function is constructed;
2) mobile management unit obtains the status information of vehicle in network, constructs state matrix, and be stored in experience replay pond;
3) using experience replay pond as sample, the Q function based on building obtains one using competition-dual training method training
For selecting the main DQN of optimal access base station;
4) the main DQN obtained with training handles input information, output selection access base station.
Further, the LTE-V network communication feature includes communication bandwidth and signal-to-noise ratio, and the base station selected performance refers to
Mark includes user's receiving velocity and load of base station.
Further, the Q function specifically constructs as follows:
In formula, μ indicates that user's receiving velocity, L indicate that load of base station, R indicate that reward function, α indicate learning rate, Q (st,
at) indicating that being in state s in t moment takes movement a to can be obtained expectation reward, subscript s' expression is taken dynamic at state s
Make next state of a entrance, γ ∈ [0,1] is discount factor, w1、w2For weight coefficient,It indicates in t+1
It carves and takes different movements to can be obtained greatest hope reward in state s.
Further, in the competition-dual training method:
A target DQN and a main DQN are established based on Q function, base station, the Q function maxima of the base station are selected by main DQN
It is calculated and is generated by target DQN.
Further, in the competition-dual training method, whether restrained using loss function as whether training of judgement is tied
The foundation of beam, the loss function are as follows:
In formula, rt+1Indicate that being located at state s at the t+1 moment takes the reward size harvested after movement a, QtargetIndicate mesh
Mark the Q function maxima that DQN is generated, QmainIndicate the Q function maxima that main DQN is generated, γ ∈ [0,1] is discount factor.
Further, in the competition-dual training method, training selects access base using ε-greedy algorithm every time
It stands, while updating network parameter using back-propagation algorithm and adaptability moments estimation algorithm.
Further, the exploration probability of the ε-greedy algorithm is as follows:
εt+1(s)=δ × f (s, a, σ)+(1- δ) × εt(s)
In formula, δ is current state selectable movement sum, and f (s, a, σ) characterizes the uncertainty of environment, σ ∈ [0,
1] direction and sensitivity, ε are indicatedt+1(s) it indicates to be located at the probability that state s takes DQN generation movement a at the t+1 moment.
Further, in the competition-dual training method, optimal hyper parameter is selected using cross-validation method.
Further, the capacity in experience replay pond is T, preferential to delete most when the quantity of the state matrix of deposit is greater than T
The state matrix being early stored in.
Compared with prior art, the present invention combines the delay performance and load-balancing performance of communication, enables vehicle
It is enough timely and reliably to be communicated, it has the advantages that
1) present invention is according to the relevant Q function of LTE-V communication special point design, to convert reinforcing for congestion control problem
Optimal decision-making problem in study, improves base station selected efficiency.
2) present invention is used as Agent (generation with MME (mobile management unit, Mobility Management Entity)
Reason), base station side network congestion probability and the receiving velocity of receiving end are considered in car networking to design reward (reward) letter
Number proposes the base learnt based on deeply in conjunction with Q (movement-value) function modelling is carried out in LTE-V the characteristics of vehicle communication
It stands eNB selection method, makes the congestion probability of network under a maximum value, to guarantee the load balancing of whole network.
3) the present invention is based on competition-double-depth Q networks (Dueling-Double Deep Q Network) to fit within
The Q function modeled under LTE-V network, and using reception delay, network congestion probability as base station selected standard, it is selected for vehicle
It is most not susceptible to the base station of network congestion, guarantees LTE-V network delay performance and load balancing, to promote communication performance.
4) present invention exists, and training selects access base station using ε-greedy algorithm every time, while being calculated using backpropagation
Method and adaptability moments estimation (Adaptive moment estimation, Adam) algorithm update network parameter, effectively increase
Motion space is rich
5) present invention carries out hyper parameter selection using cross-validation method, more preferably network model can be obtained, to improve
Base station selected precision.
Detailed description of the invention
Fig. 1 is application scenarios schematic diagram of the invention;
Fig. 2 is flow diagram of the invention.
Specific embodiment
The present invention is described in detail with specific embodiment below in conjunction with the accompanying drawings.The present embodiment is with technical solution of the present invention
Premised on implemented, the detailed implementation method and specific operation process are given, but protection scope of the present invention is not limited to
Following embodiments.
The present invention be directed to long term evolution-vehicle (Long Term Evolution-Vehicle, LTE-V) under vehicle with
Machine contention access network, the problem of be easy to causeing network congestion, provide the base station choosing based on deeply study in a kind of LTE-V
Selection method combines the delay performance and load-balancing performance of communication, allows the vehicle to timely and reliably be communicated, answer
It is as shown in Figure 1 with scene.The present invention using mobile management unit in LTE core network (Mobility Management Entity,
MME) as agency (agent), while considering network lateral load and receiving end receiving velocity, the matching for completing vehicle and eNB is asked
Topic reduces network congestion probability, reduces network delay.Use competition-double-depth Q network (Dueling-Double Deep Q
Network, DQN) come fit object movement-evaluation function (action-value function), dimensional state input is completed,
The conversion of low-dimensional movement output.
As shown in Fig. 2, method includes the following steps:
Step 1: according to LTE-V network communication feature and base station selected performance indicator, constructing Q function.
The LTE-V network communication feature includes communication bandwidth Bandwidth and signal-to-noise ratio SINR, the base station selected property
Energy index includes user's receiving velocity μ and load of base station L, then Q function specifically constructs as follows:
μ=Bandwidth × log2(1+SINR)
In formula, μ indicates that user's receiving velocity, L indicate that load of base station, R indicate that reward function, α indicate learning rate, Q (st,
at) indicating that being in state s in t moment takes movement a to can be obtained expectation reward, subscript s' expression is taken dynamic at state s
Make next state of a entrance, subscript k indicates k-th of base station, and γ ∈ [0,1] is discount factor, w1、w2For weight coefficient,Indicate that being in state s at the t+1 moment takes different movements to can be obtained greatest hope reward.
Step 2: mobile management unit obtains the status information of vehicle in network, constructs state matrix, and is stored in experience and returns
Put pond.The capacity in experience replay pond is T, when the quantity of the state matrix of deposit is greater than T, preferentially deletes the state being stored in earliest
Matrix.
Step 3: randomly selecting a part of sample-feed DQN study from experience replay pond D when training every time.With experience replay
Pond is sample, the Q function based on building, obtains one for selecting optimal access base station using competition-dual training method training
Main DQN.
The present invention is intended using competition-double-depth Q network (Dueling-Double Deep Q Network, DQN)
It closes target action-evaluation function (action-value function), completes dimensional state input, low-dimensional movement output turns
Change.Competition-dual training method specifically: a target DQN and a main DQN are established based on Q function, main DQN passes through its Q function most
Big value (abbreviation Q value) selects eNB, then goes to obtain this Q value of the movement on target DQN.Master network is responsible for selecting eNB in this way,
And the Q value of this chosen eNB is then generated by target DQN.
In the competition-dual training method, whether the foundation whether terminated as training of judgement is restrained using loss function,
The loss function are as follows:
In formula, rt+1Indicate that being located at state s at the t+1 moment takes the reward size harvested after movement a, QtargetIndicate mesh
Mark the Q value that DQN is generated, QmainIndicate the Q value that main DQN is generated, γ ∈ [0,1] is discount factor.
Rich in order to increase motion space, training selects access base station using ε-greedy algorithm every time, makes simultaneously
Network parameter is updated with back-propagation algorithm and adaptability moments estimation algorithm.
ε-greedy algorithm is the movement (utilizing) for having the probability selection of ε to be generated by DQN in each state, there is 1-
The probability of ε takes movement (exploring) at random, it is therefore an objective to expand optional motion space.In training process of the present invention, according to probability ε
It judges whether to explore, if so, random selection base station or no, then select the corresponding base station of Q function maxima.
The exploration probability of the ε-greedy algorithm is as follows:
εt+1(s)=δ × f (s, a, σ)+(1- δ) × εt(s)
In formula, δ is current state selectable movement sum, and f (s, a, σ) characterizes the uncertainty of environment, σ ∈ [0,
1] direction and sensitivity, ε are indicatedt+1(s) it indicates to be located at the probability that state s takes DQN generation movement a at the t+1 moment.
The forward-propagating of neural network, i.e. reasoning process calculate loss function Loss, profound nerve net using input
Network is considered as multi hierarchical and nested function, and backpropagation is to be become using the chain rule differentiated to each of function
Amount is differentiated, and carrys out more new variables using gradient.
Adam is a kind of adaptive learning rate optimization algorithm, even if the exponent-weighted average of variable first derivative is rectified
The more new direction and size of positive gradient, second order gradient reconcile the learning rate size updated every time, so that change of gradient is cracking
Variable update slows down.
In the competition-dual training method, optimal hyper parameter is selected using cross-validation method.Cross-validation method is one
Kind hyper parameter selection algorithm, i.e., so-called tune ginseng.Training data is divided into K parts, is trained using K-1 parts therein, it is remaining
Portion is used as test set, tests K times in this way, the average value on test set is taken to correspond to the performance of model as current hyper parameter collection.
It repeats that optimal model can be obtained from M model M times.
Step 4: the main DQN obtained with training handles input information, output selection access base station.
After the convergence of DQN parameter, when practical application, only needs to retain main DQN, is directly exported and is selected according to its forward-propagating
Access base station.
The preferred embodiment of the present invention has been described in detail above.It should be appreciated that those skilled in the art without
It needs creative work according to the present invention can conceive and makes many modifications and variations.Therefore, all technologies in the art
Personnel are available by logical analysis, reasoning, or a limited experiment on the basis of existing technology under this invention's idea
Technical solution, all should be within the scope of protection determined by the claims.
Claims (9)
1. the base station selecting method based on deeply study in a kind of LTE-V, which comprises the following steps:
1) according to LTE-V network communication feature and base station selected performance indicator, Q function is constructed;
2) mobile management unit obtains the status information of vehicle in network, constructs state matrix, and be stored in experience replay pond;
3) using experience replay pond as sample, the Q function based on building is obtained one and is used for using competition-dual training method training
Select the main DQN of optimal access base station;
4) the main DQN obtained with training handles input information, output selection access base station.
2. the base station selecting method based on deeply study in LTE-V according to claim 1, which is characterized in that institute
Stating LTE-V network communication feature includes communication bandwidth and signal-to-noise ratio, and the base station selected performance indicator includes user's receiving velocity
And load of base station.
3. the base station selecting method based on deeply study in LTE-V according to claim 2, which is characterized in that institute
State Q function specifically construct it is as follows:
In formula, μ indicates that user's receiving velocity, L indicate that load of base station, R indicate that reward function, α indicate learning rate, Q (st,at) table
Show t moment be in state s take movement a can be obtained expectation reward, subscript s' expression taken at state s movement a into
The next state entered, γ ∈ [0,1] are discount factor, w1、w2For weight coefficient,It indicates at the t+1 moment
Different movements are taken to can be obtained greatest hope reward in state s.
4. the base station selecting method based on deeply study in LTE-V according to claim 1, which is characterized in that institute
It states in competition-dual training method:
A target DQN and a main DQN are established based on Q function, base station is selected by main DQN, the Q function maxima of the base station is by target
DQN, which is calculated, to be generated.
5. the base station selecting method based on deeply study in LTE-V according to claim 1, which is characterized in that institute
It states in competition-dual training method, whether the foundation whether terminated as training of judgement, the loss letter is restrained using loss function
Number are as follows:
In formula, rt+1Indicate that being located at state s at the t+1 moment takes the reward size harvested after movement a, QtargetIndicate target
The Q function maxima that DQN is generated, QmainIndicate the Q function maxima that main DQN is generated, γ ∈ [0,1] is discount factor.
6. the base station selecting method based on deeply study in LTE-V according to claim 1, which is characterized in that institute
It states in competition-dual training method, training selects access base station using ε-greedy algorithm every time, while using backpropagation
Algorithm and adaptability moments estimation algorithm update network parameter.
7. the base station selecting method based on deeply study in LTE-V according to claim 6, which is characterized in that institute
The exploration probability for stating ε-greedy algorithm is as follows:
εt+1(s)=δ × f (s, a, σ)+(1- δ) × εt(s)
In formula, δ is current state selectable movement sum, and f (s, a, σ) characterizes the uncertainty of environment, σ ∈ [0,1] table
Show direction and sensitivity, εt+1(s) it indicates to be located at the probability that state s takes DQN generation movement a at the t+1 moment.
8. the base station selecting method based on deeply study in LTE-V according to claim 1, which is characterized in that institute
It states in competition-dual training method, optimal hyper parameter is selected using cross-validation method.
9. the base station selecting method based on deeply study in LTE-V according to claim 1, which is characterized in that warp
The capacity for testing playback pond is T, when the quantity of the state matrix of deposit is greater than T, preferentially deletes the state matrix being stored in earliest.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810885951.XA CN109195135B (en) | 2018-08-06 | 2018-08-06 | Base station selection method based on deep reinforcement learning in LTE-V |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810885951.XA CN109195135B (en) | 2018-08-06 | 2018-08-06 | Base station selection method based on deep reinforcement learning in LTE-V |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109195135A true CN109195135A (en) | 2019-01-11 |
CN109195135B CN109195135B (en) | 2021-03-26 |
Family
ID=64920254
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810885951.XA Active CN109195135B (en) | 2018-08-06 | 2018-08-06 | Base station selection method based on deep reinforcement learning in LTE-V |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109195135B (en) |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109743210A (en) * | 2019-01-25 | 2019-05-10 | 电子科技大学 | Unmanned plane network multi-user connection control method based on deeply study |
CN109803338A (en) * | 2019-02-12 | 2019-05-24 | 南京邮电大学 | A kind of dual link base station selecting method based on degree of regretting |
CN110493826A (en) * | 2019-08-28 | 2019-11-22 | 重庆邮电大学 | A kind of isomery cloud radio access network resources distribution method based on deeply study |
CN110717600A (en) * | 2019-09-30 | 2020-01-21 | 京东城市(北京)数字科技有限公司 | Sample pool construction method and device, and algorithm training method and device |
CN110809306A (en) * | 2019-11-04 | 2020-02-18 | 电子科技大学 | Terminal access selection method based on deep reinforcement learning |
CN111065131A (en) * | 2019-12-16 | 2020-04-24 | 深圳大学 | Switching method and device and electronic equipment |
CN111083767A (en) * | 2019-12-23 | 2020-04-28 | 哈尔滨工业大学 | Heterogeneous network selection method based on deep reinforcement learning |
CN111181618A (en) * | 2020-01-03 | 2020-05-19 | 东南大学 | Intelligent reflection surface phase optimization method based on deep reinforcement learning |
CN111243299A (en) * | 2020-01-20 | 2020-06-05 | 浙江工业大学 | Single cross port signal control method based on 3 DQN-PSER algorithm |
WO2021002866A1 (en) * | 2019-07-03 | 2021-01-07 | Nokia Solutions And Networks Oy | Reinforcement learning based inter-radio access technology load balancing under multi-carrier dynamic spectrum sharing |
CN112468984A (en) * | 2020-11-04 | 2021-03-09 | 国网上海市电力公司 | Method for selecting address of power wireless private network base station and related equipment |
CN113507503A (en) * | 2021-06-16 | 2021-10-15 | 华南理工大学 | Internet of vehicles resource allocation method with load balancing function |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105245608A (en) * | 2015-10-23 | 2016-01-13 | 同济大学 | Telematics network node screening and accessibility routing construction method based on self-encoding network |
CN106910351A (en) * | 2017-04-19 | 2017-06-30 | 大连理工大学 | A kind of traffic signals self-adaptation control method based on deeply study |
CN107705557A (en) * | 2017-09-04 | 2018-02-16 | 清华大学 | Road network signal control method and device based on depth enhancing network |
US20180052825A1 (en) * | 2016-08-16 | 2018-02-22 | Microsoft Technology Licensing, Llc | Efficient dialogue policy learning |
CN107832836A (en) * | 2017-11-27 | 2018-03-23 | 清华大学 | Model-free depth enhancing study heuristic approach and device |
US20180089553A1 (en) * | 2016-09-27 | 2018-03-29 | Disney Enterprises, Inc. | Learning to schedule control fragments for physics-based character simulation and robots using deep q-learning |
CN108365874A (en) * | 2018-02-08 | 2018-08-03 | 电子科技大学 | Based on the extensive MIMO Bayes compressed sensing channel estimation methods of FDD |
-
2018
- 2018-08-06 CN CN201810885951.XA patent/CN109195135B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105245608A (en) * | 2015-10-23 | 2016-01-13 | 同济大学 | Telematics network node screening and accessibility routing construction method based on self-encoding network |
US20180052825A1 (en) * | 2016-08-16 | 2018-02-22 | Microsoft Technology Licensing, Llc | Efficient dialogue policy learning |
US20180089553A1 (en) * | 2016-09-27 | 2018-03-29 | Disney Enterprises, Inc. | Learning to schedule control fragments for physics-based character simulation and robots using deep q-learning |
CN106910351A (en) * | 2017-04-19 | 2017-06-30 | 大连理工大学 | A kind of traffic signals self-adaptation control method based on deeply study |
CN107705557A (en) * | 2017-09-04 | 2018-02-16 | 清华大学 | Road network signal control method and device based on depth enhancing network |
CN107832836A (en) * | 2017-11-27 | 2018-03-23 | 清华大学 | Model-free depth enhancing study heuristic approach and device |
CN108365874A (en) * | 2018-02-08 | 2018-08-03 | 电子科技大学 | Based on the extensive MIMO Bayes compressed sensing channel estimation methods of FDD |
Non-Patent Citations (2)
Title |
---|
MATTEO GADALETA: "D-DASH: A Deep Q-Learning Framework for DASH Video Streaming", 《 IEEE TRANSACTIONS ON COGNITIVE COMMUNICATIONS AND NETWORKING》 * |
刘全: "深度强化学习综述", 《计算机学报》 * |
Cited By (20)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109743210A (en) * | 2019-01-25 | 2019-05-10 | 电子科技大学 | Unmanned plane network multi-user connection control method based on deeply study |
CN109803338A (en) * | 2019-02-12 | 2019-05-24 | 南京邮电大学 | A kind of dual link base station selecting method based on degree of regretting |
CN109803338B (en) * | 2019-02-12 | 2021-03-12 | 南京邮电大学 | Dual-connection base station selection method based on regret degree |
WO2021002866A1 (en) * | 2019-07-03 | 2021-01-07 | Nokia Solutions And Networks Oy | Reinforcement learning based inter-radio access technology load balancing under multi-carrier dynamic spectrum sharing |
CN114287145A (en) * | 2019-07-03 | 2022-04-05 | 诺基亚通信公司 | Reinforcement learning based inter-radio access technology load balancing under multi-carrier dynamic spectrum sharing |
CN110493826A (en) * | 2019-08-28 | 2019-11-22 | 重庆邮电大学 | A kind of isomery cloud radio access network resources distribution method based on deeply study |
CN110493826B (en) * | 2019-08-28 | 2022-04-12 | 重庆邮电大学 | Heterogeneous cloud wireless access network resource allocation method based on deep reinforcement learning |
CN110717600A (en) * | 2019-09-30 | 2020-01-21 | 京东城市(北京)数字科技有限公司 | Sample pool construction method and device, and algorithm training method and device |
CN110717600B (en) * | 2019-09-30 | 2021-01-26 | 京东城市(北京)数字科技有限公司 | Sample pool construction method and device, and algorithm training method and device |
CN110809306A (en) * | 2019-11-04 | 2020-02-18 | 电子科技大学 | Terminal access selection method based on deep reinforcement learning |
CN111065131A (en) * | 2019-12-16 | 2020-04-24 | 深圳大学 | Switching method and device and electronic equipment |
CN111065131B (en) * | 2019-12-16 | 2023-04-18 | 深圳大学 | Switching method and device and electronic equipment |
CN111083767B (en) * | 2019-12-23 | 2021-07-27 | 哈尔滨工业大学 | Heterogeneous network selection method based on deep reinforcement learning |
CN111083767A (en) * | 2019-12-23 | 2020-04-28 | 哈尔滨工业大学 | Heterogeneous network selection method based on deep reinforcement learning |
CN111181618A (en) * | 2020-01-03 | 2020-05-19 | 东南大学 | Intelligent reflection surface phase optimization method based on deep reinforcement learning |
CN111243299B (en) * | 2020-01-20 | 2020-12-15 | 浙江工业大学 | Single cross port signal control method based on 3 DQN-PSER algorithm |
CN111243299A (en) * | 2020-01-20 | 2020-06-05 | 浙江工业大学 | Single cross port signal control method based on 3 DQN-PSER algorithm |
CN112468984A (en) * | 2020-11-04 | 2021-03-09 | 国网上海市电力公司 | Method for selecting address of power wireless private network base station and related equipment |
CN112468984B (en) * | 2020-11-04 | 2023-02-10 | 国网上海市电力公司 | Method for selecting address of power wireless private network base station and related equipment |
CN113507503A (en) * | 2021-06-16 | 2021-10-15 | 华南理工大学 | Internet of vehicles resource allocation method with load balancing function |
Also Published As
Publication number | Publication date |
---|---|
CN109195135B (en) | 2021-03-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109195135A (en) | Base station selecting method based on deeply study in LTE-V | |
CN111369042B (en) | Wireless service flow prediction method based on weighted federal learning | |
EP3583797B1 (en) | Methods and systems for network self-optimization using deep learning | |
CN109743210B (en) | Unmanned aerial vehicle network multi-user access control method based on deep reinforcement learning | |
CN103607737B (en) | A kind of heterogeneous-network service shunt method and system | |
TWI700649B (en) | Deep reinforcement learning based beam selection method in wireless communication networks | |
US20200366385A1 (en) | Systems and methods for wireless signal configuration by a neural network | |
CN106658603A (en) | Wireless sensor network routing energy-saving method with load balancing | |
CN107690176A (en) | A kind of network selecting method based on Q learning algorithms | |
CN109787699B (en) | Wireless sensor network routing link state prediction method based on mixed depth model | |
CN106789408A (en) | A kind of IPRAN network access layers equipment cyclization rate computational methods | |
CN114071661A (en) | Base station energy-saving control method and device | |
Xu et al. | Fuzzy Q-learning based vertical handoff control for vehicular heterogeneous wireless network | |
CN104954278A (en) | Bee colony optimization based network traffic scheduling method under multiple QoS (quality of service) constraints | |
Klus et al. | Deep learning based localization and HO optimization in 5G NR networks | |
CN105246117A (en) | Energy-saving routing protocol realization method suitable for mobile wireless sensor network | |
CN103781166B (en) | Mobile terminal power distribution method in heterogeneous wireless network cooperative communication system | |
CN105844370B (en) | Urban road vehicle degree of communication optimization method based on particle swarm algorithm | |
CN104951832B (en) | A kind of car networking roadside unit Optimization deployment method based on artificial fish-swarm algorithm | |
CN102300269B (en) | Genetic algorithm based antenna recognition network end-to-end service quality guaranteeing method | |
CN106211183A (en) | A kind of self-organizing of based on Cooperation microcellulor alliance opportunistic spectrum access method | |
CN110445825A (en) | Super-intensive network small station coding cooperative caching method based on intensified learning | |
CN108063802A (en) | User location dynamic modeling optimization method based on edge calculations | |
Bhadauria et al. | QoS based deep reinforcement learning for V2X resource allocation | |
CN105959978A (en) | Massive M2M (Machine-to-Machine) communication packet access method based on LTE (Long Term Evolution) |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |