CN112188600B - Method for optimizing heterogeneous network resources by reinforcement learning - Google Patents
Method for optimizing heterogeneous network resources by reinforcement learning Download PDFInfo
- Publication number
- CN112188600B CN112188600B CN202011002522.7A CN202011002522A CN112188600B CN 112188600 B CN112188600 B CN 112188600B CN 202011002522 A CN202011002522 A CN 202011002522A CN 112188600 B CN112188600 B CN 112188600B
- Authority
- CN
- China
- Prior art keywords
- learning
- sub
- cre
- reinforcement learning
- heterogeneous network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W52/00—Power management, e.g. TPC [Transmission Power Control], power saving or power classes
- H04W52/02—Power saving arrangements
- H04W52/0203—Power saving arrangements in the radio access network or backbone network of wireless communication networks
- H04W52/0206—Power saving arrangements in the radio access network or backbone network of wireless communication networks in access points, e.g. base stations
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W24/00—Supervisory, monitoring or testing arrangements
- H04W24/02—Arrangements for optimising operational condition
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W24/00—Supervisory, monitoring or testing arrangements
- H04W24/06—Testing, supervising or monitoring using simulated traffic
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D30/00—Reducing energy consumption in communication networks
- Y02D30/70—Reducing energy consumption in communication networks in wireless communication networks
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Feedback Control In General (AREA)
- Mobile Radio Communication Systems (AREA)
Abstract
The invention discloses a method for optimizing heterogeneous network resources by reinforcement learning, which belongs to the technical field of communication, integrates reinforcement learning and convex optimization theory, provides a method for dividing an action space according to the correlation of actions, namely ABS, CRE and small base station dormancy strategies, and redesigns a reward function value to take a negative number and then take an inverse number as a new reward function value aiming at the problem that the system energy efficiency is over-large in order of magnitude of the reward function value in the reinforcement learning modeling process. The invention reduces the action space of reinforcement learning, ensures the convergence of the system by adopting the convex optimization theory, and accelerates the convergence rate of reinforcement learning; the simulation experiment proves that the method has convergence and lower complexity, and the convergence speed is improved by 60 percent compared with the traditional form type Q-Learning on the premise of almost reaching the theoretical value of the energy efficiency of the system.
Description
Technical Field
The invention belongs to the technical field of communication, and particularly relates to a method for optimizing heterogeneous network resources by reinforcement learning.
Background
As access to wireless devices increases, higher demands are placed on the communication capacity of the network system. One effective way to solve this problem is to build a heterogeneous network, where introducing elcic can effectively overcome the interference problem and improve the signal-to-interference-and-noise ratio between the mobile device and the base station. At the same time, more stringent requirements are placed on the performance and energy efficiency of heterogeneous networks. As the complexity of heterogeneous networks continues to increase, energy efficiency optimization faces increasing challenges and is one of the hot spots of communication network research, especially for heterogeneous networks equipped with 5G base stations. The key is how to effectively configure heterogeneous network resources to maximize the energy efficiency of the network system.
Research on heterogeneous network resource allocation problems from the bottom direction mainly focuses on jointly considering almost blank subframes (Almost Blank Subframe, ABS), cell coverage extension (Cell Range Expansion, CRE), and characteristics of small base station dormancy strategies, etc. to solve system energy efficiency allocation. Many scholars have finally established a non-convex NP-Hard problem. The conversion to a convex problem is achieved by relaxation (Karush-Kuhn-Tucker, KKT) conditions. The most effective method is to consider ABS jointly, and the CRE and the base station dormancy strategy are divided into three sub-problems of ABS, CRE and small base station dormancy strategy, wherein each sub-problem is convex, and according to a convex optimization theory, the original non-convex NP-Hard problem is obtained by circularly iterating solutions of the three sub-problems. The disadvantage of this solution is that the traditional mathematical method still requires a large amount of computation in actually solving the sub-problem and the computation process is quite complex. Limiting the field of practical application of this solution.
In recent years, machine learning techniques have been increasingly applied to many fields such as big data analysis, advertisement precision delivery, image classification, and the like. At present, a plurality of students introduce machine learning technology into a communication system for resource optimization research, mainly based on deep learning and reinforcement learning.
In the deep neural network, the deep learning has the advantage of good fitting performance. The deep learning method can well approximate the relation between the heterogeneous network resources and the system performance, thereby realizing the maximization of the heterogeneous network performance. The disadvantage is that neural networks can create problems with overfitting and learning speed. The reinforcement learning has the advantage that the model-free scheme and the model-based scheme can be adopted to solve the practical problem like the deep learning. It makes the solution of specific problem become more high-efficient, in time.
The learner maps the relation between the base station and the base station in the heterogeneous network and the base station and the user to the graph theory field, and then decomposes the initial Q-Learning problem into a plurality of Q-Learning sub-problems by combining reinforcement Learning and the graph theory so as to solve the network resource allocation and optimize the system performance.
Disclosure of Invention
The invention aims to: the invention aims to provide a method for optimizing heterogeneous network resources by using reinforcement Learning, aiming at the defect that reinforcement Learning is directly applied to heterogeneous network resource allocation and has overlarge action space, and the convergence rate is improved by 60% compared with that of the traditional form type Q-Learning on the premise that the theoretical value of system energy efficiency is almost reached.
The technical scheme is as follows: in order to achieve the above purpose, the invention adopts the following technical scheme:
a method for optimizing heterogeneous network resources using reinforcement learning, comprising the steps of:
step 1, establishing a Markov decision process according to a heterogeneous network energy efficiency target which needs to be optimized;
step 2, designing a traditional Q-Learning according to a Markov decision process;
step 3, redesigning the rewarding function value aiming at the problem that the magnitude of the rewarding function value in the Q-Learning is overlarge, taking the negative number and then taking the reciprocal, and compressing the rewarding function value to (-1, 0);
step 4: dividing a traditional Q-Learning action space into three sub-Q-Learning action spaces according to action correlation, namely an ABS, CRE and small base station dormancy strategy;
step 5, the cyclic iteration process is that the stable solution obtained by the three sub Q-Learning is cyclic iterated; in order to accelerate the convergence speed, the stable solution of each loop iteration is not necessarily the optimal solution of three sub Q-Learning;
step 6, bringing the solution obtained by each sub-problem into the condition of solving the two following sub-problems, enabling the solutions of the three sub-problems to reach a stable state at the same time through mutually circulating and iterating, combining the stable solutions of the three sub-problems, and outputting the optimal solution A of the original problem ABSo ,A CREo And A picoo 。
Further, in step 1, a step is establishedA markov decision process (S, a, P, R), specifically defining S as a state space, i.e. a set of user locations within a heterogeneous network cell; definition A is action space, namely action set selected by the agent under the condition of state S, definition P is transition state, namely: p(s) t+1 =s'|s t =s,a t =a); define R as the bonus function.
Further, in step 3, the prize function value is redesigned, the negative number is taken first, then the reciprocal is taken, and the prize function value is compressed to (-1, 0), namelyAnd the energy efficiency function of the E system is ensured, and the consistency of the rewarding function and the energy efficiency of the system is ensured.
Further, in step 4, the conventional Q-Learning action space is divided into three sub-Q-Learning action spaces, i.e. A is decomposed intoAnd->The action space set for optimizing the sleep strategies of the ABS, CRE and small base station is sequentially defined as +.>Define +.>For the configuration set of CRE, define +.>A sleep strategy set for the small base station; solving the sleep strategy solutions of the ABS, the CRE and the small base station respectively:
further, in step 5, the loop iterates, namely
R ABS ~P(R|S,A ABS )≤R ABSo ~P(R|S,A ABSo ),
R CRE ~P(R|S,A CRE )≤R CREo ~P(R|S,A CREo ),
R CRE ~P(R|S,A CRE )≤R Picoo ~P(R|S,A Picoo ) Wherein A is ABSo ,A CREo And
A picoo is the optimal action of three sub-Q-Learning.
The principle of the invention: according to the method, an initial problem is decomposed into a plurality of sub-problems according to the correlation of configuration resources, and the solution of the initial problem is obtained by circularly iterating the solution of the sub-problems. The approach to solving several sub-problems employs Q-Learning instead of traditional mathematical methods. Mapping the initial problem to the reinforcement Learning field, dividing an action space according to the correlation of actions, decomposing the original Q-Learning into a plurality of sub Q-Learning according to the rule of dividing actions, and obtaining the optimal strategy of the initial Q-Learning by circularly iterating the optimal strategy of the sub Q-Learning. The system energy efficiency is redesigned as the reward function, and the system energy efficiency is firstly taken as a negative value and then is taken as an inverse value, so that the reward function value of reinforcement learning can be compressed to (-1, 0), and the new reward function is ensured to be consistent with the system energy efficiency value.
The beneficial effects are that: compared with the prior art, the method for optimizing heterogeneous network resources by utilizing reinforcement learning is integrated with reinforcement learning and convex optimization theory, and the method is used for dividing the action space according to the correlation of actions, namely ABS, CRE and small base station dormancy strategies, aiming at the problem that the energy efficiency of a system in the reinforcement learning modeling process is over large in order of magnitude of a reward function value, redesigning the reward function value, taking the negative number first and then taking the reciprocal as a new reward function value. The invention reduces the action space of reinforcement learning, ensures the convergence of the system by adopting the convex optimization theory, and accelerates the convergence rate of reinforcement learning; the simulation experiment proves that the method has convergence and lower complexity, and the convergence speed is improved by 60 percent compared with the traditional form type Q-Learning on the premise of almost reaching the theoretical value of the energy efficiency of the system.
Drawings
FIG. 1 is a flow chart of a method construction process of the present invention;
FIG. 2 is a schematic diagram of iterative operation of a sub-Q-Learning loop of the present invention;
FIG. 3 is a chart showing the convergence rate of the conventional Q-Learning method under the same parameter setting;
FIG. 4 is a chart showing the convergence speed of the method of the present invention under the same parameter setting;
FIG. 5 is a diagram of the energy efficiency of the method system of the present invention.
Detailed Description
The invention is further described below in conjunction with the detailed description.
As shown in fig. 1-5, a method for optimizing heterogeneous network resources by reinforcement learning includes the following steps:
step 1: establishing a Markov decision process (Markov Decision Process, MDP) (S, A, P, R) according to the heterogeneous network energy efficiency target to be optimized, wherein S is defined as a state space, namely a set of user positions in a heterogeneous network cell; definition A is action space, namely action set selected by the agent under the condition of state S, definition P is transition state, namely
P(s t+1 =s'|s t =s,a t =a); define R as the bonus function.
Step 2: designing a traditional Q-Learning according to a Markov decision process;
step 3: aiming at the problem that the magnitude of the rewarding function value in Q-Learning is overlarge, redesigning the rewarding function value, taking the negative number and then taking the reciprocal, and compressing the rewarding function value to (-1, 0), namelyThe energy efficiency function of the E system ensures the consistency of the rewarding function and the energy efficiency of the system at the same time;
step 4: according to the correlation of actions, namely ABS, CRE and small base station dormancy strategies, the traditional Q-Learning action is performedThe space is divided into three sub-Q-Learning action spaces, i.e. A is decomposed into And->The action space set for optimizing the sleep strategies of the ABS, CRE and small base station is sequentially defined as +.>Define +.>For the configuration set of CRE, define +.>And the method is a small base station dormancy strategy set. Solving sleep strategy solutions of ABS, CRE and small base station respectively
Step 5: the loop iteration process is to perform loop iteration on stable solutions obtained by three sub Q-Learning. To increase the convergence rate, the stable solution for each loop iteration is not necessarily the optimal solution for three sub-Q-Learning, i.e
R ABS ~P(R|S,A ABS )≤R ABSo ~P(R|S,A ABSo ),
R CRE ~P(R|S,A CRE )≤R CREo ~P(R|S,A CREo ),
R CRE ~P(R|S,A CRE )≤R Picoo ~P(R|S,A Picoo ) Wherein A is ABSo ,A CREo And
A picoo is the optimal action of three sub Q-Learning;
step 6: the solution obtained by each sub-problem is brought into the condition of solving the two following sub-problems, the solutions of the three sub-problems reach a stable state at the same time through mutually circulating iteration, the stable solutions of the three sub-problems are combined, and the optimal solution A of the original problem is output ABSo ,A CREo And A picoo 。
FIG. 1 is a flow chart of the construction process of the method of the present invention. The traditional form type Q-Learning complex problem has a higher dimensional action space, and it is impractical to directly apply Q-Learning. As shown in fig. 1, an MDP is established according to the energy efficiency to be optimized, and Q-Learning of the conventional table type is established. Aiming at the problem of overlarge action space, the action space is divided according to the relation of the action required to be optimized by the energy efficiency of the system. The Q-Learning of the original table is decomposed into three sub-Q-Learning, and the actions to be optimized are obtained. When the solutions of the three sub Q-Learning are all kept stable in the loop iteration, the solutions of the three sub Q-Learning are combined and output. The solution of the original Q-Learning is obtained.
FIG. 2 shows a flow chart of three sub-Q-Learning loop iterations, wherein the current sub-Q-Learning solution is updated by the last loop iteration solution, then the sub-Q-Learning solution is used as the condition of two sub-Q-Learning solutions to be solved subsequently, the solutions of the three sub-problems reach a stable state at the same time through the loop iteration, the stable solutions of the three sub-problems are combined, and the optimal solution of the original problem is generated and output.
Based on the flowcharts of fig. 1 and fig. 2, in the simulation experiment, the number of users is set to 50, 100, 150, and 200, respectively, and is randomly entered into the cell. The wireless channel is modeled as a deterministic path loss attenuation and random shadowing fading model, and the system bandwidth is set to 10MHz. FIGS. 3 and 4 show the relationship between the number of iterations and the accuracy of the conventional form type Q-Learning method and the method of improving reinforcement Learning actions (TQL) of the present invention, wherein the Learning rate, the discount factor and the greedy rate are each set to 0.1. Wherein fig. 3 shows convergence after about 80×10000=800000 iterations of the Q-Learning method under different load conditions, and in fig. 4, the proposed TQL method converges after about 800×400=320000 iterations, and in fig. 3-4, accuracy represents the Accuracy rate, learning rate, discover factor represents the Discount factor, greedy rate, and itersips represents the number of iterative steps. As can be seen from fig. 3 and 4, the convergence rate of the TQL method we propose is improved by about 60% compared to method 1.
Fig. 5 shows a comparison of the TQL method proposed by the present invention with the conventional Q-Learning method and ADPs ES IC method for energy efficiency optimization of heterogeneous networks, where Energy Efficiency represents the energy efficiency and UEs represents the number of users. From fig. 5 (a), it can be seen that the optimization of the system energy efficiency by the method proposed by the present invention is already very close to the theoretical value of the system energy efficiency, and it is also seen that the optimization of the system energy efficiency by the method proposed by the present invention is far greater than the performance of the ADPs ES IC proposed by the relevant scholars. Fig. 5 (b) shows the gap between the energy efficiency method of the optimized heterogeneous network and the theoretical optimal value of the energy efficiency, and it can be seen from fig. 4 that the gap exists mainly in that the method of the invention finds a relatively optimized solution in a state that the optimal solution is not found in an individual state, but the relatively optimized solution is found in a state that the optimal solution is not found, and fig. 5 (b) verifies that the loss of the energy efficiency of the system is small.
The foregoing is merely a preferred embodiment of the present invention, and it will be apparent to those skilled in the art that modifications and variations can be made without departing from the technical principles of the present invention, and the modifications and variations should also be regarded as the scope of the invention.
Claims (5)
1. A method for optimizing heterogeneous network resources by using reinforcement learning is characterized in that: the method comprises the following steps:
step 1, establishing a Markov decision process according to a heterogeneous network energy efficiency target which needs to be optimized;
step 2, designing a traditional Q-Learning according to a Markov decision process;
step 3, redesigning the order of magnitude of the rewarding function value in the Q-Learning, taking the negative number and then taking the reciprocal, and compressing the rewarding function value to (-1, 0);
step 4: dividing a traditional Q-Learning action space into three sub-Q-Learning action spaces according to action correlation, namely an ABS, CRE and small base station dormancy strategy;
step 5, the cyclic iteration process is that the stable solution obtained by the three sub Q-Learning is cyclic iterated; in order to accelerate the convergence speed, the stable solution of each loop iteration is not necessarily the optimal solution of three sub Q-Learning;
step 6, bringing the solution obtained by each sub-problem into the condition of solving the two following sub-problems, enabling the solutions of the three sub-problems to reach a stable state at the same time through mutually circulating and iterating, combining the stable solutions of the three sub-problems, and outputting the optimal solution A of the original problem ABSo ,A CREo And A picoo 。
2. The method for optimizing heterogeneous network resources using reinforcement learning of claim 1, wherein: in step 1, a markov decision process (S, a, P, R) is established, specifically, defining S as a state space, that is, a set of user positions in heterogeneous network cells, defining a as an action space, that is, an action set selected by an agent under the condition of state S, and defining P as a transition state, that is:
P(s t+1 =s'|s t =s,a t =a); define R as the bonus function.
3. The method for optimizing heterogeneous network resources using reinforcement learning of claim 2, wherein: in step 3, the value of the reward function is redesigned, the negative number is firstly taken, then the reciprocal is taken, and the value of the reward function is compressed to (-1, 0), namelyAnd the energy efficiency function of the E system is ensured, and the consistency of the rewarding function and the energy efficiency of the system is ensured.
4. A method for optimizing heterogeneous network resources using reinforcement learning as recited in claim 3, wherein: in step 4, the conventional Q-LThe reading action space is divided into three sub-Q-Learning action spaces, i.e. A is decomposed intoAnd->The action space set for optimizing the sleep strategy of the ABS, the CRE and the small base station is sequentially definedDefine +.>For the configuration set of CRE, define +.>A sleep strategy set for the small base station; solving the sleep strategy solutions of the ABS, the CRE and the small base station respectively:
5. the method for optimizing heterogeneous network resources using reinforcement learning of claim 4, wherein: in step 5, the loop iteration is that
R ABS ~P(R|S,A ABS )≤R ABSo ~P(R|S,A ABSo ),
R CRE ~P(R|S,A CRE )≤R CREo ~P(R|S,A CREo ),
R CRE ~P(R|S,A CRE )≤R Picoo ~P(R|S,A Picoo ) Wherein A is ABSo ,A CREo And A picoo Is the optimal action of three sub-Q-Learning.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011002522.7A CN112188600B (en) | 2020-09-22 | 2020-09-22 | Method for optimizing heterogeneous network resources by reinforcement learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011002522.7A CN112188600B (en) | 2020-09-22 | 2020-09-22 | Method for optimizing heterogeneous network resources by reinforcement learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112188600A CN112188600A (en) | 2021-01-05 |
CN112188600B true CN112188600B (en) | 2023-05-30 |
Family
ID=73955731
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011002522.7A Active CN112188600B (en) | 2020-09-22 | 2020-09-22 | Method for optimizing heterogeneous network resources by reinforcement learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112188600B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113709882B (en) * | 2021-08-24 | 2023-10-17 | 吉林大学 | Internet of vehicles communication resource allocation method based on graph theory and reinforcement learning |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9622133B1 (en) * | 2015-10-23 | 2017-04-11 | The Florida International University Board Of Trustees | Interference and mobility management in UAV-assisted wireless networks |
CN108521673A (en) * | 2018-04-09 | 2018-09-11 | 湖北工业大学 | Resource allocation and power control combined optimization method based on intensified learning in a kind of heterogeneous network |
CN109726866A (en) * | 2018-12-27 | 2019-05-07 | 浙江农林大学 | Unmanned boat paths planning method based on Q learning neural network |
CN110691422A (en) * | 2019-10-06 | 2020-01-14 | 湖北工业大学 | Multi-channel intelligent access method based on deep reinforcement learning |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10091785B2 (en) * | 2014-06-11 | 2018-10-02 | The Board Of Trustees Of The University Of Alabama | System and method for managing wireless frequency usage |
-
2020
- 2020-09-22 CN CN202011002522.7A patent/CN112188600B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9622133B1 (en) * | 2015-10-23 | 2017-04-11 | The Florida International University Board Of Trustees | Interference and mobility management in UAV-assisted wireless networks |
CN108521673A (en) * | 2018-04-09 | 2018-09-11 | 湖北工业大学 | Resource allocation and power control combined optimization method based on intensified learning in a kind of heterogeneous network |
CN109726866A (en) * | 2018-12-27 | 2019-05-07 | 浙江农林大学 | Unmanned boat paths planning method based on Q learning neural network |
CN110691422A (en) * | 2019-10-06 | 2020-01-14 | 湖北工业大学 | Multi-channel intelligent access method based on deep reinforcement learning |
Non-Patent Citations (1)
Title |
---|
"面向智能通信的深度强化学习方法";谭俊杰;《电子科技大学学报》;全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN112188600A (en) | 2021-01-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112492691B (en) | Downlink NOMA power distribution method of depth deterministic strategy gradient | |
CN112188600B (en) | Method for optimizing heterogeneous network resources by reinforcement learning | |
Li et al. | Energy efficiency maximization oriented resource allocation in 5G ultra-dense network: Centralized and distributed algorithms | |
Xu et al. | Dynamic client association for energy-aware hierarchical federated learning | |
Li et al. | Deep neural network based computational resource allocation for mobile edge computing | |
CN104640185A (en) | Cell dormancy energy-saving method based on base station cooperation | |
Hu et al. | Multi-agent DRL-based resource allocation in downlink multi-cell OFDMA system | |
Zhao et al. | Price-based power allocation in two-tier spectrum sharing heterogeneous cellular networks | |
CN114615730A (en) | Content coverage oriented power distribution method for backhaul limited dense wireless network | |
US11961409B1 (en) | Air-ground joint trajectory planning and offloading scheduling method and system for distributed multiple objectives | |
Wang et al. | Joint heterogeneous tasks offloading and resource allocation in mobile edge computing systems | |
CN111065121B (en) | Intensive network energy consumption and energy efficiency combined optimization method considering cell difference | |
Huang et al. | Drop Maslow's Hammer or not: machine learning for resource management in D2D communications | |
CN116233984A (en) | Energy-saving control method and device of base station, electronic equipment and storage medium | |
CN116132997A (en) | Method for optimizing energy efficiency in hybrid power supply heterogeneous network based on A2C algorithm | |
Mohammad et al. | Optimal task allocation for mobile edge learning with global training time constraints | |
CN115915454A (en) | SWIPT-assisted downlink resource allocation method and device | |
CN101729105A (en) | Power control structure and method thereof based on game theory model in network | |
CN107995034A (en) | A kind of dense cellular network energy and business collaboration method | |
Guo et al. | Deep reinforcement learning based traffic offloading scheme for vehicular networks | |
CN115720341A (en) | Method, medium and device for 5G channel shutoff | |
Besser et al. | Deep learning based resource allocation: How much training data is needed? | |
CN115250156A (en) | Wireless network multichannel frequency spectrum access method based on federal learning | |
CN103607759A (en) | Zoom dormancy method and apparatus for micro base station in cellular network | |
CN104507111B (en) | Collaborative communication method and device based on cluster in cellular network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
TA01 | Transfer of patent application right |
Effective date of registration: 20210707 Address after: No.333 Xishan Avenue, Xishan District, Wuxi City, Jiangsu Province Applicant after: Binjiang College of Nanjing University of Information Engineering Applicant after: ICTEHI TECHNOLOGY DEVELOPMENT Co.,Ltd. Address before: No.333 Xishan Avenue, Xishan District, Wuxi City, Jiangsu Province Applicant before: Binjiang College of Nanjing University of Information Engineering |
|
TA01 | Transfer of patent application right | ||
GR01 | Patent grant | ||
GR01 | Patent grant |