CN111741531B - An optimization method for the best operating state of communication equipment under 5G base stations - Google Patents
An optimization method for the best operating state of communication equipment under 5G base stations Download PDFInfo
- Publication number
- CN111741531B CN111741531B CN202010863911.2A CN202010863911A CN111741531B CN 111741531 B CN111741531 B CN 111741531B CN 202010863911 A CN202010863911 A CN 202010863911A CN 111741531 B CN111741531 B CN 111741531B
- Authority
- CN
- China
- Prior art keywords
- state
- time
- real
- optimization
- communication equipment
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 230000006854 communication Effects 0.000 title claims abstract description 71
- 238000004891 communication Methods 0.000 title claims abstract description 67
- 238000005457 optimization Methods 0.000 title claims abstract description 64
- 238000000034 method Methods 0.000 title claims abstract description 58
- 238000004146 energy storage Methods 0.000 claims description 7
- 230000005540 biological transmission Effects 0.000 claims description 3
- 230000002068 genetic effect Effects 0.000 claims description 3
- 150000001875 compounds Chemical class 0.000 claims 5
- 238000013139 quantization Methods 0.000 claims 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 230000005611 electricity Effects 0.000 description 1
- 238000005562 fading Methods 0.000 description 1
- 238000003306 harvesting Methods 0.000 description 1
- 238000013178 mathematical model Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W72/00—Local resource management
- H04W72/04—Wireless resource allocation
- H04W72/044—Wireless resource allocation based on the type of the allocated resource
- H04W72/0473—Wireless resource allocation based on the type of the allocated resource the resource being transmission power
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W52/00—Power management, e.g. Transmission Power Control [TPC] or power classes
- H04W52/04—Transmission power control [TPC]
- H04W52/18—TPC being performed according to specific parameters
- H04W52/20—TPC being performed according to specific parameters using error rate
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W52/00—Power management, e.g. Transmission Power Control [TPC] or power classes
- H04W52/04—Transmission power control [TPC]
- H04W52/18—TPC being performed according to specific parameters
- H04W52/26—TPC being performed according to specific parameters using transmission rate or quality of service QoS [Quality of Service]
- H04W52/265—TPC being performed according to specific parameters using transmission rate or quality of service QoS [Quality of Service] taking into account the quality of service QoS
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W52/00—Power management, e.g. Transmission Power Control [TPC] or power classes
- H04W52/04—Transmission power control [TPC]
- H04W52/18—TPC being performed according to specific parameters
- H04W52/26—TPC being performed according to specific parameters using transmission rate or quality of service QoS [Quality of Service]
- H04W52/267—TPC being performed according to specific parameters using transmission rate or quality of service QoS [Quality of Service] taking into account the information rate
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W72/00—Local resource management
- H04W72/50—Allocation or scheduling criteria for wireless resources
- H04W72/54—Allocation or scheduling criteria for wireless resources based on quality criteria
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W72/00—Local resource management
- H04W72/50—Allocation or scheduling criteria for wireless resources
- H04W72/54—Allocation or scheduling criteria for wireless resources based on quality criteria
- H04W72/542—Allocation or scheduling criteria for wireless resources based on quality criteria using measured or perceived quality
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W72/00—Local resource management
- H04W72/50—Allocation or scheduling criteria for wireless resources
- H04W72/54—Allocation or scheduling criteria for wireless resources based on quality criteria
- H04W72/543—Allocation or scheduling criteria for wireless resources based on quality criteria based on requested quality, e.g. QoS
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W72/00—Local resource management
- H04W72/04—Wireless resource allocation
- H04W72/044—Wireless resource allocation based on the type of the allocated resource
- H04W72/0446—Resources in time domain, e.g. slots or frames
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W72/00—Local resource management
- H04W72/04—Wireless resource allocation
- H04W72/044—Wireless resource allocation based on the type of the allocated resource
- H04W72/046—Wireless resource allocation based on the type of the allocated resource the resource being in the space domain, e.g. beams
Landscapes
- Engineering & Computer Science (AREA)
- Quality & Reliability (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Mobile Radio Communication Systems (AREA)
Abstract
本发明公开了一种5G基站下通信设备最佳运行状态的优化方法,所述5G基站下通信设备分为非实时更新状态的通信设备和实时更新状态的通信设备;对于非实时更新状态的通信设备,已知时隙、供电电量和不可控参数组,构造优化问题并求解,根据求解结果实现5G基站下通信设备的运行状态优化;对于实时更新状态的通信设备,设备只有实时的信息,即只有到了任意时隙,才能得到供电电量和不可控参数;通过构造马尔科夫决策过程,然后求出马尔科夫决策过程的最优策略,根据最优策略实现5G基站下通信设备的运行状态优化。本发明同时针对实时与非实时更新两种不同运行状态下的通信设备,使其整个运行过程中的整体性能可以达到最佳。
The invention discloses a method for optimizing the optimal operating state of communication equipment under a 5G base station. The communication equipment under the 5G base station is divided into communication equipment in a non-real-time update state and a communication device in a real-time update state; for communication in the non-real-time update state Equipment, with known time slot, power supply and uncontrollable parameter group, construct an optimization problem and solve it, and realize the optimization of the operation state of communication equipment under 5G base station according to the solution results; Only at any time slot can the power supply and uncontrollable parameters be obtained; by constructing the Markov decision process, and then finding the optimal strategy of the Markov decision process, according to the optimal strategy, the operation state optimization of the communication equipment under the 5G base station is realized . The present invention simultaneously aims at real-time and non-real-time updating of communication devices in two different operating states, so that the overall performance in the entire operating process can be optimal.
Description
技术领域technical field
本发明涉及网络通信领域,具体涉及一种5G基站下通信设备最佳运行状态的优化方法。The invention relates to the field of network communication, in particular to a method for optimizing the optimal operating state of communication equipment under a 5G base station.
背景技术Background technique
传统的通信机制在执行的过程中往往默认设备具有足够多的能量执行相应的操作,并没有考虑设备的能量因素。而当设备贮能能力弱,加上供电量不确定性,传统的通信机制会增加设备断电的风险。The traditional communication mechanism often assumes that the device has enough energy to perform the corresponding operation in the process of execution, and does not consider the energy factor of the device. When the energy storage capacity of the equipment is weak, coupled with the uncertainty of the power supply, the traditional communication mechanism will increase the risk of equipment power failure.
针对上述问题,本申请将研究能量采集感知通信机制设计,即依据设备存储和供电能量的变化而动态调整通信过程和相关参数。根据不同设备的供电特性、设备贮能特性以及数据产生特性选择合适的建模方式和分析方法:对于供电量可控(比如使用专有的能量源对设备进行供电)、设备贮能量已知、数据产生量可预测的设备,在线下使用已知的信息对设备通信过程进行模拟,结合服务场景特性和相关性能指标,构造适合的数学模型,利用数学工具对通信过程进行优化;而对于某些信息不可控、不可预测的设备,借助马尔科夫决策过程等模型,利用动态规划等算法在通信过程中实时优化设备的操作步骤和能量管理。In view of the above problems, this application will study the design of the energy harvesting sensing communication mechanism, that is, the communication process and related parameters are dynamically adjusted according to the changes of device storage and power supply energy. According to the power supply characteristics, equipment energy storage characteristics and data generation characteristics of different equipment, select appropriate modeling methods and analysis methods: for controllable power supply (such as using a dedicated energy source to power equipment), equipment storage energy is known, For devices with predictable data generation, use known information offline to simulate the communication process of the device, combine the characteristics of service scenarios and related performance indicators, construct a suitable mathematical model, and use mathematical tools to optimize the communication process; For equipment with uncontrollable and unpredictable information, with the help of models such as Markov decision process, algorithms such as dynamic programming are used to optimize the operation steps and energy management of the equipment in real time during the communication process.
发明内容SUMMARY OF THE INVENTION
本发明针对现有技术的不足,提供了一种5G基站下通信设备最佳运行状态的优化方法。Aiming at the deficiencies of the prior art, the present invention provides an optimization method for the best operating state of communication equipment under a 5G base station.
本申请所采用的技术方案如下:本发明提供的一种5G基站下通信设备最佳运行状态的优化方法,所述5G基站下通信设备分为非实时更新状态的通信设备和实时更新状态的通信设备;The technical solutions adopted in this application are as follows: the present invention provides a method for optimizing the optimal operating state of communication equipment under a 5G base station, wherein the communication equipment under the 5G base station is divided into communication equipment with a non-real-time update state and a communication device with a real-time update state equipment;
对于非实时更新状态的通信设备,已知时隙i、供电电量和不可控参数组,构 造优化问题并求解,具体步骤如下: For a communication device that does not update the state in real time, the time slot i and the power supply quantity are known. and uncontrollable parameter groups , construct an optimization problem and solve it. The specific steps are as follows:
a) 确定优化目标:5G基站下的设备性能指标量化为,为可控参数组; a) Determine the optimization goal: The equipment performance index under the 5G base station is quantified as , is a controllable parameter group;
b) 确定电力需求保障约束:在每个时隙开始时设备所含有的电量均能够保障该时隙的电力需求;b) Determine the power demand guarantee constraint: the power contained in the equipment at the beginning of each time slot can guarantee the power demand of the time slot;
c) 优化问题建模:在电力需求保障约束条件下,得到最优性能指标,具体建模如下:c) Optimization problem modeling: Under the constraint of power demand guarantee, the optimal performance index is obtained, and the specific modeling is as follows:
式中,N为总时隙个数,B 1为第1个时隙开始时设备所含有的电量,为在不可 控参数组下选取可控参数组进行操作所消耗的能量; In the formula, N is the total number of time slots, B 1 is the power contained in the device at the beginning of the first time slot, for the uncontrollable parameter group Select the controllable parameter group under the energy expended to perform the operation;
d) 求解步骤c)的优化问题,根据求解结果实现5G基站下通信设备的运行状态优化;d) Solve the optimization problem of step c), and realize the optimization of the operation state of the communication equipment under the 5G base station according to the solution result;
对于实时更新状态的通信设备,设备只有实时信息,即只有到了任意时隙才能得到供电电量和不可控参数组;通过构造马尔科夫决策过程,求解马尔科夫决策过程的最优策略实现运行状态优化,具体步骤如下:For the communication device that updates the state in real time, the device only has real-time information, that is, the power supply and the uncontrollable parameter group can only be obtained at any time slot. Optimization, the specific steps are as follows:
A) 确定状态空间、动作空间和奖励:在马尔科夫决策过程中,若在实时更新状态 的通信设备的状态为供电电量、电池储能、不可控参数组的情形下,采取的动作为选 用实时更新状态的通信设备的可控参数组,则奖励就是此时关注的设备性能指标; A) Determining state space, action space and reward: In the Markov decision process, if the state of the communication device updating the state in real time is the power supply , battery energy storage , Uncontrollable parameter group In the case of , then the reward is the device performance index concerned at this time;
B) 确定决定规则和策略:若当前的状态-动作历史是,t表示为 第t个时隙;在决定规则下,动作由当前的状态-动作历史决定;策略表示为; B) Determine decision rules and policies: if the current state-action history is , t is represented as the t-th time slot; in the decision rule , the action is determined by the current state-action history; the policy is expressed as ;
C) 确定优化目标以及问题建模:通过奖励和的期望替代奖励和来评判策略的好 坏;当初始状态为时,第1个时隙到第N个时隙奖励和的期望如下: C) Determining the optimization objective and modeling the problem: Evaluate the policy by the expected replacement of the reward sum by the reward sum is good or bad; when the initial state is When , the expectation of the reward sum from the 1st slot to the Nth slot as follows:
式中,是t时隙的奖励,为策略的期望;和分别为状态随机序列和动作 随机序列中的元素;最终的目标就是找到最优的策略,使得 In the formula, is the reward for time slot t, for the strategy expectations; and are the elements in the random sequence of states and the random sequence of actions; the ultimate goal is to find the optimal strategy , so that
式中,代表所有可能策略的集合,为状态空间; In the formula, represents the set of all possible strategies, is the state space;
D) 求解步骤C)的优化目标得到最优策略,根据最优策略实现5G基站下通信设备的运行状态优化。D) Solve the optimization objective of step C) to obtain the optimal strategy, and realize the optimization of the operation state of the communication equipment under the 5G base station according to the optimal strategy.
进一步地,非实时更新状态的通信设备和实时更新状态的通信设备的可控参数组,包括发射功率和编码码率。Further, the controllable parameter group of the communication device in the non-real-time update state and the communication device in the real-time update state includes the transmission power and the encoding code rate.
进一步地,非实时更新状态的通信设备和实时更新状态的通信设备的不可控参数组,包括信道条件、产生的数据量、分配的时间资源与空间资源。Further, the uncontrollable parameter groups of the communication device that does not update the state in real time and the communication device that updates the state in real time include channel conditions, the amount of data generated, and the allocated time and space resources.
进一步地,所述5G基站下的设备性能指标包括误码率、吞吐量和服务质量QoS。Further, the equipment performance indicators under the 5G base station include bit error rate, throughput and QoS.
进一步地,步骤b)具体为:假设在第i个时隙开始时设备所含有的电量为,则得 到 Further, step b) is specifically as follows: it is assumed that the power contained in the device at the beginning of the i -th time slot is , then get
式中,为在不可控参数组下选取可控参数组进行操作所消耗的能量; 为了使得设备的用电需求得到保障,必须有如下能量约束条件: In the formula, for the uncontrollable parameter group Select the controllable parameter group under The energy consumed by the operation; In order to ensure the power demand of the equipment, the following energy constraints must be met:
式中,N为总时隙个数,对应N个能量约束条件,即对于任何一个时隙,设备拥有的电量都能保障该时隙的电力需求。In the formula, N is the total number of time slots, corresponding to N energy constraints, that is, for any time slot, the power possessed by the device can guarantee the power demand of the time slot.
进一步地,步骤d)中,若优化问题为凸优化问题,使用标准的凸优化问题的解法进行求解;若优化问题不是凸优化问题,将标准的凸优化问题的解法与遗传算法相结合进行求解,以减少收敛到次优解的情况发生。Further, in step d), if the optimization problem is a convex optimization problem, the solution method of the standard convex optimization problem is used to solve it; if the optimization problem is not a convex optimization problem, the solution method of the standard convex optimization problem is combined with the genetic algorithm to solve the problem. , to reduce the occurrence of convergence to a suboptimal solution.
进一步地,所述标准的凸优化问题的解法为牛顿法或内点法。Further, the solution method of the standard convex optimization problem is Newton's method or interior point method.
进一步地,步骤A)中,将设备的供电电量的取值集合记为,设备的电池储能的取 值集合记为,不可控参数组的取值集合记为;则状态空间表示为;状态为状态空间的一个元素,;可控参数组的取值集合即为动作空间;任何一组可控 参数组都是的一个元素,称为一个动作;选择一组可控参数组就是在马尔科夫决策过程 中选择一个动作。 Further, in step A), the value set of the power supply of the device is recorded as , the value set of the battery energy storage of the device is denoted as , the value set of the uncontrollable parameter group is denoted as ; then the state space is expressed as ;state is an element of the state space, ; Value set of controllable parameter group is the action space; any set of controllable parameters is An element of , called an action; choosing a set of controllable parameters is choosing an action in a Markov decision process.
进一步地,步骤D)中,对于马尔科夫决策过程,使用动态规划、值迭代、策略迭代或线性规划方法求得最优策略。Further, in step D), for the Markov decision process, dynamic programming, value iteration, policy iteration or linear programming method is used to obtain the optimal policy.
本发明的有益效果:不同通信设备在不同运行环境下所需要关注的性能指标有所 不同,常规的优化方案难以针对各种性能指标有统一泛化的解决;同一通信设备在运行时 也有实时与非实时更新两种不同状态,对处于实时更新状态的通信设备的性能优化通常没 法得到很好的解决。本发明将关注的设备性能指标量化为,适用于不同情况下的性 能指标优化,同时针对实时与非实时更新两种不同运行状态下的通信设备,使其整个运行 过程中的整体性能可以达到最佳。 The beneficial effects of the present invention are as follows: different communication devices need to pay attention to different performance indicators in different operating environments, and it is difficult for the conventional optimization scheme to have a unified generalization solution for various performance indicators; the same communication device also has real-time and There are two different states of non-real-time update, and the performance optimization of the communication device in the real-time update state usually cannot be well solved. The present invention quantifies the equipment performance index concerned as , which is suitable for the optimization of performance indicators in different situations, and at the same time, for real-time and non-real-time update of communication equipment in two different operating states, so that the overall performance of the entire operation process can reach the best.
附图说明Description of drawings
图1为本发明一种5G基站下通信设备最佳运行状态的优化方法流程图。FIG. 1 is a flowchart of a method for optimizing the optimal operation state of a communication device under a 5G base station according to the present invention.
具体实施方式Detailed ways
以下结合附图对本发明具体实施方式作进一步详细说明。The specific embodiments of the present invention will be further described in detail below with reference to the accompanying drawings.
如图1所示,本发明提供的一种5G基站下通信设备最佳运行状态的优化方法,该方法中,所述5G基站下通信设备分为非实时更新状态的通信设备和实时更新状态的通信设备两种情况;非实时更新状态的通信设备和实时更新状态的通信设备的可控的参数组包括发射功率、编码码率。非实时更新状态的通信设备和实时更新状态的通信设备的不可控参数组即为通信设备的外部条件,包括信道条件、产生的数据量、分配的时间资源与空间资源。As shown in FIG. 1, the present invention provides a method for optimizing the optimal operating state of communication equipment under a 5G base station. In this method, the communication equipment under the 5G base station is divided into communication equipment with a non-real-time update state and a communication device with a real-time update state. There are two cases of the communication device; the controllable parameter group of the communication device in the non-real-time update state and the communication device in the real-time update state includes the transmission power and the coding rate. The uncontrollable parameter group of the communication device that does not update the state in real time and the communication device that updates the state in real time is the external condition of the communication device, including the channel condition, the amount of data generated, and the allocated time and space resources.
对于非实时更新状态的通信设备,已知时隙i、供电电量和不可控参数组,基 于此本发明方法先构造优化问题,然后对优化问题求解:a) 确定优化目标;b) 刻画电力需 求保障约束;c) 优化问题建模;d) 优化问题求解。具体步骤如下: For a communication device that does not update the state in real time, the time slot i and the power supply quantity are known. and uncontrollable parameter groups , based on the method of the present invention, the optimization problem is first constructed, and then the optimization problem is solved: a) determine the optimization objective; b) describe the power demand guarantee constraint; c) optimize the problem modeling; d) solve the optimization problem. Specific steps are as follows:
a) 确定优化目标;a) Determine optimization goals;
5G基站下关注的设备性能指标量化为;则优化的目标即为选取最合适的,使得最大。的具体形式和性质与关注的设备性能指标有关,包括误码 率、吞吐量、服务质量QoS。其中是外部环境或者服务场景确定的不可控参数,而是可控 参数,而性能是和共同决定的,根据调整,使得最大。本发明实施 例中,设发送一帧的时间为一个时隙的长度,一帧所含的信息比特为,帧错误率为,若关 注的性能指标为一个时隙平均正确解码的信息比特数,则有 The performance indicators of the equipment concerned under the 5G base station are quantified as ; the optimization goal is to select the most suitable , so that maximum. The specific form and nature of the data are related to the equipment performance indicators concerned, including bit error rate, throughput, and quality of service (QoS). in is an uncontrollable parameter determined by the external environment or service scenario, and are controllable parameters, while performance Yes and jointly decided, according to Adjustment , so that maximum. In this embodiment of the present invention, it is assumed that the time for sending one frame is the length of one time slot, and the information bits contained in one frame are , the frame error rate is , if the performance index of interest is the average number of correctly decoded information bits in a time slot, then
式中,只与可控的参数(编码码率、带宽等)有关,而帧错误率既与可控参数(发 射功率等)有关,又与不可控参数(信道衰落等)有关。即由决定,而由与共同决 定。 In the formula, It is only related to controllable parameters (coding code rate, bandwidth, etc.), while the frame error rate is related to both controllable parameters (transmit power, etc.) and uncontrollable parameters (channel fading, etc.). which is Depend on decide, and Depend on and decided together.
b) 刻画电力需求保障约束;b) Characterize the power demand guarantee constraints;
假设在第i个时隙开始时设备所含有的电量为,则可以得到 Assume that the power contained in the device at the beginning of the ith time slot is , you can get
式中,为在不可控参数组下选取可控参数组进行操作所消耗的能量; 为了使得设备的用电需求得到保障,必须有如下能量约束条件: In the formula, for the uncontrollable parameter group Select the controllable parameter group under The energy consumed by the operation; In order to ensure the power demand of the equipment, the following energy constraints must be met:
式中,N为总时隙个数,对应N个能量约束条件,即对于任何一个时隙,设备拥有的电量都能保障该时隙的电力需求;In the formula, N is the total number of time slots, corresponding to N energy constraints, that is, for any time slot, the amount of electricity possessed by the device can guarantee the power demand of the time slot;
c) 优化问题建模;c) modeling optimization problems;
为了使得5G基站下的通信设备整个运行过程中的总体性能最好,即刻画成在电力需求保障约束条件下,得到最优的性能指标的优化问题,优化问题建模如下:In order to make the overall performance of the communication equipment under the 5G base station the best during the entire operation process, it is immediately described as an optimization problem to obtain the optimal performance index under the constraint of power demand guarantee. The optimization problem is modeled as follows:
d) 优化问题求解,若该优化问题为凸优化问题,可以使用标准的凸优化问题的解法(牛顿法、内点法等)进行求解;若优化问题不是凸优化问题,一种方法是先通过观察设法将问题转化成凸优化问题再进行求解,另一种方法是将牛顿法、内点法等方法与遗传算法等其他算法相结合进行求解,例如使用导数的R-遗传优化算法,以减少收敛到次优解的情况发生,根据求解结果实现5G基站下通信设备的运行状态优化。某些符合特定结构的非凸优化问题也可以直接解决,例如使用投影梯度下降、交替最小化、期望最大化算法、随机优化等方法。d) Solve the optimization problem. If the optimization problem is a convex optimization problem, you can use the standard convex optimization problem solution method (Newton method, interior point method, etc.) to solve it; if the optimization problem is not a convex optimization problem, one method is to first pass Observe and try to convert the problem into a convex optimization problem and solve it. Another method is to combine Newton's method, interior point method and other methods with other algorithms such as genetic algorithm to solve, such as R-genetic optimization algorithm using derivatives, to reduce The situation of converging to a suboptimal solution occurs, and the operation state optimization of the communication equipment under the 5G base station is realized according to the solution result. Certain non-convex optimization problems that conform to specific structures can also be solved directly, such as using projected gradient descent, alternating minimization, expectation maximization algorithms, stochastic optimization, etc.
对于实时更新状态的通信设备,设备只有实时的信息,即只有到了任意时隙,才能得到供电电量和不可控参数;基于此通过构造马尔科夫决策过程,然后求出马尔科夫决策过程的最优策略:具体步骤如下:a) 刻画状态空间、动作空间和奖励;b) 刻画决定规则和策略;c) 优化目标的刻画以及问题建模;d) 马尔科夫决策过程最优策略求解。 For the communication device that updates the state in real time, the device only has real-time information, that is, only when it reaches any time slot, can the power supply and uncontrollable parameters be obtained; Optimal strategy: The specific steps are as follows: a) Characterize the state space, action space and reward; b) Characterize decision rules and strategies; c) Characterize the optimization objective and problem modeling; d) Solve the optimal policy of the Markov decision process.
a) 刻画状态空间、动作空间和奖励;a) characterize the state space, action space and reward;
将设备被提供的电量的取值集合记为,设备所含有的电量的取值集合记为,不 可控的参数组的取值集合记为;则状态空间可以表示为;状态为状态空 间的一个元素,;可控的参数组的取值集合即为动作空间;可见,任何一组可控的 参数组都是的一个元素,称为一个动作;选择一组可控参数就是在马尔科夫决策过程中 选择一个动作; The set of values of the power provided by the device is recorded as , the set of values of the power contained in the device is denoted as , the value set of the uncontrollable parameter group is denoted as ; then the state space can be expressed as ;state is an element of the state space, ; the set of values for the controllable parameter group is the action space; it can be seen that any set of controllable parameter groups is An element of , called an action; choosing a set of controllable parameters is choosing an action in the Markov decision process;
在马尔科夫决策过程中处于状态时,采用动作所得到的收益定义为奖励 ;若在实时更新状态的通信设备的供电电量、实时更新状态的通信设备的电池储能、实 时更新状态的通信设备的不可控的参数组的情形下选用实时更新状态的通信设备的可控 的参数组,则奖励就是此时关注的设备性能指标,即 in a Markov decision process , take action Earnings are defined as rewards ;If the power supply of the communication device in the real-time update state , battery energy storage of communication equipment with real-time status update , Uncontrollable parameter groups of communication devices that update the status in real time In the case of choosing a controllable parameter set of the communication device that updates the status in real time , then the reward is the device performance index concerned at this time, namely
式中,为关注的实时更新状态的通信设备性能指标; In the formula, Communication equipment performance indicators for the real-time update status of concern;
b) 刻画决定规则和策略;决定规则是在某时隙i选择动作的方法;具体为:若当前 的状态-动作历史是,t表示为第t个时隙;在决定规则下,动作由当前的 状态-动作历史决定;策略是一串决定规则组成的序列,用表示,即,N为总 时隙个数;决定规则具有马尔科夫性和确定性,即动作的选取只与当前状态有关; b) Describe decision rules and strategies; decision rules are methods for selecting actions in a certain time slot i ; specifically: if the current state-action history is , t is represented as the t-th time slot; in the decision rule , the action is determined by the current state-action history; the policy is a sequence of decision rules, using means that , N is the total number of time slots; the decision rule is Markov and deterministic, that is, the selection of actions is only related to the current state;
c) 优化目标的刻画以及问题建模;c) Characterization of optimization objectives and problem modeling;
由于实际情况中设备被供电电量和不可控的参数组是随机的,因此在这里,通过 奖励和的期望替代奖励和来评判一个策略的好坏,用表示,即 Since the power supply of the device and the uncontrollable parameter set are random in the actual situation, here, a strategy is judged by the expectation of the reward sum instead of the reward sum good or bad, use means that
式中,和分别为状态随机序列和动作随机序列中 的元素,是t时隙的奖励,为策略的期望;最终的目标就是找到最优的策略,使得 In the formula, and are random sequences of states and a random sequence of actions elements in , is the reward for time slot t, for the strategy expectations; the ultimate goal is to find the optimal strategy , so that
式中,代表所有可能策略的集合; In the formula, represents the set of all possible strategies;
d) 马尔科夫决策过程最优策略求解;对于标准的马尔科夫决策过程,可以使用动态规划、值迭代、策略迭代或线性规划等方法求得最优策略,此外,贪心策略已经被大量证实可以达到局部最优解。因此,当性能损失在可接受的范围内时,也可以采用类似于贪心策略的计算复杂度低的局部最优策略,根据最优策略实现5G基站下通信设备的运行状态优化。d) Solving the optimal strategy of the Markov decision process; for the standard Markov decision process, the optimal strategy can be obtained by methods such as dynamic programming, value iteration, policy iteration or linear programming. In addition, the greedy strategy has been widely confirmed A local optimal solution can be achieved. Therefore, when the performance loss is within an acceptable range, a local optimal strategy with low computational complexity similar to the greedy strategy can also be used to optimize the operating state of the communication equipment under the 5G base station according to the optimal strategy.
上述实施例用来解释说明本发明,而不是对本发明进行限制,在本发明的精神和权利要求的保护范围内,对本发明作出的任何修改和改变,都落入本发明的保护范围。The above-mentioned embodiments are used to explain the present invention, rather than limit the present invention. Within the spirit of the present invention and the protection scope of the claims, any modifications and changes made to the present invention all fall into the protection scope of the present invention.
Claims (7)
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010809016 | 2020-08-12 | ||
CN2020108090162 | 2020-08-12 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111741531A CN111741531A (en) | 2020-10-02 |
CN111741531B true CN111741531B (en) | 2020-11-24 |
Family
ID=72658804
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010863911.2A Active CN111741531B (en) | 2020-08-12 | 2020-08-25 | An optimization method for the best operating state of communication equipment under 5G base stations |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111741531B (en) |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107809764A (en) * | 2017-09-21 | 2018-03-16 | 浙江理工大学 | A kind of multiple affair detection method based on Markov chain |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106713346B (en) * | 2017-01-13 | 2021-01-12 | 电子科技大学 | WLAN protocol design and analysis method based on wireless radio frequency energy transmission |
CN107105438B (en) * | 2017-04-20 | 2020-06-26 | 成都瑞沣信息科技有限公司 | QoS-based data and energy integrated transmission strategy design method |
CN108880893B (en) * | 2018-06-27 | 2021-02-09 | 重庆邮电大学 | Mobile edge computing server combined energy collection and task unloading method |
CN110113195B (en) * | 2019-04-26 | 2021-03-30 | 山西大学 | Method for joint unloading judgment and resource allocation in mobile edge computing system |
-
2020
- 2020-08-25 CN CN202010863911.2A patent/CN111741531B/en active Active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107809764A (en) * | 2017-09-21 | 2018-03-16 | 浙江理工大学 | A kind of multiple affair detection method based on Markov chain |
Also Published As
Publication number | Publication date |
---|---|
CN111741531A (en) | 2020-10-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110381541B (en) | Smart grid slice distribution method and device based on reinforcement learning | |
CN111524034B (en) | High-reliability low-time-delay low-energy-consumption power inspection system and inspection method | |
Hou et al. | Packets with deadlines: A framework for real-time wireless networks | |
CN113438315A (en) | Internet of things information freshness optimization method based on dual-network deep reinforcement learning | |
CN112788629B (en) | Online combined control method for power and modulation mode of energy collection communication system | |
Xiong et al. | Index-aware reinforcement learning for adaptive video streaming at the wireless edge | |
CN113395723B (en) | 5G NR downlink scheduling delay optimization system based on reinforcement learning | |
Qiao et al. | Age-optimal power control for status update systems with packet-based transmissions | |
Ghosh et al. | Achieving sub-linear regret in infinite horizon average reward constrained mdp with linear function approximation | |
US11323192B2 (en) | Adaptive modulation method for Bayes classifier-based energy harvesting relay system | |
CN111741531B (en) | An optimization method for the best operating state of communication equipment under 5G base stations | |
CN113329419B (en) | Online combined control method for power and rate of energy collection communication system | |
Nadeem et al. | HARQ optimization for real-time remote estimation in wireless networked control | |
CN118520934A (en) | Wireless Bayesian federal learning method and device based on simulated message aggregation | |
Le et al. | Radio link level performance evaluation in wireless networks using multi-rate transmission with ARQ-based error control | |
Ma et al. | A variation-aware approach for task allocation in wireless distributed computing systems | |
Pielli et al. | Minimizing data distortion of periodically reporting iot devices with energy harvesting | |
CN112218378A (en) | Wireless resource allocation method in imperfect channel state information fading channel | |
Suljović et al. | Leveraging outage probability in systems limited by BX fading and Nakagami-m co-channel interference for classification-based QoS estimation | |
Fu et al. | A new theoretic foundation for cross-layer optimization | |
Bodin et al. | Energy harvesting communication system with a finite set of transmission rates | |
CN110996398A (en) | Wireless network resource scheduling method and device | |
CN116980982B (en) | Task processing method and system under flexible networking architecture | |
Elumar et al. | Multi-armed bandits with probing | |
Zhou | Robust cross-layer design with reinforcement learning for IEEE 802.11 n link adaptation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |