CN111741531B

CN111741531B - An optimization method for the best operating state of communication equipment under 5G base stations

Info

Publication number: CN111741531B
Application number: CN202010863911.2A
Authority: CN
Inventors: 李传煌; 倪郑威; 李军; 毛建洋; 梁刚; 陈青松; 诸葛斌; 鲁佳; 陈超
Original assignee: Zhejiang Gongshang University; Sunwave Communications Co Ltd
Current assignee: Zhejiang Gongshang University; Sunwave Communications Co Ltd
Priority date: 2020-08-12
Filing date: 2020-08-25
Publication date: 2020-11-24
Anticipated expiration: 2040-08-25
Also published as: CN111741531A

Abstract

The invention discloses a method for optimizing the optimal operating state of communication equipment under a 5G base station. The communication equipment under the 5G base station is divided into communication equipment in a non-real-time update state and a communication device in a real-time update state; for communication in the non-real-time update state Equipment, with known time slot, power supply and uncontrollable parameter group, construct an optimization problem and solve it, and realize the optimization of the operation state of communication equipment under 5G base station according to the solution results; Only at any time slot can the power supply and uncontrollable parameters be obtained; by constructing the Markov decision process, and then finding the optimal strategy of the Markov decision process, according to the optimal strategy, the operation state optimization of the communication equipment under the 5G base station is realized . The present invention simultaneously aims at real-time and non-real-time updating of communication devices in two different operating states, so that the overall performance in the entire operating process can be optimal.

Description

An optimization method for the best operating state of communication equipment under 5G base stations

技术领域technical field

本发明涉及网络通信领域，具体涉及一种5G基站下通信设备最佳运行状态的优化方法。The invention relates to the field of network communication, in particular to a method for optimizing the optimal operating state of communication equipment under a 5G base station.

背景技术Background technique

传统的通信机制在执行的过程中往往默认设备具有足够多的能量执行相应的操作，并没有考虑设备的能量因素。而当设备贮能能力弱，加上供电量不确定性，传统的通信机制会增加设备断电的风险。The traditional communication mechanism often assumes that the device has enough energy to perform the corresponding operation in the process of execution, and does not consider the energy factor of the device. When the energy storage capacity of the equipment is weak, coupled with the uncertainty of the power supply, the traditional communication mechanism will increase the risk of equipment power failure.

针对上述问题，本申请将研究能量采集感知通信机制设计，即依据设备存储和供电能量的变化而动态调整通信过程和相关参数。根据不同设备的供电特性、设备贮能特性以及数据产生特性选择合适的建模方式和分析方法：对于供电量可控（比如使用专有的能量源对设备进行供电）、设备贮能量已知、数据产生量可预测的设备，在线下使用已知的信息对设备通信过程进行模拟，结合服务场景特性和相关性能指标，构造适合的数学模型，利用数学工具对通信过程进行优化；而对于某些信息不可控、不可预测的设备，借助马尔科夫决策过程等模型，利用动态规划等算法在通信过程中实时优化设备的操作步骤和能量管理。In view of the above problems, this application will study the design of the energy harvesting sensing communication mechanism, that is, the communication process and related parameters are dynamically adjusted according to the changes of device storage and power supply energy. According to the power supply characteristics, equipment energy storage characteristics and data generation characteristics of different equipment, select appropriate modeling methods and analysis methods: for controllable power supply (such as using a dedicated energy source to power equipment), equipment storage energy is known, For devices with predictable data generation, use known information offline to simulate the communication process of the device, combine the characteristics of service scenarios and related performance indicators, construct a suitable mathematical model, and use mathematical tools to optimize the communication process; For equipment with uncontrollable and unpredictable information, with the help of models such as Markov decision process, algorithms such as dynamic programming are used to optimize the operation steps and energy management of the equipment in real time during the communication process.

发明内容SUMMARY OF THE INVENTION

本发明针对现有技术的不足，提供了一种5G基站下通信设备最佳运行状态的优化方法。Aiming at the deficiencies of the prior art, the present invention provides an optimization method for the best operating state of communication equipment under a 5G base station.

本申请所采用的技术方案如下：本发明提供的一种5G基站下通信设备最佳运行状态的优化方法，所述5G基站下通信设备分为非实时更新状态的通信设备和实时更新状态的通信设备；The technical solutions adopted in this application are as follows: the present invention provides a method for optimizing the optimal operating state of communication equipment under a 5G base station, wherein the communication equipment under the 5G base station is divided into communication equipment with a non-real-time update state and a communication device with a real-time update state equipment;

对于非实时更新状态的通信设备，已知时隙i、供电电量

和不可控参数组

，构造优化问题并求解，具体步骤如下： For a communication device that does not update the state in real time, the time slot i and the power supply quantity are known.

and uncontrollable parameter groups

, construct an optimization problem and solve it. The specific steps are as follows:

a) 确定优化目标：5G基站下的设备性能指标量化为

，

为可控参数组； a) Determine the optimization goal: The equipment performance index under the 5G base station is quantified as

,

is a controllable parameter group;

b) 确定电力需求保障约束：在每个时隙开始时设备所含有的电量均能够保障该时隙的电力需求；b) Determine the power demand guarantee constraint: the power contained in the equipment at the beginning of each time slot can guarantee the power demand of the time slot;

c) 优化问题建模：在电力需求保障约束条件下，得到最优性能指标，具体建模如下：c) Optimization problem modeling: Under the constraint of power demand guarantee, the optimal performance index is obtained, and the specific modeling is as follows:

式中，N为总时隙个数，B ₁为第1个时隙开始时设备所含有的电量，

为在不可控参数组

下选取可控参数组

进行操作所消耗的能量； In the formula, N is the total number of time slots, B ₁ is the power contained in the device at the beginning of the first time slot,

for the uncontrollable parameter group

Select the controllable parameter group under

the energy expended to perform the operation;

d) 求解步骤c）的优化问题，根据求解结果实现5G基站下通信设备的运行状态优化；d) Solve the optimization problem of step c), and realize the optimization of the operation state of the communication equipment under the 5G base station according to the solution result;

对于实时更新状态的通信设备，设备只有实时信息，即只有到了任意时隙才能得到供电电量和不可控参数组；通过构造马尔科夫决策过程，求解马尔科夫决策过程的最优策略实现运行状态优化，具体步骤如下：For the communication device that updates the state in real time, the device only has real-time information, that is, the power supply and the uncontrollable parameter group can only be obtained at any time slot. Optimization, the specific steps are as follows:

A) 确定状态空间、动作空间和奖励：在马尔科夫决策过程中，若在实时更新状态的通信设备的状态为供电电量

、电池储能

、不可控参数组

的情形下，采取的动作为选用实时更新状态的通信设备的可控参数组

，则奖励就是此时关注的设备性能指标； A) Determining state space, action space and reward: In the Markov decision process, if the state of the communication device updating the state in real time is the power supply

, battery energy storage

, Uncontrollable parameter group

In the case of

, then the reward is the device performance index concerned at this time;

B) 确定决定规则和策略：若当前的状态-动作历史是

，t表示为第t个时隙；在决定规则

下，动作由当前的状态-动作历史决定；策略表示为

； B) Determine decision rules and policies: if the current state-action history is

, t is represented as the t-th time slot; in the decision rule

, the action is determined by the current state-action history; the policy is expressed as

;

C) 确定优化目标以及问题建模：通过奖励和的期望替代奖励和来评判策略

的好坏；当初始状态为

时，第1个时隙到第N个时隙奖励和的期望

如下： C) Determining the optimization objective and modeling the problem: Evaluate the policy by the expected replacement of the reward sum by the reward sum

is good or bad; when the initial state is

When , the expectation of the reward sum from the 1st slot to the Nth slot

as follows:

式中，

是t时隙的奖励，

为策略

的期望；

和

分别为状态随机序列和动作随机序列中的元素；最终的目标就是找到最优的策略

，使得 In the formula,

is the reward for time slot t,

for the strategy

expectations;

and

are the elements in the random sequence of states and the random sequence of actions; the ultimate goal is to find the optimal strategy

, so that

式中，

代表所有可能策略的集合，

为状态空间； In the formula,

represents the set of all possible strategies,

is the state space;

D) 求解步骤C）的优化目标得到最优策略，根据最优策略实现5G基站下通信设备的运行状态优化。D) Solve the optimization objective of step C) to obtain the optimal strategy, and realize the optimization of the operation state of the communication equipment under the 5G base station according to the optimal strategy.

进一步地，非实时更新状态的通信设备和实时更新状态的通信设备的可控参数组，包括发射功率和编码码率。Further, the controllable parameter group of the communication device in the non-real-time update state and the communication device in the real-time update state includes the transmission power and the encoding code rate.

进一步地，非实时更新状态的通信设备和实时更新状态的通信设备的不可控参数组，包括信道条件、产生的数据量、分配的时间资源与空间资源。Further, the uncontrollable parameter groups of the communication device that does not update the state in real time and the communication device that updates the state in real time include channel conditions, the amount of data generated, and the allocated time and space resources.

进一步地，所述5G基站下的设备性能指标包括误码率、吞吐量和服务质量QoS。Further, the equipment performance indicators under the 5G base station include bit error rate, throughput and QoS.

进一步地，步骤b）具体为：假设在第i个时隙开始时设备所含有的电量为

，则得到 Further, step b) is specifically as follows: it is assumed that the power contained in the device at the beginning of the i -th time slot is

, then get

式中，

为在不可控参数组

下选取可控参数组

进行操作所消耗的能量；为了使得设备的用电需求得到保障，必须有如下能量约束条件： In the formula,

for the uncontrollable parameter group

Select the controllable parameter group under

The energy consumed by the operation; In order to ensure the power demand of the equipment, the following energy constraints must be met:

式中，N为总时隙个数，对应N个能量约束条件，即对于任何一个时隙，设备拥有的电量都能保障该时隙的电力需求。In the formula, N is the total number of time slots, corresponding to N energy constraints, that is, for any time slot, the power possessed by the device can guarantee the power demand of the time slot.

进一步地，步骤d）中，若优化问题为凸优化问题，使用标准的凸优化问题的解法进行求解；若优化问题不是凸优化问题，将标准的凸优化问题的解法与遗传算法相结合进行求解，以减少收敛到次优解的情况发生。Further, in step d), if the optimization problem is a convex optimization problem, the solution method of the standard convex optimization problem is used to solve it; if the optimization problem is not a convex optimization problem, the solution method of the standard convex optimization problem is combined with the genetic algorithm to solve the problem. , to reduce the occurrence of convergence to a suboptimal solution.

进一步地，所述标准的凸优化问题的解法为牛顿法或内点法。Further, the solution method of the standard convex optimization problem is Newton's method or interior point method.

进一步地，步骤A）中，将设备的供电电量的取值集合记为

，设备的电池储能的取值集合记为

，不可控参数组的取值集合记为

；则状态空间表示为

；状态

为状态空间的一个元素，

；可控参数组的取值集合

即为动作空间；任何一组可控参数组都是

的一个元素，称为一个动作；选择一组可控参数组就是在马尔科夫决策过程中选择一个动作。 Further, in step A), the value set of the power supply of the device is recorded as

, the value set of the battery energy storage of the device is denoted as

, the value set of the uncontrollable parameter group is denoted as

; then the state space is expressed as

;state

is an element of the state space,

; Value set of controllable parameter group

is the action space; any set of controllable parameters is

An element of , called an action; choosing a set of controllable parameters is choosing an action in a Markov decision process.

进一步地，步骤D）中，对于马尔科夫决策过程，使用动态规划、值迭代、策略迭代或线性规划方法求得最优策略。Further, in step D), for the Markov decision process, dynamic programming, value iteration, policy iteration or linear programming method is used to obtain the optimal policy.

本发明的有益效果：不同通信设备在不同运行环境下所需要关注的性能指标有所不同，常规的优化方案难以针对各种性能指标有统一泛化的解决；同一通信设备在运行时也有实时与非实时更新两种不同状态，对处于实时更新状态的通信设备的性能优化通常没法得到很好的解决。本发明将关注的设备性能指标量化为

，适用于不同情况下的性能指标优化，同时针对实时与非实时更新两种不同运行状态下的通信设备，使其整个运行过程中的整体性能可以达到最佳。 The beneficial effects of the present invention are as follows: different communication devices need to pay attention to different performance indicators in different operating environments, and it is difficult for the conventional optimization scheme to have a unified generalization solution for various performance indicators; the same communication device also has real-time and There are two different states of non-real-time update, and the performance optimization of the communication device in the real-time update state usually cannot be well solved. The present invention quantifies the equipment performance index concerned as

, which is suitable for the optimization of performance indicators in different situations, and at the same time, for real-time and non-real-time update of communication equipment in two different operating states, so that the overall performance of the entire operation process can reach the best.

附图说明Description of drawings

图1为本发明一种5G基站下通信设备最佳运行状态的优化方法流程图。FIG. 1 is a flowchart of a method for optimizing the optimal operation state of a communication device under a 5G base station according to the present invention.

具体实施方式Detailed ways

以下结合附图对本发明具体实施方式作进一步详细说明。The specific embodiments of the present invention will be further described in detail below with reference to the accompanying drawings.

如图1所示，本发明提供的一种5G基站下通信设备最佳运行状态的优化方法，该方法中，所述5G基站下通信设备分为非实时更新状态的通信设备和实时更新状态的通信设备两种情况；非实时更新状态的通信设备和实时更新状态的通信设备的可控的参数组包括发射功率、编码码率。非实时更新状态的通信设备和实时更新状态的通信设备的不可控参数组即为通信设备的外部条件，包括信道条件、产生的数据量、分配的时间资源与空间资源。As shown in FIG. 1, the present invention provides a method for optimizing the optimal operating state of communication equipment under a 5G base station. In this method, the communication equipment under the 5G base station is divided into communication equipment with a non-real-time update state and a communication device with a real-time update state. There are two cases of the communication device; the controllable parameter group of the communication device in the non-real-time update state and the communication device in the real-time update state includes the transmission power and the coding rate. The uncontrollable parameter group of the communication device that does not update the state in real time and the communication device that updates the state in real time is the external condition of the communication device, including the channel condition, the amount of data generated, and the allocated time and space resources.

对于非实时更新状态的通信设备，已知时隙i、供电电量

和不可控参数组

，基于此本发明方法先构造优化问题，然后对优化问题求解：a) 确定优化目标；b) 刻画电力需求保障约束；c) 优化问题建模；d) 优化问题求解。具体步骤如下： For a communication device that does not update the state in real time, the time slot i and the power supply quantity are known.

and uncontrollable parameter groups

, based on the method of the present invention, the optimization problem is first constructed, and then the optimization problem is solved: a) determine the optimization objective; b) describe the power demand guarantee constraint; c) optimize the problem modeling; d) solve the optimization problem. Specific steps are as follows:

a) 确定优化目标；a) Determine optimization goals;

5G基站下关注的设备性能指标量化为

；则优化的目标即为选取最合适的

，使得

最大。

的具体形式和性质与关注的设备性能指标有关，包括误码率、吞吐量、服务质量QoS。其中

是外部环境或者服务场景确定的不可控参数，而

是可控参数，而性能

是

和

共同决定的，根据

调整

，使得

最大。本发明实施例中，设发送一帧的时间为一个时隙的长度，一帧所含的信息比特为

，帧错误率为

，若关注的性能指标为一个时隙平均正确解码的信息比特数，则有 The performance indicators of the equipment concerned under the 5G base station are quantified as

; the optimization goal is to select the most suitable

, so that

maximum.

The specific form and nature of the data are related to the equipment performance indicators concerned, including bit error rate, throughput, and quality of service (QoS). in

is an uncontrollable parameter determined by the external environment or service scenario, and

are controllable parameters, while performance

Yes

and

jointly decided, according to

Adjustment

, so that

maximum. In this embodiment of the present invention, it is assumed that the time for sending one frame is the length of one time slot, and the information bits contained in one frame are

, the frame error rate is

, if the performance index of interest is the average number of correctly decoded information bits in a time slot, then

式中，

只与可控的参数（编码码率、带宽等）有关，而帧错误率既与可控参数（发射功率等）有关，又与不可控参数（信道衰落等）有关。即

由

决定，而

由

与

共同决定。 In the formula,

It is only related to controllable parameters (coding code rate, bandwidth, etc.), while the frame error rate is related to both controllable parameters (transmit power, etc.) and uncontrollable parameters (channel fading, etc.). which is

Depend on

decide, and

Depend on

and

decided together.

b) 刻画电力需求保障约束；b) Characterize the power demand guarantee constraints;

假设在第i个时隙开始时设备所含有的电量为

，则可以得到 Assume that the power contained in the device at the beginning of the ith time slot is

, you can get

式中，

为在不可控参数组

下选取可控参数组

for the uncontrollable parameter group

Select the controllable parameter group under

式中，N为总时隙个数，对应N个能量约束条件，即对于任何一个时隙，设备拥有的电量都能保障该时隙的电力需求；In the formula, N is the total number of time slots, corresponding to N energy constraints, that is, for any time slot, the amount of electricity possessed by the device can guarantee the power demand of the time slot;

c) 优化问题建模；c) modeling optimization problems;

为了使得5G基站下的通信设备整个运行过程中的总体性能最好，即刻画成在电力需求保障约束条件下，得到最优的性能指标的优化问题，优化问题建模如下：In order to make the overall performance of the communication equipment under the 5G base station the best during the entire operation process, it is immediately described as an optimization problem to obtain the optimal performance index under the constraint of power demand guarantee. The optimization problem is modeled as follows:

d) 优化问题求解，若该优化问题为凸优化问题，可以使用标准的凸优化问题的解法（牛顿法、内点法等）进行求解；若优化问题不是凸优化问题，一种方法是先通过观察设法将问题转化成凸优化问题再进行求解，另一种方法是将牛顿法、内点法等方法与遗传算法等其他算法相结合进行求解，例如使用导数的R-遗传优化算法，以减少收敛到次优解的情况发生，根据求解结果实现5G基站下通信设备的运行状态优化。某些符合特定结构的非凸优化问题也可以直接解决，例如使用投影梯度下降、交替最小化、期望最大化算法、随机优化等方法。d) Solve the optimization problem. If the optimization problem is a convex optimization problem, you can use the standard convex optimization problem solution method (Newton method, interior point method, etc.) to solve it; if the optimization problem is not a convex optimization problem, one method is to first pass Observe and try to convert the problem into a convex optimization problem and solve it. Another method is to combine Newton's method, interior point method and other methods with other algorithms such as genetic algorithm to solve, such as R-genetic optimization algorithm using derivatives, to reduce The situation of converging to a suboptimal solution occurs, and the operation state optimization of the communication equipment under the 5G base station is realized according to the solution result. Certain non-convex optimization problems that conform to specific structures can also be solved directly, such as using projected gradient descent, alternating minimization, expectation maximization algorithms, stochastic optimization, etc.

对于实时更新状态的通信设备，设备只有实时的信息，即只有到了任意时隙，才能得到供电电量和不可控参数；基于此通过构造马尔科夫决策过程，然后求出马尔科夫决策过程的最优策略：具体步骤如下：a) 刻画状态空间、动作空间和奖励；b) 刻画决定规则和策略；c) 优化目标的刻画以及问题建模；d) 马尔科夫决策过程最优策略求解。 For the communication device that updates the state in real time, the device only has real-time information, that is, only when it reaches any time slot, can the power supply and uncontrollable parameters be obtained; Optimal strategy: The specific steps are as follows: a) Characterize the state space, action space and reward; b) Characterize decision rules and strategies; c) Characterize the optimization objective and problem modeling; d) Solve the optimal policy of the Markov decision process.

a) 刻画状态空间、动作空间和奖励；a) characterize the state space, action space and reward;

将设备被提供的电量的取值集合记为

，设备所含有的电量的取值集合记为

，不可控的参数组的取值集合记为

；则状态空间可以表示为

；状态

为状态空间的一个元素，

；可控的参数组的取值集合

即为动作空间；可见，任何一组可控的参数组都是

的一个元素，称为一个动作；选择一组可控参数就是在马尔科夫决策过程中选择一个动作； The set of values of the power provided by the device is recorded as

, the set of values of the power contained in the device is denoted as

, the value set of the uncontrollable parameter group is denoted as

; then the state space can be expressed as

;state

is an element of the state space,

; the set of values for the controllable parameter group

is the action space; it can be seen that any set of controllable parameter groups is

An element of , called an action; choosing a set of controllable parameters is choosing an action in the Markov decision process;

在马尔科夫决策过程中处于状态

时，采用动作

所得到的收益定义为奖励

；若在实时更新状态的通信设备的供电电量

、实时更新状态的通信设备的电池储能

、实时更新状态的通信设备的不可控的参数组

的情形下选用实时更新状态的通信设备的可控的参数组

，则奖励就是此时关注的设备性能指标，即 in a Markov decision process

, take action

Earnings are defined as rewards

;If the power supply of the communication device in the real-time update state

, battery energy storage of communication equipment with real-time status update

, Uncontrollable parameter groups of communication devices that update the status in real time

In the case of choosing a controllable parameter set of the communication device that updates the status in real time

, then the reward is the device performance index concerned at this time, namely

式中，

为关注的实时更新状态的通信设备性能指标； In the formula,

Communication equipment performance indicators for the real-time update status of concern;

b) 刻画决定规则和策略；决定规则是在某时隙i选择动作的方法；具体为：若当前的状态-动作历史是

，t表示为第t个时隙；在决定规则

下，动作由当前的状态-动作历史决定；策略是一串决定规则组成的序列，用

表示，即

，N为总时隙个数；决定规则具有马尔科夫性和确定性，即动作的选取只与当前状态有关； b) Describe decision rules and strategies; decision rules are methods for selecting actions in a certain time slot i ; specifically: if the current state-action history is

, t is represented as the t-th time slot; in the decision rule

, the action is determined by the current state-action history; the policy is a sequence of decision rules, using

means that

, N is the total number of time slots; the decision rule is Markov and deterministic, that is, the selection of actions is only related to the current state;

c) 优化目标的刻画以及问题建模；c) Characterization of optimization objectives and problem modeling;

由于实际情况中设备被供电电量和不可控的参数组是随机的，因此在这里，通过奖励和的期望替代奖励和来评判一个策略

的好坏，用

表示，即 Since the power supply of the device and the uncontrollable parameter set are random in the actual situation, here, a strategy is judged by the expectation of the reward sum instead of the reward sum

good or bad, use

means that

式中，

和

分别为状态随机序列

和动作随机序列

中的元素，

是t时隙的奖励，

为策略

的期望；最终的目标就是找到最优的策略

，使得 In the formula,

and

are random sequences of states

and a random sequence of actions

elements in ,

is the reward for time slot t,

for the strategy

expectations; the ultimate goal is to find the optimal strategy

, so that

式中，

代表所有可能策略的集合； In the formula,

represents the set of all possible strategies;

d) 马尔科夫决策过程最优策略求解；对于标准的马尔科夫决策过程，可以使用动态规划、值迭代、策略迭代或线性规划等方法求得最优策略，此外，贪心策略已经被大量证实可以达到局部最优解。因此，当性能损失在可接受的范围内时，也可以采用类似于贪心策略的计算复杂度低的局部最优策略，根据最优策略实现5G基站下通信设备的运行状态优化。d) Solving the optimal strategy of the Markov decision process; for the standard Markov decision process, the optimal strategy can be obtained by methods such as dynamic programming, value iteration, policy iteration or linear programming. In addition, the greedy strategy has been widely confirmed A local optimal solution can be achieved. Therefore, when the performance loss is within an acceptable range, a local optimal strategy with low computational complexity similar to the greedy strategy can also be used to optimize the operating state of the communication equipment under the 5G base station according to the optimal strategy.

上述实施例用来解释说明本发明，而不是对本发明进行限制，在本发明的精神和权利要求的保护范围内，对本发明作出的任何修改和改变，都落入本发明的保护范围。The above-mentioned embodiments are used to explain the present invention, rather than limit the present invention. Within the spirit of the present invention and the protection scope of the claims, any modifications and changes made to the present invention all fall into the protection scope of the present invention.

Claims

1. The optimization method of the optimal running state of the communication equipment under the 5G base station is characterized in that the communication equipment under the 5G base station is divided into communication equipment in a non-real-time updating state and communication equipment in a real-time updating state;

for communication devices that update status in non-real time, time slots are knowniPower supply quantity

And uncontrollable parameter set

Constructing an optimization problem and solving the optimization problem, wherein the concrete steps are as follows:

a) determining an optimization objective: equipment performance index quantization under 5G base station

，

Is a controllable parameter set;

b) determining power demand guarantee constraints: the electric quantity contained in the equipment at the beginning of each time slot can guarantee the electric power requirement of the time slot; the method specifically comprises the following steps: suppose iniThe device contains an amount of power at the beginning of a time slot of

Then obtain

In the formula (I), the compound is shown in the specification,

to be in the uncontrollable parameter group

Selecting controllable parameter group

Energy consumed to perform the operation; in order to guarantee the power demand of the equipment, the following energy constraint conditions are necessary:

in the formula (I), the compound is shown in the specification,Nto the total number of time slots, correspond toNThe energy constraint condition is that for any time slot, the electric quantity of the equipment can guarantee the power requirement of the time slot;

c) modeling an optimization problem: under the constraint condition of power demand guarantee, obtaining the optimal performance index, and specifically modeling as follows:

in the formula (I), the compound is shown in the specification,Nthe number of the total time slots is,B ₁the amount of power that the device contains at the beginning of the 1 st time slot,

to be in the uncontrollable parameter group

Selecting controllable parameter group

Energy consumed to perform the operation;

d) solving the optimization problem in the step c), and realizing the operation state optimization of the communication equipment under the 5G base station according to the solving result;

for communication equipment with a real-time update state, the equipment only has real-time information, namely, the power supply electric quantity and the uncontrollable parameter set can be obtained only when any time slot is reached; by constructing a Markov decision process and solving an optimal strategy of the Markov decision process, the running state optimization is realized, and the specific steps are as follows:

A) determining state space, action space and reward: recording the value set of the power supply electric quantity of the equipment as

And the set of values of the battery energy storage of the equipment is recorded as

The value set of the uncontrollable parameter set is recorded as

(ii) a The state space is represented as

(ii) a Status of state

Being an element of the state space,

(ii) a Value set of controllable parameter group

Namely the motion space; any ofA set of controllable parameter sets are all

An element of (1), referred to as an action; selecting a set of controllable parameters is to select an action in the Markov decision process; in the Markov decision process, if the state of the communication equipment in the real-time update state is the power supply electric quantity

Battery energy storage

Uncontrollable parameter group

In the case of (1), the action taken is to select a controllable parameter set for the communication device that is updating the state in real time

If yes, the reward is the concerned equipment performance index;

B) determining decision rules and policies: if the current state-action history is

And t is denoted as the t-th time slot; in determining the rule

Next, the action is determined by the current state-action history; the strategy is expressed as

；

C) Determining an optimization target and modeling a problem: evaluating a policy by a desired alternate bonus sum of a bonus sum

Good or bad; when the initial state is

Expectation of bonus sum from 1 st slot to Nth slot

The following were used:

in the formula (I), the compound is shown in the specification,

is a reward for the t-slot(s),

as a policy

(iii) a desire;

and

respectively are elements in a state random sequence and an action random sequence; the ultimate goal is to find the optimal strategy

So that

In the formula (I), the compound is shown in the specification,

represents the set of all possible policies that may be applied,

is a state space;

D) and C), solving the optimization target of the step C) to obtain an optimal strategy, and realizing the operation state optimization of the communication equipment under the 5G base station according to the optimal strategy.

2. The method of claim 1, wherein the set of controllable parameters for the communication device in the non-real-time update state and the communication device in the real-time update state includes a transmission power and a coding rate.

3. The method of claim 1, wherein the uncontrollable parameter set of the communication device in the non-real-time update state and the communication device in the real-time update state comprises channel conditions, generated data amount, allocated time resources and space resources.

4. The method of claim 1, wherein the device performance indicators under the 5G base station include bit error rate, throughput, and quality of service (QoS).

5. The method according to claim 1, wherein in step d), if the optimization problem is a convex optimization problem, the solution of a standard convex optimization problem is used to solve; if the optimization problem is not a convex optimization problem, a solution of a standard convex optimization problem is combined with a genetic algorithm to solve so as to reduce the occurrence of convergence to a suboptimal solution.

6. The method as claimed in claim 5, wherein the solution of the convex optimization problem is Newton's method or interior point method.

7. The method as claimed in claim 1, wherein in step D), for the markov decision process, the optimal strategy is obtained by using a dynamic programming, value iteration, strategy iteration or linear programming method.